We conducted comprehensive analyses on intron positions in the Mus musculus genome by comparing genomic sequences in the GenBank database and cDNA sequences in the mouse cDNA library recently developed by Riken Genomic Sciences Center. Our results confirm that introns have a tendency to be located toward the 5′ end of the gene. The same type of analysis was conducted in the coding region of seven eukaryotes (Saccharomyces cerevisiae, Plasmodium falciparum, Caenorhabditis elegans, Drosophila melanogaster, M. musculus, Homo sapiens, Arabidopsis thaliana). Introns in genes with a single intron have a locational bias toward the 5′ end in all species except A. thaliana. We also measured the distance from the start codon to the position of the intron, and found that single introns prefer the location immediately after the start codon in S. cerevisiae and P. falciparum. We discuss three possible explanations for these findings: (1) they are the consequence of intron loss by reverse-transcriptase; (2) they are necessary to accommodate the function; and (3) they are concerned with the mechanism of pre-mRNA splicing.
ASJC Scopus subject areas