Computer analyses of the entire GenBank database were conducted to examine correlation between splicing sites and codon positions in reading frames. Intron insertion patterns (i.e., splicing site locations with respect to codon positions) have been analyzed for all of the 64 codons of all the eukaryote taxonomic groups: primates, rodents, mammals, vertebrates, invertebrates, and plants. We found that reading frames are interrupted by an intron at a codon boundary (as opposed to the middle of a codon) significantly more often than expected. This observation is consistent with the exon shuffling hypothesis, because exons that end at codon boundaries can be concatenated without causing a frame shift and thus are evolutionarily advantageous. On the other hand, when introns interrupt at the middles of codons, they exist in between the first and second bases much more frequently than between the second and third bases, despite the fact that boundaries between the first and second bases of codons are generally far more important than those between the second and third bases. The reason for this is not clear and yet to be explained. We also show that the length of an exon is a multiple of 3 more frequently than expected. Furthermore, the total length of two consecutive exons is also more frequently a multiple of 3. All the observations above are consistent with results recently published by Long, Rosenberg, and Gilbert (1995).
|Number of pages||5|
|Journal||Molecular Biology and Evolution|
|Publication status||Published - 1996 Nov|
- sequence analysis
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics
- Molecular Biology