TY - JOUR
T1 - Comparison of correspondence analysis methods for synonymous codon usage in bacteria
AU - Suzuki, Haruo
AU - Brown, Celeste J.
AU - Forney, Larry J.
AU - Top, Eva M.
PY - 2008/12
Y1 - 2008/12
N2 - Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.
AB - Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.
KW - Correspondence analysis
KW - Horizontal gene transfer
KW - Strand-specific mutational bias
KW - Synonymous codon usage
KW - Translational selection
UR - http://www.scopus.com/inward/record.url?scp=59149091773&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=59149091773&partnerID=8YFLogxK
U2 - 10.1093/dnares/dsn028
DO - 10.1093/dnares/dsn028
M3 - Article
C2 - 18940873
AN - SCOPUS:59149091773
SN - 1340-2838
VL - 15
SP - 357
EP - 365
JO - DNA Research
JF - DNA Research
IS - 6
ER -