Comparison of correspondence analysis methods for synonymous codon usage in bacteria

Haruo Suzuki, Celeste J. Brown, Larry J. Forney, Eva M. Top

Research output: Contribution to journalArticle

50 Citations (Scopus)

Abstract

Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.

Original languageEnglish
Pages (from-to)357-365
Number of pages9
JournalDNA Research
Volume15
Issue number6
DOIs
Publication statusPublished - 2008 Dec
Externally publishedYes

Fingerprint

Codon
Bacteria
Rickettsia prowazekii
Genome
Genes
Bacterial Genomes
Base Composition
Gene Expression

Keywords

  • Correspondence analysis
  • Horizontal gene transfer
  • Strand-specific mutational bias
  • Synonymous codon usage
  • Translational selection

ASJC Scopus subject areas

  • Medicine(all)
  • Molecular Biology
  • Genetics

Cite this

Comparison of correspondence analysis methods for synonymous codon usage in bacteria. / Suzuki, Haruo; Brown, Celeste J.; Forney, Larry J.; Top, Eva M.

In: DNA Research, Vol. 15, No. 6, 12.2008, p. 357-365.

Research output: Contribution to journalArticle

Suzuki, Haruo ; Brown, Celeste J. ; Forney, Larry J. ; Top, Eva M. / Comparison of correspondence analysis methods for synonymous codon usage in bacteria. In: DNA Research. 2008 ; Vol. 15, No. 6. pp. 357-365.
@article{0245918137ff4ef6b0ab5bbc6ebb12c5,
title = "Comparison of correspondence analysis methods for synonymous codon usage in bacteria",
abstract = "Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.",
keywords = "Correspondence analysis, Horizontal gene transfer, Strand-specific mutational bias, Synonymous codon usage, Translational selection",
author = "Haruo Suzuki and Brown, {Celeste J.} and Forney, {Larry J.} and Top, {Eva M.}",
year = "2008",
month = "12",
doi = "10.1093/dnares/dsn028",
language = "English",
volume = "15",
pages = "357--365",
journal = "DNA Research",
issn = "1340-2838",
publisher = "Oxford University Press",
number = "6",

}

TY - JOUR

T1 - Comparison of correspondence analysis methods for synonymous codon usage in bacteria

AU - Suzuki, Haruo

AU - Brown, Celeste J.

AU - Forney, Larry J.

AU - Top, Eva M.

PY - 2008/12

Y1 - 2008/12

N2 - Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.

AB - Synonymous codon usage varies both between organisms and among genes within a genome, and arises due to differences in G + C content, replication strand skew, or gene expression levels. Correspondence analysis (CA) is widely used to identify major sources of variation in synonymous codon usage among genes and provides a way to identify horizontally transferred or highly expressed genes. Four methods of CA have been developed based on three kinds of input data: absolute codon frequency, relative codon frequency, and relative synonymous codon usage (RSCU) as well as within-group CA (WCA). Although different CA methods have been used in the past, no comprehensive comparative study has been performed to evaluate their effectiveness. Here, the four CA methods were evaluated by applying them to 241 bacterial genome sequences. The results indicate that WCA is more effective than the other three methods in generating axes that reflect variations in synonymous codon usage. Furthermore, WCA reveals sources that were previously unnoticed in some genomes; e.g. synonymous codon usage related to replication strand skew was detected in Rickettsia prowazekii. Though CA based on RSCU is widely used, our evaluation indicates that this method does not perform as well as WCA.

KW - Correspondence analysis

KW - Horizontal gene transfer

KW - Strand-specific mutational bias

KW - Synonymous codon usage

KW - Translational selection

UR - http://www.scopus.com/inward/record.url?scp=59149091773&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=59149091773&partnerID=8YFLogxK

U2 - 10.1093/dnares/dsn028

DO - 10.1093/dnares/dsn028

M3 - Article

C2 - 18940873

AN - SCOPUS:59149091773

VL - 15

SP - 357

EP - 365

JO - DNA Research

JF - DNA Research

SN - 1340-2838

IS - 6

ER -