A problem in multivariate analysis of codon usage data and a possible solution

Haruo Suzuki, Rintaro Saito, Masaru Tomita

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

Multivariate analyses are often used to identify major trends of variation in synonymous codon usage among genes. These analyses need to be performed on properly normalized codon usage data to avoid biases masking this synonymous variation, i.e., gene length, amino acid usage, and codon degeneracy; however, previous studies have failed to do so. In this paper, we demonstrate that the use of alternative normalized data (called 'relative adaptiveness' in the literature) can avoid all these biases and furthermore, can identify more trends of variation among genes, including GC-ending codon usage, GT-ending codon usage, and gene expression level.

Original languageEnglish
Pages (from-to)6499-6504
Number of pages6
JournalFEBS Letters
Volume579
Issue number28
DOIs
Publication statusPublished - 2005 Nov 21

Fingerprint

Codon
Multivariate Analysis
Genes
Gene expression
Amino Acids
Gene Expression

Keywords

  • Amino acid usage
  • Codon degeneracy
  • GC-ending codon usage
  • Gene expression level
  • Gene length
  • GT-ending codon usage
  • Multivariate analysis
  • Principal component analysis
  • Synonymous codon usage

ASJC Scopus subject areas

  • Biochemistry
  • Biophysics
  • Molecular Biology

Cite this

A problem in multivariate analysis of codon usage data and a possible solution. / Suzuki, Haruo; Saito, Rintaro; Tomita, Masaru.

In: FEBS Letters, Vol. 579, No. 28, 21.11.2005, p. 6499-6504.

Research output: Contribution to journalArticle

@article{a002106a38b0435396b9e76e36be26a2,
title = "A problem in multivariate analysis of codon usage data and a possible solution",
abstract = "Multivariate analyses are often used to identify major trends of variation in synonymous codon usage among genes. These analyses need to be performed on properly normalized codon usage data to avoid biases masking this synonymous variation, i.e., gene length, amino acid usage, and codon degeneracy; however, previous studies have failed to do so. In this paper, we demonstrate that the use of alternative normalized data (called 'relative adaptiveness' in the literature) can avoid all these biases and furthermore, can identify more trends of variation among genes, including GC-ending codon usage, GT-ending codon usage, and gene expression level.",
keywords = "Amino acid usage, Codon degeneracy, GC-ending codon usage, Gene expression level, Gene length, GT-ending codon usage, Multivariate analysis, Principal component analysis, Synonymous codon usage",
author = "Haruo Suzuki and Rintaro Saito and Masaru Tomita",
year = "2005",
month = "11",
day = "21",
doi = "10.1016/j.febslet.2005.10.032",
language = "English",
volume = "579",
pages = "6499--6504",
journal = "FEBS Letters",
issn = "0014-5793",
publisher = "Elsevier",
number = "28",

}

TY - JOUR

T1 - A problem in multivariate analysis of codon usage data and a possible solution

AU - Suzuki, Haruo

AU - Saito, Rintaro

AU - Tomita, Masaru

PY - 2005/11/21

Y1 - 2005/11/21

N2 - Multivariate analyses are often used to identify major trends of variation in synonymous codon usage among genes. These analyses need to be performed on properly normalized codon usage data to avoid biases masking this synonymous variation, i.e., gene length, amino acid usage, and codon degeneracy; however, previous studies have failed to do so. In this paper, we demonstrate that the use of alternative normalized data (called 'relative adaptiveness' in the literature) can avoid all these biases and furthermore, can identify more trends of variation among genes, including GC-ending codon usage, GT-ending codon usage, and gene expression level.

AB - Multivariate analyses are often used to identify major trends of variation in synonymous codon usage among genes. These analyses need to be performed on properly normalized codon usage data to avoid biases masking this synonymous variation, i.e., gene length, amino acid usage, and codon degeneracy; however, previous studies have failed to do so. In this paper, we demonstrate that the use of alternative normalized data (called 'relative adaptiveness' in the literature) can avoid all these biases and furthermore, can identify more trends of variation among genes, including GC-ending codon usage, GT-ending codon usage, and gene expression level.

KW - Amino acid usage

KW - Codon degeneracy

KW - GC-ending codon usage

KW - Gene expression level

KW - Gene length

KW - GT-ending codon usage

KW - Multivariate analysis

KW - Principal component analysis

KW - Synonymous codon usage

UR - http://www.scopus.com/inward/record.url?scp=27744507140&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=27744507140&partnerID=8YFLogxK

U2 - 10.1016/j.febslet.2005.10.032

DO - 10.1016/j.febslet.2005.10.032

M3 - Article

VL - 579

SP - 6499

EP - 6504

JO - FEBS Letters

JF - FEBS Letters

SN - 0014-5793

IS - 28

ER -