TY - JOUR
T1 - Comparison and characterization of proteomes in the three domains of life using 2D correlation analysis
AU - Fujishima, Kosuke
AU - Komasa, Mizuki
AU - Kitamura, Sayaka
AU - Tomita, Masaru
AU - Kanai, Akio
PY - 2008
Y1 - 2008
N2 - Proteins are a major regulatory component in complex biological systems. Among them, DNA/RNA-binding proteins, the key components of the central dogma of molecular biology, and membrane proteins, which are necessary for both signal transduction and metabolite transport, are suggested to be the most important protein families that arose in the early stage of life. In this study, we computationally analyzed the whole proteome data of six model species to overview the protein diversity in the three domains of life (Bacteria, Archaea and Eukaryota), especially focusing on the above two protein families. To compare the protein distribution among the six model species, we calculated various protein profiles: hydropathy, molecular weight, amino acid composition and periodicity for each protein. We found a domain-specific distribution of the proteome based on 2D correlation analysis of hydropathy and molecular weight. Further, the merged protein distribution of Archaea and other domains revealed many membrane proteins localized in Bacteria-specific regions with a high ratio of hydropathy and many DNA/RNA-binding proteins localized in Eukaryota-specific regions with a low ratio of hydropathy. Since about half of the proteins encoded in the genome are still functionally unknown, we further conducted Support Vector Machine (SVM)-based functional prediction using amino acid composition (CO score) and periodicity (PD score) as feature vectors to predict the overall number of DNA/RNA-binding proteins and membrane proteins in the proteome. Our estimation indicated that two functional categories occupy approximately 60% to 80% of the proteome, and further, the proportion of the two categories varied among the three domains of life, suggesting that the proteome has gone through different selective pressure during evolution.
AB - Proteins are a major regulatory component in complex biological systems. Among them, DNA/RNA-binding proteins, the key components of the central dogma of molecular biology, and membrane proteins, which are necessary for both signal transduction and metabolite transport, are suggested to be the most important protein families that arose in the early stage of life. In this study, we computationally analyzed the whole proteome data of six model species to overview the protein diversity in the three domains of life (Bacteria, Archaea and Eukaryota), especially focusing on the above two protein families. To compare the protein distribution among the six model species, we calculated various protein profiles: hydropathy, molecular weight, amino acid composition and periodicity for each protein. We found a domain-specific distribution of the proteome based on 2D correlation analysis of hydropathy and molecular weight. Further, the merged protein distribution of Archaea and other domains revealed many membrane proteins localized in Bacteria-specific regions with a high ratio of hydropathy and many DNA/RNA-binding proteins localized in Eukaryota-specific regions with a low ratio of hydropathy. Since about half of the proteins encoded in the genome are still functionally unknown, we further conducted Support Vector Machine (SVM)-based functional prediction using amino acid composition (CO score) and periodicity (PD score) as feature vectors to predict the overall number of DNA/RNA-binding proteins and membrane proteins in the proteome. Our estimation indicated that two functional categories occupy approximately 60% to 80% of the proteome, and further, the proportion of the two categories varied among the three domains of life, suggesting that the proteome has gone through different selective pressure during evolution.
UR - http://www.scopus.com/inward/record.url?scp=54049149033&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=54049149033&partnerID=8YFLogxK
U2 - 10.1143/PTPS.173.206
DO - 10.1143/PTPS.173.206
M3 - Article
AN - SCOPUS:54049149033
SN - 0375-9687
SP - 206
EP - 218
JO - Progress of Theoretical Physics Supplement
JF - Progress of Theoretical Physics Supplement
IS - 173
ER -