Proteome-wide prediction of novel DNA/RNA-binding proteins using amino acid composition and periodicity in the hyperthermophilic archaeon pyrococcus furiosus

Kosuke Fujishima, Mizuki Komasa, Sayaka Kitamura, Haruo Suzuki, Masaru Tomita, Akio Kanai

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Proteins play a critical role in complex biological systems, yet about half of the proteins in publicly available databases are annotated as functionally unknown. Proteome-wide functional classification using bioinformatics approaches thus is becoming an important method for revealing unknown protein functions. Using the hyperthermophilic archaeon Pyrococcus furiosus as a model species, we used the support vector machine (SVM) method to discriminate DNA/RNA-binding proteins from proteins with other functions, using amino acid composition and periodicities as feature vectors. We defined this value as the composition score (CO) and periodicity score (PD). The P. furiosus proteins were classified into three classes (I-III) on the basis of the two-dimensional correlation analysis of CO score and PD score. As a result, approximately 87 of the functionally known proteins categorized as class I proteins (CO score + PD score > 0.6) were found to be DNA/RNA-binding proteins. Applying the two-dimensional correlation analysis to the 994 hypothetical proteins in P. furiosus, a total of 151 proteins were predicted to be novel DNA/RNA-binding protein candidates. DNA/RNA-binding activities of randomly chosen hypothetical proteins were experimentally verified. Six out of seven candidate proteins in class I possessed DNA/RNA-binding activities, supporting the efficacy of our method.

Original languageEnglish
Pages (from-to)91-102
Number of pages12
JournalDNA Research
Volume14
Issue number3
DOIs
Publication statusPublished - 2007 Jun

Fingerprint

Pyrococcus furiosus
RNA-Binding Proteins
Archaea
DNA-Binding Proteins
Periodicity
Proteome
Amino Acids
Proteins
RNA
DNA
Computational Biology

Keywords

  • Amino acid periodicity
  • Archaea
  • DNA/RNA-binding protein
  • Support vector machine

ASJC Scopus subject areas

  • Genetics
  • Molecular Biology

Cite this

Proteome-wide prediction of novel DNA/RNA-binding proteins using amino acid composition and periodicity in the hyperthermophilic archaeon pyrococcus furiosus. / Fujishima, Kosuke; Komasa, Mizuki; Kitamura, Sayaka; Suzuki, Haruo; Tomita, Masaru; Kanai, Akio.

In: DNA Research, Vol. 14, No. 3, 06.2007, p. 91-102.

Research output: Contribution to journalArticle

@article{3d7440d47d62417bb20694f945db1a79,
title = "Proteome-wide prediction of novel DNA/RNA-binding proteins using amino acid composition and periodicity in the hyperthermophilic archaeon pyrococcus furiosus",
abstract = "Proteins play a critical role in complex biological systems, yet about half of the proteins in publicly available databases are annotated as functionally unknown. Proteome-wide functional classification using bioinformatics approaches thus is becoming an important method for revealing unknown protein functions. Using the hyperthermophilic archaeon Pyrococcus furiosus as a model species, we used the support vector machine (SVM) method to discriminate DNA/RNA-binding proteins from proteins with other functions, using amino acid composition and periodicities as feature vectors. We defined this value as the composition score (CO) and periodicity score (PD). The P. furiosus proteins were classified into three classes (I-III) on the basis of the two-dimensional correlation analysis of CO score and PD score. As a result, approximately 87 of the functionally known proteins categorized as class I proteins (CO score + PD score > 0.6) were found to be DNA/RNA-binding proteins. Applying the two-dimensional correlation analysis to the 994 hypothetical proteins in P. furiosus, a total of 151 proteins were predicted to be novel DNA/RNA-binding protein candidates. DNA/RNA-binding activities of randomly chosen hypothetical proteins were experimentally verified. Six out of seven candidate proteins in class I possessed DNA/RNA-binding activities, supporting the efficacy of our method.",
keywords = "Amino acid periodicity, Archaea, DNA/RNA-binding protein, Support vector machine",
author = "Kosuke Fujishima and Mizuki Komasa and Sayaka Kitamura and Haruo Suzuki and Masaru Tomita and Akio Kanai",
year = "2007",
month = "6",
doi = "10.1093/dnares/dsm011",
language = "English",
volume = "14",
pages = "91--102",
journal = "DNA Research",
issn = "1340-2838",
publisher = "Oxford University Press",
number = "3",

}

TY - JOUR

T1 - Proteome-wide prediction of novel DNA/RNA-binding proteins using amino acid composition and periodicity in the hyperthermophilic archaeon pyrococcus furiosus

AU - Fujishima, Kosuke

AU - Komasa, Mizuki

AU - Kitamura, Sayaka

AU - Suzuki, Haruo

AU - Tomita, Masaru

AU - Kanai, Akio

PY - 2007/6

Y1 - 2007/6

N2 - Proteins play a critical role in complex biological systems, yet about half of the proteins in publicly available databases are annotated as functionally unknown. Proteome-wide functional classification using bioinformatics approaches thus is becoming an important method for revealing unknown protein functions. Using the hyperthermophilic archaeon Pyrococcus furiosus as a model species, we used the support vector machine (SVM) method to discriminate DNA/RNA-binding proteins from proteins with other functions, using amino acid composition and periodicities as feature vectors. We defined this value as the composition score (CO) and periodicity score (PD). The P. furiosus proteins were classified into three classes (I-III) on the basis of the two-dimensional correlation analysis of CO score and PD score. As a result, approximately 87 of the functionally known proteins categorized as class I proteins (CO score + PD score > 0.6) were found to be DNA/RNA-binding proteins. Applying the two-dimensional correlation analysis to the 994 hypothetical proteins in P. furiosus, a total of 151 proteins were predicted to be novel DNA/RNA-binding protein candidates. DNA/RNA-binding activities of randomly chosen hypothetical proteins were experimentally verified. Six out of seven candidate proteins in class I possessed DNA/RNA-binding activities, supporting the efficacy of our method.

AB - Proteins play a critical role in complex biological systems, yet about half of the proteins in publicly available databases are annotated as functionally unknown. Proteome-wide functional classification using bioinformatics approaches thus is becoming an important method for revealing unknown protein functions. Using the hyperthermophilic archaeon Pyrococcus furiosus as a model species, we used the support vector machine (SVM) method to discriminate DNA/RNA-binding proteins from proteins with other functions, using amino acid composition and periodicities as feature vectors. We defined this value as the composition score (CO) and periodicity score (PD). The P. furiosus proteins were classified into three classes (I-III) on the basis of the two-dimensional correlation analysis of CO score and PD score. As a result, approximately 87 of the functionally known proteins categorized as class I proteins (CO score + PD score > 0.6) were found to be DNA/RNA-binding proteins. Applying the two-dimensional correlation analysis to the 994 hypothetical proteins in P. furiosus, a total of 151 proteins were predicted to be novel DNA/RNA-binding protein candidates. DNA/RNA-binding activities of randomly chosen hypothetical proteins were experimentally verified. Six out of seven candidate proteins in class I possessed DNA/RNA-binding activities, supporting the efficacy of our method.

KW - Amino acid periodicity

KW - Archaea

KW - DNA/RNA-binding protein

KW - Support vector machine

UR - http://www.scopus.com/inward/record.url?scp=34748889938&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34748889938&partnerID=8YFLogxK

U2 - 10.1093/dnares/dsm011

DO - 10.1093/dnares/dsm011

M3 - Article

VL - 14

SP - 91

EP - 102

JO - DNA Research

JF - DNA Research

SN - 1340-2838

IS - 3

ER -