TY - JOUR
T1 - Sequence and structural analyses for functional non-coding RNAs
AU - Sakakibara, Yasubumi
AU - Sato, Kengo
N1 - Funding Information:
This work is supported in part by Grant-in-Aid for Scientific Research on Priority Area No. 17018029 and grants from the Non-coding RNA Project by New Energy and Industrial Technology Development Organization (NEDO) of Japan.
Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2009.
PY - 2009
Y1 - 2009
N2 - Analysis and detection of functional RNAs are currently important topics in both molecular biology and bioinformatics research. Several computational methods based on stochastic context-free grammars (SCFGs) have been developed for modeling and analysing functional RNA sequences. These grammatical methods have succeeded in modeling typical secondary structures of RNAs and are used for structural alignments of RNA sequences. Such stochastic models, however, are not sufficient to discriminate member sequences of an RNA family from non-members, and hence to detect non-coding RNA regions from genome sequences. Recently, the support vector machine (SVM) and kernel function techniques have been actively studied and proposed as a solution to various problems in bioinformatics. SVMs are trained from positive and negative samples and have strong, accurate discrimination abilities, and hence are more appropriate for the discrimination tasks. A few kernel functions that extend the string kernel to measure the similarity of two RNA sequences from the viewpoint of secondary structures have been proposed. In this article, we give an overview of recent progress in SCFG-based methods for RNA sequence analysis and novel kernel functions tailored to measure the similarity of two RNA sequences and developed for use with support vector machines (SVM) in discriminating members of an RNA family from non-members.
AB - Analysis and detection of functional RNAs are currently important topics in both molecular biology and bioinformatics research. Several computational methods based on stochastic context-free grammars (SCFGs) have been developed for modeling and analysing functional RNA sequences. These grammatical methods have succeeded in modeling typical secondary structures of RNAs and are used for structural alignments of RNA sequences. Such stochastic models, however, are not sufficient to discriminate member sequences of an RNA family from non-members, and hence to detect non-coding RNA regions from genome sequences. Recently, the support vector machine (SVM) and kernel function techniques have been actively studied and proposed as a solution to various problems in bioinformatics. SVMs are trained from positive and negative samples and have strong, accurate discrimination abilities, and hence are more appropriate for the discrimination tasks. A few kernel functions that extend the string kernel to measure the similarity of two RNA sequences from the viewpoint of secondary structures have been proposed. In this article, we give an overview of recent progress in SCFG-based methods for RNA sequence analysis and novel kernel functions tailored to measure the similarity of two RNA sequences and developed for use with support vector machines (SVM) in discriminating members of an RNA family from non-members.
UR - http://www.scopus.com/inward/record.url?scp=85019974545&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85019974545&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-88869-7_5
DO - 10.1007/978-3-540-88869-7_5
M3 - Article
AN - SCOPUS:85019974545
SN - 1619-7127
SP - 63
EP - 79
JO - Natural Computing Series
JF - Natural Computing Series
IS - 9783540888680
ER -