Directed acyclic graph kernels for structural RNA analysis

Kengo Sato, Toutai Mituyama, Kiyoshi Asai, Yasubumi Sakakibara

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Background: Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. Results: We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. Conclusion: Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.

Original languageEnglish
Article number318
JournalBMC Bioinformatics
Volume9
DOIs
Publication statusPublished - 2008 Jul 22

Fingerprint

Untranslated RNA
Directed Acyclic Graph
RNA
kernel
Base Pairing
Support vector machines
Cluster Analysis
Pairing
Computational complexity
Research Personnel
Alignment
Kernel Methods
Hierarchical Clustering
Secondary Structure
Similarity Measure
Large Data Sets
Support Vector Machine
Computational Complexity
Strings

ASJC Scopus subject areas

  • Medicine(all)
  • Structural Biology
  • Applied Mathematics
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Cite this

Directed acyclic graph kernels for structural RNA analysis. / Sato, Kengo; Mituyama, Toutai; Asai, Kiyoshi; Sakakibara, Yasubumi.

In: BMC Bioinformatics, Vol. 9, 318, 22.07.2008.

Research output: Contribution to journalArticle

@article{96f13466835845719eb1de76bd864791,
title = "Directed acyclic graph kernels for structural RNA analysis",
abstract = "Background: Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. Results: We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. Conclusion: Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.",
author = "Kengo Sato and Toutai Mituyama and Kiyoshi Asai and Yasubumi Sakakibara",
year = "2008",
month = "7",
day = "22",
doi = "10.1186/1471-2105-9-318",
language = "English",
volume = "9",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Directed acyclic graph kernels for structural RNA analysis

AU - Sato, Kengo

AU - Mituyama, Toutai

AU - Asai, Kiyoshi

AU - Sakakibara, Yasubumi

PY - 2008/7/22

Y1 - 2008/7/22

N2 - Background: Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. Results: We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. Conclusion: Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.

AB - Background: Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. Results: We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. Conclusion: Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.

UR - http://www.scopus.com/inward/record.url?scp=49649096693&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=49649096693&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-9-318

DO - 10.1186/1471-2105-9-318

M3 - Article

C2 - 18647390

AN - SCOPUS:49649096693

VL - 9

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 318

ER -