Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Motivation: Since the whole genome sequences for many species are currently available, computational predictions of RNA secondary structures and computational identifications of those non-coding RNA regions by comparative genomics become important, and require more advanced alignment methods. Recently, an approach of structural alignments for RNA sequences has been introduced to solve these problems. By structural alignments, we mean a pair-wise alignment to align an unfolded RNA sequence into a folded RNA sequence of known secondary structure. Pair HMMs on tree structures (PHMMTSs) proposed by Sakakibara are efficient automata-theoretic models for structural alignments of RNA secondary structures, but are incapable of handling pseudoknots. On the other hand, tree adjoining grammars (TAGs) is a subclass of context-sensitive grammar, which is suitable for modeling pseudoknots. Our goal is to extend PHMMTSs by incorporating TAGs to be able to handle pseudoknots. Results: We propose the pair stochastic tree adjoining grammars (PSTAGs) for modeling RNA secondary structures including pseudoknots and show the strong experimental evidences that modeling pseudoknot structures significantly improves the prediction accuracies of RNA secondary structures. First, we extend the notion of PHMMTSs defined on alignments of 'trees' to PSTAGs defined on alignments of "TAG (derivation) trees", which represent a top-down parsing process of TAGs and are functionally equivalent to derived trees of TAGs. Second, we modify PSTAGs so that it takes as input a pair of a linear sequence and a TAG tree representing a pseudoknot structure of RNA to produce a structural alignment. Then, we develop a polynomial-time algorithm for obtaining an optimal structural alignment by PSTAGs, based on dynamic programming parser. We have done several computational experiments for predicting pseudoknots by PSTAGs, and our computational experiments suggests that prediction of RNA pseudoknot struc tures by our method are more efficient and biologically plausible than by other conventional methods. The binary code for PSTAG method is freely available from our website at http://www.dna.bio.keio.ac.jp/pstag/.

Original languageEnglish
Title of host publicationProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
PublisherIEEE Computer Society
Pages290-299
Number of pages10
ISBN (Print)0769521940, 9780769521947
Publication statusPublished - 2004 Jan 1
EventProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004 - Stanford, CA, United States
Duration: 2004 Aug 162004 Aug 19

Publication series

NameProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004

Other

OtherProceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
CountryUnited States
CityStanford, CA
Period04/8/1604/8/19

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures'. Together they form a unique fingerprint.

  • Cite this

    Matsui, H., Sato, K., & Sakakibara, Y. (2004). Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures. In Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004 (pp. 290-299). (Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004). IEEE Computer Society.