Prediction of gene structures from RNA-seq data using dual decomposition

Tatsumu Inatsuki, Kengo Sato, Yasubumi Sakakibara

研究成果: Article

1 引用 (Scopus)

抄録

Numerous computational algorithms for predicting protein-coding genes from genomic sequences have been developed, and hidden Markov models (HMMs) have frequently been used to model gene structures. For eukaryotes, more complex gene structures such as introns make gene prediction much harder due to isoforms of transcripts by alternative splicing machinery. We develop a novel gene prediction method for eukaryote genomes that extends the traditional HMM-based gene prediction model by incorporating comprehensive evidence of transcripts by using RNA sequencing (RNA-seq) technology. We formulate gene prediction as an integer programming problem, and solve it by the dual decomposition technique. To confirm the utility of the proposed algorithm, computational experiments on benchmark datasets were conducted. The results show that our algorithm efficiently and effectively employs RNA-seq data in gene structure prediction.

元の言語English
ページ(範囲)1-6
ページ数6
ジャーナルIPSJ Transactions on Bioinformatics
9
DOI
出版物ステータスPublished - 2016 3 1

Fingerprint

RNA Sequence Analysis
RNA
Genes
Decomposition
Eukaryota
Hidden Markov models
Benchmarking
Alternative Splicing
Introns
Integer programming
Protein Isoforms
Genome
Machinery
Technology
Proteins

ASJC Scopus subject areas

  • Computer Science Applications
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)

これを引用

@article{1bd70bd08e7a475985f30d374d8b7fa6,
title = "Prediction of gene structures from RNA-seq data using dual decomposition",
abstract = "Numerous computational algorithms for predicting protein-coding genes from genomic sequences have been developed, and hidden Markov models (HMMs) have frequently been used to model gene structures. For eukaryotes, more complex gene structures such as introns make gene prediction much harder due to isoforms of transcripts by alternative splicing machinery. We develop a novel gene prediction method for eukaryote genomes that extends the traditional HMM-based gene prediction model by incorporating comprehensive evidence of transcripts by using RNA sequencing (RNA-seq) technology. We formulate gene prediction as an integer programming problem, and solve it by the dual decomposition technique. To confirm the utility of the proposed algorithm, computational experiments on benchmark datasets were conducted. The results show that our algorithm efficiently and effectively employs RNA-seq data in gene structure prediction.",
keywords = "Dual decomposition, Gene structure prediction, Hidden markov models, Lagrangian relaxation, RNA-seq",
author = "Tatsumu Inatsuki and Kengo Sato and Yasubumi Sakakibara",
year = "2016",
month = "3",
day = "1",
doi = "10.2197/ipsjtbio.9.1",
language = "English",
volume = "9",
pages = "1--6",
journal = "IPSJ Transactions on Bioinformatics",
issn = "1882-6679",
publisher = "Information Processing Society of Japan",

}

TY - JOUR

T1 - Prediction of gene structures from RNA-seq data using dual decomposition

AU - Inatsuki, Tatsumu

AU - Sato, Kengo

AU - Sakakibara, Yasubumi

PY - 2016/3/1

Y1 - 2016/3/1

N2 - Numerous computational algorithms for predicting protein-coding genes from genomic sequences have been developed, and hidden Markov models (HMMs) have frequently been used to model gene structures. For eukaryotes, more complex gene structures such as introns make gene prediction much harder due to isoforms of transcripts by alternative splicing machinery. We develop a novel gene prediction method for eukaryote genomes that extends the traditional HMM-based gene prediction model by incorporating comprehensive evidence of transcripts by using RNA sequencing (RNA-seq) technology. We formulate gene prediction as an integer programming problem, and solve it by the dual decomposition technique. To confirm the utility of the proposed algorithm, computational experiments on benchmark datasets were conducted. The results show that our algorithm efficiently and effectively employs RNA-seq data in gene structure prediction.

AB - Numerous computational algorithms for predicting protein-coding genes from genomic sequences have been developed, and hidden Markov models (HMMs) have frequently been used to model gene structures. For eukaryotes, more complex gene structures such as introns make gene prediction much harder due to isoforms of transcripts by alternative splicing machinery. We develop a novel gene prediction method for eukaryote genomes that extends the traditional HMM-based gene prediction model by incorporating comprehensive evidence of transcripts by using RNA sequencing (RNA-seq) technology. We formulate gene prediction as an integer programming problem, and solve it by the dual decomposition technique. To confirm the utility of the proposed algorithm, computational experiments on benchmark datasets were conducted. The results show that our algorithm efficiently and effectively employs RNA-seq data in gene structure prediction.

KW - Dual decomposition

KW - Gene structure prediction

KW - Hidden markov models

KW - Lagrangian relaxation

KW - RNA-seq

UR - http://www.scopus.com/inward/record.url?scp=84975090081&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84975090081&partnerID=8YFLogxK

U2 - 10.2197/ipsjtbio.9.1

DO - 10.2197/ipsjtbio.9.1

M3 - Article

AN - SCOPUS:84975090081

VL - 9

SP - 1

EP - 6

JO - IPSJ Transactions on Bioinformatics

JF - IPSJ Transactions on Bioinformatics

SN - 1882-6679

ER -