Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model

Nozomu Yachie, Koji Numata, Rintaro Saito, Akio Kanai, Masaru Tomita

Research output: Contribution to journalArticle

34 Citations (Scopus)

Abstract

A new mathematical index was developed to identify and characterize non-coding RNA (ncRNA) genes encoded within the Escherichia coli (E. coli) genome. It was designated the GMMI (Gapped Markov Model Index) and used to evaluate sequence patterns located at the separate positions of consensus sequences, codon biases and/or possible RNA structures on the basis of the Markov model. The GMMI was able to separate a set of known mRNA sequences from a mixture of ncRNAs including tRNAs and rRNAs. Consequently, the GMMI was employed to predict novel ncRNA candidates. At the beginning, possible transcription units were extracted from the E. coli genome using consensus sequences for the sigma70 promoter and the rho-independent terminator. Then, these units were evaluated by using the GMMI. This identified 133 candidate ncRNAs, which contain 29 previously annotated small RNA genes and 46 possible antisense ncRNAs. Furthermore 12 transcripts (including five antisense RNAs) were confirmed according to the expression analysis. These data suggests that the expression of small antisense RNAs might be more common than previously thought in the E. coli genome.

Original languageEnglish
Pages (from-to)171-181
Number of pages11
JournalGene
Volume372
Issue number1-2
DOIs
Publication statusPublished - 2006 May 10

Fingerprint

Antisense RNA
Untranslated RNA
Consensus Sequence
Genome
Escherichia coli
RNA
Genes
Transfer RNA
Codon
Messenger RNA

Keywords

  • Bioinformatics
  • Markov model
  • Rho-independent terminator
  • Sigma70 promoter
  • Small RNA (sRNA)

ASJC Scopus subject areas

  • Genetics

Cite this

Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model. / Yachie, Nozomu; Numata, Koji; Saito, Rintaro; Kanai, Akio; Tomita, Masaru.

In: Gene, Vol. 372, No. 1-2, 10.05.2006, p. 171-181.

Research output: Contribution to journalArticle

Yachie, Nozomu ; Numata, Koji ; Saito, Rintaro ; Kanai, Akio ; Tomita, Masaru. / Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model. In: Gene. 2006 ; Vol. 372, No. 1-2. pp. 171-181.
@article{7c16b569fdb74ab3960aa73f77a597ba,
title = "Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model",
abstract = "A new mathematical index was developed to identify and characterize non-coding RNA (ncRNA) genes encoded within the Escherichia coli (E. coli) genome. It was designated the GMMI (Gapped Markov Model Index) and used to evaluate sequence patterns located at the separate positions of consensus sequences, codon biases and/or possible RNA structures on the basis of the Markov model. The GMMI was able to separate a set of known mRNA sequences from a mixture of ncRNAs including tRNAs and rRNAs. Consequently, the GMMI was employed to predict novel ncRNA candidates. At the beginning, possible transcription units were extracted from the E. coli genome using consensus sequences for the sigma70 promoter and the rho-independent terminator. Then, these units were evaluated by using the GMMI. This identified 133 candidate ncRNAs, which contain 29 previously annotated small RNA genes and 46 possible antisense ncRNAs. Furthermore 12 transcripts (including five antisense RNAs) were confirmed according to the expression analysis. These data suggests that the expression of small antisense RNAs might be more common than previously thought in the E. coli genome.",
keywords = "Bioinformatics, Markov model, Rho-independent terminator, Sigma70 promoter, Small RNA (sRNA)",
author = "Nozomu Yachie and Koji Numata and Rintaro Saito and Akio Kanai and Masaru Tomita",
year = "2006",
month = "5",
day = "10",
doi = "10.1016/j.gene.2005.12.034",
language = "English",
volume = "372",
pages = "171--181",
journal = "Gene",
issn = "0378-1119",
publisher = "Elsevier",
number = "1-2",

}

TY - JOUR

T1 - Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model

AU - Yachie, Nozomu

AU - Numata, Koji

AU - Saito, Rintaro

AU - Kanai, Akio

AU - Tomita, Masaru

PY - 2006/5/10

Y1 - 2006/5/10

N2 - A new mathematical index was developed to identify and characterize non-coding RNA (ncRNA) genes encoded within the Escherichia coli (E. coli) genome. It was designated the GMMI (Gapped Markov Model Index) and used to evaluate sequence patterns located at the separate positions of consensus sequences, codon biases and/or possible RNA structures on the basis of the Markov model. The GMMI was able to separate a set of known mRNA sequences from a mixture of ncRNAs including tRNAs and rRNAs. Consequently, the GMMI was employed to predict novel ncRNA candidates. At the beginning, possible transcription units were extracted from the E. coli genome using consensus sequences for the sigma70 promoter and the rho-independent terminator. Then, these units were evaluated by using the GMMI. This identified 133 candidate ncRNAs, which contain 29 previously annotated small RNA genes and 46 possible antisense ncRNAs. Furthermore 12 transcripts (including five antisense RNAs) were confirmed according to the expression analysis. These data suggests that the expression of small antisense RNAs might be more common than previously thought in the E. coli genome.

AB - A new mathematical index was developed to identify and characterize non-coding RNA (ncRNA) genes encoded within the Escherichia coli (E. coli) genome. It was designated the GMMI (Gapped Markov Model Index) and used to evaluate sequence patterns located at the separate positions of consensus sequences, codon biases and/or possible RNA structures on the basis of the Markov model. The GMMI was able to separate a set of known mRNA sequences from a mixture of ncRNAs including tRNAs and rRNAs. Consequently, the GMMI was employed to predict novel ncRNA candidates. At the beginning, possible transcription units were extracted from the E. coli genome using consensus sequences for the sigma70 promoter and the rho-independent terminator. Then, these units were evaluated by using the GMMI. This identified 133 candidate ncRNAs, which contain 29 previously annotated small RNA genes and 46 possible antisense ncRNAs. Furthermore 12 transcripts (including five antisense RNAs) were confirmed according to the expression analysis. These data suggests that the expression of small antisense RNAs might be more common than previously thought in the E. coli genome.

KW - Bioinformatics

KW - Markov model

KW - Rho-independent terminator

KW - Sigma70 promoter

KW - Small RNA (sRNA)

UR - http://www.scopus.com/inward/record.url?scp=33646029404&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646029404&partnerID=8YFLogxK

U2 - 10.1016/j.gene.2005.12.034

DO - 10.1016/j.gene.2005.12.034

M3 - Article

C2 - 16564143

AN - SCOPUS:33646029404

VL - 372

SP - 171

EP - 181

JO - Gene

JF - Gene

SN - 0378-1119

IS - 1-2

ER -