TY - JOUR
T1 - Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model
AU - Yachie, Nozomu
AU - Numata, Koji
AU - Saito, Rintaro
AU - Kanai, Akio
AU - Tomita, Masaru
PY - 2006/5/10
Y1 - 2006/5/10
N2 - A new mathematical index was developed to identify and characterize non-coding RNA (ncRNA) genes encoded within the Escherichia coli (E. coli) genome. It was designated the GMMI (Gapped Markov Model Index) and used to evaluate sequence patterns located at the separate positions of consensus sequences, codon biases and/or possible RNA structures on the basis of the Markov model. The GMMI was able to separate a set of known mRNA sequences from a mixture of ncRNAs including tRNAs and rRNAs. Consequently, the GMMI was employed to predict novel ncRNA candidates. At the beginning, possible transcription units were extracted from the E. coli genome using consensus sequences for the sigma70 promoter and the rho-independent terminator. Then, these units were evaluated by using the GMMI. This identified 133 candidate ncRNAs, which contain 29 previously annotated small RNA genes and 46 possible antisense ncRNAs. Furthermore 12 transcripts (including five antisense RNAs) were confirmed according to the expression analysis. These data suggests that the expression of small antisense RNAs might be more common than previously thought in the E. coli genome.
AB - A new mathematical index was developed to identify and characterize non-coding RNA (ncRNA) genes encoded within the Escherichia coli (E. coli) genome. It was designated the GMMI (Gapped Markov Model Index) and used to evaluate sequence patterns located at the separate positions of consensus sequences, codon biases and/or possible RNA structures on the basis of the Markov model. The GMMI was able to separate a set of known mRNA sequences from a mixture of ncRNAs including tRNAs and rRNAs. Consequently, the GMMI was employed to predict novel ncRNA candidates. At the beginning, possible transcription units were extracted from the E. coli genome using consensus sequences for the sigma70 promoter and the rho-independent terminator. Then, these units were evaluated by using the GMMI. This identified 133 candidate ncRNAs, which contain 29 previously annotated small RNA genes and 46 possible antisense ncRNAs. Furthermore 12 transcripts (including five antisense RNAs) were confirmed according to the expression analysis. These data suggests that the expression of small antisense RNAs might be more common than previously thought in the E. coli genome.
KW - Bioinformatics
KW - Markov model
KW - Rho-independent terminator
KW - Sigma70 promoter
KW - Small RNA (sRNA)
UR - http://www.scopus.com/inward/record.url?scp=33646029404&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33646029404&partnerID=8YFLogxK
U2 - 10.1016/j.gene.2005.12.034
DO - 10.1016/j.gene.2005.12.034
M3 - Article
C2 - 16564143
AN - SCOPUS:33646029404
SN - 0378-1119
VL - 372
SP - 171
EP - 181
JO - Gene
JF - Gene
IS - 1-2
ER -