TY - JOUR
T1 - SHARAKU
T2 - An algorithm for aligning and clustering read mapping profiles of deep sequencing in non-coding RNA processing
AU - Tsuchiya, Mariko
AU - Amano, Kojiro
AU - Abe, Masaya
AU - Seki, Misato
AU - Hase, Sumitaka
AU - Sato, Kengo
AU - Sakakibara, Yasubumi
N1 - Publisher Copyright:
© 2016 The Author 2016. Published by Oxford University Press.
PY - 2016/6/15
Y1 - 2016/6/15
N2 - Motivation: Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. Results: We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5′-end processing and 3′-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain.
AB - Motivation: Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. Results: We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5′-end processing and 3′-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain.
UR - http://www.scopus.com/inward/record.url?scp=84976481188&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84976481188&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btw273
DO - 10.1093/bioinformatics/btw273
M3 - Article
C2 - 27307639
AN - SCOPUS:84976481188
SN - 1367-4803
VL - 32
SP - i369-i377
JO - Bioinformatics
JF - Bioinformatics
IS - 12
ER -