TY - JOUR
T1 - An improved de novo genome assembly of the common marmoset genome yields improved contiguity and increased mapping rates of sequence data
AU - Jayakumar, Vasanthan
AU - Ishii, Hiromi
AU - Seki, Misato
AU - Kumita, Wakako
AU - Inoue, Takashi
AU - Hase, Sumitaka
AU - Sato, Kengo
AU - Okano, Hideyuki
AU - Sasaki, Erika
AU - Sakakibara, Yasubumi
N1 - Funding Information:
Publication of this supplement is funded by grants from the Strategic Research Program for Brain Sciences; and by grants from the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) of Japan to E.S., H.O., and Y.S.; Y.S. was also supported by JSPS KAKENHI Grant Numbers 16H06279 and 18H04127 of Japan. V.J. and Y.S. were also supported by Innovative Areas [no. 221S0002] from MEXT, Japan, and Program for Promoting Platform of Genomics based Drug Discovery from the Japan Agency AMED under Grant Number JP19kk0305008.
Publisher Copyright:
© 2020 The Author(s).
PY - 2020/4/2
Y1 - 2020/4/2
N2 - Background: The common marmoset (Callithrix jacchus) is one of the most studied primate model organisms. However, the marmoset genomes available in the public databases are highly fragmented and filled with sequence gaps, hindering research advances related to marmoset genomics and transcriptomics. Results: Here we utilize single-molecule, long-read sequence data to improve and update the existing genome assembly and report a near-complete genome of the common marmoset. The assembly is of 2.79 Gb size, with a contig N50 length of 6.37 Mb and a chromosomal scaffold N50 length of 143.91 Mb, representing the most contiguous and high-quality marmoset genome up to date. Approximately 90% of the assembled genome was represented in contigs longer than 1 Mb, with approximately 104-fold improvement in contiguity over the previously published marmoset genome. More than 98% of the gaps from the previously published genomes were filled successfully, which improved the mapping rates of genomic and transcriptomic data on to the assembled genome. Conclusions: Altogether the updated, high-quality common marmoset genome assembly provide improvements at various levels over the previous versions of the marmoset genome assemblies. This will allow researchers working on primate genomics to apply the genome more efficiently for their genomic and transcriptomic sequence data.
AB - Background: The common marmoset (Callithrix jacchus) is one of the most studied primate model organisms. However, the marmoset genomes available in the public databases are highly fragmented and filled with sequence gaps, hindering research advances related to marmoset genomics and transcriptomics. Results: Here we utilize single-molecule, long-read sequence data to improve and update the existing genome assembly and report a near-complete genome of the common marmoset. The assembly is of 2.79 Gb size, with a contig N50 length of 6.37 Mb and a chromosomal scaffold N50 length of 143.91 Mb, representing the most contiguous and high-quality marmoset genome up to date. Approximately 90% of the assembled genome was represented in contigs longer than 1 Mb, with approximately 104-fold improvement in contiguity over the previously published marmoset genome. More than 98% of the gaps from the previously published genomes were filled successfully, which improved the mapping rates of genomic and transcriptomic data on to the assembled genome. Conclusions: Altogether the updated, high-quality common marmoset genome assembly provide improvements at various levels over the previous versions of the marmoset genome assemblies. This will allow researchers working on primate genomics to apply the genome more efficiently for their genomic and transcriptomic sequence data.
KW - Callithrix jacchus
KW - Chromosome-scale scaffolds
KW - Common marmoset
KW - De novo assembly
KW - Non-human primate genomics
UR - http://www.scopus.com/inward/record.url?scp=85082738542&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082738542&partnerID=8YFLogxK
U2 - 10.1186/s12864-020-6657-2
DO - 10.1186/s12864-020-6657-2
M3 - Article
C2 - 32241258
AN - SCOPUS:85082738542
SN - 1471-2164
VL - 21
JO - BMC Genomics
JF - BMC Genomics
M1 - 243
ER -