MetaVelvet-DL: a MetaVelvet deep learning extension for de novo metagenome assembly

Kuo ching Liang, Yasubumi Sakakibara

Research output: Contribution to journalArticlepeer-review

Abstract

Background: The increasing use of whole metagenome sequencing has spurred the need to improve de novo assemblers to facilitate the discovery of unknown species and the analysis of their genomic functions. MetaVelvet-SL is a short-read de novo metagenome assembler that partitions a multi-species de Bruijn graph into single-species sub-graphs. This study aimed to improve the performance of MetaVelvet-SL by using a deep learning-based model to predict the partition nodes in a multi-species de Bruijn graph. Results: This study showed that the recent advances in deep learning offer the opportunity to better exploit sequence information and differentiate genomes of different species in a metagenomic sample. We developed an extension to MetaVelvet-SL, which we named MetaVelvet-DL, that builds an end-to-end architecture using Convolutional Neural Network and Long Short-Term Memory units. The deep learning model in MetaVelvet-DL can more accurately predict how to partition a de Bruijn graph than the Support Vector Machine-based model in MetaVelvet-SL can. Assembly of the Critical Assessment of Metagenome Interpretation (CAMI) dataset showed that after removing chimeric assemblies, MetaVelvet-DL produced longer single-species contigs, with less misassembled contigs than MetaVelvet-SL did. Conclusions: MetaVelvet-DL provides more accurate de novo assemblies of whole metagenome data. The authors believe that this improvement can help in furthering the understanding of microbiomes by providing a more accurate description of the metagenomic samples under analysis.

Original languageEnglish
Article number427
JournalBMC bioinformatics
Volume22
DOIs
Publication statusPublished - 2021 Jun

Keywords

  • Convolutional neural network
  • Deep learning
  • Long short-term memory
  • Metagenome analysis
  • de Bruijn graph
  • de novo assembly

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'MetaVelvet-DL: a MetaVelvet deep learning extension for de novo metagenome assembly'. Together they form a unique fingerprint.

Cite this