Development of a large-scale comparative genome system and its application to the analysis of mycobacteria genomes

Yasubumi Sakakibara, Yasunori Osana, Kris Popendorf

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

As the number of whole genome sequences available continues to increase rapidly, the raw scale of the sequence data being used in analysis is the first hurdle for comparative genome analysis. When performing whole genome alignments, large-scale rearrangements make it necessary to first find out roughly which short well-conserved segments correspond to what other segments (termed anchors). Successful results have been achieved by adapting tools like BLAT and BLASTZ on a problem-to-problem basis, but the work required to perform a single alignment is considerable. Recently, new programs such as Mauve and Pattern-Hunter can handle slightly larger inputs, but the memory/time requirements for sequences like Human and Chimp X chromosomes are prohibitive for most computational environments. Our novel algorithm, which we have implemented in a program called Murasaki (available at http://murasaki.dna.bio.keio.ac.jp), makes it possible to identify anchors of multiple large sequences on the scale of several hundred megabases (e.g. three mammal chromosomes) in a matter of minutes. We also demonstrate an application of Murasaki to the comparative analysis of multiple mycobacteria genomes.

Original languageEnglish
Pages (from-to)251-256
Number of pages6
JournalJapanese Journal of Leprosy
Volume76
Issue number3
Publication statusPublished - 2007

Fingerprint

Mycobacterium
Genome
Chromosomes, Human, X
Mammals
Chromosomes

Keywords

  • Comparative genomics
  • Dotplot
  • Mycobacteria
  • Pseudogene
  • Sequence analysis

ASJC Scopus subject areas

  • Dermatology
  • Infectious Diseases

Cite this

Development of a large-scale comparative genome system and its application to the analysis of mycobacteria genomes. / Sakakibara, Yasubumi; Osana, Yasunori; Popendorf, Kris.

In: Japanese Journal of Leprosy, Vol. 76, No. 3, 2007, p. 251-256.

Research output: Contribution to journalArticle

@article{f5e6f07a627f4a22a3cf5ff44b194467,
title = "Development of a large-scale comparative genome system and its application to the analysis of mycobacteria genomes",
abstract = "As the number of whole genome sequences available continues to increase rapidly, the raw scale of the sequence data being used in analysis is the first hurdle for comparative genome analysis. When performing whole genome alignments, large-scale rearrangements make it necessary to first find out roughly which short well-conserved segments correspond to what other segments (termed anchors). Successful results have been achieved by adapting tools like BLAT and BLASTZ on a problem-to-problem basis, but the work required to perform a single alignment is considerable. Recently, new programs such as Mauve and Pattern-Hunter can handle slightly larger inputs, but the memory/time requirements for sequences like Human and Chimp X chromosomes are prohibitive for most computational environments. Our novel algorithm, which we have implemented in a program called Murasaki (available at http://murasaki.dna.bio.keio.ac.jp), makes it possible to identify anchors of multiple large sequences on the scale of several hundred megabases (e.g. three mammal chromosomes) in a matter of minutes. We also demonstrate an application of Murasaki to the comparative analysis of multiple mycobacteria genomes.",
keywords = "Comparative genomics, Dotplot, Mycobacteria, Pseudogene, Sequence analysis",
author = "Yasubumi Sakakibara and Yasunori Osana and Kris Popendorf",
year = "2007",
language = "English",
volume = "76",
pages = "251--256",
journal = "Japanese Journal of Leprosy",
issn = "1342-3681",
publisher = "Japanese Leprosy Association",
number = "3",

}

TY - JOUR

T1 - Development of a large-scale comparative genome system and its application to the analysis of mycobacteria genomes

AU - Sakakibara, Yasubumi

AU - Osana, Yasunori

AU - Popendorf, Kris

PY - 2007

Y1 - 2007

N2 - As the number of whole genome sequences available continues to increase rapidly, the raw scale of the sequence data being used in analysis is the first hurdle for comparative genome analysis. When performing whole genome alignments, large-scale rearrangements make it necessary to first find out roughly which short well-conserved segments correspond to what other segments (termed anchors). Successful results have been achieved by adapting tools like BLAT and BLASTZ on a problem-to-problem basis, but the work required to perform a single alignment is considerable. Recently, new programs such as Mauve and Pattern-Hunter can handle slightly larger inputs, but the memory/time requirements for sequences like Human and Chimp X chromosomes are prohibitive for most computational environments. Our novel algorithm, which we have implemented in a program called Murasaki (available at http://murasaki.dna.bio.keio.ac.jp), makes it possible to identify anchors of multiple large sequences on the scale of several hundred megabases (e.g. three mammal chromosomes) in a matter of minutes. We also demonstrate an application of Murasaki to the comparative analysis of multiple mycobacteria genomes.

AB - As the number of whole genome sequences available continues to increase rapidly, the raw scale of the sequence data being used in analysis is the first hurdle for comparative genome analysis. When performing whole genome alignments, large-scale rearrangements make it necessary to first find out roughly which short well-conserved segments correspond to what other segments (termed anchors). Successful results have been achieved by adapting tools like BLAT and BLASTZ on a problem-to-problem basis, but the work required to perform a single alignment is considerable. Recently, new programs such as Mauve and Pattern-Hunter can handle slightly larger inputs, but the memory/time requirements for sequences like Human and Chimp X chromosomes are prohibitive for most computational environments. Our novel algorithm, which we have implemented in a program called Murasaki (available at http://murasaki.dna.bio.keio.ac.jp), makes it possible to identify anchors of multiple large sequences on the scale of several hundred megabases (e.g. three mammal chromosomes) in a matter of minutes. We also demonstrate an application of Murasaki to the comparative analysis of multiple mycobacteria genomes.

KW - Comparative genomics

KW - Dotplot

KW - Mycobacteria

KW - Pseudogene

KW - Sequence analysis

UR - http://www.scopus.com/inward/record.url?scp=35548955038&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35548955038&partnerID=8YFLogxK

M3 - Article

C2 - 17877037

AN - SCOPUS:35548955038

VL - 76

SP - 251

EP - 256

JO - Japanese Journal of Leprosy

JF - Japanese Journal of Leprosy

SN - 1342-3681

IS - 3

ER -