Accurate identification of orthologous segments among multiple genomes

Tsuyoshi Hachiya, Yasunori Osana, Kris Popendorf, Yasubumi Sakakibara

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

Motivation: The accurate detection of orthologous segments (also referred to as syntenic segments) plays a key role in comparative genomics, as it is useful for inferring genome rearrangement scenarios and computing whole-genome alignments. Although a number of algorithms for detecting orthologous segments have been proposed, none of them contain a framework for optimizing their parameter values. Methods: In the present study, we propose an algorithm, named OSfinder (Orthologous Segment finder), which uses a novel scoring scheme based on stochastic models. OSfinder takes as input the positions of short homologous regions (also referred to as anchors) and explicitly discriminates orthologous anchors from non-orthologous anchors by using Markov chain models which represent respective geometric distributions of lengths of orthologous and non-orthologous anchors. Such stochastic modeling makes it possible to optimize parameter values by maximizing the likelihood of the input dataset, and to automate the setting of the optimal parameter values. Results: We validated the accuracies of orthology-mapping algorithms on the basis of their consistency with the orthology annotation of genes. Our evaluation tests using mammalian and bacterial genomes demonstrated that OSfinder shows higher accuracy than previous algorithms.

Original languageEnglish
Pages (from-to)853-860
Number of pages8
JournalBioinformatics
Volume25
Issue number7
DOIs
Publication statusPublished - 2009

Fingerprint

Anchors
Genome
Genes
Bacterial Genomes
Molecular Sequence Annotation
Markov Chains
Stochastic models
Genomics
Markov processes
Genome Rearrangement
Geometric distribution
Comparative Genomics
Markov Chain Model
Stochastic Modeling
Optimal Parameter
Scoring
Stochastic Model
Annotation
Likelihood
High Accuracy

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability

Cite this

Accurate identification of orthologous segments among multiple genomes. / Hachiya, Tsuyoshi; Osana, Yasunori; Popendorf, Kris; Sakakibara, Yasubumi.

In: Bioinformatics, Vol. 25, No. 7, 2009, p. 853-860.

Research output: Contribution to journalArticle

Hachiya, Tsuyoshi ; Osana, Yasunori ; Popendorf, Kris ; Sakakibara, Yasubumi. / Accurate identification of orthologous segments among multiple genomes. In: Bioinformatics. 2009 ; Vol. 25, No. 7. pp. 853-860.
@article{8155aac0eb014d90afec9aee1d169b05,
title = "Accurate identification of orthologous segments among multiple genomes",
abstract = "Motivation: The accurate detection of orthologous segments (also referred to as syntenic segments) plays a key role in comparative genomics, as it is useful for inferring genome rearrangement scenarios and computing whole-genome alignments. Although a number of algorithms for detecting orthologous segments have been proposed, none of them contain a framework for optimizing their parameter values. Methods: In the present study, we propose an algorithm, named OSfinder (Orthologous Segment finder), which uses a novel scoring scheme based on stochastic models. OSfinder takes as input the positions of short homologous regions (also referred to as anchors) and explicitly discriminates orthologous anchors from non-orthologous anchors by using Markov chain models which represent respective geometric distributions of lengths of orthologous and non-orthologous anchors. Such stochastic modeling makes it possible to optimize parameter values by maximizing the likelihood of the input dataset, and to automate the setting of the optimal parameter values. Results: We validated the accuracies of orthology-mapping algorithms on the basis of their consistency with the orthology annotation of genes. Our evaluation tests using mammalian and bacterial genomes demonstrated that OSfinder shows higher accuracy than previous algorithms.",
author = "Tsuyoshi Hachiya and Yasunori Osana and Kris Popendorf and Yasubumi Sakakibara",
year = "2009",
doi = "10.1093/bioinformatics/btp070",
language = "English",
volume = "25",
pages = "853--860",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "7",

}

TY - JOUR

T1 - Accurate identification of orthologous segments among multiple genomes

AU - Hachiya, Tsuyoshi

AU - Osana, Yasunori

AU - Popendorf, Kris

AU - Sakakibara, Yasubumi

PY - 2009

Y1 - 2009

N2 - Motivation: The accurate detection of orthologous segments (also referred to as syntenic segments) plays a key role in comparative genomics, as it is useful for inferring genome rearrangement scenarios and computing whole-genome alignments. Although a number of algorithms for detecting orthologous segments have been proposed, none of them contain a framework for optimizing their parameter values. Methods: In the present study, we propose an algorithm, named OSfinder (Orthologous Segment finder), which uses a novel scoring scheme based on stochastic models. OSfinder takes as input the positions of short homologous regions (also referred to as anchors) and explicitly discriminates orthologous anchors from non-orthologous anchors by using Markov chain models which represent respective geometric distributions of lengths of orthologous and non-orthologous anchors. Such stochastic modeling makes it possible to optimize parameter values by maximizing the likelihood of the input dataset, and to automate the setting of the optimal parameter values. Results: We validated the accuracies of orthology-mapping algorithms on the basis of their consistency with the orthology annotation of genes. Our evaluation tests using mammalian and bacterial genomes demonstrated that OSfinder shows higher accuracy than previous algorithms.

AB - Motivation: The accurate detection of orthologous segments (also referred to as syntenic segments) plays a key role in comparative genomics, as it is useful for inferring genome rearrangement scenarios and computing whole-genome alignments. Although a number of algorithms for detecting orthologous segments have been proposed, none of them contain a framework for optimizing their parameter values. Methods: In the present study, we propose an algorithm, named OSfinder (Orthologous Segment finder), which uses a novel scoring scheme based on stochastic models. OSfinder takes as input the positions of short homologous regions (also referred to as anchors) and explicitly discriminates orthologous anchors from non-orthologous anchors by using Markov chain models which represent respective geometric distributions of lengths of orthologous and non-orthologous anchors. Such stochastic modeling makes it possible to optimize parameter values by maximizing the likelihood of the input dataset, and to automate the setting of the optimal parameter values. Results: We validated the accuracies of orthology-mapping algorithms on the basis of their consistency with the orthology annotation of genes. Our evaluation tests using mammalian and bacterial genomes demonstrated that OSfinder shows higher accuracy than previous algorithms.

UR - http://www.scopus.com/inward/record.url?scp=63549088678&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=63549088678&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btp070

DO - 10.1093/bioinformatics/btp070

M3 - Article

C2 - 19188192

AN - SCOPUS:63549088678

VL - 25

SP - 853

EP - 860

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 7

ER -