Discriminative detection of cis-acting regulatory variation from location data

Yuji Kawada, Yasubumi Sakakibara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The interaction between transcription factors and their DNA binding sites plays a key role for understanding gene regulation mechanisms. Recent studies revealed the presence of .functional polymorphism. in genes that is defined as regulatory variation measured in transcription levels due to the cisacting sequence differences. These regulatory variants are assumed to contribute to modulating gene functions. However, computational identifications of such functional cisregulatory variants is a much greater challenge than just identifying consensus sequences, because cisregulatory variants differ by only a few bases from the main consensus sequences, while they have important consequences for organismal phenotype. None of the previous studies have directly addressed this problem. We propose a novel discriminative detection method for precisely identifying transcription factor binding sites and their functional variants from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor) based on the genome-wide location data. Our goal is to find such discriminative substrings that best explain the location data in the sense that the substrings precisely discriminate the positive samples from the negative ones rather than finding the substrings that are simply over-represented among the positive ones. Our method consists of two steps: First, we apply a decision tree learning method to discover discriminative substrings and a hierarchical relationship among them. Second, we extract a main motif and further a second motif as a cis-regulatory variant by utilizing functional annotations. Our genome-wide experimental results on yeast Saccharomyces cerevisiae show that our method presented significantly better performances for detecting experimentally verified consensus sequences than current motif detecting methods. In addition, our method has successfully discovered second motifs of putative functional cis-regulatory variants which are associated with genes of different functional annotations, and the correctness of those variants have been verified by expression profile analyses.

Original languageEnglish
Title of host publicationSeries on Advances in Bioinformatics and Computational Biology
Pages89-98
Number of pages10
Volume3
Publication statusPublished - 2006
Event4th Asia-Pacific Bioinformatics Conference, APBC 2006 - Taipei, Taiwan, Province of China
Duration: 2006 Feb 132006 Feb 16

Other

Other4th Asia-Pacific Bioinformatics Conference, APBC 2006
CountryTaiwan, Province of China
CityTaipei
Period06/2/1306/2/16

Fingerprint

Genes
Transcription factors
Transcription Factors
Binding sites
Yeast
Binding Sites
Transcription
Decision trees
Polymorphism
Gene expression
DNA

ASJC Scopus subject areas

  • Bioengineering
  • Information Systems

Cite this

Kawada, Y., & Sakakibara, Y. (2006). Discriminative detection of cis-acting regulatory variation from location data. In Series on Advances in Bioinformatics and Computational Biology (Vol. 3, pp. 89-98)

Discriminative detection of cis-acting regulatory variation from location data. / Kawada, Yuji; Sakakibara, Yasubumi.

Series on Advances in Bioinformatics and Computational Biology. Vol. 3 2006. p. 89-98.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kawada, Y & Sakakibara, Y 2006, Discriminative detection of cis-acting regulatory variation from location data. in Series on Advances in Bioinformatics and Computational Biology. vol. 3, pp. 89-98, 4th Asia-Pacific Bioinformatics Conference, APBC 2006, Taipei, Taiwan, Province of China, 06/2/13.
Kawada Y, Sakakibara Y. Discriminative detection of cis-acting regulatory variation from location data. In Series on Advances in Bioinformatics and Computational Biology. Vol. 3. 2006. p. 89-98
Kawada, Yuji ; Sakakibara, Yasubumi. / Discriminative detection of cis-acting regulatory variation from location data. Series on Advances in Bioinformatics and Computational Biology. Vol. 3 2006. pp. 89-98
@inproceedings{52380480cf394faf8bd8f6045ebfff43,
title = "Discriminative detection of cis-acting regulatory variation from location data",
abstract = "The interaction between transcription factors and their DNA binding sites plays a key role for understanding gene regulation mechanisms. Recent studies revealed the presence of .functional polymorphism. in genes that is defined as regulatory variation measured in transcription levels due to the cisacting sequence differences. These regulatory variants are assumed to contribute to modulating gene functions. However, computational identifications of such functional cisregulatory variants is a much greater challenge than just identifying consensus sequences, because cisregulatory variants differ by only a few bases from the main consensus sequences, while they have important consequences for organismal phenotype. None of the previous studies have directly addressed this problem. We propose a novel discriminative detection method for precisely identifying transcription factor binding sites and their functional variants from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor) based on the genome-wide location data. Our goal is to find such discriminative substrings that best explain the location data in the sense that the substrings precisely discriminate the positive samples from the negative ones rather than finding the substrings that are simply over-represented among the positive ones. Our method consists of two steps: First, we apply a decision tree learning method to discover discriminative substrings and a hierarchical relationship among them. Second, we extract a main motif and further a second motif as a cis-regulatory variant by utilizing functional annotations. Our genome-wide experimental results on yeast Saccharomyces cerevisiae show that our method presented significantly better performances for detecting experimentally verified consensus sequences than current motif detecting methods. In addition, our method has successfully discovered second motifs of putative functional cis-regulatory variants which are associated with genes of different functional annotations, and the correctness of those variants have been verified by expression profile analyses.",
author = "Yuji Kawada and Yasubumi Sakakibara",
year = "2006",
language = "English",
isbn = "1860946232",
volume = "3",
pages = "89--98",
booktitle = "Series on Advances in Bioinformatics and Computational Biology",

}

TY - GEN

T1 - Discriminative detection of cis-acting regulatory variation from location data

AU - Kawada, Yuji

AU - Sakakibara, Yasubumi

PY - 2006

Y1 - 2006

N2 - The interaction between transcription factors and their DNA binding sites plays a key role for understanding gene regulation mechanisms. Recent studies revealed the presence of .functional polymorphism. in genes that is defined as regulatory variation measured in transcription levels due to the cisacting sequence differences. These regulatory variants are assumed to contribute to modulating gene functions. However, computational identifications of such functional cisregulatory variants is a much greater challenge than just identifying consensus sequences, because cisregulatory variants differ by only a few bases from the main consensus sequences, while they have important consequences for organismal phenotype. None of the previous studies have directly addressed this problem. We propose a novel discriminative detection method for precisely identifying transcription factor binding sites and their functional variants from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor) based on the genome-wide location data. Our goal is to find such discriminative substrings that best explain the location data in the sense that the substrings precisely discriminate the positive samples from the negative ones rather than finding the substrings that are simply over-represented among the positive ones. Our method consists of two steps: First, we apply a decision tree learning method to discover discriminative substrings and a hierarchical relationship among them. Second, we extract a main motif and further a second motif as a cis-regulatory variant by utilizing functional annotations. Our genome-wide experimental results on yeast Saccharomyces cerevisiae show that our method presented significantly better performances for detecting experimentally verified consensus sequences than current motif detecting methods. In addition, our method has successfully discovered second motifs of putative functional cis-regulatory variants which are associated with genes of different functional annotations, and the correctness of those variants have been verified by expression profile analyses.

AB - The interaction between transcription factors and their DNA binding sites plays a key role for understanding gene regulation mechanisms. Recent studies revealed the presence of .functional polymorphism. in genes that is defined as regulatory variation measured in transcription levels due to the cisacting sequence differences. These regulatory variants are assumed to contribute to modulating gene functions. However, computational identifications of such functional cisregulatory variants is a much greater challenge than just identifying consensus sequences, because cisregulatory variants differ by only a few bases from the main consensus sequences, while they have important consequences for organismal phenotype. None of the previous studies have directly addressed this problem. We propose a novel discriminative detection method for precisely identifying transcription factor binding sites and their functional variants from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor) based on the genome-wide location data. Our goal is to find such discriminative substrings that best explain the location data in the sense that the substrings precisely discriminate the positive samples from the negative ones rather than finding the substrings that are simply over-represented among the positive ones. Our method consists of two steps: First, we apply a decision tree learning method to discover discriminative substrings and a hierarchical relationship among them. Second, we extract a main motif and further a second motif as a cis-regulatory variant by utilizing functional annotations. Our genome-wide experimental results on yeast Saccharomyces cerevisiae show that our method presented significantly better performances for detecting experimentally verified consensus sequences than current motif detecting methods. In addition, our method has successfully discovered second motifs of putative functional cis-regulatory variants which are associated with genes of different functional annotations, and the correctness of those variants have been verified by expression profile analyses.

UR - http://www.scopus.com/inward/record.url?scp=84856993101&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84856993101&partnerID=8YFLogxK

M3 - Conference contribution

SN - 1860946232

SN - 9781860946233

VL - 3

SP - 89

EP - 98

BT - Series on Advances in Bioinformatics and Computational Biology

ER -