High throughput multiple combination extraction from large scale polymorphism data by exact tree method

Koichi Miyaki, Kazuyuki Omae, Mitsuru Murata, Norio Tanahashi, Ikuo Saito, Kiyoaki Watanabe

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Single nucleotide polymorphisms (SNPs) are increasingly becoming important in clinical settings as useful genetic markers. For the evaluation of genetic risk factors of multifactorial diseases, it is not sufficient to focus on individual SNPs. It is preferable to evaluate combinations of multiple markers, because it allows us to examine the interactions between multiple factors. If all the combinations possible were evaluated round-robin, the number of calculations would rapidly explode as the number of markers analyzed increased. To overcome this limitation, we devised the exact tree method based on decision tree analysis and applied it to 14 SNP data from 68 Japanese stroke patients and 189 healthy controls. From the obtained tree models, we succeeded in extracting multiple statistically significant combinations that elevate the risk of stroke. From this result, we inferred that this method would work more efficiently in the whole genome study, which handles thousands of genetic markers. This exploratory data mining method will facilitate the extraction of combinations from large-scale genetic data and provide a good foothold for further verificatory research.

Original languageEnglish
Pages (from-to)455-462
Number of pages8
JournalJournal of Human Genetics
Volume49
Issue number9
DOIs
Publication statusPublished - 2004

Fingerprint

Single Nucleotide Polymorphism
Genetic Markers
Stroke
Decision Trees
Songbirds
Data Mining
Decision Support Techniques
Genome
Research

Keywords

  • Combination
  • Data mining
  • Exact tree
  • Genetic polymorphism
  • Interaction
  • Multifactorial disease
  • Multiple factor
  • SNP

ASJC Scopus subject areas

  • Genetics(clinical)
  • Genetics

Cite this

High throughput multiple combination extraction from large scale polymorphism data by exact tree method. / Miyaki, Koichi; Omae, Kazuyuki; Murata, Mitsuru; Tanahashi, Norio; Saito, Ikuo; Watanabe, Kiyoaki.

In: Journal of Human Genetics, Vol. 49, No. 9, 2004, p. 455-462.

Research output: Contribution to journalArticle

Miyaki, Koichi ; Omae, Kazuyuki ; Murata, Mitsuru ; Tanahashi, Norio ; Saito, Ikuo ; Watanabe, Kiyoaki. / High throughput multiple combination extraction from large scale polymorphism data by exact tree method. In: Journal of Human Genetics. 2004 ; Vol. 49, No. 9. pp. 455-462.
@article{b3961e0b1fe2412db6ffe4849058e1ef,
title = "High throughput multiple combination extraction from large scale polymorphism data by exact tree method",
abstract = "Single nucleotide polymorphisms (SNPs) are increasingly becoming important in clinical settings as useful genetic markers. For the evaluation of genetic risk factors of multifactorial diseases, it is not sufficient to focus on individual SNPs. It is preferable to evaluate combinations of multiple markers, because it allows us to examine the interactions between multiple factors. If all the combinations possible were evaluated round-robin, the number of calculations would rapidly explode as the number of markers analyzed increased. To overcome this limitation, we devised the exact tree method based on decision tree analysis and applied it to 14 SNP data from 68 Japanese stroke patients and 189 healthy controls. From the obtained tree models, we succeeded in extracting multiple statistically significant combinations that elevate the risk of stroke. From this result, we inferred that this method would work more efficiently in the whole genome study, which handles thousands of genetic markers. This exploratory data mining method will facilitate the extraction of combinations from large-scale genetic data and provide a good foothold for further verificatory research.",
keywords = "Combination, Data mining, Exact tree, Genetic polymorphism, Interaction, Multifactorial disease, Multiple factor, SNP",
author = "Koichi Miyaki and Kazuyuki Omae and Mitsuru Murata and Norio Tanahashi and Ikuo Saito and Kiyoaki Watanabe",
year = "2004",
doi = "10.1007/s10038-004-0174-z",
language = "English",
volume = "49",
pages = "455--462",
journal = "Journal of Human Genetics",
issn = "1434-5161",
publisher = "Nature Publishing Group",
number = "9",

}

TY - JOUR

T1 - High throughput multiple combination extraction from large scale polymorphism data by exact tree method

AU - Miyaki, Koichi

AU - Omae, Kazuyuki

AU - Murata, Mitsuru

AU - Tanahashi, Norio

AU - Saito, Ikuo

AU - Watanabe, Kiyoaki

PY - 2004

Y1 - 2004

N2 - Single nucleotide polymorphisms (SNPs) are increasingly becoming important in clinical settings as useful genetic markers. For the evaluation of genetic risk factors of multifactorial diseases, it is not sufficient to focus on individual SNPs. It is preferable to evaluate combinations of multiple markers, because it allows us to examine the interactions between multiple factors. If all the combinations possible were evaluated round-robin, the number of calculations would rapidly explode as the number of markers analyzed increased. To overcome this limitation, we devised the exact tree method based on decision tree analysis and applied it to 14 SNP data from 68 Japanese stroke patients and 189 healthy controls. From the obtained tree models, we succeeded in extracting multiple statistically significant combinations that elevate the risk of stroke. From this result, we inferred that this method would work more efficiently in the whole genome study, which handles thousands of genetic markers. This exploratory data mining method will facilitate the extraction of combinations from large-scale genetic data and provide a good foothold for further verificatory research.

AB - Single nucleotide polymorphisms (SNPs) are increasingly becoming important in clinical settings as useful genetic markers. For the evaluation of genetic risk factors of multifactorial diseases, it is not sufficient to focus on individual SNPs. It is preferable to evaluate combinations of multiple markers, because it allows us to examine the interactions between multiple factors. If all the combinations possible were evaluated round-robin, the number of calculations would rapidly explode as the number of markers analyzed increased. To overcome this limitation, we devised the exact tree method based on decision tree analysis and applied it to 14 SNP data from 68 Japanese stroke patients and 189 healthy controls. From the obtained tree models, we succeeded in extracting multiple statistically significant combinations that elevate the risk of stroke. From this result, we inferred that this method would work more efficiently in the whole genome study, which handles thousands of genetic markers. This exploratory data mining method will facilitate the extraction of combinations from large-scale genetic data and provide a good foothold for further verificatory research.

KW - Combination

KW - Data mining

KW - Exact tree

KW - Genetic polymorphism

KW - Interaction

KW - Multifactorial disease

KW - Multiple factor

KW - SNP

UR - http://www.scopus.com/inward/record.url?scp=4744345258&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4744345258&partnerID=8YFLogxK

U2 - 10.1007/s10038-004-0174-z

DO - 10.1007/s10038-004-0174-z

M3 - Article

C2 - 15309679

AN - SCOPUS:4744345258

VL - 49

SP - 455

EP - 462

JO - Journal of Human Genetics

JF - Journal of Human Genetics

SN - 1434-5161

IS - 9

ER -