Estimating the false discovery rate using mixed normal distribution for identifying differentially expressed genes in microarray data analysis

Akihiro Hirakawa, Yasunori Sato, Takashi Sozu, Chikuma Hamada, Isao Yoshimura

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

The recent development of DNA microarray technology allows us to measure simultaneously the expression levels of thousands of genes and to identify truly correlated genes with anticancer drug response (differentially expressed genes) from many candidate genes. Significance Analysis of Microarray (SAM) is often used to estimate the false discovery rate (FDR), which is an index for optimizing the identifiability of differentially expressed genes, while the accuracy of the estimated FDR by SAM is not necessarily confirmed. We propose a new method for estimating the FDR assuming a mixed normal distribution on the test statistic and examine the performance of the proposed method and SAM using simulated data. The simulation results indicate that the accuracy of the estimated FDR by the proposed method and SAM, varied depending on the experimental conditions. We applied both methods to actual data comprised of expression levels of 12,625 genes of 10 responders and 14 non-responders to docetaxel for breast cancer. The proposed method identified 280 differentially expressed genes correlated with docetaxel response using a cut-off value for achieving FDR <0.01 to prevent false-positive genes, although 92 genes were previously thought to be correlated with docetaxel response ones.

Original languageEnglish
Pages (from-to)140-148
Number of pages9
JournalCancer Informatics
Volume3
Publication statusPublished - 2007 Dec 1
Externally publishedYes

Fingerprint

Normal Distribution
Microarray Analysis
docetaxel
Genes
Oligonucleotide Array Sequence Analysis
Breast Neoplasms
Technology

Keywords

  • Differentially expressed genes
  • False discovery rate
  • Microarray
  • Mixed normal distribution
  • Significance analysis of microarray

ASJC Scopus subject areas

  • Oncology
  • Cancer Research

Cite this

Estimating the false discovery rate using mixed normal distribution for identifying differentially expressed genes in microarray data analysis. / Hirakawa, Akihiro; Sato, Yasunori; Sozu, Takashi; Hamada, Chikuma; Yoshimura, Isao.

In: Cancer Informatics, Vol. 3, 01.12.2007, p. 140-148.

Research output: Contribution to journalArticle

@article{ff0e8c4da4914c56b6b43afe33358ce0,
title = "Estimating the false discovery rate using mixed normal distribution for identifying differentially expressed genes in microarray data analysis",
abstract = "The recent development of DNA microarray technology allows us to measure simultaneously the expression levels of thousands of genes and to identify truly correlated genes with anticancer drug response (differentially expressed genes) from many candidate genes. Significance Analysis of Microarray (SAM) is often used to estimate the false discovery rate (FDR), which is an index for optimizing the identifiability of differentially expressed genes, while the accuracy of the estimated FDR by SAM is not necessarily confirmed. We propose a new method for estimating the FDR assuming a mixed normal distribution on the test statistic and examine the performance of the proposed method and SAM using simulated data. The simulation results indicate that the accuracy of the estimated FDR by the proposed method and SAM, varied depending on the experimental conditions. We applied both methods to actual data comprised of expression levels of 12,625 genes of 10 responders and 14 non-responders to docetaxel for breast cancer. The proposed method identified 280 differentially expressed genes correlated with docetaxel response using a cut-off value for achieving FDR <0.01 to prevent false-positive genes, although 92 genes were previously thought to be correlated with docetaxel response ones.",
keywords = "Differentially expressed genes, False discovery rate, Microarray, Mixed normal distribution, Significance analysis of microarray",
author = "Akihiro Hirakawa and Yasunori Sato and Takashi Sozu and Chikuma Hamada and Isao Yoshimura",
year = "2007",
month = "12",
day = "1",
language = "English",
volume = "3",
pages = "140--148",
journal = "Cancer Informatics",
issn = "1176-9351",
publisher = "Libertas Academica Ltd.",

}

TY - JOUR

T1 - Estimating the false discovery rate using mixed normal distribution for identifying differentially expressed genes in microarray data analysis

AU - Hirakawa, Akihiro

AU - Sato, Yasunori

AU - Sozu, Takashi

AU - Hamada, Chikuma

AU - Yoshimura, Isao

PY - 2007/12/1

Y1 - 2007/12/1

N2 - The recent development of DNA microarray technology allows us to measure simultaneously the expression levels of thousands of genes and to identify truly correlated genes with anticancer drug response (differentially expressed genes) from many candidate genes. Significance Analysis of Microarray (SAM) is often used to estimate the false discovery rate (FDR), which is an index for optimizing the identifiability of differentially expressed genes, while the accuracy of the estimated FDR by SAM is not necessarily confirmed. We propose a new method for estimating the FDR assuming a mixed normal distribution on the test statistic and examine the performance of the proposed method and SAM using simulated data. The simulation results indicate that the accuracy of the estimated FDR by the proposed method and SAM, varied depending on the experimental conditions. We applied both methods to actual data comprised of expression levels of 12,625 genes of 10 responders and 14 non-responders to docetaxel for breast cancer. The proposed method identified 280 differentially expressed genes correlated with docetaxel response using a cut-off value for achieving FDR <0.01 to prevent false-positive genes, although 92 genes were previously thought to be correlated with docetaxel response ones.

AB - The recent development of DNA microarray technology allows us to measure simultaneously the expression levels of thousands of genes and to identify truly correlated genes with anticancer drug response (differentially expressed genes) from many candidate genes. Significance Analysis of Microarray (SAM) is often used to estimate the false discovery rate (FDR), which is an index for optimizing the identifiability of differentially expressed genes, while the accuracy of the estimated FDR by SAM is not necessarily confirmed. We propose a new method for estimating the FDR assuming a mixed normal distribution on the test statistic and examine the performance of the proposed method and SAM using simulated data. The simulation results indicate that the accuracy of the estimated FDR by the proposed method and SAM, varied depending on the experimental conditions. We applied both methods to actual data comprised of expression levels of 12,625 genes of 10 responders and 14 non-responders to docetaxel for breast cancer. The proposed method identified 280 differentially expressed genes correlated with docetaxel response using a cut-off value for achieving FDR <0.01 to prevent false-positive genes, although 92 genes were previously thought to be correlated with docetaxel response ones.

KW - Differentially expressed genes

KW - False discovery rate

KW - Microarray

KW - Mixed normal distribution

KW - Significance analysis of microarray

UR - http://www.scopus.com/inward/record.url?scp=49649093310&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=49649093310&partnerID=8YFLogxK

M3 - Article

C2 - 19455258

AN - SCOPUS:49649093310

VL - 3

SP - 140

EP - 148

JO - Cancer Informatics

JF - Cancer Informatics

SN - 1176-9351

ER -