Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination

Takeshi Murata, Takako Yanagisawa, Toshiaki Kurihara, Miku Kaneko, Sana Ota, Ayame Enomoto, Masaru Tomita, Masahiro Sugimoto, Makoto Sunamura, Tetsu Hayashida, Yuukou Kitagawa, Hiromitsu Jinno

Research output: Contribution to journalArticle

Abstract

Purpose: The aim of this study is to explore new salivary biomarkers to discriminate breast cancer patients from healthy controls. Methods: Saliva samples were collected after 9 h fasting and were immediately stored at − 80 °C. Capillary electrophoresis and liquid chromatography with mass spectrometry were used to quantify hundreds of hydrophilic metabolites. Conventional statistical analyses and artificial intelligence-based methods were used to assess the discrimination abilities of the quantified metabolites. A multiple logistic regression (MLR) model and an alternative decision tree (ADTree)-based machine learning method were used. The generalization abilities of these mathematical models were validated in various computational tests, such as cross-validation and resampling methods. Results: One hundred sixty-six unstimulated saliva samples were collected from 101 patients with invasive carcinoma of the breast (IC), 23 patients with ductal carcinoma in situ (DCIS), and 42 healthy controls (C). Of the 260 quantified metabolites, polyamines were significantly elevated in the saliva of patients with breast cancer. Spermine showed the highest area under the receiver operating characteristic curves [0.766; 95% confidence interval (CI) 0.671–0.840, P < 0.0001] to discriminate IC from C. In addition to spermine, polyamines and their acetylated forms were elevated in IC only. Two hundred each of two-fold, five-fold, and ten-fold cross-validation using different random values were conducted and the MLR model had slightly better accuracy. The ADTree with an ensemble approach showed higher accuracy (0.912; 95% CI 0.838–0.961, P < 0.0001). These prediction models also included spermine as a predictive factor. Conclusions: These data indicated that combinations of salivary metabolomics with the ADTree-based machine learning methods show potential for non-invasive screening of breast cancer.

Original languageEnglish
JournalBreast Cancer Research and Treatment
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Decision Trees
Metabolomics
Spermine
Breast Neoplasms
Logistic Models
Saliva
Aptitude
Polyamines
Confidence Intervals
Carcinoma, Intraductal, Noninfiltrating
Artificial Intelligence
Capillary Electrophoresis
ROC Curve
Liquid Chromatography
Fasting
Mass Spectrometry
Theoretical Models
Biomarkers
Machine Learning

Keywords

  • Alternative decision tree
  • Biomarker
  • Breast cancer
  • Metabolomics
  • Polyamines
  • Saliva

ASJC Scopus subject areas

  • Oncology
  • Cancer Research

Cite this

Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination. / Murata, Takeshi; Yanagisawa, Takako; Kurihara, Toshiaki; Kaneko, Miku; Ota, Sana; Enomoto, Ayame; Tomita, Masaru; Sugimoto, Masahiro; Sunamura, Makoto; Hayashida, Tetsu; Kitagawa, Yuukou; Jinno, Hiromitsu.

In: Breast Cancer Research and Treatment, 01.01.2019.

Research output: Contribution to journalArticle

Murata, Takeshi ; Yanagisawa, Takako ; Kurihara, Toshiaki ; Kaneko, Miku ; Ota, Sana ; Enomoto, Ayame ; Tomita, Masaru ; Sugimoto, Masahiro ; Sunamura, Makoto ; Hayashida, Tetsu ; Kitagawa, Yuukou ; Jinno, Hiromitsu. / Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination. In: Breast Cancer Research and Treatment. 2019.
@article{8a9856dbc51745549cd705499c50f0aa,
title = "Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination",
abstract = "Purpose: The aim of this study is to explore new salivary biomarkers to discriminate breast cancer patients from healthy controls. Methods: Saliva samples were collected after 9 h fasting and were immediately stored at − 80 °C. Capillary electrophoresis and liquid chromatography with mass spectrometry were used to quantify hundreds of hydrophilic metabolites. Conventional statistical analyses and artificial intelligence-based methods were used to assess the discrimination abilities of the quantified metabolites. A multiple logistic regression (MLR) model and an alternative decision tree (ADTree)-based machine learning method were used. The generalization abilities of these mathematical models were validated in various computational tests, such as cross-validation and resampling methods. Results: One hundred sixty-six unstimulated saliva samples were collected from 101 patients with invasive carcinoma of the breast (IC), 23 patients with ductal carcinoma in situ (DCIS), and 42 healthy controls (C). Of the 260 quantified metabolites, polyamines were significantly elevated in the saliva of patients with breast cancer. Spermine showed the highest area under the receiver operating characteristic curves [0.766; 95{\%} confidence interval (CI) 0.671–0.840, P < 0.0001] to discriminate IC from C. In addition to spermine, polyamines and their acetylated forms were elevated in IC only. Two hundred each of two-fold, five-fold, and ten-fold cross-validation using different random values were conducted and the MLR model had slightly better accuracy. The ADTree with an ensemble approach showed higher accuracy (0.912; 95{\%} CI 0.838–0.961, P < 0.0001). These prediction models also included spermine as a predictive factor. Conclusions: These data indicated that combinations of salivary metabolomics with the ADTree-based machine learning methods show potential for non-invasive screening of breast cancer.",
keywords = "Alternative decision tree, Biomarker, Breast cancer, Metabolomics, Polyamines, Saliva",
author = "Takeshi Murata and Takako Yanagisawa and Toshiaki Kurihara and Miku Kaneko and Sana Ota and Ayame Enomoto and Masaru Tomita and Masahiro Sugimoto and Makoto Sunamura and Tetsu Hayashida and Yuukou Kitagawa and Hiromitsu Jinno",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s10549-019-05330-9",
language = "English",
journal = "Breast Cancer Research and Treatment",
issn = "0167-6806",
publisher = "Springer New York",

}

TY - JOUR

T1 - Salivary metabolomics with alternative decision tree-based machine learning methods for breast cancer discrimination

AU - Murata, Takeshi

AU - Yanagisawa, Takako

AU - Kurihara, Toshiaki

AU - Kaneko, Miku

AU - Ota, Sana

AU - Enomoto, Ayame

AU - Tomita, Masaru

AU - Sugimoto, Masahiro

AU - Sunamura, Makoto

AU - Hayashida, Tetsu

AU - Kitagawa, Yuukou

AU - Jinno, Hiromitsu

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Purpose: The aim of this study is to explore new salivary biomarkers to discriminate breast cancer patients from healthy controls. Methods: Saliva samples were collected after 9 h fasting and were immediately stored at − 80 °C. Capillary electrophoresis and liquid chromatography with mass spectrometry were used to quantify hundreds of hydrophilic metabolites. Conventional statistical analyses and artificial intelligence-based methods were used to assess the discrimination abilities of the quantified metabolites. A multiple logistic regression (MLR) model and an alternative decision tree (ADTree)-based machine learning method were used. The generalization abilities of these mathematical models were validated in various computational tests, such as cross-validation and resampling methods. Results: One hundred sixty-six unstimulated saliva samples were collected from 101 patients with invasive carcinoma of the breast (IC), 23 patients with ductal carcinoma in situ (DCIS), and 42 healthy controls (C). Of the 260 quantified metabolites, polyamines were significantly elevated in the saliva of patients with breast cancer. Spermine showed the highest area under the receiver operating characteristic curves [0.766; 95% confidence interval (CI) 0.671–0.840, P < 0.0001] to discriminate IC from C. In addition to spermine, polyamines and their acetylated forms were elevated in IC only. Two hundred each of two-fold, five-fold, and ten-fold cross-validation using different random values were conducted and the MLR model had slightly better accuracy. The ADTree with an ensemble approach showed higher accuracy (0.912; 95% CI 0.838–0.961, P < 0.0001). These prediction models also included spermine as a predictive factor. Conclusions: These data indicated that combinations of salivary metabolomics with the ADTree-based machine learning methods show potential for non-invasive screening of breast cancer.

AB - Purpose: The aim of this study is to explore new salivary biomarkers to discriminate breast cancer patients from healthy controls. Methods: Saliva samples were collected after 9 h fasting and were immediately stored at − 80 °C. Capillary electrophoresis and liquid chromatography with mass spectrometry were used to quantify hundreds of hydrophilic metabolites. Conventional statistical analyses and artificial intelligence-based methods were used to assess the discrimination abilities of the quantified metabolites. A multiple logistic regression (MLR) model and an alternative decision tree (ADTree)-based machine learning method were used. The generalization abilities of these mathematical models were validated in various computational tests, such as cross-validation and resampling methods. Results: One hundred sixty-six unstimulated saliva samples were collected from 101 patients with invasive carcinoma of the breast (IC), 23 patients with ductal carcinoma in situ (DCIS), and 42 healthy controls (C). Of the 260 quantified metabolites, polyamines were significantly elevated in the saliva of patients with breast cancer. Spermine showed the highest area under the receiver operating characteristic curves [0.766; 95% confidence interval (CI) 0.671–0.840, P < 0.0001] to discriminate IC from C. In addition to spermine, polyamines and their acetylated forms were elevated in IC only. Two hundred each of two-fold, five-fold, and ten-fold cross-validation using different random values were conducted and the MLR model had slightly better accuracy. The ADTree with an ensemble approach showed higher accuracy (0.912; 95% CI 0.838–0.961, P < 0.0001). These prediction models also included spermine as a predictive factor. Conclusions: These data indicated that combinations of salivary metabolomics with the ADTree-based machine learning methods show potential for non-invasive screening of breast cancer.

KW - Alternative decision tree

KW - Biomarker

KW - Breast cancer

KW - Metabolomics

KW - Polyamines

KW - Saliva

UR - http://www.scopus.com/inward/record.url?scp=85068836365&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068836365&partnerID=8YFLogxK

U2 - 10.1007/s10549-019-05330-9

DO - 10.1007/s10549-019-05330-9

M3 - Article

AN - SCOPUS:85068836365

JO - Breast Cancer Research and Treatment

JF - Breast Cancer Research and Treatment

SN - 0167-6806

ER -