TY - JOUR
T1 - Predictions of the pathological response to neoadjuvant chemotherapy in patients with primary breast cancer using a data mining technique
AU - Takada, M.
AU - Sugimoto, M.
AU - Ohno, S.
AU - Kuroi, K.
AU - Sato, N.
AU - Bando, H.
AU - Masuda, N.
AU - Iwata, H.
AU - Kondo, M.
AU - Sasano, H.
AU - Chow, L. W.C.
AU - Inamoto, T.
AU - Naito, Y.
AU - Tomita, M.
AU - Toi, M.
N1 - Funding Information:
Acknowledgments We thank the doctors and data managers for data collection. We also thank the patients who participated in this study. This study was funded by research grants from the Ministry of Health, Labour and Welfare (‘‘A study on the construction of an algorithm for multimodal therapy with biomarkers for primary breast cancer by formulation of a decision-making process’’, led by MT, No. H18-3JIGAN-IPPAN-007 and ‘‘Reduction and lowering of recurrence risk, toxicity and pharmacoeconomic cost by prediction of efficacy for anticancer agents in breast cancer patients’’, led by MT; No. H22-GANRINSHO-IPPAN-039), research funds from the Yamagata Prefectural Government and Tsuruoka City, and an International Internship Grant from the Global COE project ‘‘Centre for Frontier Medicine’’, Kyoto University. This study was also supported by the program ‘‘Raising Proficient Oncologists’’ administered by the Japanese Ministry of Education, Culture, Sports, Science and Technology.
PY - 2012/7
Y1 - 2012/7
N2 - Nomogram, a standard technique that utilizes multiple characteristics to predict efficacy of treatment and likelihood of a specific status of an individual patient, has been used for prediction of response to neoadjuvant chemotherapy (NAC) in breast cancer patients. The aim of this study was to develop a novel computational technique to predict the pathological complete response (pCR) to NAC in primary breast cancer patients. A mathematical model using alternating decision trees, an epigone of decision tree, was developed using 28 clinicopathological variables that were retrospectively collected from patients treated with NAC (n = 150), and validated using an independent dataset from a randomized controlled trial (n = 173). The model selected 15 variables to predict the pCR with yielding area under the receiver operating characteristics curve (AUC) values of 0.766 [95 % confidence interval (≥I)], 0.671-0.861, P value< 0.0001) in cross-validation using training dataset and 0.787 (95 % CI 0.716-0.858, P value < 0.0001) in the validation dataset. Among three subtypes of breast cancer, the luminal subgroup showed the best discrimination (AUC = 0.779, 95 % CI 0.641-0.917, P value = 0.0059). The developed model (AUC = 0.805, 95 % CI 0.716-0.894, P value\0.0001) outperformed multivariate logistic regression (AUC = 0.754, 95 % CI 0.651-0.858, P value = 0.00019) of validation datasets without missing values (n = 127). Several analyses, e.g. bootstrap analysis, revealed that the developed model was insensitive to missing values and also tolerant to distribution bias among the datasets. Our model based on clinicopathological variables showed high predictive ability for pCR. This model might improve the prediction of the response to NAC in primary breast cancer patients.
AB - Nomogram, a standard technique that utilizes multiple characteristics to predict efficacy of treatment and likelihood of a specific status of an individual patient, has been used for prediction of response to neoadjuvant chemotherapy (NAC) in breast cancer patients. The aim of this study was to develop a novel computational technique to predict the pathological complete response (pCR) to NAC in primary breast cancer patients. A mathematical model using alternating decision trees, an epigone of decision tree, was developed using 28 clinicopathological variables that were retrospectively collected from patients treated with NAC (n = 150), and validated using an independent dataset from a randomized controlled trial (n = 173). The model selected 15 variables to predict the pCR with yielding area under the receiver operating characteristics curve (AUC) values of 0.766 [95 % confidence interval (≥I)], 0.671-0.861, P value< 0.0001) in cross-validation using training dataset and 0.787 (95 % CI 0.716-0.858, P value < 0.0001) in the validation dataset. Among three subtypes of breast cancer, the luminal subgroup showed the best discrimination (AUC = 0.779, 95 % CI 0.641-0.917, P value = 0.0059). The developed model (AUC = 0.805, 95 % CI 0.716-0.894, P value\0.0001) outperformed multivariate logistic regression (AUC = 0.754, 95 % CI 0.651-0.858, P value = 0.00019) of validation datasets without missing values (n = 127). Several analyses, e.g. bootstrap analysis, revealed that the developed model was insensitive to missing values and also tolerant to distribution bias among the datasets. Our model based on clinicopathological variables showed high predictive ability for pCR. This model might improve the prediction of the response to NAC in primary breast cancer patients.
KW - Breast cancer
KW - Data mining
KW - Neoadjuvant chemotherapy
KW - Nomogram
KW - Prediction model
UR - http://www.scopus.com/inward/record.url?scp=84868209381&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84868209381&partnerID=8YFLogxK
U2 - 10.1007/s10549-012-2109-2
DO - 10.1007/s10549-012-2109-2
M3 - Article
C2 - 22689089
AN - SCOPUS:84868209381
SN - 0167-6806
VL - 134
SP - 661
EP - 670
JO - Breast Cancer Research and Treatment
JF - Breast Cancer Research and Treatment
IS - 2
ER -