Abstract
Purpose
The Oncotype DX (ODX) test is a commercially available molecular test for breast cancer assay that provides prognostic and predictive breast cancer recurrence information for hormone positive, HER2-negative patients. The aim of this study is to propose a novel methodology to assist physicians in their decision-making.
Methods
A retrospective study between 2012 and 2020 with 333 cases that underwent an ODX assay from three hospitals in the Bourgogne Franche-Comté region (France) was conducted. Clinical and pathological reports were used to collect the data. A methodology based on distributional random forest was developed to predict the ODX score classes (ODX \(\le 25\) and ODX \(>25\)) using 9 clinico-pathological characteristics. This methodology can be used particularly to identify the patients of the training cohort that share similarities with the new patient and to predict an estimate of the distribution of the ODX score.
Results
The mean age of participants is 56.9 years old. We have correctly classified \(92\%\) of patients in low risk and \(40.2\%\) of patients in high risk. The overall accuracy is \(79.3\%\). The proportion of low risk correct predicted value (PPV) is \(82\%\). The percentage of high risk correct predicted value (NPV) is approximately \(62.3\%\). The F1-score and the Area Under Curve (AUC) are of 0.87 and 0.759, respectively.
Conclusion
The proposed methodology makes it possible to predict the distribution of the ODX score for a patient. This prediction is reinforced by the determination of a family of known patients with follow-up of identical scores. The use of this methodology with the pathologist’s expertise on the different histological and immunohistochemical characteristics has a clinical impact to help oncologist in decision-making regarding breast cancer therapy.
Similar content being viewed by others
Data availability
The data were used under permission for the current study, and so are not publicly available.
References
Albain KS, Barlow WE, Shak S, Hortobagyi GN, Livingston RB, Yeh IT, Ravdin P, Bugarini R, Baehner FL, Davidson NE et al (2010) Prognostic and predictive value of the 21-gene recurrence score assay in a randomized trial of chemotherapy for postmenopausal, node-positive, estrogen receptor-positive breast cancer. Lancet Oncol 11(1):55
Albanell J, Svedman C, Gligorov J, Holt SD, Bertelli G, Blohmer JU, Rouzier R, Lluch A, Eiermann W (2016) Pooled analysis of prospective European studies assessing the impact of using the 21-gene recurrence score assay on clinical decision making in women with oestrogen receptor-positive, human epidermal growth factor receptor 2-negative early-stage breast cancer. Eur J Cancer 66:104–113
Andre F, Ismaila N, Henry NL, Somerfield MR, Bast RC, Barlow W, Collyar DE, Hammond ME, Kuderer NM, Liu MC et al (2019) Use of biomarkers to guide decisions on adjuvant systemic therapy for women with early-stage invasive breast cancer: Asco clinical practice guideline update-integration of results from tailorx. J Clin Oncol 37(22):1956–1964
Athey S, Tibshirani J, Wager S (2019) Generalized random forests. Ann Stat 47(2):1148–1178. https://doi.org/10.1214/18-AOS1709
Baltres A, Al Masry Z, Zemouri R, Valmary-Degano S, Arnould L, Zerhouni N, Devalland C (2020) Prediction of oncotype dx recurrence score using deep multi-layer perceptrons in estrogen receptor-positive, her2-negative breast cancer. Breast Cancer 27(5):1007–1016
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. The Wadsworth Statistics/Probability Series. Belmont, California: Wadsworth International Group, a Division of Wadsworth, Inc., p 368
Chen G, Li Q, Shi F, Rekik I, Pan Z (2020) Rfdcr: Automated brain lesion segmentation using cascaded random forests with dense conditional random fields. Neuroimage 211:116620
Ćevid D, Michel L, Näf J, Meinshausen N, Bühlmann P (2021) Distributional random forests: heterogeneity adjustment and multivariate distributional regression. https://arxiv.org/abs/2005.14458
Fernandez-Lozano C, Hervella P, Mato-Abad V, Rodríguez-Yáñez M, Suárez-Garaboa S, López-Dequidt I, Estany-Gestal A, Sobrino T, Campos F, Castillo J et al (2021) Random forest-based prediction of stroke outcome. Sci Rep 11(1):1–12
Flanagan MB, Dabbs DJ, Brufsky AM, Beriwal S, Bhargava R (2008) Histopathologic variables predict oncotype dx™recurrence score. Mod Pathol 21(10):1255–1261
Giuliano AE, Connolly JL, Edge SB, Mittendorf EA, Rugo HS, Solin LJ, Weaver DL, Winchester DJ, Hortobagyi GN (2017) Breast cancer-major changes in the American joint committee on cancer eighth edition cancer staging manual. CA: A Cancer J Clin 67(4):290–303
Gneiting T, Katzfuss M (2014) Probabilistic forecasting. Annu Rev Stat Appl 1(1):125–151. https://doi.org/10.1146/annurev-statistics-062713-085831
Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Amer Stat Assoc 102(477):359–378. https://doi.org/10.1198/016214506000001437
Hou Y, Tozbikian G, Zynger DL, Li Z (2017) Using the modified Magee equation to identify patients unlikely to benefit from the 21-gene recurrence score assay (oncotype dx assay). Am J Clin Pathol 147(6):541–548
Kalinsky K, Barlow WE, Gralow JR, Meric-Bernstam F, Albain KS, Hayes DF, Lin NU, Perez EA, Goldstein LJ, Chia SK et al (2021) 21-gene assay to inform chemotherapy benefit in node-positive breast cancer. N Engl J Med 385(25):2336–2347
Kim I, Choi HJ, Ryu JM, Lee SK, Yu JH, Kim SW, Nam SJ, Lee JE (2019) A predictive model for high/low risk group according to oncotype dx recurrence score using machine learning. Eur J Surg Oncol 45(2):134–140
Klein ME, Dabbs DJ, Shuai Y, Brufsky AM, Jankowitz R, Puhalla SL, Bhargava R (2013) Prediction of the oncotype dx recurrence score: use of pathology-generated equations derived by linear regression analysis. Mod Pathol 26(5):658–664
Lin Y, Jeon Y (2006) Random forests and adaptive nearest neighbors. J Am Stat Assoc 101(474):578–590. https://doi.org/10.1198/016214505000001230
Matheson JE, Winkler RL (1976) Scoring rules for continuous probability distributions. Manag Sci 22. https://doi.org/10.2307/2629907
Meinshausen N (2006) Quantile regression forests. J Mach Learn Res 983–999
Orucevic A, Bell JL, King M, McNabb AP, Heidel RE (2019) Nomogram update based on tailorx clinical trial results-oncotype dx breast cancer recurrence score can be predicted using clinicopathologic data. Breast 46:116–125
Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T et al (2004) A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 351(27):2817–2826
Paik S, Tang G, Shak S, Kim C, Baker J, Kim W, Cronin M, Baehner FL, Watson D, Bryant J et al (2006) Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer. J Clin Oncol 24(23):3726–3734
Pawloski KR, Gonen M, Wen HY, Tadros AB, Thompson D, Abbate K, Morrow M, El-Tamer M (2022) Supervised machine learning model to predict oncotype dx risk category in patients over age 50. Breast Cancer Res Treat 191(2):423–430
Sparano JA, Gray RJ, Makower DF, Pritchard KI, Albain KS, Hayes DF, Geyer Jr CE, Dees EC, Perez EA, Olson Jr JA et al (2015) Prospective validation of a 21-gene expression assay in breast cancer. N Engl J Med 373(21):2005–2014
Sparano JA, Gray RJ, Makower DF, Pritchard KI, Albain KS, Hayes DF, Geyer Jr CE, Dees EC, Goetz MP, Olson Jr JA et al (2018) Adjuvant chemotherapy guided by a 21-gene expression assay in breast cancer. N Engl J Med 379(2):111–121
Sughayer M, Alaaraj R, Alsughayer A (2018) Applying new magee equations for predicting the oncotype dx recurrence score. Breast Cancer 25(5):597–604
Wolff AC, Hammond ME, Allison KH, Harvey BE, Mangu PB, Bartlett JM, Bilous M, Ellis IO, Fitzgibbons P, Hanna W et al (2018) Human epidermal growth factor receptor 2 testing in breast cancer: American society of clinical oncology/college of American pathologists clinical practice guideline focused update. Archiv pathology Labor Med 142(11):1364–1382
Yeo B, Zabaglo L, Hills M, Dodson A, Smith I, Dowsett M (2015) Clinical utility of the ihc4+ c score in oestrogen receptor-positive early breast cancer: a prospective decision impact study. Br J Cancer 113(3):390–395
Zare A, Postovit L-M, Githaka JM (2021) Robust inflammatory breast cancer gene signature using nonparametric random forest analysis. Breast Cancer Res 23(1):1–6
Funding
The authors declare that no funds, grants or other support were received during the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception, design, data analysis and manuscript preparation. All authors reviewed and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
All authors of this work declare that there are no conflicts of interest in the authorship or publication of this contribution.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Al Masry, Z., Pic, R., Dombry, C. et al. A new methodology to predict the oncotype scores based on clinico-pathological data with similar tumor profiles. Breast Cancer Res Treat 203, 587–598 (2024). https://doi.org/10.1007/s10549-023-07141-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10549-023-07141-5