Abstract
Objective
Breast cancer is the most prevalent cancer and is second leading cause of death from malignancy among women worldwide. In addition to tumor factors, the host characteristics of tumors have been paid more and more attention by the medical community. This study aimed to develop a breast cancer prediction model for the Chinese population using clinical and biochemical characteristics.
Methods
This is a retrospective study. From 2012 to 2021, we selected 19,751 patients with breast diseases from the Guangdong Hospital of Traditional Chinese Medicine, which included 5660 patients with breast cancer and 14,091 patients with benign breast diseases—75% of patients were randomly assigned to the training group and 25% to the test group using a total of 34 clinical and biochemical characteristics. Significant clinical signs were investigated, and logistic regression with recursive feature elimination (RFE) model was used to develop a prediction model for distinguishing benign from malignant breast diseases. The prediction model's accuracy, precision, sensitivity, specificity, and area under the ROC curve (AUC) were calculated.
Results
Clinical statistics demonstrated that the prediction model comprised 19 clinical characteristics had statistical separability in both the training group and the test group, as well as good sensitivity and prediction.
Conclusions
This model based on biochemical parameters demonstrates a significant predictive effect for breast cancer and may be useful as a reference for invasive tissue biopsy in patients undergoing BI-RADS 3 and 4A breast imaging.
Similar content being viewed by others
References
Akhtar M, Haider A, Rashid S, Al-Nabet ADMH (2019) Paget’s “seed and soil” theory of cancer metastasis: an idea whose time has come. Adv AnatPathol 26(1):69–74. https://doi.org/10.1097/PAP.0000000000000219. (PMID: 30339548)
Bai J, Kwok WC, Thiery JP (2019) Traditional Chinese Medicine and regulatory roles on epithelial-mesenchymal transitions. Chin Med 14:34
Bent CK, Bassett LW, D’Orsi CJ, Sayre JW (2010) The positive predictive value of BI-RADS microcalcification descriptors and final assessment categories. AJR Am J Roentgenol 194(5):1378–1383. https://doi.org/10.2214/AJR.09.3423
Bhadra T, Mallik S, Hasan N, Zhao Z (2022) Comparison of five supervised feature selection algorithms leading to top features and gene signatures from multi-omics data in cancer. BMC Bioinform 23(Suppl 3):153. https://doi.org/10.1186/s12859-022-04678-y. (PMID:35484501; PMCID:PMC9052461)
Burnside ES, Rubin DL, Fine JP, Shachter RD, Sisney GA, Leung WK (2006) Bayesian network to predict breast cancer risk of mammographic microcalcifications and reduce number of benign biopsy results: initial experience. Radiology 240(3):666–673. https://doi.org/10.1148/radiol.2403051096
Carter PR, Uppal H, Chandran KR, Bainei KR, Potluri R (2017) Patients with a diagnosis of hyperlipidaemia have a reduced risk of developing breast cancer and lower mortality rates: a large retrospective longitudinal cohort study from the UK ACALM registry. Eur Heart J 38(Suppl 1):644
Chan DS, Vieira AR, Aune D, Bandera EV, Greenwood DC, McTiernan A, Navarro Rosenblatt D, Thune I, Vieira R, Norat T (2014) Body mass index and survival in women with breast cancer-systematic literature review and meta-analysis of 82 follow-up studies. Ann Oncol 25(10):1901–1914. https://doi.org/10.1093/annonc/mdu042
Cinti S, Mitchell G, Barbatelli G, Murano I, Ceresi E, Faloia E, Wang S, Fortier M, Greenberg AS, Obin MS (2005) Adipocyte death defines macrophage localization and function in adipose tissue of obese mice and humans. J Lipid Res 46(11):2347–2355. https://doi.org/10.1194/jlr.M500294-JLR200
Ferlay J, Colombet M, Soerjomataram I, Parkin DM, Piñeros M, Znaor A, Bray F (2021) Cancer statistics for the year 2020: an overview. Int J Cancer. https://doi.org/10.1002/ijc.33588. (PMID: 33818764)
Ferreira P, Fonseca NA, Dutra I, Woods R, Burnside E (2015) Predicting malignancy from mammography findings and image-guided core biopsies. Int J Data Min Bioinform 11(3):257–276. https://doi.org/10.1504/IJDMB.2015.067319
Gail MH, Brinton LA, Byar DP, Corle DK, Green SB, Schairer C, Mulvihill JJ (1989) Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst 81:1879–1886
Ghani MU, Alam TM, Jaskani FH. Comparison of classification models for early prediction of breast cancer. 2019 International Conference on Innovative Computing (ICIC); Lahore, Pakistan: IEEE; 2019. p. 1–6. doi: https://doi.org/10.1109/ICIC48496.2019.8966691.
Granitto PM, Furlanello C, Biasioli F, Gasperi F (2006) Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemom Intell Lab Syst 83:83–90. https://doi.org/10.1016/j.chemolab.2006.01.007
Ha M, Sung J, Song YM (2009) Serum total cholesterol and the risk of breast cancer in postmenopausal Korean women. Cancer Causes Control 20(7):1055–1060. https://doi.org/10.1007/s10552-009-9301-7
Hajiloo M, Damavandi B, Hooshsadat M, Sangi F et al (2013) Breast cancer prediction using genome wide single nucleotide polymorphism data. BMC Bioinform 14(Suppl 13):S3. https://doi.org/10.1186/1471-2105-14-S13-S3
His M, Dartois L, Fagherazzi G, Boutten A, Dupre T, Mesrine S, Boutron-Ruault MC, Clavel-Chapelon F, Dossus L (2017) Associations between serum lipids and breast cancer incidence and survival in the E3N prospective cohort study. Cancer Causes Control 28(1):77–88. https://doi.org/10.1007/s10552-016-0832-4
Hu J, La Vecchia C, de Groh M, Negri E, Morrison H, Mery L (2012) Canadian Cancer registries epidemiology research G: dietary cholesterol intake and cancer. Ann Oncol 23(2):491–500. https://doi.org/10.1093/annonc/mdr155
Hursting SD, Dunlap SM (2012) Obesity, metabolic dysregulation, and cancer: a growing concern and an inflammatory (and microenvironmental) issue. Ann N Y Acad Sci 1271:82–87. https://doi.org/10.1111/j.1749-6632.2012.06737.x
Ingram DM, Roberts A, Nottage EM (1992) Host factors and breast cancer growth characteristics. Eur J Cancer 28A(6–7):1153–1161. https://doi.org/10.1016/0959-8049(92)90477-j. (PMID: 1627387)
Islam MM, Yang HC, Nguyen PA, Poly TN, Huang CW, Kekade S, Khalfan AM, Debnath T, Li YJ, Abdul SS (2017) Exploring association between statin use and breast cancer risk: an updated meta-analysis. Arch GynecolObstet 296(6):1043–1053. https://doi.org/10.1007/s00404-017-4533-3
Johnson KE, Siewert KM, Klarin D, Damrauer SM, Million Veteran Program VA, Chang KM, Tsao PS, Assimes TL, Maxwell KN, Voight BF (2020) The relationship between circulating lipids and breast cancer risk: a Mendelian randomization study. PLoS Med 17(9):e1003302. https://doi.org/10.1371/journal.pmed.1003302. (PMID: 32915777; PMCID: PMC7485834)
Jothi N, Husain W (2015) Data mining in healthcare-a review. Procedia Comput Sci 72:306–313. https://doi.org/10.1016/j.procs.2015.12.145
Jovic A., Brkic K., Bogunovic N. A Review of Feature Selection Methods with Applications; Proceedings of the 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO); Opatija, Croatia. 25–29 May 2015; pp. 1200–1205.
Kuzu OF, Noory MA, Robertson GP (2016) The role of cholesterol in cancer. Cancer Res 76(8):2063–2070. https://doi.org/10.1158/0008-5472.CAN-15-2613
Li C, Yang L, Zhang D, Jiang W (2016) Systematic review and meta-analysis suggest that dietary cholesterol intake increases risk of breast cancer. Nutr Res 36(7):627–635. https://doi.org/10.1016/j.nutres.2016.04.009
Maxwell K, Nathanson K (2013) Common breast cancer risk variants in the post-COGS era: a comprehensive review. Breast Cancer Res 15(6):212. https://doi.org/10.1186/bcr3591
McCarthy AM, Keller B, Kontos D, Boghossian L, McGuire E, Bristol M et al (2015) The use of the Gail model, body mass index and SNPs to predict breast cancer among women with abnormal (BI-RADS 4) mammograms. Breast Cancer Res. https://doi.org/10.1186/s13058-014-0509-4
Must A, Spadano J, Coakley EH, Field AE, Colditz G, Dietz WH (1999) The disease burden associated with overweight and obesity. JAMA 282(16):1523–1529. https://doi.org/10.1001/jama.282.16.1523
Niraula S, Ocana A, Ennis M, Goodwin PJ (2012) Body size and breast cancer prognosis in relation to hormone receptor and menopausal status: a meta-analysis. Breast Cancer Res Treat 134(2):769–781. https://doi.org/10.1007/s10549-012-2073-x
Oyewola D, Hakimi D, Adeboye K, Shehu MD (2016) Using five machine learning for breast cancer biopsy predictions based on mammographic diagnosis. Int J Eng Technol 2(4):142–145. https://doi.org/10.19072/ijet.280563
Pal Choudhury P, Wilcox AN, Brook MN, Zhang Y, Ahearn T, Orr N, Coulson P, Schoemaker MJ, Jones ME, Gail MH, Swerdlow AJ, Chatterjee N, Garcia-Closas M (2020) Comparative validation of breast cancer risk prediction models and projections for future risk stratification. J Natl Cancer Inst 112(3):278–285. https://doi.org/10.1093/jnci/djz113.PMID:31165158;PMCID:PMC7073933
Rockhill B, Spiegelman D, Byrne C, Hunter DJ, Colditz GA (2001) Validation of the Gail et al. model of breast cancer risk prediction and implications for chemoprevention. J Natl Cancer Inst 93:358–366
Smigal C, Jemal A, Ward E, Cokkinides V, Smith R, Howe HL, Thun M (2006) Trends in breast cancer by race and ethnicity: update 2006. CA Cancer J Clin 56:168–183
Stark GF, Hart GR, Nartowt BJ, Deng J. Predicting breast cancer risk using personal health data and machine learning models. PLoS One. 2019 Dec 27;14(12):e0226765. doi: https://doi.org/10.1371/journal.pone.0226765. PMID: 31881042; PMCID: PMC6934281.
Sun YS, Zhao Z, Yang ZN, Xu F, Lu HJ, Zhu ZY, Shi W, Jiang J, Yao PP, Zhu HP (2017) Risk factors and preventions of breast cancer. Int J Biol Sci 13(11):1387–1397. https://doi.org/10.7150/ijbs.21635.PMID:29209143;PMCID:PMC5715522
Williams K, Idowu PA, Balogun JA, Oluwaranti AI (2015) Breast cancer risk prediction using data mining classification techniques. Trans Netw Commun 3(2):1–11. https://doi.org/10.14738/tnc.32.662
Wu QJ, Tu C, Li YY, Zhu J, Qian KQ, Li WJ, Wu L (2015) Statin use and breast cancer survival and risk: a systematic review and meta-analysis. Oncotarget 6(40):42988–43004
Yin Z, Zhang J (2014) Operator functional state classification using least-square support vector machine based recursive feature elimination technique. Comput Methods Programs Biomed 113:101–115. https://doi.org/10.1016/j.cmpb.2013.09.007
You W, Yang Z, Ji G (2014) Feature selection for high-dimensional multi-category data using PLS-based local recursive feature elimination. Expert Syst Appl 41:1463–1475. https://doi.org/10.1016/j.eswa.2013.08.043
Zhai B, Zhang N, Han X, Li Q, Zhang M, Chen X, Li G, Zhang R, Chen P, Wang W et al (2019) Molecular targets of beta-elemene, a herbal extract used in traditional Chinese medicine, and its potential role in cancer therapy: a review. Biomed Pharmacother 114:108812
Zhong S, Zhang X, Chen L, Ma T, Tang J, Zhao J (2015) Statin use and mortality in cancer patients: systematic review and meta-analysis of observational studies. Cancer Treat Rev 41(6):554–567. https://doi.org/10.1016/j.ctrv.2015.04.005
Funding
General Project of National Natural Science Foundation of China, 82274513, based on the connection between central CRH neurons and peripheral autonomic nerves to explore the new mechanism of the pathogenesis of breast cancer "caused by depression" and the research on the role of soothing the liver and relieving depression method, 2023/01–2026/12, under research, presided over by Qianjun Chen.
Author information
Authors and Affiliations
Contributions
Conception and design of the research: QC, WZ. Acquisition of data: LG, QC, JH. Analysis and interpretation of the data: YX, WZ, XL. Statistical analysis: YX, XL, JH. Obtaining financing: QC. Writing of the manuscript: LG. Critical revision of the manuscript for intellectual content: WZ. All authors read and approved the final draft.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, L., Xie, Y., He, J. et al. Breast cancer prediction model based on clinical and biochemical characteristics: clinical data from patients with benign and malignant breast tumors from a single center in South China. J Cancer Res Clin Oncol 149, 13257–13269 (2023). https://doi.org/10.1007/s00432-023-05181-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00432-023-05181-4