Abstract
In this study, we aimed to classify breast cancer patients into four molecular subtypes: Luminal A, Luminal B, Her-2, and triple negative, using six machine learning techniques: logistic regression (LR), naive Bayes (NB), k-nearest neighbors (KNN), support vector machine (SVM), decision tree (DT), and random forest (RF). We evaluated the performance of each model using several evaluation metrics, including accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC). The dataset used in this study was obtained in real-time from breast cancer patients, and includes immunohistochemistry (IHC) marker reports. Our results show that all six models achieved high accuracy and AUC scores, indicating their effectiveness in classifying breast cancer patients into molecular subtypes. However, the random forest model outperformed the other models with an AUC score of + 0.95, followed by Logistic Regression with an AUC score of 0.91. These findings demonstrate the potential of machine learning techniques in accurately classifying breast cancer patients into molecular subtypes, which could inform clinical decision-making and personalized treatment strategies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breast Cancer Molecular Subtypes, cancercenter.com, last accessed on (18 Feb 2023)
Liu, Y., Lin, Y., Yang, X., Dong, Y., Li, X., Liu, J.: Accurate classification of breast cancer subtypes using deep neural network and machine learning algorithms. BMC Med. Inform. Decis. Mak. 21(1), 1–10 (2021)
Smaili, H., Touns, N., Zouaki, H., et al.: Identification of breast cancer molecular subtypes using machine learning techniques based on optimized gene expression data. Cancer Med. 10(1), 341–352 (2021)
Bhardwaj, M., Srivastava, S., Singh, V., et al.: Breast cancer molecular subtypes prediction by machine learning methods using transcriptomic data. BMC Bioinformatics. (Suppl 13), 425 (2020–2021)
Zhang, Y., Zhao, Y., Huang, J., Yu, L.: Identification of triple-negative breast cancer subtypes and preoperative nomogram using a machine learning approach based on radiomics and T2*-weighted imaging. Front. Oncol. 10, 1057 (2020)
Wang, Z., Yang, Y., Yang, D., et al.: Deep learning models for breast cancer classification using gene expression profiles. BMC Med Genomics 12(Suppl 5), 108 (2019)
Hon, J.D., Singh, B., Sahin, A., et al.: Breast Cancer Molecular Subtypes: From TNBC to QNBC. Am J Cancer Res (2019)
Liu, C., Shi, W., Zhang, L., Wu, L., Shen, Y.: A support vector machine based classifier for breast cancer subtype prediction using RNA-Seq data. IEEE Access 7, 116364–116372 (2019)
Xu, J., Liu, J., Xue, F., Li, J.: Application of support vector machine in molecular classification of breast cancer based on gene expression data. Journal of Biomedical Informatics 91, 103136 (2019)
Zhang, Y., Xu, Y., Wang, W., Liu, Q.: Application of decision tree algorithm in the classification of breast cancer in young women. Journal of Healthcare Engineering 2019, 1–10 (2019)
Chen, H., Li, Q., Liu, X., Wu, X., Shi, H.: Deep neural network model for breast cancer diagnosis assistance: A review and a vision. Artif. Intell. Med. 89, 1–9 (2018)
Dillon, D.A., Hassell, L.A.: Molecular subtypes of breast cancer: a review for breast radiologists. Journal of breast imaging 1(2), 87–96 (2018). https://doi.org/10.1093/jbi/wby012
Yamashita, R., Nakano, K., Ueno, H., Mitsuyama, S., Nishimura, R.: Deep learning of immunohistochemistry images reveals nuclear features for clinical outcome prediction of patients with breast cancer. Sci. Rep. 8(1), 1–9 (2018)
Zhang, Y., Zhu, W., Yang, L., Wu, J., Luo, X.: Breast cancer molecular subtyping using deep features learned from tumor histopathological images and its association with prognosis. Journal of Clinical Oncology 36(15_suppl), e12519–e12519 (2018)
Jhang, X.: Molecular classification of breast cancer: relevance and challenges. The J. Pathol. Translat. Medi. 51(1), 1–12 (2017)
Ades, F., Zardavas, D., Bozovic-Spasojevic, I., et al.: Molecular classification of breast cancer: where do we stand? Lancet Oncol. 15(3), e216–e226 (2014). https://doi.org/10.1016/S1470-2045(13)70540-1
Prat, A., Parker, J.S., Perou, C.M.: Molecular classification of breast cancer. Oncologist 18(4), 326–335 (2013)
Prat, A., Perou, C.M., Mamounas, E.P.: Molecular classification of breast cancer: what the clinician needs to know. The surgeon 10(6), 336–342 (2012)
Shi, Y., Huang, H.C., Zhou, H., et al.: Breast cancer classification using machine learning algorithms with a microarray gene expression signature. Cancer Inform. 1, 108–116 (2005)
Sørlie, T., et al.: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. 98(19), 10869–10874 (2001)
Perou, C.M., et al.: Molecular portraits of human breast tumours. Nature 406(6797), 747–752 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Aggarwal, A., Sharma, A. (2024). Predictive Modeling of Breast Cancer Subtypes Using Machine Learning Algorithms. In: García Márquez, F.P., Jamil, A., Ramirez, I.S., Eken, S., Hameed, A.A. (eds) Computing, Internet of Things and Data Analytics. ICCIDA 2023. Studies in Computational Intelligence, vol 1145. Springer, Cham. https://doi.org/10.1007/978-3-031-53717-2_34
Download citation
DOI: https://doi.org/10.1007/978-3-031-53717-2_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53716-5
Online ISBN: 978-3-031-53717-2
eBook Packages: EngineeringEngineering (R0)