Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms

Ma, Mengwei; Liu, Renyi; Wen, Chanjuan; Xu, Weimin; Xu, Zeyuan; Wang, Sina; Wu, Jiefang; Pan, Derun; Zheng, Bowen; Qin, Genggeng; Chen, Weiguo

doi:10.1007/s00330-021-08271-4

Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms

Breast
Published: 13 October 2021

Volume 32, pages 1652–1662, (2022)
Cite this article

European Radiology Aims and scope Submit manuscript

Mengwei Ma¹^na1,
Renyi Liu¹^na1,
Chanjuan Wen¹,
Weimin Xu¹,
Zeyuan Xu¹,
Sina Wang¹,
Jiefang Wu¹,
Derun Pan¹,
Bowen Zheng¹,
Genggeng Qin ORCID: orcid.org/0000-0002-7563-3924¹ &
…
Weiguo Chen¹

3810 Accesses
29 Citations
3 Altmetric
Explore all metrics

Abstract

Objectives

To evaluate the performance of interpretable machine learning models in predicting breast cancer molecular subtypes.

Methods

We retrospectively enrolled 600 patients with invasive breast carcinoma between 2012 and 2019. The patients were randomly divided into a training (n = 450) and a testing (n = 150) set. The five constructed models were trained based on clinical characteristics and imaging features (mammography and ultrasonography). The model classification performances were evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), accuracy, sensitivity, and specificity. Shapley additive explanation (SHAP) technique was used to interpret the optimal model output. Then we choose the optimal model as the assisted model to evaluate the performance of another four radiologists in predicting the molecular subtype of breast cancer with or without model assistance, according to mammography and ultrasound images.

Results

The decision tree (DT) model performed the best in distinguishing triple-negative breast cancer (TNBC) from other breast cancer subtypes, yielding an AUC of 0.971; accuracy, 0.947; sensitivity, 0.905; and specificity, 0.941. The accuracy, sensitivity, and specificity of all radiologists in distinguishing TNBC from other molecular subtypes and Luminal breast cancer from other molecular subtypes have significantly improved with the assistance of DT model. In the diagnosis of TNBC versus other subtypes, the average sensitivity, average specificity, and average accuracy of less experienced and more experienced radiologists increased by 0.090, 0.125, 0.114, and 0.060, 0.090, 0.083, respectively. In the diagnosis of Luminal versus other subtypes, the average sensitivity, average specificity, and average accuracy of less experienced and more experienced radiologists increased by 0.084, 0.152, 0.159, and 0.020, 0.100, 0.048.

Conclusions

This study established an interpretable machine learning model to differentiate between breast cancer molecular subtypes, providing additional values for radiologists.

Key Points

• Interpretable machine learning model (MLM) could help clinicians and radiologists differentiate between breast cancer molecular subtypes.

• The Shapley additive explanations (SHAP) technique can select important features for predicting the molecular subtypes of breast cancer from a large number of imaging signs.

• Machine learning model can assist radiologists to evaluate the molecular subtype of breast cancer to some extent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inter- and intra-observer variability of qualitative visual breast-composition assessment in mammography among Japanese physicians: a first multi-institutional observer performance study in Japan

Article 15 April 2024

Discrimination between HER2-overexpressing, -low-expressing, and -zero-expressing statuses in breast cancer using multiparametric MRI-based radiomics

Article 16 February 2024

Multi-modality radiomics model predicts axillary lymph node metastasis of breast cancer using MRI and mammography

Article 10 February 2024

Abbreviations

AUC:: Area under curve
BI-RADS:: Breast Imaging Reporting and Data System
CC:: Craniocaudal
DT:: Decision tree
ER:: Estrogen receptor
FISH:: Fluorescence in situ hybridization
HER2:: Human epidermal growth factor receptor 2
ICC:: Intraclass correlation coefficient
IHC:: Immunohistochemistry
KNN:: k-Nearest neighbor
LR:: Logistic regression
ML:: Machine learning
MLM:: Machine learning model
MLO:: Mediolateral oblique
NB:: Naive Bayes
NFL:: No free lunch
PR:: Progesterone receptor
RF:: Random forest
ROC:: Receiver operating characteristic
SHAP:: Shapley additive explanations
SVM:: Support vector machine
TNBC:: Triple-negative breast cancer
US:: Ultrasonography

References

Vogell A, Evans ML (2019) Cancer screening in women. Obstet Gynecol Clin N Am 46(3):485–499
Article Google Scholar
Harbeck N, Gnant M (2017) Breast cancer. Lancet 389(10074):1134–1150
Article PubMed Google Scholar
Zardavas D, Irrthum A, Swanton C et al (2015) Clinical management of breast cancer heterogeneity. Nat Rev Clin Oncol 12(7):381–394
Article CAS PubMed Google Scholar
Goldhirsch A, Wood WC, Coates AS et al (2011) Strategies for subtypes–dealing with the diversity of breast cancer: highlights of the St. Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2011. Ann Oncol 22(8):1736–1747
Article CAS PubMed PubMed Central Google Scholar
Toss A, Cristofanilli M (2015) Molecular characterization and targeted therapeutic approaches in breast cancer. Breast Cancer Res 17(1):60
Article PubMed PubMed Central Google Scholar
Huber KE, Carey LA, Wazer DE (2009) Breast cancer molecular subtypes in patients with locally advanced disease: impact on prognosis, patterns of recurrence, and response to therapy. Semin Radiat Oncol 19(4):204–210
Article PubMed Google Scholar
McDonald ES, Clark AS, Tchou J et al (2016) Clinical diagnosis and management of breast cancer. J Nucl Med 57(Suppl 1):9S-16S
Article PubMed Google Scholar
Seely JM, Alhassan T (2018) Screening for breast cancer in 2018-what should we be doing today? Curr Oncol 25(Suppl 1):S115–S124
Article CAS PubMed PubMed Central Google Scholar
Niell BL, Freer PE, Weinfurtner RJ et al (2017) Screening for breast cancer. Radiol Clin N Am 55(6):1145–1162
Article PubMed Google Scholar
Taneja S, Evans AJ, Rakha EA et al (2008) The mammographic correlations of a new immunohistochemical classification of invasive breast cancer. Clin Radiol 63(11):1228–1235
Article CAS PubMed Google Scholar
Rashmi S, Kamala S, Murthy SS et al (2018) Predicting the molecular subtype of breast cancer based on mammography and ultrasound findings. Indian J Radiol Imaging 28(3):354–361
Article CAS PubMed PubMed Central Google Scholar
An YY, Kim SH, Kang BJ et al (2015) Breast cancer in very young women (<30 years): Correlation of imaging features with clinicopathological features and immunohistochemical subtypes. Eur J Radiol 84(10):1894–1902
Article PubMed Google Scholar
Deo RC (2015) Machine learning in medicine. Circulation 132(20):1920–1930
Article PubMed PubMed Central Google Scholar
Choy G, Khalilzadeh O, Michalski M et al (2018) Current applications and future impact of machine learning in radiology. Radiology 288(2):318–328
Article PubMed Google Scholar
Guo Y, Hu Y, Qiao M et al (2018) Radiomics analysis on ultrasound for prediction of biologic behavior in breast invasive ductal carcinoma. Clin Breast Cancer 18(3):e335–e344
Article PubMed Google Scholar
Son J, Lee SE, Kim EK et al (2020) Prediction of breast cancer molecular subtypes using radiomics signatures of synthetic mammography from digital breast tomosynthesis. Sci Rep 10(1):21566
Article CAS PubMed PubMed Central Google Scholar
Elshawi R, Al-Mallah MH, Sakr S (2019) On the interpretability of machine learning-based model for predicting hypertension. BMC Medical Inform Decis Mak 19(1):146
Article Google Scholar
Lebedev AV, Westman E, Van Westen GJP et al (2014) Random forest ensembles for detection and prediction of Alzheimer’s disease with a good between-cohort robustness. Neuroimage Clin 6:115–125
Article PubMed PubMed Central Google Scholar
Rodríguez-Pérez R, Bajorath J (2020) Interpretation of compound activity predictions from complex machine learning models using local approximations and Shapley values. J Med Chem 63(16):8761–8777
Article PubMed Google Scholar
Rao AA, Feneis J, Lalonde C et al (2016) A pictorial review of changes in the BI-RADS Fifth Edition. Radiographics 36(3):623–639
Article PubMed Google Scholar
Elkin EB, Klem ML, Gonzales AM et al (2011) Characteristics and outcomes of breast cancer in women with and without a history of radiation for Hodgkin’s lymphoma: a multi-institutional, matched cohort study. J Clin Oncol 29(18):2466–2473
Article PubMed PubMed Central Google Scholar
Rakha EA, Green AR (2017) Molecular classification of breast cancer: what the pathologist needs to know. Pathology 49(2):111–119
Article CAS PubMed Google Scholar
Zhang K, Zhu Q, Sheng D et al (2020) A new model incorporating axillary ultrasound after neoadjuvant chemotherapy to predict non-sentinel lymph node metastasis in invasive breast cancer. Cancer Manag Res 12:965–972
Article PubMed PubMed Central Google Scholar
Prieto L, Lamarca R, Casado A et al (1997) The evaluation of agreement on continuous variables by the intraclass correlation coefficient. J Epidemiol Community Health 51(5):579–581
Article CAS PubMed PubMed Central Google Scholar
Boisserie-Lacroix M, Mac GG, Debled M et al (2012) Radiological features of triple-negative breast cancers (73 cases). Diagn Interv Imaging 93(3):183–190
Article CAS PubMed Google Scholar
Wang Y, Ikeda DM, Narasimhan B et al (2008) Estrogen receptor-negative invasive breast cancer: imaging features of tumors with and without human epidermal growth factor receptor type 2 overexpression. Radiology 246(2):367–375
Article PubMed Google Scholar
Ko ES, Lee BH, Kim HA et al (2010) Triple-negative breast cancer: correlation between imaging and pathological findings. Eur Radiol 20(5):1111–1117
Article PubMed Google Scholar
Huang J, Lin Q, Cui C et al (2020) Correlation between imaging features and molecular subtypes of breast cancer in young women (≤30 years old). Jpn J Radiol 38(11):1062–1074
Article PubMed Google Scholar
Lee SH, Chang JM, Shin SU et al (2017) Imaging features of breast cancers on digital breast tomosynthesis according to molecular subtype: association with breast cancer detection. Br J Radiol 90(1080):20170470
Article PubMed PubMed Central Google Scholar
Killelea BK, Chagpar AB, Bishop J et al (2013) Is there a correlation between breast cancer molecular subtype using receptors as surrogates and mammographic appearance? Ann Surg Oncol 20(10):3247–3253
Article PubMed Google Scholar
Zhou J, Tan H, Bai Y et al (2019) Evaluating the HER-2 status of breast cancer using mammography radiomics features. Eur J Radiol 121:108718
Article PubMed Google Scholar
Fleury E, Marcomini K (2019) Performance of machine learning software to classify breast lesions using BI-RADS radiomic features on ultrasound images. Eur Radiol Exp 3(1):34
Article PubMed PubMed Central Google Scholar
Ma W, Zhao Y, Ji Y et al (2019) Breast cancer molecular subtype prediction by mammographicradiomic features. Acad Radiol 26(2):196–201
Article PubMed Google Scholar
Li H, Zhu Y, Burnside ES et al (2016) Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set. NPJ Breast Cancer 2:16012
Article PubMed PubMed Central Google Scholar
Zhang Y, Xin Y, Li Q et al (2017) Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications. Biomed Eng Online 16(1):125
Article PubMed PubMed Central Google Scholar
Wu M, Zhong X, Peng Q et al (2019) Prediction of molecular subtypes of breast cancer using BI-RADS features based on a “white box” machine learning approach in a multi-modal imaging setting. Eur J Radiol 114:175–184
Article PubMed Google Scholar
Tagliafico AS, Bignotti B, Rossi F et al (2019) Breast cancer Ki-67 expression prediction by digital breast tomosynthesis radiomics features. Eur Radiol Exp 3(1):36
Article PubMed PubMed Central Google Scholar

Download references

Funding

This work was supported by the National Natural Science Foundation of China (82171929), National Key R&D Program of China (2019YFC0121903, 2019YFC0117301) and National Natural Science Foundation of Guangdong Province, China (2019A1515011168, 2018A0303130215).

Author information

Mengwei Ma and Renyi Liu contributed equally to this work.

Authors and Affiliations

Department of Radiology, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, Guangdong, China
Mengwei Ma, Renyi Liu, Chanjuan Wen, Weimin Xu, Zeyuan Xu, Sina Wang, Jiefang Wu, Derun Pan, Bowen Zheng, Genggeng Qin & Weiguo Chen

Authors

Mengwei Ma
View author publications
You can also search for this author in PubMed Google Scholar
Renyi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chanjuan Wen
View author publications
You can also search for this author in PubMed Google Scholar
Weimin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zeyuan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Sina Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiefang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Derun Pan
View author publications
You can also search for this author in PubMed Google Scholar
Bowen Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Genggeng Qin
View author publications
You can also search for this author in PubMed Google Scholar
Weiguo Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Genggeng Qin or Weiguo Chen.

Ethics declarations

Guarantor

The scientific guarantor of this publication is Weiguo Chen.

Conflict of interest

The authors declare no competing interests.

Statistics and biometry

No complex statistical methods were necessary for this paper.

Informed consent

Retrospective study and waived the need for written informed consent.

Ethical approval

The institutional review board of Nanfang Hospital, Southern Medical University approval was obtained.

Methodology

• retrospective

• observational

• performed at one institution

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 513 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, M., Liu, R., Wen, C. et al. Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms. Eur Radiol 32, 1652–1662 (2022). https://doi.org/10.1007/s00330-021-08271-4

Download citation

Received: 25 June 2021
Revised: 25 June 2021
Accepted: 12 August 2021
Published: 13 October 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s00330-021-08271-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms