Skip to main content

Advertisement

Log in

Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms

  • Breast
  • Published:
European Radiology Aims and scope Submit manuscript

Abstract

Objectives

To evaluate the performance of interpretable machine learning models in predicting breast cancer molecular subtypes.

Methods

We retrospectively enrolled 600 patients with invasive breast carcinoma between 2012 and 2019. The patients were randomly divided into a training (n = 450) and a testing (n = 150) set. The five constructed models were trained based on clinical characteristics and imaging features (mammography and ultrasonography). The model classification performances were evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), accuracy, sensitivity, and specificity. Shapley additive explanation (SHAP) technique was used to interpret the optimal model output. Then we choose the optimal model as the assisted model to evaluate the performance of another four radiologists in predicting the molecular subtype of breast cancer with or without model assistance, according to mammography and ultrasound images.

Results

The decision tree (DT) model performed the best in distinguishing triple-negative breast cancer (TNBC) from other breast cancer subtypes, yielding an AUC of 0.971; accuracy, 0.947; sensitivity, 0.905; and specificity, 0.941. The accuracy, sensitivity, and specificity of all radiologists in distinguishing TNBC from other molecular subtypes and Luminal breast cancer from other molecular subtypes have significantly improved with the assistance of DT model. In the diagnosis of TNBC versus other subtypes, the average sensitivity, average specificity, and average accuracy of less experienced and more experienced radiologists increased by 0.090, 0.125, 0.114, and 0.060, 0.090, 0.083, respectively. In the diagnosis of Luminal versus other subtypes, the average sensitivity, average specificity, and average accuracy of less experienced and more experienced radiologists increased by 0.084, 0.152, 0.159, and 0.020, 0.100, 0.048.

Conclusions

This study established an interpretable machine learning model to differentiate between breast cancer molecular subtypes, providing additional values for radiologists.

Key Points

Interpretable machine learning model (MLM) could help clinicians and radiologists differentiate between breast cancer molecular subtypes.

The Shapley additive explanations (SHAP) technique can select important features for predicting the molecular subtypes of breast cancer from a large number of imaging signs.

Machine learning model can assist radiologists to evaluate the molecular subtype of breast cancer to some extent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Abbreviations

AUC:

Area under curve

BI-RADS:

Breast Imaging Reporting and Data System

CC:

Craniocaudal

DT:

Decision tree

ER:

Estrogen receptor

FISH:

Fluorescence in situ hybridization

HER2:

Human epidermal growth factor receptor 2

ICC:

Intraclass correlation coefficient

IHC:

Immunohistochemistry

KNN:

k-Nearest neighbor

LR:

Logistic regression

ML:

Machine learning

MLM:

Machine learning model

MLO:

Mediolateral oblique

NB:

Naive Bayes

NFL:

No free lunch

PR:

Progesterone receptor

RF:

Random forest

ROC:

Receiver operating characteristic

SHAP:

Shapley additive explanations

SVM:

Support vector machine

TNBC:

Triple-negative breast cancer

US:

Ultrasonography

References

  1. Vogell A, Evans ML (2019) Cancer screening in women. Obstet Gynecol Clin N Am 46(3):485–499

    Article  Google Scholar 

  2. Harbeck N, Gnant M (2017) Breast cancer. Lancet 389(10074):1134–1150

    Article  PubMed  Google Scholar 

  3. Zardavas D, Irrthum A, Swanton C et al (2015) Clinical management of breast cancer heterogeneity. Nat Rev Clin Oncol 12(7):381–394

    Article  CAS  PubMed  Google Scholar 

  4. Goldhirsch A, Wood WC, Coates AS et al (2011) Strategies for subtypes–dealing with the diversity of breast cancer: highlights of the St. Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2011. Ann Oncol 22(8):1736–1747

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Toss A, Cristofanilli M (2015) Molecular characterization and targeted therapeutic approaches in breast cancer. Breast Cancer Res 17(1):60

    Article  PubMed  PubMed Central  Google Scholar 

  6. Huber KE, Carey LA, Wazer DE (2009) Breast cancer molecular subtypes in patients with locally advanced disease: impact on prognosis, patterns of recurrence, and response to therapy. Semin Radiat Oncol 19(4):204–210

    Article  PubMed  Google Scholar 

  7. McDonald ES, Clark AS, Tchou J et al (2016) Clinical diagnosis and management of breast cancer. J Nucl Med 57(Suppl 1):9S-16S

    Article  PubMed  Google Scholar 

  8. Seely JM, Alhassan T (2018) Screening for breast cancer in 2018-what should we be doing today? Curr Oncol 25(Suppl 1):S115–S124

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Niell BL, Freer PE, Weinfurtner RJ et al (2017) Screening for breast cancer. Radiol Clin N Am 55(6):1145–1162

    Article  PubMed  Google Scholar 

  10. Taneja S, Evans AJ, Rakha EA et al (2008) The mammographic correlations of a new immunohistochemical classification of invasive breast cancer. Clin Radiol 63(11):1228–1235

    Article  CAS  PubMed  Google Scholar 

  11. Rashmi S, Kamala S, Murthy SS et al (2018) Predicting the molecular subtype of breast cancer based on mammography and ultrasound findings. Indian J Radiol Imaging 28(3):354–361

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. An YY, Kim SH, Kang BJ et al (2015) Breast cancer in very young women (<30 years): Correlation of imaging features with clinicopathological features and immunohistochemical subtypes. Eur J Radiol 84(10):1894–1902

    Article  PubMed  Google Scholar 

  13. Deo RC (2015) Machine learning in medicine. Circulation 132(20):1920–1930

    Article  PubMed  PubMed Central  Google Scholar 

  14. Choy G, Khalilzadeh O, Michalski M et al (2018) Current applications and future impact of machine learning in radiology. Radiology 288(2):318–328

    Article  PubMed  Google Scholar 

  15. Guo Y, Hu Y, Qiao M et al (2018) Radiomics analysis on ultrasound for prediction of biologic behavior in breast invasive ductal carcinoma. Clin Breast Cancer 18(3):e335–e344

    Article  PubMed  Google Scholar 

  16. Son J, Lee SE, Kim EK et al (2020) Prediction of breast cancer molecular subtypes using radiomics signatures of synthetic mammography from digital breast tomosynthesis. Sci Rep 10(1):21566

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Elshawi R, Al-Mallah MH, Sakr S (2019) On the interpretability of machine learning-based model for predicting hypertension. BMC Medical Inform Decis Mak 19(1):146

    Article  Google Scholar 

  18. Lebedev AV, Westman E, Van Westen GJP et al (2014) Random forest ensembles for detection and prediction of Alzheimer’s disease with a good between-cohort robustness. Neuroimage Clin 6:115–125

    Article  PubMed  PubMed Central  Google Scholar 

  19. Rodríguez-Pérez R, Bajorath J (2020) Interpretation of compound activity predictions from complex machine learning models using local approximations and Shapley values. J Med Chem 63(16):8761–8777

    Article  PubMed  Google Scholar 

  20. Rao AA, Feneis J, Lalonde C et al (2016) A pictorial review of changes in the BI-RADS Fifth Edition. Radiographics 36(3):623–639

    Article  PubMed  Google Scholar 

  21. Elkin EB, Klem ML, Gonzales AM et al (2011) Characteristics and outcomes of breast cancer in women with and without a history of radiation for Hodgkin’s lymphoma: a multi-institutional, matched cohort study. J Clin Oncol 29(18):2466–2473

    Article  PubMed  PubMed Central  Google Scholar 

  22. Rakha EA, Green AR (2017) Molecular classification of breast cancer: what the pathologist needs to know. Pathology 49(2):111–119

    Article  CAS  PubMed  Google Scholar 

  23. Zhang K, Zhu Q, Sheng D et al (2020) A new model incorporating axillary ultrasound after neoadjuvant chemotherapy to predict non-sentinel lymph node metastasis in invasive breast cancer. Cancer Manag Res 12:965–972

    Article  PubMed  PubMed Central  Google Scholar 

  24. Prieto L, Lamarca R, Casado A et al (1997) The evaluation of agreement on continuous variables by the intraclass correlation coefficient. J Epidemiol Community Health 51(5):579–581

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Boisserie-Lacroix M, Mac GG, Debled M et al (2012) Radiological features of triple-negative breast cancers (73 cases). Diagn Interv Imaging 93(3):183–190

    Article  CAS  PubMed  Google Scholar 

  26. Wang Y, Ikeda DM, Narasimhan B et al (2008) Estrogen receptor-negative invasive breast cancer: imaging features of tumors with and without human epidermal growth factor receptor type 2 overexpression. Radiology 246(2):367–375

    Article  PubMed  Google Scholar 

  27. Ko ES, Lee BH, Kim HA et al (2010) Triple-negative breast cancer: correlation between imaging and pathological findings. Eur Radiol 20(5):1111–1117

    Article  PubMed  Google Scholar 

  28. Huang J, Lin Q, Cui C et al (2020) Correlation between imaging features and molecular subtypes of breast cancer in young women (≤30 years old). Jpn J Radiol 38(11):1062–1074

    Article  PubMed  Google Scholar 

  29. Lee SH, Chang JM, Shin SU et al (2017) Imaging features of breast cancers on digital breast tomosynthesis according to molecular subtype: association with breast cancer detection. Br J Radiol 90(1080):20170470

    Article  PubMed  PubMed Central  Google Scholar 

  30. Killelea BK, Chagpar AB, Bishop J et al (2013) Is there a correlation between breast cancer molecular subtype using receptors as surrogates and mammographic appearance? Ann Surg Oncol 20(10):3247–3253

    Article  PubMed  Google Scholar 

  31. Zhou J, Tan H, Bai Y et al (2019) Evaluating the HER-2 status of breast cancer using mammography radiomics features. Eur J Radiol 121:108718

    Article  PubMed  Google Scholar 

  32. Fleury E, Marcomini K (2019) Performance of machine learning software to classify breast lesions using BI-RADS radiomic features on ultrasound images. Eur Radiol Exp 3(1):34

    Article  PubMed  PubMed Central  Google Scholar 

  33. Ma W, Zhao Y, Ji Y et al (2019) Breast cancer molecular subtype prediction by mammographicradiomic features. Acad Radiol 26(2):196–201

    Article  PubMed  Google Scholar 

  34. Li H, Zhu Y, Burnside ES et al (2016) Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set. NPJ Breast Cancer 2:16012

    Article  PubMed  PubMed Central  Google Scholar 

  35. Zhang Y, Xin Y, Li Q et al (2017) Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications. Biomed Eng Online 16(1):125

    Article  PubMed  PubMed Central  Google Scholar 

  36. Wu M, Zhong X, Peng Q et al (2019) Prediction of molecular subtypes of breast cancer using BI-RADS features based on a “white box” machine learning approach in a multi-modal imaging setting. Eur J Radiol 114:175–184

    Article  PubMed  Google Scholar 

  37. Tagliafico AS, Bignotti B, Rossi F et al (2019) Breast cancer Ki-67 expression prediction by digital breast tomosynthesis radiomics features. Eur Radiol Exp 3(1):36

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China (82171929), National Key R&D Program of China (2019YFC0121903, 2019YFC0117301) and National Natural Science Foundation of Guangdong Province, China (2019A1515011168, 2018A0303130215).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Genggeng Qin or Weiguo Chen.

Ethics declarations

Guarantor

The scientific guarantor of this publication is Weiguo Chen.

Conflict of interest

The authors declare no competing interests.

Statistics and biometry

No complex statistical methods were necessary for this paper.

Informed consent

Retrospective study and waived the need for written informed consent.

Ethical approval

The institutional review board of Nanfang Hospital, Southern Medical University approval was obtained.

Methodology

• retrospective

• observational

• performed at one institution

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 513 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, M., Liu, R., Wen, C. et al. Predicting the molecular subtype of breast cancer and identifying interpretable imaging features using machine learning algorithms. Eur Radiol 32, 1652–1662 (2022). https://doi.org/10.1007/s00330-021-08271-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00330-021-08271-4

Keywords

Navigation