Skip to main content

Inductive Machine Learning and Feature Selection for Knowledge Extraction from Medical Data: Detection of Breast Lesions in MRI

  • Chapter
  • First Online:
Advances in Assistive Technologies

Abstract

This paper presents an approach to the problem of breast cancer diagnosis through the data analysis of magnetic mammography observations (MRi Data), developing corresponding hybrid classification models of patient cases into specific classes (e.g. Benign and Malignant). The aim of this work is the contribution of machine learning to the diagnostic process of breast cancer, offering a supportive intelligent tool that can be used by expert doctors as a medical decision-making aiding tool. Data were collected in collaboration with expert doctors and consist of 77 patient cases. The development of the presented classification models is a combination of inductive decision trees, clustering and feature selection techniques. Specifically, nine (9) different classification models were developed and evaluated by using statistical criteria, medical expert knowledge and where possible, using the Chi-Square statistical test. The performance achieved is considered encouraging for application in real-world practice, while further research is underway for associating MR imaging data with data from invasive examinations (biopsies).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. World Health Organisation, Latest global cancer data. IARC, 13–15, 2018.

    Google Scholar 

  2. R. Kraus, M. Espy, P. Magnelind, P. Volegov, Ultra-Low Field Nuclear Magnetic Resonance: A New MRI Regime (Oxford University Press)

    Google Scholar 

  3. C.S. Sureka, C. Armpilia, Radiation Biology for Medical Physicists, 1st edn. (CRC Press, Taylor & Francis Group, Florida, USA, 2017)

    Book  Google Scholar 

  4. M. B. Amin et al. (eds.), AJCC Cancer Staging Manual, 8th edn. (Springer International Publishing, 2017)

    Google Scholar 

  5. A. Aydiner, A. İgci, A. Soran (eds.), Breast Cancer: A Guide to Clinical Practice (Springer International Publishing, 2019)

    Google Scholar 

  6. N.I.R. Yassin, S. Omran, E.M.F. El Houby, H. Allam, Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: a systematic review. Comput. Methods Programs Biomed. 156, 25–45 (2018). https://doi.org/10.1016/j.cmpb.2017.12.012

    Article  Google Scholar 

  7. W. Yue, Z. Wang, H. Chen, A. Payne, X. Liu, Machine Learning with Applications in Breast Cancer Diagnosis and Prognosis. Designs 2(2), Art. no. 2, June 2018. https://doi.org/10.3390/designs2020013

  8. W.H. Wolberg, O.L. Mangasarian, Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc. Natl. Acad. Sci. USA 87(23), 9193–9196 (1990)

    Article  Google Scholar 

  9. Y. Wu, M.L. Giger, K. Doi, C.J. Vyborny, R.A. Schmidt, C.E. Metz, Artificial neural networks in mammography: application to decision making in the diagnosis of breast cancer. Radiology 187(1), 81–87 (1993). https://doi.org/10.1148/radiology.187.1.8451441

    Article  Google Scholar 

  10. D. Furundzic, M. Djordjevic, A. Jovicevic Bekic, Neural networks approach to early breast cancer detection. J. Syst. Archit. 44(8), 617–633, April 1998. https://doi.org/10.1016/S1383-7621(97)00067-2

  11. P.C. Pendharkar, J.A. Rodger, G.J. Yaverbaum, N. Herman, M. Benner, Association, statistical, mathematical and neural approaches for mining breast cancer patterns. Expert Syst. Appl. 17(3), 223–232 (1999). https://doi.org/10.1016/S0957-4174(99)00036-6

    Article  Google Scholar 

  12. S.-M. Chou, T.-S. Lee, Y.E. Shao, I.-F. Chen, Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines. Expert Syst. Appl. 27(1), 133–142 (2004). https://doi.org/10.1016/j.eswa.2003.12.013

    Article  Google Scholar 

  13. A. Sadaf, P. Crystal, A. Scaranelo, T. Helbich, Performance of computer-aided detection applied to full-field digital mammography in detection of breast cancers. Eur. J. Radiol. 77(3), 457–461 (2011). https://doi.org/10.1016/j.ejrad.2009.08.024

    Article  Google Scholar 

  14. A. Horsch, A. Hapfelmeier, M. Elter, Needs assessment for next generation computer-aided mammography reference image databases and evaluation studies. Int. J. CARS 6(6), 749 (2011). https://doi.org/10.1007/s11548-011-0553-9

    Article  Google Scholar 

  15. M. Dietzel et al., Artificial neural networks for differential diagnosis of breast lesions in mr-mammography: a systematic approach addressing the influence of network architecture on diagnostic performance using a large clinical database. Eur. J. Radiol. 81(7), 1508–1513 (2012). https://doi.org/10.1016/j.ejrad.2011.03.024

    Article  Google Scholar 

  16. A.E. Hassanien, T. Kim, Breast cancer MRI diagnosis approach using support vector machine and pulse coupled neural networks. J. Appl. Log. 10(4), 277–284 (2012). https://doi.org/10.1016/j.jal.2012.07.003

    Article  MathSciNet  Google Scholar 

  17. J. Milenković, K. Hertl, A. Košir, J. Žibert, J.F. Tasič, Characterization of spatiotemporal changes for the classification of dynamic contrast-enhanced magnetic-resonance breast lesions. Artif. Intell. Med. 58(2), 101–114 (2013). https://doi.org/10.1016/j.artmed.2013.03.002

    Article  Google Scholar 

  18. P.A.T. Baltzer, M. Dietzel, W.A. Kaiser, A simple and robust classification tree for differentiation between benign and malignant lesions in MR-mammography. Eur. Radiol. 23(8), 2051–2060 (2013). https://doi.org/10.1007/s00330-013-2804-3

    Article  Google Scholar 

  19. S. Hoffmann, J.D. Shutler, M. Lobbes, B. Burgeth, A. Meyer-Bäse, Automated analysis of non-mass-enhancing lesions in breast MRI based on morphological, kinetic, and spatio-temporal moments and joint segmentation-motion compensation technique. EURASIP J. Adv. Signal Process. 2013(1), 172 (2013). https://doi.org/10.1186/1687-6180-2013-172

    Article  Google Scholar 

  20. N. Bhooshan et al., Potential of Computer-Aided Diagnosis of High Spectral and Spatial Resolution (HiSS) MRI in the Classification of Breast Lesions. J. Magn. Reson. Imaging 39(1), 59–67 (2014). https://doi.org/10.1002/jmri.24145

    Article  Google Scholar 

  21. W.A. Weiss, M. Medved, G.S. Karczmar, M.L. Giger, Residual analysis of the water resonance signal in breast lesions imaged with high spectral and spatial resolution (HiSS) MRI: a pilot study. Med. Phys. 41(1), 012303 (2014). https://doi.org/10.1118/1.4851615

    Article  Google Scholar 

  22. S.C. Agner et al., Computerized image analysis for identifying triple-negative breast cancers and differentiating them from other molecular subtypes of breast cancer on dynamic contrast-enhanced mr images: a feasibility study. Radiology 272(1), 91–99 (2014). https://doi.org/10.1148/radiol.14121031

    Article  Google Scholar 

  23. F. Soares, F. Janela, M. Pereira, J. Seabra, M.M. Freire, Classification of breast masses on contrast-enhanced magnetic resonance images through log detrended fluctuation cumulant-based multifractal analysis. IEEE Syst. J. 8, 929–938 (2014). https://doi.org/10.1109/JSYST.2013.2284101

    Article  Google Scholar 

  24. Y.-H. Huang, Y.-C. Chang, C.-S. Huang, J.-H. Chen, R.-F. Chang, Computerized breast mass detection using multi-scale hessian-based analysis for dynamic contrast-enhanced MRI. J Digit Imaging 27(5), 649–660 (2014). https://doi.org/10.1007/s10278-014-9681-4

    Article  Google Scholar 

  25. Q. Yang, L. Li, J. Zhang, G. Shao, B. Zheng, A new quantitative image analysis method for improving breast cancer diagnosis using DCE-MRI examinations. Med Phys 42(1), 103–109 (2015). https://doi.org/10.1118/1.4903280

    Article  Google Scholar 

  26. A. Gubern-Mérida et al., Automated localization of breast cancer in DCE-MRI. Med. Image Anal. 20(1), 265–274 (2015). https://doi.org/10.1016/j.media.2014.12.001

    Article  Google Scholar 

  27. S.A. Waugh et al., Magnetic resonance imaging texture analysis classification of primary breast cancer. Eur. Radiol. 26(2), 322–330 (2016). https://doi.org/10.1007/s00330-015-3845-6

    Article  Google Scholar 

  28. I. Vidić et al., Support vector machine for breast cancer classification using diffusion-weighted MRI histogram features: preliminary study. J. Magn. Reson. Imaging 47(5), 1205–1216 (2018). https://doi.org/10.1002/jmri.25873

    Article  Google Scholar 

  29. D. Truhn, S. Schrading, C. Haarburger, H. Schneider, D. Merhof, C. Kuhl, Radiomic versus convolutional neural networks analysis for classification of contrast-enhancing lesions at multiparametric breast MRI. Radiology 290(2), 290–297 (2018). https://doi.org/10.1148/radiol.2018181352

    Article  Google Scholar 

  30. R. Ha et al., Predicting breast cancer molecular subtype with MRI dataset utilizing convolutional neural network algorithm. J Digit Imaging 32(2), 276–282 (2019). https://doi.org/10.1007/s10278-019-00179-2

    Article  MathSciNet  Google Scholar 

  31. Y. Ji et al., Independent validation of machine learning in diagnosing breast cancer on magnetic resonance imaging within a single institution. Cancer Imaging, 19, September 2019. https://doi.org/10.1186/s40644-019-0252-2

  32. N.C. D’Amico et al., A machine learning approach for differentiating malignant from benign enhancing foci on breast MRI. European Radiology Experimental 4(1), 5 (2020). https://doi.org/10.1186/s41747-019-0131-4

    Article  Google Scholar 

  33. S. Ellmann et al., Implementation of machine learning into clinical breast MRI: potential for objective and accurate decision-making in suspicious breast masses. PLoS One 15(1), January 2020. https://doi.org/10.1371/journal.pone.0228446

  34. V.S. Parekh et al., Multiparametric deep learning tissue signatures for a radiological biomarker of breast cancer: preliminary results. Med. Phys. 47(1), 75–88 (2020). https://doi.org/10.1002/mp.13849

    Article  Google Scholar 

  35. P. Pandya, P. Jayati, C5. 0 algorithm to improved decision tree with feature selection and reduced rrror pruning. Int. J. Comput. Appl. 117(16), 18–21 (2015). https://doi.org/10.5120/20639-3318

    Article  Google Scholar 

  36. J.R. Quinlan, C4.5: Programs for Machine Learning (Morgan Kaufmann, 1993)

    Google Scholar 

  37. J.R. Quinlan, Generating production rules from decision trees, in Proceedings of the 10th International Joint Conference on Artificial intelligence—Volume 1, San Francisco, CA, USA, August 1987, pp. 304–307. Accessed 23 Aug 2020 [Online]

    Google Scholar 

  38. A. Baratloo, M. Hosseini, A. Negida, G. El Ashal, Part 1: Simple definition and calculation of accuracy, sensitivity and specificity. Emerg (Tehran) 3(2), 48–49 (2015)

    Google Scholar 

  39. V. Jaiswal, A. Jitendra, The evolution of the association rules. Int. J. Model. Optim. 2(6), 726–729 (2012)

    Article  Google Scholar 

  40. G.I. Webb, OPUS: an efficient admissible algorithm for unordered search. J. Artif. Intell. Res. 3, 431–465 (1995). https://doi.org/10.1613/jair.227

    Article  MATH  Google Scholar 

  41. P.J. Azevedo, A.M. Jorge, Comparing rule measures for predictive association rules, in Machine Learning: ECML 2007 (Berlin, Heidelberg, 2007), pp. 510–517. https://doi.org/10.1007/978-3-540-74958-5_47

  42. S. Alvarez, Chi-squared computation for association rules: preliminary results. Technical Report BC-CS-2003–01, July 2003. Accessed 24 August 2020 [Online]. https://www.academia.edu/11560769/Chi_squared_computation_for_association_rules_preliminary_results

  43. C.S. Leong et al., Characterization of breast lesion morphology with delayed 3DSSMT: an adjunct to dynamic breast MRI. J. Magn. Reson. Imaging 11(2), 87–96 (2000). https://doi.org/10.1002/(SICI)1522-2586(200002)11:2%3c87::AID-JMRI3%3e3.0.CO;2-E

    Article  Google Scholar 

  44. A.G. Sorace et al., Distinguishing benign and malignant breast tumors: preliminary comparison of kinetic modeling approaches using multi-institutional dynamic contrast-enhanced MRI data from the International Breast MR Consortium 6883 trial. JMI 5(1), 011019 (2018). https://doi.org/10.1117/1.JMI.5.1.011019

    Article  Google Scholar 

  45. C.K. Kuhl et al., Dynamic breast MR imaging: are signal intensity time course data useful for differential diagnosis of enhancing lesions? Radiology 211(1), 101–110 (1999). https://doi.org/10.1148/radiology.211.1.r99ap38101

    Article  Google Scholar 

  46. S.D. Edwards, J.A. Lipson, D.M. Ikeda, J.M. Lee, Updates and revisions to the BI-RADS magnetic resonance imaging Lexicon. Magn. Reson. Imaging Clin. 21(3), 483–493 (2013). https://doi.org/10.1016/j.mric.2013.02.005

    Article  Google Scholar 

  47. M. Goto et al., Diagnosis of breast tumors by contrast-enhanced MR imaging: comparison between the diagnostic performance of dynamic enhancement patterns and morphologic features. J. Magn. Reson. Imaging 25(1), 104–112 (2007). https://doi.org/10.1002/jmri.20812

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Georgios Dounias .

Editor information

Editors and Affiliations

Appendices

Annex 6.1—Abbreviations

Abbreviations

Descriptions

ACR BI-RADS

American College of Radiology Breast Imaging Reporting & Data System

ADC

Apparent Diffusion Coefficient

ANN

Artificial Neural Networks

AUC

Area Under the ROC Curve

B–ANN

Bayesian Artificial Neural Networks

CAD

Computer Aided Diagnostic systems

CHAID

Chi-squared Automatic Interaction Detection

CLi Data

Clinical image Data

CNN

Convolutional Neural Network

DCE–MRI

Dynamic Contrast–Enhanced Magnetic-Resonance Imaging

DMi Data

Digital Mammography image Data

eBP–FNN

error Back Propagation Feed Forward Artificial Neural Networks

FCM

Fuzzy c-Means

FFDM

Full Field Mammograms

FOV

Field of View

HiSS–MRI

High Spatial and Spectral resolution MRI

IRi Data

Infrared Thermography image Data

KNN

K Nearest Neighbor

LDA

Linear Discriminant Analysis

LS–SVM

Least-Squares Support Vector Machine

MARS

Multivariate Adaptive Regression Splines

MCi Data

Microscope image Data

MRi Data

Magnetic–resonance image Data

MSM

Multisurface method

PFK-SVM

Polynomial Kernel Function Support Vector Machine

RF

Random Forest

SA

Sparse Autoencoder

ST

Slice thickness

SVM

Support Vector Machines

TNM

T category describes the primary tumor site, N category describes the regional lymph node involvement and M category describes the presence or otherwise of distant metastatic spread (UNICC- Union for International Cancer Control

TSE

Turbo Spin Echo

Annex 6.2—Variables Frequency Charts (Original Dataset)

figure a
figure b

Annex 6.3—Variables’ Values Range

Variables’ values range

S. no.

Variable

Values

Group

Frequency (N)

Frequency (%)

1

AGE

[0, 49]

AGE_1

39

50.6

[50, ∞]

AGE_2

38

50.4

Total

77

100

2

Morphology (MORPH)

MASS

MORPH_1

48

62.3

MASS & NON MASS

MORPH_2

2

2.6

NON MASS

MORPH_3

27

35.1

Total

77

100

3

Borders (BDS)

IRR

BDS_1

66

85.7

SPIC

SMH

BDS_2

11

14.3

Total

77

100

4

Tumor Size (TUMS)

[0, 1.3]

TUMS_1

28

36.4

[1.4, 2.5]

TUMS_2

19

24.7

Over 2.6

TUMS_3

30

39.0

Total

77

100

5

Peritumoral Edema (PRED)

NO

PRED_1

50

64.9

YES

PRED_2

27

35.1

Total

77

100

6

T2 Weighted image (T2-Wi)

NONE

LOW

T2WI_1

70

90.9

INTER

HIGH

T2WI_2

7

9.1

Total

77

100

7

Curve Morphology (CRM)

TYPE_1

CRM_1

14

18.2

TYPE_2

CRM_2

25

32.5

TYPE_3

CRM_3

38

49.4

Total

77

100

8

Breast Density (BD)

A

BD_1

28

36.4

B

B-C

BD _2

49

63.6

C

C-D

Total

77

100

9

Background Parenchymal Enhancement (BPE)

MIN

BPE_1

52

67.5

MILD

MOD

BPE_2

25

32.5

MARK

Total

77

100

10

Feeding Vessel (FV)

NO

FV_1

46

59.7

YES

FV_2

31

40.3

Total

77

100

11

Internal Enhancement (INTEN)

HOMOG

INTEN_1

16

20.8

INHOMOG

INTEN_2

61

79.2

HETER

Total

77

100

12

Diffusion (DF)

N/A

DF_1

4

5.2

LOW

DF_2

15

19.5

HIGH

DF_3

58

75.3

Total

77

100

13

Apparent Diffusion Coefficient (ADC)

N/A

ADC_1

4

5.2

LOW

ADC_2

48

62.3

HIGH

ADC_3

25

32.5

Total

77

100

14

Focality (FC)

U

FC_1

48

62.3

MC

FC_2

4

5.2

MF

FC_3

25

32.5

Total

77

100

15

Benign or Malignant (BOM)

BENIGN

BOM_1

19

24.7

MALIGNANT

BOM_2

58

75.3

Total

77

100

16

Benign and Malignant (BAM)

BENIGN

BAM_1

19

24.7

DCIS

BAM_2

6

7.8

IDC

BAM_3

36

46.8

DCIS & IDC

BAM_4

9

11.7

ILC

BAM_5

6

7.8

SOLID PAPILLARY

BAM_6

1

1.3

Total

77

100

17

Malignan (ML)

DCIS

ML_1

6

7.8

IDC

ML_2

32

41.6

DCIS & IDC

ML_3

13

16.9

ILC

ML_4

6

7.8

SOLID PAPILLARY

ML_5

1

1.3

Total

58

75.3

Missing

System

19

24.7

Total

77

100

18

BIRADS

CATEGORY 3

BIRADS_2

36

46.8

CATEGORY 4

CATEGORY 5

BIRADS_3

41

53.2

CATEGORY 6

Total

77

100

19

Tumor Grade (TUMG)

GRADE 1

TUMG_1

3

3.9

GRADE 2

TUMG_2

25

32.5

GRADE 3

TUMG_3

28

36.4

Total

56

72.7

Missing

System

21

27.3

Total

77

100

20

Estrogen Receptors (ER)

NO

ER_1

14

18.2

YES

ER_2

43

55.8

Total

57

74.0

Missing

System

20

26.0

Total

77

100

21

Progesterone Receptors) (PR)

NO

PR_1

19

24.7

YES

PR_2

38

49.4

Total

57

7 4.0

Missing

System

20

26.0

Total

77

100

22

Cerb-B2

NO

CERB2_1

29

37.7

YES

CERB2_2

28

36.4

Total

57

74.0

Missing

System

20

26.0

Total

77

100

23

Ki-67

[0, 15]

KI67_1

22

28.6

[16, 25]

KI67_2

8

10.4

[26, 100]

KI67_3

23

29.9

Total

 

53

68.8

Missing

System

24

31.2

Total

77

100

Annex 6.4—Classification Tree (Benign or Malignant)

figure c
figure d

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Karampotsis, E., Panourgias, E., Dounias, G. (2022). Inductive Machine Learning and Feature Selection for Knowledge Extraction from Medical Data: Detection of Breast Lesions in MRI. In: Tsihrintzis, G.A., Virvou, M., Esposito, A., Jain, L.C. (eds) Advances in Assistive Technologies. Learning and Analytics in Intelligent Systems, vol 28. Springer, Cham. https://doi.org/10.1007/978-3-030-87132-1_6

Download citation

Publish with us

Policies and ethics