Advertisement

Journal of Medical Systems

, Volume 36, Issue 4, pp 2259–2269 | Cite as

Discovering Mammography-based Machine Learning Classifiers for Breast Cancer Diagnosis

  • Raúl Ramos-Pollán
  • Miguel Angel Guevara-López
  • Cesar Suárez-Ortega
  • Guillermo Díaz-Herrero
  • Jose Miguel Franco-Valiente
  • Manuel Rubio-del-Solar
  • Naimy González-de-Posada
  • Mario Augusto Pires Vaz
  • Joana Loureiro
  • Isabel Ramos
ORIGINAL PAPER

Abstract

This work explores the design of mammography-based machine learning classifiers (MLC) and proposes a new method to build MLC for breast cancer diagnosis. We massively evaluated MLC configurations to classify features vectors extracted from segmented regions (pathological lesion or normal tissue) on craniocaudal (CC) and/or mediolateral oblique (MLO) mammography image views, providing BI-RADS diagnosis. Previously, appropriate combinations of image processing and normalization techniques were applied to reduce image artifacts and increase mammograms details. The method can be used under different data acquisition circumstances and exploits computer clusters to select well performing MLC configurations. We evaluated 286 cases extracted from the repository owned by HSJ-FMUP, where specialized radiologists segmented regions on CC and/or MLO images (biopsies provided the golden standard). Around 20,000 MLC configurations were evaluated, obtaining classifiers achieving an area under the ROC curve of 0.996 when combining features vectors extracted from CC and MLO views of the same case.

Keywords

Breast cancer CAD Machine learning classifiers Mammography classifiers 

Notes

Acknowledgements

This work is part of the GRIDMED research collaboration project between INEGI (Portugal) and CETA-CIEMAT (Spain). Prof. Guevara acknowledges POPH - QREN-Tipologia 4.2 – Promotion of scientific employment funded by the ESF and MCTES, Portugal. CETA-CIEMAT acknowledges the support of the European Regional Development Fund

References

  1. 1.
    Althuis, M. D., et al., Global trends in breast cancer incidence and mortality 1973–1997. Int. J. Epidemiol. 34:405–412, 2005. April 1, 2005.CrossRefGoogle Scholar
  2. 2.
    Veloso, V., “Cancro da mama mata 5 mulheres por dia em Portugal,”. In: (Ed.) CiênciaHoje. Lisboa, Portugal, 2009Google Scholar
  3. 3.
    Tabár, L., et al., Beyond randomized controlled trials: organized mammographic screening substantially reduces breast carcinoma mortality. Cancer 91:1724–1731, 2001.CrossRefGoogle Scholar
  4. 4.
    Brown, J., et al., Mammography screening: an incremental cost effectiveness analysis of double versus single reading of mammograms, BMJ (Clinical research ed.) 312:809–812, 1996.CrossRefGoogle Scholar
  5. 5.
    Sampat, M. P., et al., Computer-Aided Detection and Diagnosis in Mammography. In: Al, B. (Ed.), Handbook of Image and Video Processing, Secondth edition. Academic, ed Burlington, pp. 1195–1217, 2005.CrossRefGoogle Scholar
  6. 6.
    López, Y., et al., “Breast cancer diagnosis based on a suitable combination of deformable models and artificial neural networks techniques,”. In: Progress in Pattern Recognition, Image Analysis and Applications. vol. Volume 4756/2008, ed: Springer Berlin/Heidelberg, 2008, pp. 803–811.Google Scholar
  7. 7.
    López, Y., et al., “Computer aided diagnosis system to detect breast cancer pathological lesions,” In: Progress in Pattern Recognition, Image Analysis and Applications. vol. Volume 5197/2008, ed: Springer Berlin/Heidelberg, 2008, pp. 453–460.Google Scholar
  8. 8.
    Ramos-Pollan, R. et al., “Exploiting eInfrastructures for medical image storage and analysis: A grid application for mammography CAD,”. In: The Seventh IASTED International Conference on Biomedical Engineering, Innsbruck, Austria, 2010Google Scholar
  9. 9.
    Ramos-Pollan, R., et al., "Grid-based architecture to host multiple repositories: A mammography image analysis use case,". In: 3rd Iberian Grid Infrastructure Conference Proceedings, Valencia, Spain, 2009, pp. 327–338Google Scholar
  10. 10.
    Ramos-Pollan, R., et al., “Building medical image repositories and CAD systems on grid infrastructures: A mammograms case,”. In: 15th edition of the Portuguese Conference on Pattern Recognition., University of Aveiro. Aveiro, Portugal, 2009.Google Scholar
  11. 11.
    Ramos-Pollan, R., et al., “Grid computing for breast cancer CAD. A pilot experience in a medical environment,”. In: 4th Iberian Grid Infrastructure Conference, Minho, Portugal, 2010, pp. 307–318.Google Scholar
  12. 12.
    NEMA. (2010), Digital Imaging and Communications in Medicine. Available: http://dicom.nema.org/
  13. 13.
    Espert, I. B., et al., Content-based organisation of virtual repositories of DICOM objects. Future Gener Comput. Syst. 25:627–637, 2009.CrossRefGoogle Scholar
  14. 14.
    D’Orsi, C. J., et al., Breast imaging reporting and data system: ACR BI-RADS-mammography, 4th Edition ed.: American College of Radiology, 2003.Google Scholar
  15. 15.
    Chenyang, X., and Prince, J. L., Snakes, shapes, and gradient vector flow. Image Process. IEEE Trans. 7:359–369, 1998.zbMATHCrossRefGoogle Scholar
  16. 16.
    Liang, J., et al., United snakes. Med. Image Anal. 10:215–233, 2006.CrossRefGoogle Scholar
  17. 17.
    Rodenacker, K., A feature set for cytometry on digitized microscopic images. Cell Pathol 25:1–36, 2001.Google Scholar
  18. 18.
    Haralick, R., et al., Textural features for image classification. IEEE Trans. Syst. Man Cybern. SMC-3:610–621, 1973.MathSciNetCrossRefGoogle Scholar
  19. 19.
    Oliver, A., et al., A review of automatic mass detection and segmentation in mammographic images. Med. Image Anal. 14:87–110, 2010.CrossRefGoogle Scholar
  20. 20.
    Mark Hall, et al., “The WEKA data mining software: an update,” SIGKDD Explorations, vol. 11, 2009.Google Scholar
  21. 21.
    Park, S. C., et al., Improving performance of computer-aided detection scheme by combining results from two machine learning classifiers. Acad. Radiol. 16:266–274, 2009.CrossRefGoogle Scholar
  22. 22.
    Verma, B., et al., Classification of benign and malignant patterns in digital mammograms for the diagnosis of breast cancer. Expert Syst. Appl. 37:3344–3351, 2010.CrossRefGoogle Scholar
  23. 23.
    Mavroforakis, M. E., et al., Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artif. Intell. Med. 37:145–162, 2006.CrossRefGoogle Scholar
  24. 24.
    Mavroforakis, M., et al., Significance analysis of qualitative mammographic features, using linear classifiers, neural networks and support vector machines. Eur. J. Radiol. 54:80–89, 2005.CrossRefGoogle Scholar
  25. 25.
    Butler, S. M., et al., A case study in feature invention for breast cancer diagnosis using X-ray scatter images. In: Gedeon, T. D., and Fung, L. C. C. (Eds.), AI 2003: Advances in Artificial Intelligence. vol. 2903. Springer, Berlin/Heidelberg, pp. 677–685, 2003.CrossRefGoogle Scholar
  26. 26.
    Song, J. H., et al., Comparative analysis of logistic regression and artificial neural network for computer-aided diagnosis of breast masses. Acad. Radiol. 12:487–495, 2005.CrossRefGoogle Scholar
  27. 27.
    Abonyi, J., and Szeifert, F., Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recognit. Lett. 24:2195–2207, 2003.zbMATHCrossRefGoogle Scholar
  28. 28.
    Setiono, R., Generating concise and accurate classification rules for breast cancer diagnosis. Artif. Intell. Med. 18:205–219, 2000.CrossRefGoogle Scholar
  29. 29.
    Fan, C.-Y., et al., A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification. Appl. Soft Comput. 11:632–644, 2011.CrossRefGoogle Scholar
  30. 30.
    Sweilam, N. H., et al., Support vector machine for diagnosis cancer disease: A comparative study. Egypt. Inform. J. 11:81–92, 2010.CrossRefGoogle Scholar
  31. 31.
    Bishop, C. M., Neural Networks for Pattern Recognition: Oxford University Press, Inc., 1995.Google Scholar
  32. 32.
    Heaton, J., “Programming Neural Networks with Encog 2 in Java,” ed: Heaton Research, Inc., 2010.Google Scholar
  33. 33.
    Chang, C-C., and LinC.-J., (2001, LIBSVM: a library for support vector machines. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm
  34. 34.
    Foster, I, and Kesselman, C., The Grid 2, Second Edition: Blueprint for a New Computing Infrastructure, 2nd ed.: Elsevier, 2004.Google Scholar
  35. 35.
    The gLite middleware. Available: http://glite.web.cern.ch
  36. 36.
    Ramos Pollan, R., et al., “Introducing ROC curves as error measure functions. A new approach to train ANN-based biomedical data classifiers,”. In: 15th Iberoamerican Congress on Pattern Recognition, Sao Paolo, Brasil, 2010.Google Scholar
  37. 37.
    Yoon, H. J., et al., Evaluating computer-aided detection algorithms. Med. Phys. 34:2024–2038, 2007.CrossRefGoogle Scholar
  38. 38.
    Fawcett, T., An introduction to ROC analysis. Pattern Recognit. Lett. 27:861–874, 2006.CrossRefGoogle Scholar
  39. 39.
    John Eng, M. D., (2006, March 7). ROC analysis: Web-based calculator for ROC curves. Available: http://www.jrocfit.org
  40. 40.
    Kim, J.-H., Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Comput. Stat. Data Anal. 53:3735–3745, 2009.zbMATHCrossRefGoogle Scholar
  41. 41.
    Efron, B., and Gong, G., A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation. Am. Stat. 37:36–48, 1983.MathSciNetGoogle Scholar
  42. 42.
    Efron, B., Estimating the error rate of a prediction rule: Improvement on cross-validation. J. Am. Stat. Assoc. 78:316–331, 1983.MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Raúl Ramos-Pollán
    • 1
  • Miguel Angel Guevara-López
    • 2
  • Cesar Suárez-Ortega
    • 1
  • Guillermo Díaz-Herrero
    • 1
  • Jose Miguel Franco-Valiente
    • 1
  • Manuel Rubio-del-Solar
    • 1
  • Naimy González-de-Posada
    • 2
  • Mario Augusto Pires Vaz
    • 2
  • Joana Loureiro
    • 3
  • Isabel Ramos
    • 3
  1. 1.CETA-CIEMAT Center of Extremadura for Advanced TechnologiesTrujilloSpain
  2. 2.INEGI-FEUP Institute of Mechanical Engineering and Industrial Management, Faculty of EngineeringUniversity of PortoPortoPortugal
  3. 3.HSJ-FMUP Hospital de São João - Faculty of MedicineUniversity of PortoPortoPortugal

Personalised recommendations