Image feature evaluation in two new mammography CAD prototypes

  • Alexander Hapfelmeier
  • Alexander Horsch
Original Article



Breast cancer is a common but treatable disease for adult women. Improvements in breast cancer detection and treatment have helped to lower mortality, but there is still a need for further improvements, particularly for better computer-aided diagnosis (CADx) and computer-aided detection (CADe).


Two new CAD prototypes, one CADx and one CADe prototype, were evaluated. The core modules are segmentation of lesions, feature extraction, and classification. The evaluation of microcalcifications and mass lesions is based on the currently largest publicly available Digital Database for Screening Mammography (DDSM) with digitized film mammograms and a smaller data source with high-quality mammograms from digital mammography devices. Two different image analysis approaches used by the respective CAD prototypes were examined and compared. These include the ‘machine learning’ approach and the new ‘knowledge-driven’ approach. Particular emphasis is put on a profound discussion of statistical methods with recommendations for their proper application in order to avoid common errors including feature selection, model fitting, and sampling schemes.


The results show that the classification performance of the investigated CADx prototypes for microcalcifications produced a higher AUC =.777 for 44 machine learning features than for 10 knowledge-driven features (AUC =.657). A combination of both feature sets (53 features) did not substantially raise the classification performance (AUC =.771). These analyses were based on 1,347 and 1,359 ROIs, respectively. Evaluating the CADx prototype with 242 machine learning features on DDSM masses data resulted in an AUC of .862 using 1,934 ROIs. Furthermore, a CADe prototype was applied to three own databases giving information about the true positive detection rate for mass lesions. Depending on the definition of a true positive detection, it produced AUC values of .953, .818, and .954 using 12, 17, and 18 features, respectively.


The comparison of CAD prototypes revealed that the quality of results is highly dependent on the correct usage of statistical models, feature selection methods, and evaluation schemes.


Mammographie Feature selection Classification DDSM CAD Sampling Selection bias Stepwise regression SVM LDA Classification tree AIC BIC 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    WHO (ed) (2008) World health statistics. WHO Press, GenevaGoogle Scholar
  2. 2.
    Levi F, Lucchini F, Negri E, Vecchia CL (2007) Continuing declines in cancer mortality in the European union. Ann Oncol 18(3):593–595, [Online]. Available: Google Scholar
  3. 3.
    Thurfjell EL, Lernevall KA, Taube AA (1994) Benefit of independent double reading in a population-based mammography screening program. Radiology 191(1):241–244 [Online]. Available:
  4. 4.
    Warren RML, Duffy W (1995) Comparison of single reading with double reading of mammograms, and change in effectiveness with experience. Br J Radiol 68(813):958–962 [Online]. Available: Google Scholar
  5. 5.
    Harvey SC, Geller B, Oppenheimer RG, Pinet M, Riddell L, Garra B (2003) Increase in cancer detection and recall rates with independent double interpretation of screening mammography. Am J Roentgenol 180(5):1461–1467 [Online]. Available: Google Scholar
  6. 6.
    Taylor P, Potts H (2008) Computer aids and human second reading as interventions in screening mammography: two systematic reviews to compare effects on cancer detection and recall rate. Eur J Cancer 44(6):798–807, April 2008. [Online]. Available: doi: 10.1016/j.ejca.2008.02.016
  7. 7.
    Elter M, Horsch A (2009) Cadx of mammographic masses and clustered microcalcifications: a review. Med Phy 36(6):2052–2068 [Online]. Available: Google Scholar
  8. 8.
    Rosado B, Menzies S, Harbauer A, Pehamberger H, Wolff K, Binder M, Kittler H (2003) Accuracy of computer diagnosis of melanoma: a quantitative meta-analysis. Arch Dermatol 139(3):361–367 [Online]. Available: Google Scholar
  9. 9.
    Heath M, Bowyer K, Kopans D, Moore R, Kegelmeyer WP (2001) The digital database for screening mammography. In: Yaffe M (ed) Proceedings of the fifth international workshop on digital mammography. Medical Physics Publishing, London, pp 212–218Google Scholar
  10. 10.
    Elter M, Horsch A, Schöulz-Wendtland R, Sittek H, Athelogou M, Schmidt G, Wittenberg T (2007) A modern benchmark case database for computer-aided diagnosis of breast cancer. Int J Comput Assist Radiol Surg (CARS 2007) 2(S1): 514Google Scholar
  11. 11.
    Schönmeyer R, Athelogou M, Sittek H, Ellenberg P, Feehan O, Schmidt G, Binnig G (2011) Cognition network technology prototype of a cad system for mammography to assist radiologists by finding similar cases in a reference database. Int J Comput Assist Radiol Surg 6:127–134, doi: 10.1007/s11548-010-0486-8. [Online].
  12. 12.
    Athelogou M, Schmidt G, Schäpe A, Baatz M, Binnig G (2007) Cognition network technology—a novel multimodal image analysis technique for automatic identification and quantification of biological image contents. In: Shorte S, Frischknecht F (eds) Imaging cellular and molecular biological functions. Springer, pp. 407–422. [Online]. Available:
  13. 13.
    Horsch A (2011) Biomedical image processing, 1st edn. ch. Melanoma Diagnosis. Springer, HeidelbergGoogle Scholar
  14. 14.
    Elter M, Held C (2008) Semiautomatic segmentation for the computer aided diagnosis of clustered microcalcifications. In: Giger ML, Karssemeijer N (eds) Medical imaging 2008: computer-aided diagnosis 6915(1). SPIE, p 691524. [Online]. Available:
  15. 15.
    Elter M, Bergen T (2009) Incorporating a segmentation routine for mammographic masses into a knowledge-based cadx approach. In: Karssemeijer N, Giger ML (eds) Medical imaging 2009: computer-aided diagnosis, 7260(1). SPIE, p 726025. [Online]. Available:
  16. 16.
    Elter M, Held C (2010) An improved method for segmentation of mammographic masses. SPIE medical imaging 2010: computer-aided diagnosis (in press)Google Scholar
  17. 17.
    Hu MK (1962) Visual pattern recognition by moment invariants. IEEE Trans Inf Theory IT-8: 179–187Google Scholar
  18. 18.
    Khotanzad A, Hong Y (1990) Invariant image recognition by zernike moments. IEEE Trans Pattern Anal Mach Intell 12: 489–497CrossRefGoogle Scholar
  19. 19.
    Roß T, Handels H, Busche H, Kreusch J, Wolf HH, Pöppl SJ (1995) Automatische klassifikation hochaufgelöster oberflächenprofile von hauttumoren mit neuronalen netzen. In: DAGM-Symposium pp 379–386Google Scholar
  20. 20.
    Galloway MM (1975) Texture analysis using gray level run lengths. Comput Graphics Image Process 4(2):172–179 [Online]. Available:
  21. 21.
    Unser M (1986) Sum and difference histograms for texture classification. IEEE Trans Pattern Anal Mach Intell 8(1): 118–125PubMedCrossRefGoogle Scholar
  22. 22.
    Haralick RM, Dinstein I, Shanmugam K (1973) Textural features for image classification. IEEE Trans Syst Man Cybern SMC-3: 610–621CrossRefGoogle Scholar
  23. 23.
    Laine A, Fan J (1993) Texture classification by wavelet packet signatures. IEEE Trans Pattern Anal Mach Intell 15(11): 1186–1191CrossRefGoogle Scholar
  24. 24.
    Chen Y, Nixon M, Thomas D (1995) Statistical geometric features for texture classification. Pattern Recognit 28(4):537–552 [Online]. Available: Google Scholar
  25. 25.
    Zahn CT, Roskies RZ (1972) Fourier descriptors for plane closed curves. IEEE Trans Comput c-21(3): 269–281CrossRefGoogle Scholar
  26. 26.
    Kilday J, Palmieri F, Fox MD (1993) Classifying mammographic lesions using computerized image analysis. IEEE Trans Med Imaging 12(4): 664–669PubMedCrossRefGoogle Scholar
  27. 27.
    R Development Core Team (2009) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, ISBN 3-900051-07-0. [Online]. Available:
  28. 28.
    Metter RLV, Beutel J, Kundel HL (eds) (February 2000) Handbook of medical imaging, physics and psychophysics, corrected ed. Bellingham, SPIE PressGoogle Scholar
  29. 29.
    Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning. J Comput Graph Stat 15(3):651–674 [Online]. Available: Google Scholar
  30. 30.
    Hastie T, Tibshirani R, Friedman JH (2009) The elements of statistical learning, corrected ed. SpringerGoogle Scholar
  31. 31.
    Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psych 12(4):387–415 [Online]. Available: Google Scholar
  32. 32.
    Jaeger J, Sengupta R, Ruzzo W (2003) Improved gene selection for classification of microarrays. In: Proceedings of pacific symposium on biocomputing. pp 53–64Google Scholar
  33. 33.
    Boulesteix AL, Strobl C, Augustin T, Daumer M (2008) Evaluating microarray-based classifiers: an overview. Cancer Informat 6: 77–97Google Scholar
  34. 34.
    Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. Chapman & Hall, New YorkGoogle Scholar
  35. 35.
    Wood M (2004) Statistical inference using bootstrap confidence intervals. Significance 1(4):180–182 [Online]. Available: doi: 10.1111/j.1740-9713.2004.00067.x Google Scholar
  36. 36.
    McLachlan GJ, Chevelu J, Zhu J (2008) Correcting for selection bias via cross-validation in the classification of microarray data. IMS Collect 1:364–376 [Online]. Available: doi: 10.1214/193940307000000284 Google Scholar
  37. 37.
    Ambroise C, McLachlan GJ (2002) Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci USA. 99(10):6562–6566 [Online]. Available: doi: 10.1073/pnas.102102699
  38. 38.
    Sing T, Sander O, Beerenwinkel N, Lengauer T (2005) ROCR: visualizing classifier performance in R. Bioinformatics 21(20):3940–3941 [Online]. Available: Google Scholar
  39. 39.
    Pirooznia M, Yang J, Yang MQ, Deng Y (2008) A comparative study of different machine learning methods on microarray gene expression data. BMC Genomics 9(Suppl 1):S13 [Online]. Available:
  40. 40.
    Breiman L (1996) Bagging predictors. Machine Learning 24(2):123–140 [Online]. Available: doi: 10.1023/A:1018054314350 Google Scholar
  41. 41.
    Slawski M, Daumer M, Boulesteix A-L (2008) Cma—a comprehensive bioconductor package for supervised classification with high dimensional data. BMC Bioinformatics 9(1):439 [Online]. Available:
  42. 42.
    Dupuy A, Simon RM (2007) Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J Natl Cancer Inst 99(2):147–157, Jan 2007. [Online]. Available: doi: 10.1093/jnci/djk018 Google Scholar
  43. 43.
    Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: EuroCOLT ’95: Proceedings of the second European conference on computational learning theory. Springer, London, pp 23–37Google Scholar

Copyright information

© CARS 2011

Authors and Affiliations

  1. 1.Institute for Medical Statistics and EpidemiologyTechnische Universität MünchenMünchenGermany
  2. 2.Department of Computer ScienceUniversity of TromsøTromsøNorway

Personalised recommendations