Abstract
Purpose
To develop and validate a decision support tool for mammographic mass lesions based on a standardized descriptor terminology (BI-RADS lexicon) to reduce variability of practice.
Materials and methods
We used separate training data (1,276 lesions, 138 malignant) and validation data (1,177 lesions, 175 malignant). We created naïve Bayes (NB) classifiers from the training data with tenfold cross-validation. Our “inclusive model” comprised BI-RADS categories, BI-RADS descriptors, and age as predictive variables; our “descriptor model” comprised BI-RADS descriptors and age. The resulting NB classifiers were applied to the validation data. We evaluated and compared classifier performance with ROC-analysis.
Results
In the training data, the inclusive model yields an AUC of 0.959; the descriptor model yields an AUC of 0.910 (P < 0.001). The inclusive model is superior to the clinical performance (BI-RADS categories alone, P < 0.001); the descriptor model performs similarly. When applied to the validation data, the inclusive model yields an AUC of 0.935; the descriptor model yields an AUC of 0.876 (P < 0.001). Again, the inclusive model is superior to the clinical performance (P < 0.001); the descriptor model performs similarly.
Conclusion
We consider our classifier a step towards a more uniform interpretation of combinations of BI-RADS descriptors. We provide our classifier at www.ebm-radiology.com/nbmm/index.html.
Key Points
• We provide a decision support tool for mammographic masses at www.ebm-radiology.com/nbmm/index.html .
• Our tool may reduce variability of practice in BI-RADS category assignment.
• A formal analysis of BI-RADS descriptors may enhance radiologists’ diagnostic performance.
Similar content being viewed by others
References
Sickles EA, D’Orsi CJ, Bassett LW et al (2013) ACR BI-RADS® Mammography. In: ACR BI-RADS® Atlas, Breast Imaging Reporting and Data System. American College of Radiology, Reston, VA
Berg WA, Campassi C, Langenberg P, Sexton MJ (2000) Breast imaging reporting and data system: inter-and intraobserver variability in feature analysis and final assessment. Am J Roentgenol 174:1769–1777
Berg WA, D'Orsi CJ, Jackson VP et al (2002) Does training in the breast imaging reporting and data system (BI-RADS) improve biopsy recommendations or feature analysis agreement with experienced breast imagers at mammography? Radiology 224:871–880
Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS (2006) BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology 239:385–391
Caplan LS, Blackman D, Nadel M, Monticciolo D (1999) Coding mammograms using the classification “probably benign finding - short interval follow-up suggested”. Am J Roentgenol 172:339–342
Timmers J, van Doorne-Nagtegaal H, Verbeek A, den Heeten G, Broeders M (2012) A dedicated BI-RADS training programme: effect on the inter-observer variation among screening radiologists. Eur J Radiol 81:2184–2188
Baker JA, Kornguth PJ, Lo JY, Williford ME, Floyd CE (1995) Breast cancer: prediction with artificial neural network based on BI-RADS standardized lexicon. Radiology 196:817–822
Burnside ES, Davis J, Chhatwal J et al (2009) Probabilistic computer model developed from clinical data in national mammography database format to classify mammographic findings. Radiology 251:663–672
Elter M, Schulz-Wendtland R, Wittenberg T (2007) The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med Phys 34:4164–4172
Fischer E, Lo J, Markey M (2004) Bayesian networks of BI-RADS descriptors for breast lesion classification. Eng Med Biol Soc 4:3031–3034
Moura D, Guevara López M (2013) An evaluation of image descriptors combined with clinical data for breast cancer diagnosis. Int J Comput Assist Radiol Surg 8:561–574
Timmers J, Verbeek A, IntHout J, Pijnappel R, Broeders M, den Heeten G (2013) Breast cancer risk prediction model: a nomogram based on common mammographic screening findings. Eur Radiol 23:2413–2419
Balleyguier C, Bidault F, Mathieu MC, Ayadi S, Couanet D, Sigal R (2007) BIRADS (TM) mammography: exercises. Eur J Radiol 61:195–201
Charniak E (1991) Bayesian networks without tears. AI Mag 12:50–63
Hand DJ, Yu K (2001) Idiot's Bayes-not so stupid after all? Int Stat Rev 69:385–398
R Development Core Team (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org. ISBN 3–900051-07–0
Meyer D, Weingessel A, Dimitriadou E, Hornik K, and Leisch F (2014) e1071: Misc Functions of the Department of Statistics (e1071), TU Wien. R package version 1.6–3. http://CRAN.R-project.org/package=e1071
Sing T, Sander O, Beerenwinkel N, Lengauer T (2005) ROCR: visualizing classifier performance in R. Bioinformatics 21:3940–3941
DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845
Robin X, Turck N, Hainard A et al (2011) pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinforma 12:77
Zweig MH, Campbell G (1993) Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577
Collins GS, de Groot JA, Dutton S et al (2014) External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol 14:40
Pisano E, Hendrick R, Yaffe M et al (2008) Diagnostic accuracy of digital versus film mammography: exploratory analysis of selected population subgroups in DMIST. Radiology 246:376–383
Howlader N, Noone A, Krapcho M et al (2014) SEER cancer statistics review, 1975-2011. National Cancer Institute, Bethesda
Zhang H (2004) The optimality of naive Bayes. Proc FLAIRS Conf 1:3–9
Domingos P, Pazzani M (1996) Beyond independence: conditions for the optimality of the simple Bayesian classifier. Proceedings of the 13th International Conference on Machine Learning, pp 105-112
Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers ICML. Citeseer, pp 609-616
Elter M, Horsch A (2009) CADx of mammographic masses and clustered microcalcifications: a review. Med Phys 36:2052–2068
Vickers AJ, Cronin AM (2010) Everything you always wanted to know about evaluating prediction models (but were too afraid to ask). Urology 76:1298
Burnside ES, Sickles EA, Bassett LW et al (2009) The ACR BI-RADS experience: learning from history. J Am Coll Radiol 6:851–860
Baker JA, Kornguth PJ, Floyd C Jr (1996) Breast imaging reporting and data system standardized mammography lexicon: observer variability in lesion description. Am J Roentgenol 166:773–778
Ransohoff D, Feinstein A (1978) Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med 299:926–930
Whiting P, Rutjes A, Reitsma J, Glas A, Bossuyt P, Kleijnen J (2004) Sources of variation and bias in studies of diagnostic accuracy. Ann Intern Med 140:189–203
Slattery ML, Kerber RA (1993) A comprehensive evaluation of family history and breast cancer risk: the Utah population database. JAMA 270:1563–1568
McCormack VA, dos Santos Silva I (2006) Breast density and parenchymal patterns as markers of breast cancer risk: a meta-analysis. Cancer Epidemiol Biomark Prev 15:1159–1169
Acknowledgments
The scientific guarantor of this publication is Elizabeth Burnside. The authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article. This study has received funding by: M. Benndorf received a grant from the DFG (Deutsche Forschungsgemeinschaft, BE5747/1-1) for conducting experiments in Madison, WI. The authors acknowledge the support of the National Institutes of Health (grants: R01CA165229, R01LM011028). We also acknowledge support from the UW Institute for Clinical and Translational Research (UL1TR000427) and the UW Carbone Comprehensive Cancer Center (P30CA014520). One of the authors has significant statistical expertise. Institutional review board approval was obtained. Written informed consent was waived by the institutional review board. Some study subjects or cohorts have been previously reported in: “Addressing the challenge of assessing physician-level screening performance: mammography as an example.” Burnside ES, Lin Y, Munoz del Rio A, Pickhardt PJ, Wu Y, Strigel RM, Elezaby MA, Kerr EA, Miglioretti DL, PLoS One 2014 “Using multidimensional mutual information to prioritize mammographic features for breast cancer diagnosis.” Wu Y, Vanness DJ, Burnside ES, AMIA Annu Symp Proc 2013. Methodology: retrospective, diagnostic study, performed at one institution.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Benndorf, M., Kotter, E., Langer, M. et al. Development of an online, publicly accessible naive Bayesian decision support tool for mammographic mass lesions based on the American College of Radiology (ACR) BI-RADS lexicon. Eur Radiol 25, 1768–1775 (2015). https://doi.org/10.1007/s00330-014-3570-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00330-014-3570-6