Breast Cancer Research and Treatment

, Volume 120, Issue 3, pp 539–546

Predictors of interobserver agreement in breast imaging using the Breast Imaging Reporting and Data System



The Breast Imaging Reporting and Data System (BI-RADS) was introduced in 1993 to standardize the interpretation of mammograms. Though many studies have assessed the validity of the system, fewer have examined its reliability. Our objective is to identify predictors of reliability as measured by the kappa statistic. We identified studies conducted between 1993 and 2009 which reported kappa values for interpreting mammograms using any edition of BI-RADS. Bivariate and multivariate multilevel analyses were used to examine associations between potential predictors and kappa values. We identified ten eligible studies, which yielded 88 kappa values for the analysis. Potential predictors of kappa included: whether or not the study included negative cases, whether single- or two-view mammograms were used, whether or not mammograms were digital versus screen-film, whether or not the fourth edition of BI-RADS was utilized, the BI-RADS category being evaluated, whether or not readers were trained, whether or not there was an overlap in readers’ professional activities, the number of cases in the study and the country in which the study was conducted. Our best multivariate model identified training, use of two-view mammograms and BI-RADS categories (masses, calcifications, and final assessments) as predictors of kappa. Training, use of two-view mammograms and focusing on mass description may be useful in increasing reliability in mammogram interpretation. Calcification and final assessment descriptors are areas for potential improvement. These findings are important for implementing policies in BI-RADS use before introducing the system in different settings and improving current implementations.


Interobserver agreement Kappa Mammography Breast cancer BI-RADS 


  1. 1.
    American Cancer Society. Cancer facts and figures 2009. Accessed 1 May 2009.
  2. 2.
    Wiratkapun C, Lertsithichai P, Wibulpholprasert B (2006) Positive predictive value of breast cancer in the lesions categorized as BI-RADS category 5. J Med Assoc Thai 89(8):1253–1259PubMedGoogle Scholar
  3. 3.
    Masroor I (2005) Prediction of benignity or malignancy of a lesion using BI-RADS. J Coll Phys Surg Pak 15(11):686–688Google Scholar
  4. 4.
    Resende LM, Matias MA, Oliveira GM, Salles MA, Melo FH, Gobbi H (2008) Evaluation of breast microcalcifications according to Breast Imaging Reporting and Data System (BI-RADS) and Le Gal’s classifications. Rev Bras Ginecol Obstet 30(2):75–79CrossRefPubMedGoogle Scholar
  5. 5.
    Lazarus E, Mainiero MB, Schepps B, Koelliker SL, Livingston LS (2006) BI-RADS lexicon for US and mammography: interobserver variability and positive predictive value. Radiology 239(2):385–391 (epub 2006 Mar 28)CrossRefPubMedGoogle Scholar
  6. 6.
    Coşar ZS, Cetin M, Tepe TK, Cetin R, Zarali AC (2005) Concordance of mammographic classifications of microcalcifications in breast cancer diagnosis: utility of the Breast Imaging Reporting and Data System (fourth edition). Clin Imaging 29(6):389–395CrossRefPubMedGoogle Scholar
  7. 7.
    Berg WA, D’Orsi CJ, Jackson VP, Bassett LW, Beam CA, Lewis RS, Crewson PE (2002) Does training in the Breast Imaging Reporting and Data System (BI-RADS) improve biopsy recommendations or feature analysis agreement with experienced breast imagers at mammography? Radiology 224(3):871–880CrossRefPubMedGoogle Scholar
  8. 8.
    Gülsün M, Demirkazik FB, Ariyürek M (2003) Evaluation of breast microcalcifications according to Breast Imaging Reporting and Data System criteria and Le Gal’s classification. Eur J Radiol 47(3):227–231CrossRefPubMedGoogle Scholar
  9. 9.
    Berg WA, Campassi C, Langenberg P, Sexton MJ (2000) Breast Imaging Reporting and Data System: inter- and intraobserver variability in feature analysis and final assessment. AJR Am J Roentgenol 174(6):1769–1777PubMedGoogle Scholar
  10. 10.
    Ciatto S, Houssami N, Apruzzese A, Bassetti E, Brancato B, Carozzi F, Catarzi S, Lamberini MP, Marcelli G, Pellizzoni R, Pesce B, Risso G, Russo F, Scorsolini A (2005) Categorizing breast mammographic density: intra- and interobserver reproducibility of BI-RADS density categories. Breast 14(4):269–275CrossRefPubMedGoogle Scholar
  11. 11.
    Ciatto S, Houssami N, Apruzzese A, Bassetti E, Brancato B, Carozzi F, Catarzi S, Lamberini MP, Marcelli G, Pellizzoni R, Pesce B, Risso G, Russo F, Scorsolini A (2006) Reader variability in reporting breast imaging according to BI-RADS assessment categories (the Florence experience). Breast 15(1):44–51 (epub 2005 Aug 1)CrossRefPubMedGoogle Scholar
  12. 12.
    Ooms EA, Zonderland HM, Eijkemans MJ, Kriege M, Mahdavian Delavary B, Burger CW, Ansink AC (2007) Mammography: interobserver variability in breast density assessment. Breast 16(6):568–576CrossRefPubMedGoogle Scholar
  13. 13.
    Baker JA, Kornguth PJ, Floyd CE Jr (1996) Breast imaging reporting and data system standardized mammography lexicon: observer variability in lesion description. AJR Am J Roentgenol 166(4):773–778PubMedGoogle Scholar
  14. 14.
    Kerlikowske K, Grady D, Barclay J, Frankel SD, Ominsky SH, Sickles EA, Ernster V (1998) Variability and accuracy in mammographic interpretation using the American College of Radiology Breast Imaging Reporting and Data System. J Natl Cancer Inst 90(23):1801–1809CrossRefPubMedGoogle Scholar
  15. 15.
    Fleiss JL (1973) Statistical methods for rates and proportions. Wiley, New York, pp 598–626Google Scholar
  16. 16.
    Wenkel E, Heckmann M, Heinrich M, Schwab SA, Uder M, Schulz-Wendtland R, Bautz WA, Janka R (2008) Automated breast ultrasound: lesion detection and BI-RADS classification—a pilot study. Rofo 180(9):804–808 (epub 2008 Aug 14)PubMedGoogle Scholar
  17. 17.
    Caramella T, Chapellier C, Ettore F, Raoust I, Chamorey E, Balu-Maestro C (2007) Value of MRI in the surgical planning of invasive lobular breast carcinoma: a prospective and a retrospective study of 57 cases: comparison with physical examination, conventional imaging, and histology. Clin Imaging 31(3):155–161CrossRefPubMedGoogle Scholar
  18. 18.
    Thomas A, Fischer T, Frey H, Ohlinger R, Grunwald S, Blohmer JU, Winzer KJ, Weber S, Kristiansen G, Ebert B, Kümmel S (2006) Real-time elastography—an advanced method of ultrasound: first results in 108 patients with breast lesions. Ultrasound Obstet Gynecol 28(3):335–340CrossRefPubMedGoogle Scholar
  19. 19.
    Teifke A, Vomweg TW, Hlawatsch A, Nasresfahani A, Kern A, Victor A, Schmidt M, Bittinger F, Düber C (2006) Second reading of breast imaging at the hospital department of radiology: reasonable or waste of money? Rofo 178(3):330–336PubMedGoogle Scholar
  20. 20.
    Lorenzen J, Wedel AK, Lisboa BW, Löning T, Adam G (2005) Diagnostic mammography and sonography: concordance of the breast imaging reporting assessments and final clinical outcome. Rofo 177(11):1545–1551PubMedGoogle Scholar
  21. 21.
    Yamada T, Saito M, Ishibashi T, Tsuboi M, Matsuhashi T, Sato A, Saito H, Takahashi S, Onuki K, Ouchi N (2004) Comparison of screen-film and full-field digital mammography in Japanese population-based screening. Radiat Med 22(6):408–412PubMedGoogle Scholar
  22. 22.
    Pijnappel RM, Peeters PH, Hendriks JH, Mali WP (2004) Reproducibility of mammographic classifications for non-palpable suspect lesions with microcalcifications. Br J Radiol 77(916):312–314CrossRefPubMedGoogle Scholar
  23. 23.
    Castella C, Kinkel K, Eckstein MP, Sottas PE, Verdun FR, Bochud FO (2007) Semiautomatic mammographic parenchymal patterns classification using multiple statistical features. Acad Radiol 14(12):1486–1499CrossRefPubMedGoogle Scholar
  24. 24.
    Baker JA, Kornguth PJ, Lo JY, Floyd CE Jr (1996) Artificial neural network: improving the quality of breast biopsy recommendations. Radiology 198(1):131–135PubMedGoogle Scholar
  25. 25.
    Gupta S, Chyn PF, Markey MK (2006) Breast cancer CADx based on BI-RAds descriptors from two mammographic views. Med Phys 33(6):1810–1817CrossRefPubMedGoogle Scholar
  26. 26.
    Skaane P, Diekmann F, Balleyguier C, Diekmann S, Piguet JC, Young K, Abdelnoor M, Niklason L (2008) Observer variability in screen-film mammography versus full-field digital mammography with soft-copy reading. Eur Radiol 18(6):1134–1143 (epub 2008 Feb 27)CrossRefPubMedGoogle Scholar
  27. 27.
    Perisinakis K, Damilakis J, Kontogiannis E, Gourtsoyiannis N (2001) Film-screen magnification versus electronic magnification and enhancement of digitized contact mammograms in the assessment of subtle microcalcifications. Invest Radiol 36(12):726–733CrossRefPubMedGoogle Scholar
  28. 28.
    Venta LA, Hendrick RE, Adler YT, DeLeon P, Mengoni PM, Scharl AM, Comstock CE, Hansen L, Kay N, Coveler A, Cutter G (2001) Rates and causes of disagreement in interpretation of full-field digital mammography and film-screen mammography in a diagnostic setting. AJR Am J Roentgenol 176(5):1241–1248PubMedGoogle Scholar
  29. 29.
    Thomas A, Kümmel S, Fritzsche F, Warm M, Ebert B, Hamm B, Fischer T (2006) Real-time sonoelastography performed in addition to B-mode ultrasound and mammography: improved differentiation of breast lesions? Acad Radiol 13(12):1496–1504CrossRefPubMedGoogle Scholar
  30. 30.
    Martin KE, Helvie MA, Zhou C, Roubidoux MA, Bailey JE, Paramagul C, Blane CE, Klein KA, Sonnad SS, Chan HP (2006) Mammographic density measured with quantitative computer-aided method: comparison with radiologists’ estimates and BI-RADS categories. Radiology 240(3):656–665 (epub 2006 Jul 20)CrossRefPubMedGoogle Scholar
  31. 31.
    Taplin SH, Ichikawa LE, Kerlikowske K et al (2002) Concordance of breast imaging reporting and data system assessments and management recommendations in screening mammography. Radiology 222(2):529–535CrossRefPubMedGoogle Scholar
  32. 32.
    Pisano ED, Yaffe MJ (2005) Digital mammography. Radiology 234(2):353–362CrossRefPubMedGoogle Scholar
  33. 33.
    Hambly NM, McNicholas MM, Phelan N, Hargaden GC, O’Doherty A, Flanagan FL (2009) Comparison of digital mammography and screen-film mammography in breast cancer screening: a review in the Irish breast screening program. AJR Am J Roentgenol 193(4):1010–1018CrossRefPubMedGoogle Scholar
  34. 34.
    Skaane P, Hofvind S, Skjennald A (2007) Randomized trial of screen-film versus full-field digital mammography with soft-copy reading in population-based screening program: follow-up and final results of Oslo II study. Radiology 244:708–717CrossRefPubMedGoogle Scholar
  35. 35.
    Pisano ED, Gatsonis C, Hendrick E et al (2005) Diagnostic performance of digital versus film mammography for breast-cancer screening. N Engl J Med 353:1773–1783 (erratum in N Engl J Med 2006; 355:1840)CrossRefPubMedGoogle Scholar
  36. 36.
    Rastogi T, Hildesheim A, Sinha R (2004) Opportunities for cancer epidemiology in developing countries. Nat Rev Cancer 4:909–917CrossRefPubMedGoogle Scholar
  37. 37.
    Alwan A (1997) Non-communicable diseases: a major challenge to public health in the region. East Mediterr Health J 3:6–16Google Scholar
  38. 38.
    Kanavos P (2006) The rising burden of cancer in the developing world. Ann Oncol 17(Suppl 8):viii15–viii23CrossRefPubMedGoogle Scholar
  39. 39.
    Gelman A, Hill J (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, CambridgeGoogle Scholar
  40. 40.
    Lesaffre E, Rizopoulos D, Tsonaka R (2007) The logistic transform for bounded outcome scores. Biostatistics 8(1):72–85 (epub 2006 Apr 5)CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC. 2010

Authors and Affiliations

  • Anna Liza M. Antonio
    • 1
    • 2
  • Catherine M. Crespi
    • 1
    • 3
  1. 1.Department of BiostatisticsUCLA School of Public Health, University of CaliforniaLos AngelesUSA
  2. 2.VA Greater Los Angeles Healthcare SystemLos AngelesUSA
  3. 3.Division of Cancer Prevention and Control Research, Jonsson Comprehensive Cancer CenterUniversity of CaliforniaLos AngelesUSA

Personalised recommendations