Skip to main content
Log in

Inter- and intra-observer variability of qualitative visual breast-composition assessment in mammography among Japanese physicians: a first multi-institutional observer performance study in Japan

  • Original Article
  • Published:
Breast Cancer Aims and scope Submit manuscript

Abstract

Background

Visual assessment of mammographic breast composition remains the most common worldwide, although subjective variability limits its reproducibility. This study aimed to investigate the inter- and intra-observer variability in qualitative visual assessment of mammographic breast composition through a multi-institutional observer performance study for the first time in Japan.

Methods

This study enrolled 10 Japanese physicians from five different institutions. They used the new Japanese breast-composition classification system 4th edition to subjectively evaluate the breast composition in 200 pairs of right and left normal mediolateral oblique mammograms (number determined using precise sample size calculations) twice, with a 1-month interval (median patient age: 59 years [range 40–69 years]). The primary endpoint of this study was the inter-observer variability using kappa (κ) value.

Results

Inter-observer variability for the four and two classes of breast-composition assessment revealed moderate agreement (Fleiss’ κ: first and second reading = 0.553 and 0.587, respectively) and substantial agreement (Fleiss’ κ: first and second reading = 0.689 and 0.70, respectively). Intra-observer variability for the four and two classes of breast-composition assessment demonstrated substantial agreement (Cohen’s κ, median = 0.758) and almost perfect agreement (Cohen’s κ, median = 0.813). Assessments of consensus between the 10 physicians and the automated software Volpara® revealed slight agreement (Cohen’s κ; first and second reading: 0.104 and 0.075, respectively).

Conclusions

Qualitative visual assessment of mammographic breast composition using the new Japanese classification revealed excellent intra-observer reproducibility. However, persistent inter-observer variability, presenting a challenge in establishing it as the gold standard in Japan.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Availability of data and materials

All data supporting this article are included in this manuscript.

Abbreviations

BI-RADS:

Breast Imaging-Reporting and Data System

MLO:

Mediolateral oblique

PDF:

Portable Document File

VBD:

Volumetric breast density

VDG:

Volpara density grade

JCOQABCS:

Japan Central Organization on Quality Assurance of Breast Cancer Screening

References

  1. Smith RA, Duffy SW, Gabe R, Tabar L, Yen AM, Chen TH. The randomized trials of breast cancer screening: What have we learned? Radiol Clin North Am. 2004;42:793–806.

    Article  PubMed  Google Scholar 

  2. Uematsu T. Rethinking screening mammography in Japan: next-generation breast cancer screening through breast awareness and supplemental ultrasonography. Breast Cancer. 2024;31:24–30.

    Article  PubMed  Google Scholar 

  3. Namba T, Matsuda N, Rahman M, Kanomata N, Yamauchi H, Tsunoda H. Association between mammographic breast composition and breast cancer risk among Japanese women: a retrospective cohort study. Breast Cancer. 2022;29:978–84.

    Article  PubMed  Google Scholar 

  4. Redondo A, Comas M, Macià F, Ferrer F, Murta-Nascimento C, Maristany MT, et al. Inter- and intra-radiologist variability in the BI-RADS assessment and breast density categories for screening mammograms. Br J Radiol. 2012;85:1465–70.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Spak DA, Plaxco JS, Santiago L, Dryden MJ, Dogan BE. BI-RADS® fifth edition: a summary of changes. Diagn Interv Imaging. 2017;98:179–90.

    Article  CAS  PubMed  Google Scholar 

  6. Irshad A, Leddy R, Ackerman S, Cluver A, Pavic D, Abid A, et al. Effects of changes in BI-RADS density assessment guidelines (fourth versus fifth edition) on breast density assessment: Intra- and inter-reader agreements and density distribution. AJR Am J Roentgenol. 2016;207:1366–71.

    Article  PubMed  Google Scholar 

  7. Alikhassi A, Esmaili Gourabi H, Baikpour M. Comparison of inter- and intra-observer variability of breast density assessments using the fourth and fifth editions of Breast Imaging Reporting and Data System. Eur J Radiol Open. 2018;5:67–72.

    Article  PubMed  PubMed Central  Google Scholar 

  8. The Japan Central Organization on Quality Assurance of breast cancer Screening; 2017. (in Japanese). Notice on classification of breast composition. https://www.qabcs.or.jp/news/20200206.html. Accessed 1 Jan 2024.

  9. Japan Radiological Society, Japanese Society of Radiological Technology. Mammography guidelines. 4th ed. Igaku-Shoin Ltd. 2021. (in Japanese).

  10. Funayama K, Kubouchi K, Doi T, Mizuno K. Statistical bias among mammogram assessments of breast density in population-based breast cancer screening. J Jpn Assoc Breast Cancer Screen. 2018;27:77–80 (in Japanese).

    Google Scholar 

  11. Grassano L, Pagana G, Daperno M, Bibbona E, Gasparini M. Asymptotic distributions of kappa statistics and their differences with many raters, many rating categories and two conditions. Biom J. 2018;60:146–54.

    Article  PubMed  Google Scholar 

  12. Highnam R, Brady SM, Yaffe MJ, Karssemeijer N, Harvey J. Robust breast composition measurement. In: VolparaTM. In: Martí J, Oliver A, Freixenet J, Martí R, editors. Digital Mammography. IWDM 2010. Lecture Notes in Computer Science, vol 6136. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13666-5_46

  13. Lee SE, Son NH, Kim MH, Kim EK. Mammographic density assessment by artificial intelligence-based computer-assisted diagnosis: a comparison with automated volumetric assessment. J Digit Imaging. 2022;35:173–9.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Youk JH, Gweon HM, Son EJ, Kim JA. Automated volumetric breast density measurements in the era of the BI-RADS fifth edition: a comparison with visual assessment. AJR Am J Roentgenol. 2016;206:1056–62.

    Article  PubMed  Google Scholar 

  15. Singh T, Sharma M, Singla V, Khandelwal N. Breast density estimation with fully automated volumetric method: Comparison to radiologists’ assessment by BI-RADS categories. Acad Radiol. 2016;23:78–83.

    Article  PubMed  Google Scholar 

  16. Brandt KR, Scott CG, Ma L, Mahmoudzadeh AP, Jensen MR, Whaley DH, et al. Comparison of clinical and automated breast density measurements: Implications for risk prediction and supplemental screening. Radiology. 2016;279:710–9.

    Article  PubMed  Google Scholar 

  17. Moshina N, Roman M, Sebuødegård S, Waade GG, Ursin G, Hofvind S. Comparison of subjective and fully automated methods for measuring mammographic density. Acta Radiol. 2018;59:154–60.

    Article  PubMed  Google Scholar 

  18. Eom HJ, Cha JH, Kang JW, Choi WJ, Kim HJ, Go E. Comparison of variability in breast density assessment by BI-RADS category according to the level of experience. Acta Radiol. 2018;59:527–32.

    Article  PubMed  Google Scholar 

  19. Sartor H, Lång K, Rosso A, Borgquist S, Zackrisson S, Timberg P. Measuring mammographic density: Comparing a fully automated volumetric assessment versus European radiologists’ qualitative classification. Eur Radiol. 2016;26:4354–60.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Ekpo EU, Mello-Thoms C, Rickard M, Brennan PC, McEntee MF. Breast density (BD) assessment with digital breast tomosynthesis (DBT): Agreement between Quantra™ and 5th edition BI-RADS®. Breast. 2016;30:185–90.

    Article  PubMed  Google Scholar 

  21. Tari DU, Santonastaso R, De Lucia DR, Santarsiere M, Pinto F. Breast density evaluation according to BI-RADS 5th edition on digital breast tomosynthesis: AI automated assessment versus human visual assessment. J Pers Med. 2023;13:609.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Lehman CD, Yala A, Schuster T, Dontchos B, Bahl M, Swanson K, et al. Mammographic breast density assessment using deep learning: Clinical implementation. Radiology. 2019;290:52–8.

    Article  PubMed  Google Scholar 

  23. Matthews TP, Singh S, Mombourquette B, Su J, Shah MP, Pedemonte S, et al. A multisite study of a breast density deep learning model for full-field digital mammography and synthetic mammography. Radiol Artif Intell. 2021;3: e200015.

    Article  PubMed  Google Scholar 

  24. Sexauer R, Hejduk P, Borkowski K, Ruppert C, Weikert T, Dellas S, et al. Diagnostic accuracy of automated ACR BI-RADS breast density classification using deep convolutional neural networks. Eur Radiol. 2023;33:4589–96.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Rigaud B, Weaver OO, Dennison JB, Awais M, Anderson BM, Chiang TD, et al. Deep learning models for automated assessment of breast density using multiple mammographic image types. Cancers (Basel). 2022;14:5003.

    Article  PubMed  Google Scholar 

  26. Tice JA, Cummings SR, Smith-Bindman R, Ichikawa L, Barlow WE, Kerlikowske K. Using clinical factors and mammographic breast density to estimate breast cancer risk: Development and validation of a new predictive model. Ann Intern Med. 2008;148:337–47.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Conant EF, Sprague BL, Kontos D. Beyond BI-RADS density: a call for quantification in the breast imaging clinic. Radiology. 2018;286:401–4.

    Article  PubMed  Google Scholar 

  28. Tohno E, Umemoto T, Itoh A, Kujiraoka Y, Koshikawa K, Fukuda Y, et al. Interobserver agreement in evaluation of breast composition and differences in sensitivity according to breast composition. J Jpn Assoc Breast Cancer Screen. 2015;24:121 (in Japanese).

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Editage for English language editing.

Funding

The authors received no funding for the study.

Author information

Authors and Affiliations

Authors

Contributions

All the authors were involved in designing this study. TU proposed the concept and idea of this study. KY and TU drafted the protocol design and manuscript. MT and SO developed and are responsible for the design of statistical analysis. KN developed and is responsible for the dataset of mammograms. HT, FK, NU, KB, YM, YK, AK, KT, and TI provided advice on the protocol design and manuscript. All authors have reviewed and approved the manuscript for submission.

Corresponding author

Correspondence to Takayoshi Uematsu.

Ethics declarations

Conflict of interest

None declared.

Ethics approval and consent to participation

Owing to the retrospective nature of the study, the requirement of written informed consent was waived. Ethical approval was provided by the Institutional Review Board of Shizuoka Cancer Center (Approval number: T2023-1–2023-1–4). The study protocol was approved by the institutional review board of each institution. In addition, all images used in this study were completely anonymized, ensuring no information that could identify the patient.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

12282_2024_1580_MOESM1_ESM.pptx

Supplementary Fig. 1 The new Japanese breast-composition classification, 4th edition Color atlas of the new Japanese breast-composition classification, 4th edition. The definition of the mammary gland region is shown in (a), and the method of calculating the proportion of the mammary parenchyma area for classifying breast composition into four classes is shown in (b) and (c). (Originally cited in https://brestcs.org/study/achievement/page1.html ; a product of team Kasahara in Health and Labour Sciences Research)

12282_2024_1580_MOESM2_ESM.pptx

Supplementary Fig. 2 Mediolateral oblique (MLO) views of mammogram examples(a) and automated breast composition assessment(b) by Volpara®. Two mammogram examples of the same patient taken at Shizuoka Cancer Hospital and the evaluation of breast composition using automated software Volpara® are shown. Automated software evaluated Volumetric Breast Density at 10.2% and classified the breast composition as grade c.

12282_2024_1580_MOESM3_ESM.pptx

Supplementary Fig. 3 Mediolateral oblique (MLO)-view mammograms from two cases in the dataset. The mammograms from two cases in the dataset are shown. (a) The opinions of all physicians were consistent as “heterogeneously dense.” (b) The opinions of the physicians were divided between “scattered” and “heterogeneously dense

Supplementary file4 (XLSX 20 KB)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koyama, Y., Nakashima, K., Orihara, S. et al. Inter- and intra-observer variability of qualitative visual breast-composition assessment in mammography among Japanese physicians: a first multi-institutional observer performance study in Japan. Breast Cancer (2024). https://doi.org/10.1007/s12282-024-01580-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12282-024-01580-8

Keywords

Navigation