Abstract
Background
Visual assessment of mammographic breast composition remains the most common worldwide, although subjective variability limits its reproducibility. This study aimed to investigate the inter- and intra-observer variability in qualitative visual assessment of mammographic breast composition through a multi-institutional observer performance study for the first time in Japan.
Methods
This study enrolled 10 Japanese physicians from five different institutions. They used the new Japanese breast-composition classification system 4th edition to subjectively evaluate the breast composition in 200 pairs of right and left normal mediolateral oblique mammograms (number determined using precise sample size calculations) twice, with a 1-month interval (median patient age: 59 years [range 40–69 years]). The primary endpoint of this study was the inter-observer variability using kappa (κ) value.
Results
Inter-observer variability for the four and two classes of breast-composition assessment revealed moderate agreement (Fleiss’ κ: first and second reading = 0.553 and 0.587, respectively) and substantial agreement (Fleiss’ κ: first and second reading = 0.689 and 0.70, respectively). Intra-observer variability for the four and two classes of breast-composition assessment demonstrated substantial agreement (Cohen’s κ, median = 0.758) and almost perfect agreement (Cohen’s κ, median = 0.813). Assessments of consensus between the 10 physicians and the automated software Volpara® revealed slight agreement (Cohen’s κ; first and second reading: 0.104 and 0.075, respectively).
Conclusions
Qualitative visual assessment of mammographic breast composition using the new Japanese classification revealed excellent intra-observer reproducibility. However, persistent inter-observer variability, presenting a challenge in establishing it as the gold standard in Japan.
Similar content being viewed by others
Availability of data and materials
All data supporting this article are included in this manuscript.
Abbreviations
- BI-RADS:
-
Breast Imaging-Reporting and Data System
- MLO:
-
Mediolateral oblique
- PDF:
-
Portable Document File
- VBD:
-
Volumetric breast density
- VDG:
-
Volpara density grade
- JCOQABCS:
-
Japan Central Organization on Quality Assurance of Breast Cancer Screening
References
Smith RA, Duffy SW, Gabe R, Tabar L, Yen AM, Chen TH. The randomized trials of breast cancer screening: What have we learned? Radiol Clin North Am. 2004;42:793–806.
Uematsu T. Rethinking screening mammography in Japan: next-generation breast cancer screening through breast awareness and supplemental ultrasonography. Breast Cancer. 2024;31:24–30.
Namba T, Matsuda N, Rahman M, Kanomata N, Yamauchi H, Tsunoda H. Association between mammographic breast composition and breast cancer risk among Japanese women: a retrospective cohort study. Breast Cancer. 2022;29:978–84.
Redondo A, Comas M, Macià F, Ferrer F, Murta-Nascimento C, Maristany MT, et al. Inter- and intra-radiologist variability in the BI-RADS assessment and breast density categories for screening mammograms. Br J Radiol. 2012;85:1465–70.
Spak DA, Plaxco JS, Santiago L, Dryden MJ, Dogan BE. BI-RADS® fifth edition: a summary of changes. Diagn Interv Imaging. 2017;98:179–90.
Irshad A, Leddy R, Ackerman S, Cluver A, Pavic D, Abid A, et al. Effects of changes in BI-RADS density assessment guidelines (fourth versus fifth edition) on breast density assessment: Intra- and inter-reader agreements and density distribution. AJR Am J Roentgenol. 2016;207:1366–71.
Alikhassi A, Esmaili Gourabi H, Baikpour M. Comparison of inter- and intra-observer variability of breast density assessments using the fourth and fifth editions of Breast Imaging Reporting and Data System. Eur J Radiol Open. 2018;5:67–72.
The Japan Central Organization on Quality Assurance of breast cancer Screening; 2017. (in Japanese). Notice on classification of breast composition. https://www.qabcs.or.jp/news/20200206.html. Accessed 1 Jan 2024.
Japan Radiological Society, Japanese Society of Radiological Technology. Mammography guidelines. 4th ed. Igaku-Shoin Ltd. 2021. (in Japanese).
Funayama K, Kubouchi K, Doi T, Mizuno K. Statistical bias among mammogram assessments of breast density in population-based breast cancer screening. J Jpn Assoc Breast Cancer Screen. 2018;27:77–80 (in Japanese).
Grassano L, Pagana G, Daperno M, Bibbona E, Gasparini M. Asymptotic distributions of kappa statistics and their differences with many raters, many rating categories and two conditions. Biom J. 2018;60:146–54.
Highnam R, Brady SM, Yaffe MJ, Karssemeijer N, Harvey J. Robust breast composition measurement. In: VolparaTM. In: Martí J, Oliver A, Freixenet J, Martí R, editors. Digital Mammography. IWDM 2010. Lecture Notes in Computer Science, vol 6136. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13666-5_46
Lee SE, Son NH, Kim MH, Kim EK. Mammographic density assessment by artificial intelligence-based computer-assisted diagnosis: a comparison with automated volumetric assessment. J Digit Imaging. 2022;35:173–9.
Youk JH, Gweon HM, Son EJ, Kim JA. Automated volumetric breast density measurements in the era of the BI-RADS fifth edition: a comparison with visual assessment. AJR Am J Roentgenol. 2016;206:1056–62.
Singh T, Sharma M, Singla V, Khandelwal N. Breast density estimation with fully automated volumetric method: Comparison to radiologists’ assessment by BI-RADS categories. Acad Radiol. 2016;23:78–83.
Brandt KR, Scott CG, Ma L, Mahmoudzadeh AP, Jensen MR, Whaley DH, et al. Comparison of clinical and automated breast density measurements: Implications for risk prediction and supplemental screening. Radiology. 2016;279:710–9.
Moshina N, Roman M, Sebuødegård S, Waade GG, Ursin G, Hofvind S. Comparison of subjective and fully automated methods for measuring mammographic density. Acta Radiol. 2018;59:154–60.
Eom HJ, Cha JH, Kang JW, Choi WJ, Kim HJ, Go E. Comparison of variability in breast density assessment by BI-RADS category according to the level of experience. Acta Radiol. 2018;59:527–32.
Sartor H, Lång K, Rosso A, Borgquist S, Zackrisson S, Timberg P. Measuring mammographic density: Comparing a fully automated volumetric assessment versus European radiologists’ qualitative classification. Eur Radiol. 2016;26:4354–60.
Ekpo EU, Mello-Thoms C, Rickard M, Brennan PC, McEntee MF. Breast density (BD) assessment with digital breast tomosynthesis (DBT): Agreement between Quantra™ and 5th edition BI-RADS®. Breast. 2016;30:185–90.
Tari DU, Santonastaso R, De Lucia DR, Santarsiere M, Pinto F. Breast density evaluation according to BI-RADS 5th edition on digital breast tomosynthesis: AI automated assessment versus human visual assessment. J Pers Med. 2023;13:609.
Lehman CD, Yala A, Schuster T, Dontchos B, Bahl M, Swanson K, et al. Mammographic breast density assessment using deep learning: Clinical implementation. Radiology. 2019;290:52–8.
Matthews TP, Singh S, Mombourquette B, Su J, Shah MP, Pedemonte S, et al. A multisite study of a breast density deep learning model for full-field digital mammography and synthetic mammography. Radiol Artif Intell. 2021;3: e200015.
Sexauer R, Hejduk P, Borkowski K, Ruppert C, Weikert T, Dellas S, et al. Diagnostic accuracy of automated ACR BI-RADS breast density classification using deep convolutional neural networks. Eur Radiol. 2023;33:4589–96.
Rigaud B, Weaver OO, Dennison JB, Awais M, Anderson BM, Chiang TD, et al. Deep learning models for automated assessment of breast density using multiple mammographic image types. Cancers (Basel). 2022;14:5003.
Tice JA, Cummings SR, Smith-Bindman R, Ichikawa L, Barlow WE, Kerlikowske K. Using clinical factors and mammographic breast density to estimate breast cancer risk: Development and validation of a new predictive model. Ann Intern Med. 2008;148:337–47.
Conant EF, Sprague BL, Kontos D. Beyond BI-RADS density: a call for quantification in the breast imaging clinic. Radiology. 2018;286:401–4.
Tohno E, Umemoto T, Itoh A, Kujiraoka Y, Koshikawa K, Fukuda Y, et al. Interobserver agreement in evaluation of breast composition and differences in sensitivity according to breast composition. J Jpn Assoc Breast Cancer Screen. 2015;24:121 (in Japanese).
Acknowledgements
The authors would like to thank Editage for English language editing.
Funding
The authors received no funding for the study.
Author information
Authors and Affiliations
Contributions
All the authors were involved in designing this study. TU proposed the concept and idea of this study. KY and TU drafted the protocol design and manuscript. MT and SO developed and are responsible for the design of statistical analysis. KN developed and is responsible for the dataset of mammograms. HT, FK, NU, KB, YM, YK, AK, KT, and TI provided advice on the protocol design and manuscript. All authors have reviewed and approved the manuscript for submission.
Corresponding author
Ethics declarations
Conflict of interest
None declared.
Ethics approval and consent to participation
Owing to the retrospective nature of the study, the requirement of written informed consent was waived. Ethical approval was provided by the Institutional Review Board of Shizuoka Cancer Center (Approval number: T2023-1–2023-1–4). The study protocol was approved by the institutional review board of each institution. In addition, all images used in this study were completely anonymized, ensuring no information that could identify the patient.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
12282_2024_1580_MOESM1_ESM.pptx
Supplementary Fig. 1 The new Japanese breast-composition classification, 4th edition Color atlas of the new Japanese breast-composition classification, 4th edition. The definition of the mammary gland region is shown in (a), and the method of calculating the proportion of the mammary parenchyma area for classifying breast composition into four classes is shown in (b) and (c). (Originally cited in https://brestcs.org/study/achievement/page1.html ; a product of team Kasahara in Health and Labour Sciences Research)
12282_2024_1580_MOESM2_ESM.pptx
Supplementary Fig. 2 Mediolateral oblique (MLO) views of mammogram examples(a) and automated breast composition assessment(b) by Volpara®. Two mammogram examples of the same patient taken at Shizuoka Cancer Hospital and the evaluation of breast composition using automated software Volpara® are shown. Automated software evaluated Volumetric Breast Density at 10.2% and classified the breast composition as grade c.
12282_2024_1580_MOESM3_ESM.pptx
Supplementary Fig. 3 Mediolateral oblique (MLO)-view mammograms from two cases in the dataset. The mammograms from two cases in the dataset are shown. (a) The opinions of all physicians were consistent as “heterogeneously dense.” (b) The opinions of the physicians were divided between “scattered” and “heterogeneously dense
About this article
Cite this article
Koyama, Y., Nakashima, K., Orihara, S. et al. Inter- and intra-observer variability of qualitative visual breast-composition assessment in mammography among Japanese physicians: a first multi-institutional observer performance study in Japan. Breast Cancer (2024). https://doi.org/10.1007/s12282-024-01580-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12282-024-01580-8