Skip to main content
Log in

Digital mammography dataset for breast cancer diagnosis research (DMID) with breast mass segmentation analysis

  • Original Article
  • Published:
Biomedical Engineering Letters Aims and scope Submit manuscript

Abstract

Purpose:In the last two decades, computer-aided detection and diagnosis (CAD) systems have been created to help radiologists discover and diagnose lesions observed on breast imaging tests. These systems can serve as a second opinion tool for the radiologist. However, developing algorithms for identifying and diagnosing breast lesions relies heavily on mammographic datasets. Many existing databases do not consider all the needs necessary for research and study, such as mammographic masks, radiology reports, breast composition, etc. This paper aims to introduce and describe a new mammographic database. Methods:The proposed dataset comprises mammograms with several lesions, such as masses, calcifications, architectural distortions, and asymmetries. In addition, a radiologist report is provided, describing the details of the breast, such as breast density, description of abnormality present, condition of the skin, nipple and pectoral muscles, etc., for each mammogram. Results:We present results of commonly used segmentation framework trained on our proposed dataset. We used information regarding the class of abnormalities (benign or malignant) and breast tissue density provided with each mammogram to analyze the segmentation model’s performance concerning these parameters. Conclusion:The presented dataset provides diverse mammogram images to develop and train models for breast cancer diagnosis applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The dataset can be used to train ML or DL models for mammogram classification, BI-RADS classification, as well as breast composition classification. Moreover, it can be used to train segmentation models to segment breast lesions. Additionally, the radiology reports can be utilized to train report generation models. This dataset will be available for research purposes only on the link provided in the article.

References

  1. American Cancer Society, Breast cancer facts & figures 2019–2020. Am Cancer Soc, pp 1–44 (2019)

  2. Breast cancer landscape in asia-pacific, https://novotech-cro.com/sites/default/files/2021-02/Breast20Cancer20Landscape20in/20Asia-Pacific_2021.pdf. Accessed 2022-03-10 (2021).

  3. Fenton JJ, Zhu W, Balch S, Smith-Bindman R, Fishman P, Hubbard RA. Distinguishing screening from diagnostic mammograms using Medicare claims data. Med Care. 2014;52(7):244. https://doi.org/10.1097/MLR.0b013e318269e0f5.

    Article  Google Scholar 

  4. Moreira IC, Amaral I, Domingues I, Cardoso A, Cardoso MJ, Cardoso JS. Inbreast: toward a full-field digital mammographic database. Acad Radiol. 2012;19(2):236–48. https://doi.org/10.1016/j.acra.2011.09.014.

    Article  PubMed  Google Scholar 

  5. Oza P, Sharma P, Patel S. A transfer representation learning approach for breast cancer diagnosis from mammograms using efficientnet models. Scalable Comput Practice Exp. 2022;23(2):51–8. https://doi.org/10.12694/scpe.v23i2.1975.

    Article  Google Scholar 

  6. Oza P, Sharma P, Patel S. Transfer learning assisted classification of artefacts removed and contrast improved digital mammograms. Scalable Comput Practice Exp. 2022;23(3):115–27. https://doi.org/10.12694/scpe.v23i2.1975.

    Article  Google Scholar 

  7. Oza P, Sharma P, Patel S. A drive through computer-aided diagnosis of breast cancer: a comprehensive study of clinical and technical aspects. In Recent innovations in computing: proceedings of ICRIC 2021, Vol 1, pp 233–249 (2022c). 10.1007/978-981-16-8248-3_19

  8. Oza P, Sharma P, Patel S. Breast lesion classification from mammograms using deep neural network and test-time augmentation. Neural Comput Appl. 2023. https://doi.org/10.1007/s00521-023-09165-w.

    Article  Google Scholar 

  9. Oza P. AI in breast imaging: Applications, challenges, and future research. In: Computational intelligence and modelling techniques for disease detection in mammogram images. 2023.

  10. Oza P, Sharma P, Patel S, Kumar P. Deep convolutional neural networks for computer-aided breast cancer diagnostic: a survey. Neural Comput Appl. 2022;34:1815–36. https://doi.org/10.1007/s00521-021-06804-y.

    Article  Google Scholar 

  11. Oza P, Sharma P, Patel S, Adedoyin F, Bruno A. Image augmentation techniques for mammogram analysis. J Imaging. 2022;8(5):141. https://doi.org/10.3390/jimaging8050141.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Oza P, Sharma P, Patel S. Deep ensemble transfer learning-based framework for mammographic image classification. J Supercomput. 2022. https://doi.org/10.1007/s11227-022-04992-5.

    Article  Google Scholar 

  13. Li H, Chen D, Nailon WH, Davies ME, Laurenson DI. Dual convolutional neural networks for breast mass segmentation and diagnosis in mammography. IEEE Trans Med Imaging. 2021;41(1):3–13. https://doi.org/10.1109/TMI.2021.3102622.

    Article  PubMed  Google Scholar 

  14. Baccouche A, Garcia-Zapirain B, Olea CC, Elmaghraby AS. Connected-unets: a deep learning architecture for breast mass segmentation. NPJ Breast Cancer. 2021;7(1):1–12. https://doi.org/10.1038/s41523-021-00358-x.

    Article  Google Scholar 

  15. Abdelhafiz D, Bi J, Ammar R, Yang C, Nabavi S. Convolutional neural network for automated mass segmentation in mammography. BMC Bioinform. 2020;21(1):1–19. https://doi.org/10.1186/s12859-020-3521-y.

    Article  Google Scholar 

  16. Sun H, Li C, Liu B, Liu Z, Wang M, Zheng H, Feng DD, Wang S. Aunet: attention-guided dense-upsampling networks for breast mass segmentation in whole mammograms. Phys Med Biol. 2020;65(5):055005. https://doi.org/10.1088/1361-6560/ab5745.

    Article  PubMed  Google Scholar 

  17. Suckling J, Parker J, Dance D, Astley S, Hutt I, Boggis C, Ricketts I, Stamatakis E, Cerneaz N, Kok S, et al. Mammographic image analysis society (mias) database v1. 21. (2015)

  18. Michael Heath, Bowyer K, Kopans D, Kegelmeyer P, Moore R, Chang K, Munishkumaran S. Current status of the digital database for screening mammography. In Digital mammography, pp 457–460. Springer (1998). https://doi.org/10.1007/978-94-011-5318-8_75

  19. Bruno A, Ardizzone E, Vitabile S, Midiri M. A novel solution based on scale invariant feature transform descriptors and deep learning for the detection of suspicious regions in mammogram images. J Med Signals Sens. 2020;10(3):158. https://doi.org/10.4103/jmss.JMSS_31_19.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Alsolami AS, Shalash W, Alsaggaf W, Ashoor S, Refaat H, Elmogy M. King abdulaziz university breast cancer mammogram dataset (kau-bcmd). Data. 2021;6(11):111. https://doi.org/10.3390/data6110111.

    Article  Google Scholar 

  21. Oliveira JEE et al. Toward a standard reference database for computer-aided mammography. In: Medical imaging 2008: computer-aided diagnosis, vol 6915, pp 606–614. SPIE (2008). https://doi.org/10.1117/12.770325.

  22. Lopez MG, Posada N, Moura DC, Pollán RR, Valiente JMF, Ortega CS, Solar M, Diaz-Herrero M, Ramos IMAP, Loureiro J, et al. Bcdr: a breast cancer digital repository. In 15th international conference on experimental mechanics, vol 1215 (2012)

  23. Matheus BRN, Schiabel H. Online mammographic images database for development and comparison of cad schemes. J Digit Imaging. 2011;24(3):500–6. https://doi.org/10.1007/s10278-010-9297-2.

    Article  PubMed  Google Scholar 

  24. Oza P, Sharma P, Patel S, Kumar P. Computer-aided breast cancer diagnosis: comparative analysis of breast imaging modalities and mammogram repositories. Current Med Imaging. 2022;18:1–13. https://doi.org/10.2174/1573405618666220621123156.

    Article  Google Scholar 

  25. Tariq M, Iqbal S, Ayesha H, Abbas I, Ahmad KT, Niazi MFK. Medical image based breast cancer diagnosis: state of the art and future directions. Expert Syst Appl. 2021;167:114095. https://doi.org/10.1016/j.eswa.2020.114095.

    Article  Google Scholar 

  26. Lee RS, Gimenez F, Hoogi A, Miyake KK, Gorovoy M, Rubin DL. A curated mammography data set for use in computer-aided detection and diagnosis research. Sci Data. 2017;4(1):1–9. https://doi.org/10.1038/sdata.2017.177.

    Article  Google Scholar 

  27. D’Orsi CJ. The American college of radiology mammography lexicon: an initial attempt to standardize terminology. AJR Am J Roentgenol. 1996;166(4):779–80. https://doi.org/10.2214/ajr.166.4.8610548.

    Article  PubMed  Google Scholar 

  28. Weerakkody Y, Niknejad M, Breast imaging-reporting and data system (bi-rads). https://radiopaedia.org/articles/10003(2022). Accessed: 10 May 2022

  29. Li S, Dong M, Guangming D, Xiaomin M. Attention dense-u-net for automatic breast mass segmentation in digital mammogram. IEEE Access. 2019;7:59037–47. https://doi.org/10.1109/ACCESS.2019.2914873.

    Article  Google Scholar 

  30. Al-Antari MA, Al-Masni MA, Choi M-T, Han S-M, Kim T-S. A fully integrated computer-aided diagnosis system for digital x-ray mammograms via deep learning detection, segmentation, and classification. Int J Med Inform. 2018;117:44–54. https://doi.org/10.1016/j.ijmedinf.2018.06.003.

    Article  PubMed  Google Scholar 

  31. Baccouche A, Garcia-Zapirain B, Castillo Olea C, Elmaghraby AS. Connected-unets: a deep learning architecture for breast mass segmentation. NPJ Breast Cancer. 2021;7(1):1–12. https://doi.org/10.1038/s41523-021-00358-x.

    Article  Google Scholar 

  32. Dhungel N, Carneiro G, Bradley AP. A deep learning approach for the analysis of masses in mammograms with minimal user intervention. Med Image Anal. 2017;37:114–28. https://doi.org/10.1016/j.media.2017.01.009.

    Article  PubMed  Google Scholar 

  33. Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp 234–241. Springer (2015) . https://doi.org/10.48550/arXiv.1505.04597.

  34. Oktay O, Schlemper J, Le Folgoc L, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al. Attention u-net: learning where to look for the pancreas. arXiv:1804.03999, https://doi.org/10.48550/arXiv.1804.03999 (2018)

Download references

Acknowledgements

We would like to thank Samved Hospital, Ahmedabad, India. Additionally, we would like to thank Dr. Dinesh Patel and Dr. Trupti Patel for their support.

Funding

Funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parita Oza.

Ethics declarations

Conflict of interest

None

Ethical approval

The use of the dataset for the purpose of AI research has been approved by the Ethical Review Board.

Consent to participate

The hospital has provided anonymized images for the dataset.

Consent to publish

The authors affirm to publish this dataset with permission.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Oza, P., Oza, U., Oza, R. et al. Digital mammography dataset for breast cancer diagnosis research (DMID) with breast mass segmentation analysis. Biomed. Eng. Lett. 14, 317–330 (2024). https://doi.org/10.1007/s13534-023-00339-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13534-023-00339-y

Keywords

Navigation