Skip to main content

Advertisement

Log in

Automated Extraction of BI-RADS Final Assessment Categories from Radiology Reports with Natural Language Processing

  • Published:
Journal of Digital Imaging Aims and scope Submit manuscript

Abstract

The objective of this study is to evaluate a natural language processing (NLP) algorithm that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) final assessment categories from radiology reports. This HIPAA-compliant study was granted institutional review board approval with waiver of informed consent. This cross-sectional study involved 1,165 breast imaging reports in the electronic medical record (EMR) from a tertiary care academic breast imaging center from 2009. Reports included screening mammography, diagnostic mammography, breast ultrasound, combined diagnostic mammography and breast ultrasound, and breast magnetic resonance imaging studies. Over 220 reports were included from each study type. The recall (sensitivity) and precision (positive predictive value) of a NLP algorithm to collect BI-RADS final assessment categories stated in the report final text was evaluated against a manual human review standard reference. For all breast imaging reports, the NLP algorithm demonstrated a recall of 100.0 % (95 % confidence interval (CI), 99.7, 100.0 %) and a precision of 96.6 % (95 % CI, 95.4, 97.5 %) for correct identification of BI-RADS final assessment categories. The NLP algorithm demonstrated high recall and precision for extraction of BI-RADS final assessment categories from the free text of breast imaging reports. NLP may provide an accurate, scalable data extraction mechanism from reports within EMRs to create databases to track breast imaging performance measures and facilitate optimal breast cancer population management strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Ballard-Barbash R, Taplin SH, Yankaskas BC, Ernster VL, Rosenberg RD, Carney PA, Barlow WE, Geller BM, Kerlikowske K, Edwards BK, Lynch CF, Urban N, Chrvala CA, Key CR, Poplack SP, Worden JK, Kessler LG: Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database. AJR Am J Roentgenol 169(4):1001–1008, 1997

    Article  PubMed  CAS  Google Scholar 

  2. American College of Radiology: Breast Imaging Reporting and Data System® (BI-RADS®), 4th edition. American College of Radiology, Reston, 2003

    Google Scholar 

  3. Mammography Quality Standard Act, 62 Federal Register 559688, 1997

  4. American College of Radiology Breast Magnetic Resonance Imaging (MRI) Accreditation Program Requirements. Available at http://www.acr.org/~/media/ACR/Documents/Accreditation/BreastMRI/Requirements.pdf. Accessed 10 July 2012

  5. Sickles EA: Auditing your breast imaging practice: an evidence-based approach. Semin Roentgenol 42(4):211–217, 2007

    Article  PubMed  Google Scholar 

  6. Hripcsak G, Austin JH, Alderson PO, Friedman C: Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology 224(1):157–163, 2002

    Article  PubMed  Google Scholar 

  7. Dreyer KJ, Kalra MK, Maher MM, Hurier AM, Asfaw BA, Schultz T, Halpern EF, Thrall JH: Application of recently developed computer algorithm for automatic classification of unstructured radiology reports: validation study. Radiology 234(2):323–329, 2005

    Article  PubMed  Google Scholar 

  8. Ip IK, Mortele KJ, Prevedello LM, Khorasani R: Repeat abdominal imaging examinations in a tertiary care hospital. Am J Med 125(2):155–161, 2012

    Article  PubMed  Google Scholar 

  9. Cheng LT, Zheng J, Savova GK, Erickson BJ: Discerning tumor status from unstructured MRI reports—completeness of information in existing reports and utility of automated natural language processing. J Digit Imaging 23(2):119–132, 2010

    Article  PubMed  Google Scholar 

  10. Jain NL, Friedman C: Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports. Proc AMIA Annu Fall Symp(829–833), 1997

  11. Sevenster M, van Ommering R, Qian Y: Automatically correlating clinical findings and body locations in radiology reports using MedLEE. J Digit Imaging 25(2):240–249, 2012

    Article  PubMed  Google Scholar 

  12. Percha B, Nassif H, Lipson J, Burnside E, Rubin D: Automatic classification of mammography reports by BI-RADS breast tissue composition class. J Am Med Inform Assoc 19(5):913–916, 2012

    Article  PubMed  Google Scholar 

  13. Mykowiecka A, Marciniak M, Kupść A: Rule-based information extraction from patients' clinical data. J Biomed Inform 42(5):923–936, 2009

    Article  PubMed  Google Scholar 

  14. How BROK Works. Brigham and Women's Hospital Web site. Available at http://www.brighamandwomens.org/Research/labs/cebi/BROK/default.aspx. Accessed 13 May 2013

  15. Chaudhry B, Wang J, Wu S, Maglione M, Mojica W, Roth E, Morton SC, Shekelle PG: Systematic review: impact of health information technology on quality, efficiency, and costs of medical care. Ann Intern Med 144(10):742–752, 2006

    Article  PubMed  Google Scholar 

  16. National Mammography Database (NMD). Available at https://nrdr.acr.org/Portal/NMD/Main/page.aspx. Accessed 10 June 2012

  17. Population-based Research Optimizing Screening through Personalized Regimens (PROSPR). Available at http://appliedresearch.cancer.gov/networks/prospr/. Accessed 10 June 2012

  18. RadLex. Available at http://www.rsna.org/radlex/. Accessed 10 June 2012

  19. Xu H, Anderson K, Grann VR, Friedman C: Facilitating cancer research using natural language processing of pathology reports. Stud Health Technol Inform 107(Pt 1):565–572, 2004

    PubMed  Google Scholar 

  20. Savova GK, Olson JE, Murphy SP, Cafourek VL, Couch FJ, Goetz MP, Ingle JN, Suman VJ, Chute CG, Weinshilboum RM. Automated discovery of drug treatment patterns for endocrine therapy of breast cancer within an electronic medical record. J Am Med Inform Assoc 19(e1): e83–e89. doi:10.1136/amiajnl-2011-000295

  21. Baldwin KB: Evaluating healthcare quality using natural language processing. J Healthc Qual 30(4):24–29, 2008

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

We would like to thank Drs. E. Francis Cook and E. John Orav for providing guidance for the statistical analysis.

The majority of the contributions of I Ikuta occurred while he was a fellow supported by the National Institutes of Health grant T15LM007092. The content of this manuscript is solely the responsibility of the authors and does not necessarily represent the official views of the National Library of Medicine or the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dorothy A. Sippo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sippo, D.A., Warden, G.I., Andriole, K.P. et al. Automated Extraction of BI-RADS Final Assessment Categories from Radiology Reports with Natural Language Processing. J Digit Imaging 26, 989–994 (2013). https://doi.org/10.1007/s10278-013-9616-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10278-013-9616-5

Keywords

Navigation