Abstract
The objective of this study is to evaluate a natural language processing (NLP) algorithm that determines American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) final assessment categories from radiology reports. This HIPAA-compliant study was granted institutional review board approval with waiver of informed consent. This cross-sectional study involved 1,165 breast imaging reports in the electronic medical record (EMR) from a tertiary care academic breast imaging center from 2009. Reports included screening mammography, diagnostic mammography, breast ultrasound, combined diagnostic mammography and breast ultrasound, and breast magnetic resonance imaging studies. Over 220 reports were included from each study type. The recall (sensitivity) and precision (positive predictive value) of a NLP algorithm to collect BI-RADS final assessment categories stated in the report final text was evaluated against a manual human review standard reference. For all breast imaging reports, the NLP algorithm demonstrated a recall of 100.0 % (95 % confidence interval (CI), 99.7, 100.0 %) and a precision of 96.6 % (95 % CI, 95.4, 97.5 %) for correct identification of BI-RADS final assessment categories. The NLP algorithm demonstrated high recall and precision for extraction of BI-RADS final assessment categories from the free text of breast imaging reports. NLP may provide an accurate, scalable data extraction mechanism from reports within EMRs to create databases to track breast imaging performance measures and facilitate optimal breast cancer population management strategies.
Similar content being viewed by others
References
Ballard-Barbash R, Taplin SH, Yankaskas BC, Ernster VL, Rosenberg RD, Carney PA, Barlow WE, Geller BM, Kerlikowske K, Edwards BK, Lynch CF, Urban N, Chrvala CA, Key CR, Poplack SP, Worden JK, Kessler LG: Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database. AJR Am J Roentgenol 169(4):1001–1008, 1997
American College of Radiology: Breast Imaging Reporting and Data System® (BI-RADS®), 4th edition. American College of Radiology, Reston, 2003
Mammography Quality Standard Act, 62 Federal Register 559688, 1997
American College of Radiology Breast Magnetic Resonance Imaging (MRI) Accreditation Program Requirements. Available at http://www.acr.org/~/media/ACR/Documents/Accreditation/BreastMRI/Requirements.pdf. Accessed 10 July 2012
Sickles EA: Auditing your breast imaging practice: an evidence-based approach. Semin Roentgenol 42(4):211–217, 2007
Hripcsak G, Austin JH, Alderson PO, Friedman C: Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology 224(1):157–163, 2002
Dreyer KJ, Kalra MK, Maher MM, Hurier AM, Asfaw BA, Schultz T, Halpern EF, Thrall JH: Application of recently developed computer algorithm for automatic classification of unstructured radiology reports: validation study. Radiology 234(2):323–329, 2005
Ip IK, Mortele KJ, Prevedello LM, Khorasani R: Repeat abdominal imaging examinations in a tertiary care hospital. Am J Med 125(2):155–161, 2012
Cheng LT, Zheng J, Savova GK, Erickson BJ: Discerning tumor status from unstructured MRI reports—completeness of information in existing reports and utility of automated natural language processing. J Digit Imaging 23(2):119–132, 2010
Jain NL, Friedman C: Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports. Proc AMIA Annu Fall Symp(829–833), 1997
Sevenster M, van Ommering R, Qian Y: Automatically correlating clinical findings and body locations in radiology reports using MedLEE. J Digit Imaging 25(2):240–249, 2012
Percha B, Nassif H, Lipson J, Burnside E, Rubin D: Automatic classification of mammography reports by BI-RADS breast tissue composition class. J Am Med Inform Assoc 19(5):913–916, 2012
Mykowiecka A, Marciniak M, Kupść A: Rule-based information extraction from patients' clinical data. J Biomed Inform 42(5):923–936, 2009
How BROK Works. Brigham and Women's Hospital Web site. Available at http://www.brighamandwomens.org/Research/labs/cebi/BROK/default.aspx. Accessed 13 May 2013
Chaudhry B, Wang J, Wu S, Maglione M, Mojica W, Roth E, Morton SC, Shekelle PG: Systematic review: impact of health information technology on quality, efficiency, and costs of medical care. Ann Intern Med 144(10):742–752, 2006
National Mammography Database (NMD). Available at https://nrdr.acr.org/Portal/NMD/Main/page.aspx. Accessed 10 June 2012
Population-based Research Optimizing Screening through Personalized Regimens (PROSPR). Available at http://appliedresearch.cancer.gov/networks/prospr/. Accessed 10 June 2012
RadLex. Available at http://www.rsna.org/radlex/. Accessed 10 June 2012
Xu H, Anderson K, Grann VR, Friedman C: Facilitating cancer research using natural language processing of pathology reports. Stud Health Technol Inform 107(Pt 1):565–572, 2004
Savova GK, Olson JE, Murphy SP, Cafourek VL, Couch FJ, Goetz MP, Ingle JN, Suman VJ, Chute CG, Weinshilboum RM. Automated discovery of drug treatment patterns for endocrine therapy of breast cancer within an electronic medical record. J Am Med Inform Assoc 19(e1): e83–e89. doi:10.1136/amiajnl-2011-000295
Baldwin KB: Evaluating healthcare quality using natural language processing. J Healthc Qual 30(4):24–29, 2008
Acknowledgments
We would like to thank Drs. E. Francis Cook and E. John Orav for providing guidance for the statistical analysis.
The majority of the contributions of I Ikuta occurred while he was a fellow supported by the National Institutes of Health grant T15LM007092. The content of this manuscript is solely the responsibility of the authors and does not necessarily represent the official views of the National Library of Medicine or the National Institutes of Health.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sippo, D.A., Warden, G.I., Andriole, K.P. et al. Automated Extraction of BI-RADS Final Assessment Categories from Radiology Reports with Natural Language Processing. J Digit Imaging 26, 989–994 (2013). https://doi.org/10.1007/s10278-013-9616-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10278-013-9616-5