Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline
Diagnostic radiologists are expected to review and assimilate findings from prior studies when constructing their overall assessment of the current study. Radiology information systems facilitate this process by presenting the radiologist with a subset of prior studies that are more likely to be relevant to the current study, usually by comparing anatomic coverage of both the current and prior studies. It is incumbent on the radiologist to review the full text report and/or images from those prior studies, a process that is time-consuming and confers substantial risk of overlooking a relevant prior study or finding. This risk is compounded when patients have dozens or even hundreds of prior imaging studies. Our goal is to assess the feasibility of natural language processing techniques to automatically extract asserted and negated disease entities from free-text radiology reports as a step towards automated report summarization. We compared automatically extracted disease mentions to a gold-standard set of manual annotations for 50 radiology reports from CT abdomen and pelvis examinations. The automated report summarization pipeline found perfect or overlapping partial matches for 86% of the manually annotated disease mentions (sensitivity 0.86, precision 0.66, accuracy 0.59, F1 score 0.74). The performance of the automated pipeline was good, and the overall accuracy was similar to the interobserver agreement between the two manual annotators.
KeywordsNLP Report summarization Data extraction Radiology report
- 2.Bozkurt S, Lipson JA, Senol U, Rubin DL: Automatic abstraction of imaging observations with their characteristics from mammography reports. J Am Med Inform Assoc 22(e1):e81–e92, 2015. https://doi.org/10.1136/amiajnl-2014-003009 Erratum in: J Am Med Inform Assoc. 2015 Sep;22(5):1112. PubMedGoogle Scholar
- 3.Pham AD, Névéol A, Lavergne T, Yasunaga D, Clément O, Meyer G, Morello R, Burgun A: Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings. BMC Bioinformatics 15:266, 2014. https://doi.org/10.1186/1471-2105-15-266 CrossRefPubMedPubMedCentralGoogle Scholar
- 8.Stenetorp P, Pyysalo S, Topić G, Ohta T, Ananiadou S, Tsujii J. BRAT: a web-based tool for NLP-assisted text annotation. In: 13th Conference of the European Chapter of the Association for Computational Linguistics. Avignon, France: Association for Computational Linguistics, 2012:102–107Google Scholar