Abstract
Radiology reports are permanent legal documents that serve as official interpretation of imaging tests. Manual analysis of textual information contained in these reports requires significant time and effort. This study describes the development and initial evaluation of a toolkit that enables automated identification of relevant information from within these largely unstructured text reports. We developed and made publicly available a natural language processing toolkit, Information from Searching Content with an Ontology-Utilizing Toolkit (iSCOUT). Core functions are included in the following modules: the Data Loader, Header Extractor, Terminology Interface, Reviewer, and Analyzer. The toolkit enables search for specific terms and retrieval of (radiology) reports containing exact term matches as well as similar or synonymous term matches within the text of the report. The Terminology Interface is the main component of the toolkit. It allows query expansion based on synonyms from a controlled terminology (e.g., RadLex or National Cancer Institute Thesaurus [NCIT]). We evaluated iSCOUT document retrieval of radiology reports that contained liver cysts, and compared precision and recall with and without using NCIT synonyms for query expansion. iSCOUT retrieved radiology reports with documented liver cysts with a precision of 0.92 and recall of 0.96, utilizing NCIT. This recall (i.e., utilizing the Terminology Interface) is significantly better than using each of two search terms alone (0.72, p = 0.03 for liver cyst and 0.52, p = 0.0002 for hepatic cyst). iSCOUT reliably assembled relevant radiology reports for a cohort of patients with liver cysts with significant improvement in document retrieval when utilizing controlled lexicons.
Similar content being viewed by others
References
Taira RK, Soderland SG, Jakobovits RM: Automatic structuring of radiology free-text reports. Radiographics 21(1):237–245, 2001
Mamlin BW, Heinze DT, McDonald CJ. Automated extraction and normalization of findings from cancer-related free-text radiology reports. AMIA Annu Symp Proc 420–424, 2003
Zingmond D, Lenert LA: Monitoring free-text data using medical language processing. Comput Biomed Res 26(5):467–481, 1993
Fiszman M, Haug PJ, Frederick PR. Automatic extraction of PIOPED interpretations from ventilation/perfusion lung scan reports. Proc AMIA Symp 860–864, 1998
Thomas BJ, Ouellette H, Halpern EF, Rosenthal DI: Automated computer-assisted categorization of radiology reports. AJR Am J Roentgenol 184(2):687–690, 2005
Dreyer KJ, Kalra MK, Maher MM, Hurier AM, Asfaw BA, Schultz T, et al: Application of recently developed computer algorithm for automatic classification of unstructured radiology reports: validation study. Radiology 234(2):323–329, 2005
Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SB: A general natural-language text processor for clinical radiology. J Am Med Inform Assoc 1(2):161–174, 1994
Pines JM: Trends in the rates of radiography use and important diagnoses in emergency department patients with abdominal pain. Med Care 47(7):782–786, 2009
Korley FK, Pham JC, Kirsch TD: Use of advanced radiology during visits to US emergency departments for injury-related conditions, 1998–2007. JAMA 304(13):1465–1471, 2010
Meystre SM, Haug PJ. Comparing natural language processing tools to extract medical problems from narrative text. AMIA Annu Symp Proc 525–529, 2005
Xu H, Fu Z, Shah A, Chen Y, Peterson NB, Chen Q, et al: Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases. AMIA Annu Symp Proc 2011:1564–1572, 2011
Uzuner O, South BR, Shen S, Duvall SL: 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc 18(5):552–556, 2011
Meystre S, Haug PJ: Natural language processing to extract medical problems from electronic clinical documents: performance evaluation. J Biomed Inform 39(6):589–599, 2006
Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R: Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak 6:30, 2006
Cunningham H, D Maynard, K Bontcheva, V Tablan. GATE: A framework and graphical development environment for robust NLP tools and applications. Proc 40th Assoc for Computational Linguistics, 2002
de Coronado S, Haber MW, Sioutos N, Tuttle MS, Wright LW: NCI Thesaurus: using science-based terminology to integrate cancer research results. Stud Health Technol Inform 107(Pt 1):33–37, 2004
Langlotz CP: RadLex: a new method for indexing online educational materials. Radiographics 26(6):1595–1597, 2006
Andriole KP, Khorasani R: Implementing a replacement PACS: issues to consider. J Am Coll Radiol 4(6):416–418, 2007
Gershanik EF, Lacson R, Khorasani R: Critical finding capture in the impression section of radiology reports. AMIA Annu Symp Proc 2011:465–469, 2011
National Cancer Institute. http://ncit.nci.nih.gov. 26 July 2010.
Hersh W: Evaluation of biomedical text-mining systems: lessons learned from information retrieval. Brief Bioinform 6(4):344–356, 2005
Su K, Ries JE, Peterson GM, Cullinan Sievert ME, Patrick TB, Moxley DE et al. Comparing frequency of word occurrences in abstracts and texts using two stop word lists. Proc AMIA Symp 682–686, 2001
Nadkarni PM, Ohno-Machado L, Chapman WW: Natural language processing: an introduction. J Am Med Inform Assoc 18(5):544–551, 2011
Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG: A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform 34(5):301–310, 2001
Lindberg DA, Humphreys BL, McCray AT: The unified medical language system. Methods Inf Med 32(4):281–291, 1993
Loy P: International classification of diseases—9th revision. Med Rec Health Care Inf J 19(2):390–396, 1978
Cote RA, Robboy S: Progress in medical information management. Systematized nomenclature of medicine (SNOMED). JAMA 243(8):756–762, 1980
Rogers FB: Medical subject headings. Bull Med Libr Assoc 51:114–116, 1963
Cheng LT, Zheng J, Savova GK, Erickson BJ: Discerning tumor status from unstructured MRI reports—completeness of information in existing reports and utility of automated natural language processing. J Digit Imaging 23(2):119–132, 2010
Cheng B, Titterington D: Neural networks: a review from a statistical perspective. Stat Sci 9(1):2–54, 1994
Savova GK, Fan J, Ye Z, Murphy SP, Zheng J, Chute CG, et al: Discovering peripheral arterial disease cases from radiology notes using natural language processing. AMIA Annu Symp Proc 2010:722–726, 2010
Warden GI, Lacson R, Khorasani R: Leveraging terminologies for retrieval of radiology reports with critical imaging findings. AMIA Annu Symp Proc 2011:1481–1488, 2011
Acknowledgments
This work was partly funded by AHRQ grant 1R18HS019635.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lacson, R., Andriole, K.P., Prevedello, L.M. et al. Information from Searching Content with an Ontology-Utilizing Toolkit (iSCOUT). J Digit Imaging 25, 512–519 (2012). https://doi.org/10.1007/s10278-012-9463-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10278-012-9463-9