A Case Study of the Incremental Utility for Disease Identification of Natural Language Processing in Electronic Medical Records
- 98 Downloads
Information exists as unstructured medical text in healthcare databases. Such information is not routinely considered in safety surveillance but typically relies solely on structured (coded) data. Natural language processing (NLP) may allow the capture of concepts from unstructured data and thus enhance safety surveillance capability.
We sought to assess the added contribution of unstructured data extracted from medical text by NLP for detecting acute liver dysfunction (ALD) in patients with inflammatory bowel disease (IBD).
Using a previously developed rule, we evaluated structured and unstructured NLP-extracted terms from a commercially available electronic medical record (EMR) system. The rule was intended to identify ALD diagnosis and timing of onset and was the result of three iterations of rule development using 150 ALD candidate cases. We evaluated the performance of the rule with or without NLP among all candidate cases and among 50 new cases with clinical adjudication.
NLP terms were necessary for the diagnosis of 9% of cases and for ruling out 3% of false-positive cases. Inclusion of NLP terms led to an identification of an additional 9% of ALD-onset dates, with consequent earlier recognition in 5%.
NLP-derived terms in one large commercially available EMR system modestly improved the sensitivity and specificity in the identification of ALD and identified earlier onset.
Compliance with Ethical Standards
All patient and provider information was provided in the form of non-identifying study code numbers. The work did not require institutional review board approval.
This work was conducted using Pfizer, Inc., internal funds and under a research contract between Pfizer and World Health Information Science Consultants (AW and AA).
Conflicts of Interest
LSW, XZ, RS, RES, AB and RR are employees and may be shareholders of Pfizer, Inc. AW has worked under contract with Optum, which owns Humedica (whose data resource is being studied). AA has received consulting fees or honoraria for serving on scientific advisory boards for Abbvie, Takeda, and Merck. The views expressed herein are those of the authors and do not necessarily represent those of Pfizer, Inc.
- 1.Ananthakrishnan AN, Cai T, Savova G, Cheng SC, Chen P, Perez RG, et al. Improving case definition of Crohn’s disease and ulcerative colitis in electronic medical records using natural language processing: a novel informatics approach. Inflamm Bowel Dis. 2013;19(7):1411–20.CrossRefPubMedPubMedCentralGoogle Scholar
- 6.Li L, Chase HS, Patel CO, Friedman C, Weng C. Comparing ICD9-encoded diagnoses and NLP-processed discharge summaries for clinical trials pre-screening: a case study. AMIA Annu Symp Proc. 2008;06:404–8.Google Scholar