Natural Language Processing for Understanding Contraceptive Use at the VA
- 1.5k Downloads
Objective: To evaluate the potential of Natural Language Processing (NLP) for understanding contraceptive use among female Veterans seeking care at Veterans Administration (VA) healthcare facilities.
Design: Retrospective chart review of a subset of female Veterans enrolled in the Women Veterans Cohort Study (WVCS) who sought care at the VA Connecticut Healthcare facility (in West Haven, CT) in 2009 and completed a survey that included self-reported contraceptive use. In addition, only notes that were annotated for contraceptive use from a prior study that included 227 patients WVCS participants were selected.
Methods: A biomedical ontology of contraceptive terms and concepts was created that included both permanent methods (e.g. hysterectomy) as well as non-permanent methods (e.g. oral contraceptives). The new ontology, along with a section of the VA’s National Drug File was used as the knowledge base for information extraction from the free-text medical records. Included were 208 annotated notes across 39 patients. The General Architecture for Text Engineering (GATE), an open-source application for development of NLP pipelines was used. The ontology was added to GATE along with a processing resource that was developed in order to create an ontology-aware information extraction plugin for the pipeline. In addition, prior resources developed for negation of concepts (e.g. The patient denies using a emergency contraceptive) were utilized.
The NLP pipeline extracted contraceptives currently used by the patient, ones not currently used (prior use or recommended use by the clinician), or whose use was negated. A Boolean matrix of concepts by each patient was produced for input into a decision tree classifier. Tenfold cross validation created iterations of training and testing sets to estimate active versus inactive contraceptive. Responses to self-reported contraceptive use on the prior survey were used as the gold standard.
Results: The use of manual annotation, development of a biomedical ontology, and creation of a natural language processing pipeline achieved high precision (0.83) and recall (0.84). The weighted F-measure was 0.83.
Conclusion: Our combined approach utilized annotation of concepts, a biomedical ontology of contraceptives, and a natural language processing pipeline for information extraction. Our results highlight the potential for biomedical informatics to support research of contraceptive use among female Veterans at the VA. Additional research needs to be done that evaluates the accuracy of contraceptive information in the VA’s Electronic Health Record (EHR) with the consideration of both free text and semi-structured data such as pharmacy records.
KeywordsNatural language processing Contraceptive agents Veterans Medical informatics
This work was supported in part by a Veterans Affairs Health Services Research & Development (HSR&D) grant HIR 09-007 and is a translational use case project within the VA-funded Consortium for Healthcare Informatics Research (CHIR). In addition, this work is supported in part by VA grant DHI 07-065-1 to CB. The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs. The authors would like to thank colleagues from the Tampa VA, specifically Dr. James McCart, Mr. Jay Jarman, and Dr. Stephen Luther for providing GATE plugins. The authors would also like to thank Ms. Harini Bathulapalli for her database work and Mr. Brett South for providing code to export Knowtator annotations. Finally, the authors would like to thank Dr. Jyotishman Pathak for his feedback on the project.
- About.com. (2012). Contraception terms. [cited 2011 Dec 1]. http://contraception.about.com
- BioPortal. 2010 [cited 2010 Dec 5]. http://bioportal.bioontology.org/
- Brown SH et al (2004) VA National Drug File Reference Terminology: a cross-institutional content coverage study. Stud Health Technol Inform 107(Pt 1):477–481Google Scholar
- Drugs.com. (2012) Contraception drugs. [cited 2011 Dec 2]. http://www.drugs.com/condition/contraception.html
- Friedman C (1997) Towards a comprehensive medical language processing system: methods and issues. Proc AMIA Annu Fall Symp 1997:595–599Google Scholar
- GATE (2011) [cited 2012 Apr 16]. http://gate.ac.uk/
- IHTSDO (2012) SNOMED. [cited 2012 May 10]. http://www.ihtsdo.org/snomed-ct/
- Lee JH, Gonzalez GH (2011) Towards integrative gene prioritization in Alzheimer’s disease. Pac Symp Biocomput 4:13Google Scholar
- Meystre SM, Haug PJ (2005) Comparing natural language processing tools to extract medical problems from narrative text. AMIA Annu Symp Proc 2005:525–529Google Scholar
- Morrison FP, Sengupta S, Hripcsak G (2009) Using a pipeline to improve de-identification performance. AMIA Annu Symp Proc 2009:447–451Google Scholar
- NDF (2012). [cited 2012 May 10]. http://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/VANDF/attributes.html
- NLM (2012) RxNorm. [cited 2012 May 10]. http://www.nlm.nih.gov/research/umls/rxnorm/
- Noy NF et al (2003) Protege-2000: an open-source ontology-development and knowledge-acquisition environment. AMIA Annu Symp Proc 2003:953Google Scholar
- Ogren PV (2006) Knowtator: a protégé plug-in for annotated corpus construction. Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, 2006Google Scholar
- Ogren P (2007) Knowtator . [cited 2012 May 12]. http://knowtator.sourceforge.net/
- Protege 4.1. 2011. [cited 2011 Dec 1]. http://protege.stanford.edu/download/protege/4.1/installanywhere/Web_Installers/
- Rubin D et al (2010) Natural language processing for lines and devices in portable chest X-rays. AMIA Annu Symp Proc 2010:692–696Google Scholar
- Veterans Health Administration (2007) Program Announcement for Request for Concept Paper for Service Directed Research: Consortium for Healthcare Informatics Research (CHIR). [cited 2012 May 12]. http://www.research.va.gov/funding/solicitations/docs/Consortium-Healthcare-Informatics.pdf
- Veterans Health Administration (2012) VINCI. [cited 2012 May 12]. http://www.hsrd.research.va.gov/for_researchers/vinci/
- Wang X et al (2008) Automated knowledge acquisition from clinical narrative reports. AMIA Annu Symp Proc 2008:783–787Google Scholar
- Weka (2012). http://www.cs.waikato.ac.nz/ml/weka/
- Womack J et al (2012) Analysis of contraceptive use among female veterans at the VA. AMIA Summit on Clinical Research Informatics, San FranciscoGoogle Scholar