Abstract
Large sets of electronic health record data are increasingly used in retrospective clinical studies and comparative effectiveness research. The desired patient cohort characteristics for such studies are best expressed as free text descriptions. We present a syntactic-semantic approach to structuring these descriptions. We developed the approach on 60 training topics (descriptions) and evaluated it on 35 test topics provided within the 2011 TREC Medical Record evaluation. We evaluated the accuracy of the frames as well as the modifications needed to achieve near perfect precision in identifying the top 10 eligible patients. Our automatic approach accurately captured 34 test descriptions; 25 automatic frames needed no modifications for finding eligible patients. Further evaluations of the overall average retrieval effectiveness showed that frames are not needed for simple descriptions containing one or two key terms. However, our training results suggest that the frames are needed for more complex real-life cohort selection tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. J. Am. Med. Inform. Assoc. 17(3), 229–236 (2010)
Boxwala, A., Kim, H., Choi, J., Ohno-Machado, L.: Understanding data and query requirements for cohort identification in clinical research studies. In: AMIA Annu. Symp. Proc., p. 95 (2011)
Cimino, J.J., Ayres, E.J.: The clinical research data repository of the US National Institutes of Health. Stud. Health Technol. Inform. 160(Pt 2), 1299–1303 (2010)
Demner-Fushman, D., Lin, J.: Answering Clinical Questions with Knowledge-Based and Statistical Techniques. Computational Linguistics 33(1), 63–103 (2007)
Deshmukh, V.G., Meystre, S.M., Mitchell, J.A.: Evaluating the informatics for integrating biology and the bedside system for clinical research. BMC Med. Res. Methodol. 9, 70 (2009)
Friedman, C., Shagina, L., Lussier, Y., Hripcsak, G.: Automated encoding of clinical documents based on natural language processing. J. Am. Med. Inform. Assoc. 11(5), 392–402 (2004)
Huang, X., Lin, J., Demner-Fushman, D.: Evaluation of PICO as a knowledge representation for clinical questions. In: AMIA Annu. Symp. Proc., pp. 359–363 (2006)
Ide, N.C., Loane, R.F., Demner-Fushman, D.: Essie: a concept-based search engine for structured biomedical text. J. Am. Med. Inform. Assoc. 14(3), 253–263 (2007)
Institute of Medicine of the National Academies (IOM): 100 initial priority topics for comparative effectiveness research, http://www.iom.edu/Reports/2009/ComparativeEffectivenessResearchPriorities.aspx
Jacquemart, P., Zweigenbaum, P.: Towards a medical question-answering system: a feasibility study. Stud. Health Technol. Inform. 95, 463–468 (2003)
JournalWATCH® General Medicine: http://general-medicine.jwatch.org/ (updated August 16, 2011, accessed August 16, 2011)
Lindberg, D.A.B., Humphreys, B.L., McCray, A.T.: The Unified Medical Language System. Meth. Inform. Med. 32, 281–291 (1993)
Lowe, H.J., Ferris, T.A., Hernandez, P.M., Weber, S.C.: STRIDE–An integrated standards-based translational research informatics platform. In: AMIA Annu. Symp. Proc., pp. 391–395 (2009)
de Marneffe, M.-C., MacCartney, B., Manning, C.D.: Generating Typed Dependency Parses from Phrase Structure Parses. In: LREC 2006 (2006), http://nlp.stanford.edu/pubs/LREC06_dependencies.pdf (accessed August 16, 2011)
Murphy, S.N., Barnett, G.O., Chueh, H.C.: Visual query tool for finding patient cohorts from a clinical data warehouse of the Partners HealthCare system. In: Proc. AMIA Symp., p. 1174 (2000)
Narayanan, S., Harabagiu, S.: Question Answering based on Semantic Structures. In: International Conference on Computational Linguistics COLING 2004, Geneva, Switzerland (2004)
Richardson, W.S., Wilson, M.C., Nishikawa, J., Hayward, R.S.: The well-built clinical question: a key to evidence-based decisions. ACP J. Club 123, A12–3 (1995)
Ruiz, E.E., Chilov, M., Johnson, S.B., Mendonça E.A.: Developing multilevel search filters for clinical questions represented as conceptual graphs. In: AMIA Annu. Symp. Proc., p. 1118 (2008)
Tu, S., Peleg, M., Carini, S., Bobak, M., Ross, J., Rubin, D., Sim, I.: A practical method for transforming free-text eligibility criteria into computable criteria. J. Biomed. Inform. 44(2), 239–250 (2011)
Voorhees, E., Tong, R.: Overview of the TREC 2011 Medical Records Track. In: The Twentieth Text REtrieval Conference Proceedings TREC 2011, Gaithersburg, MD. National Institute for Standards and Technology (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Demner-Fushman, D., Abhyankar, S. (2012). Syntactic-Semantic Frames for Clinical Cohort Identification Queries. In: Bodenreider, O., Rance, B. (eds) Data Integration in the Life Sciences. DILS 2012. Lecture Notes in Computer Science(), vol 7348. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31040-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-31040-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31039-3
Online ISBN: 978-3-642-31040-9
eBook Packages: Computer ScienceComputer Science (R0)