Abstract
Precision medicine or evidence based medicine is based on the extraction of knowledge from medical records to provide individuals with the appropriate treatment in the appropriate moment according to the patient features. Despite the efforts of using clinical narratives for clinical decision support, many challenges have to be faced still today such as multilinguarity, diversity of terms and formats in different services, acronyms, negation, to name but a few. The same problems exist when one wants to analyze narratives in literature whose analysis would provide physicians and researchers with highlights. In this talk we will analyze challenges, solutions and open problems and will analyze several frameworks and tools that are able to perform NLP over free text to extract medical entities by means of Named Entity Recognition process. We will also analyze a framework we have developed to extract and validate medical terms. In particular we present two uses cases: (i) medical entities extraction of a set of infectious diseases description texts provided by MedlinePlus and (ii) scales of stroke identification in clinical narratives written in Spanish.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ben-Assuli, O.: Electronic health records, adoption, quality of care, legal and privacy issues and their implementation in emergency departments. Health Policy 119(3), 287–297 (2015)
Hanauer, D.A., Mei, Q., Law, J., Khanna, R., Zheng, K.: Supporting information retrieval from electronic health records: a report of university of michigans nine-year experience in developing and using the electronic medical record search engine (EMERSE). J. Biomed. Inf. 55, 290–300 (2015)
Teng, Z., Ren, F., Kuroiwa, S.: Emotion recognition from text based on the rough set theory and the support vector machines. In: 2007 International Conference on Natural Language Processing and Knowledge Engineering, pp. 36–41. IEEE (2007)
Ji, Y., Shang, L., Dai, X., Ma, R.: Apply a rough set-based classifier to dependency parsing. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 97–105. Springer, Heidelberg (2008). doi:10.1007/978-3-540-79721-0_18
Humphreys, B.L., Lindberg, D.A.: The UMLS project: making the conceptual connection between users and the information they need. Bull. Med. Libr. Assoc. 81(2), 170 (1993)
Rodriguez, A., Gonzalo, C., Menasalvas, E., Costumero, R., Ambit, H.: H2a - human health analytics: a natural language processing system for electronic health records. In: Proceedings of the AMIA Symposium. IJCRS-Chile (2016, to appear)
Rodríguez-González, A., Martínez-Romero, M., Costumero, R., Wilkinson, M.D., Menasalvas-Ruiz, E.: Diagnostic knowledge extraction from medlineplus: an application for infectious diseases. In: Overbeek, R., Rocha, M.P., Fdez-Riverola, F., Paz, J.F. (eds.) 9th International Conference on Practical Applications of Computational Biology and Bioinformatics. AISC, vol. 375, pp. 79–87. Springer, Heidelberg (2015). doi:10.1007/978-3-319-19776-0_9
Meystre, S.M., Savova, G.K., Kipper-Schuler, K.C., Hurdle, J.F., et al.: Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med. Inform. 35, 128–144 (2008)
Christensen, L.M., Haug, P.J., Fiszman, M.: Mplus: a probabilistic medical language understanding system. In: Proceedings of the ACL 2002 Workshop on Natural Language Processing in the Biomedical Domain, vol. 3, pp. 29–36. Association for Computational Linguistics (2002)
Coden, A., Savova, G.K., Sominsky, I.L., Tanenblatt, M.A., Masanz, J.J., Schuler, K., Cooper, J.W., Guan, W., de Groen, P.C.: Automatically extracting cancer disease characteristics from pathology reports into a disease knowledge representation model. J. Biomed. Inf. 42(5), 937–949 (2009)
Doan, S., Mike Conway, T., Phuong, M., Ohno-Machado, L.: Natural language processing in biomedicine: a unified system architecture overview. arXiv preprint arXiv:1401.0569 (2014)
Fiszman, M., Haug, P.J., Frederick, P.R.: Automatic extraction of pioped interpretations from ventilation/perfusion lung scan reports. In: Proceedings of the AMIA Symposium, pp. 860–864 (1998)
Friedman, C., Hripcsak, G., DuMouchel, W., Johnson, S.B., Clayton, P.D.: Natural language processing in an operational clinical information system. Nat. Lang. Eng. 1(01), 83–108 (1995)
Friedman, C.: Towards a comprehensive medical language processing system: methods and issues. In: Proceedings of the AMIA Annual Fall Symposium, p. 595. American Medical Informatics Association (1997)
Friedman, C.: A broad-coverage natural language processing system. In: Proceedings of the AMIA Symposium, p. 270. American Medical Informatics Association (2000)
Friedman, C., Alderson, P.O., Austin, J.H., Cimino, J.J., Johnson, S.B.: A general natural-language text processor for clinical radiology. J. Am. Med. Inf. Assoc. 1(2), 161–174 (1994)
Friedman, C., Hripcsak, G.: Natural language processing and its future in medicine. Acad. Med. 74(8), 890–895 (1999)
Friedman, C., Knirsch, C., Shagina, L., Hripcsak, G.: Automating a severity score guideline for community-acquired pneumonia employing medical language processing of discharge summaries. In: Proceedings of the AMIA Symposium, p. 256. American Medical Informatics Association (1999)
Friedman, C., Liu, H., Shagina, L., Johnson, S., Hripcsak, G.: Evaluating the UMLS as a source of lexical knowledge for medical language processing. In: Proceedings of the AMIA Symposium, p. 189. American Medical Informatics Association (2001)
Goryachev, S., Sordo, M., Zeng, Q.T.: A suite of natural language processing tools developed for the I2B2 project. In: AMIA Annual Symposium Proceedings, vol. 2006, p. 931. American Medical Informatics Association (2006)
Hripcsak, G., Austin, J.H.M., Alderson, P.O., Friedman, C.: Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports 1. Radiology 224(1), 157–163 (2002)
Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inf. Assoc. 17(5), 507–513 (2010)
Zweigenbaum, P.: Menelas: an access system for medical records using natural language. Comput. Method Prog. Biomed. 45(1), 117–120 (1994)
Goryachev, S.: Hitex manual. https://www.i2b2.org/software/projects/hitex/hitex_manual.html
Zeng, Q.T., Goryachev, S., Weiss, S., Sordo, M., Murphy, S.N., Lazarus, R.: Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med. Inf. Decis. Making 6(1), 30 (2006)
Ferrucci, D., Lally, A.: UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3–4), 327–348 (2004)
Taboada, M., Meizoso, M., Martínez, D., Riaño, D., Alonso, A.: Combining open-source natural language processing tools to parse clinical practice guidelines. Expert Syst. 30(1), 3–11 (2013)
Thomas, A.A., Zheng, C., Jung, H., Chang, A., Kim, B., Gelfond, J., Slezak, J., Porter, K., Jacobsen, S.J., Chien, G.W.: Extracting data from electronic medical records: validation of a natural language processing program to assess prostate biopsy results. World J. Urology 32(1), 99–103 (2014)
Hohnloser, J.H., Holzer, M., Fischer, M.R., Ingenerf, J., Günther-Sutherland, A.: Natural language processing, automatic snomed-encoding of free text: An analysis of free text data from a routine electronic patient record application with a parsing tool using the german snomed ii. In: Proceedings of the AMIA Annual Fall Symposium, p. 856. American Medical Informatics Association (1996)
Pietrzyk, P.M.: A medical text analysis system for german-syntax analysis. Method Inf. Med. 30(4), 275–283 (1991)
Savana Médica: Savana médica (2015)
Costumero, R., Gonzalo, C., Menasalvas, E.: TIDA: a spanish EHR semantic search engine. In: Saez-Rodriguez, J., Rocha, M.P., Fdez-Riverola, F., De Paz, J.F., Santana, L.F. (eds.) PACBB 2014. AISP, vol. 294, pp. 235–242. Springer, Heildelberg (2014)
Costumero, R., Garcia-Pedrero, A., Sánchez, I., Gonzalo, C., Menasalvas, E.: 1 electronic health records analytics: natural language processing and image annotation. In: Big Data and Applications, p. 1 (2014)
Costumero, R., Lopez, F., Gonzalo-Martín, C., Millan, M., Menasalvas, E.: An approach to detect negation on medical documents in Spanish. In: Ślȩzak, D., Tan, A.H., Peters, J.F., Schwabe, L. (eds.) BIH 2014. LNCS (LNAI), vol. 8609, pp. 366–375. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09891-3_34
Costumero, R., García-Pedrero, Á., Gonzalo-Martín, C., Menasalvas, E., Millan, S.: Text analysis and information extraction from Spanish written documents. In: Ślȩzak, D., Tan, A.-H., Peters, J.F., Schwabe, L. (eds.) BIH 2014. LNCS (LNAI), vol. 8609, pp. 188–197. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09891-3_18
Rodríguez-González, A., Alor-Hernández, G.: An approach for solving multi-level diagnosis in high sensitivity medical diagnosis systems through the application of semantic technologies. Comput. Biol. Med. 43(1), 51–62 (2013)
Zhou, X., Menche, J., Barabási, A.-L., Sharma, A.: Human symptoms-disease network. Nat. Commun. 5 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Menasalvas, E., Rodriguez-Gonzalez, A., Costumero, R., Ambit, H., Gonzalo, C. (2016). Clinical Narrative Analytics Challenges. In: Flores, V., et al. Rough Sets. IJCRS 2016. Lecture Notes in Computer Science(), vol 9920. Springer, Cham. https://doi.org/10.1007/978-3-319-47160-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-47160-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47159-4
Online ISBN: 978-3-319-47160-0
eBook Packages: Computer ScienceComputer Science (R0)