Abstract
The existence of massive quantity of clinical text in electronic medical records (EMRs) has created significant demand for clinical text processing and information extraction in the field of health care and medical research. Detailed clinical observations of patients are typically recorded chronologically. Temporal information in such clinical texts consist of three elements: temporal expressions, temporal events, and temporal relations. Due to the implicit expression of temporal information, lack of writing quality, and domain-specific nature in the clinical text, extraction of temporal information is much more complex than for newswire texts. In spite of these difficulties, to extract temporal information using the annotated corpora, few research works reported rule-based, machine-learning, and hybrid methods. On the other hand, creating the annotated corpora is expensive, time-consuming, and demands significant human effort; the processing quality is inevitably affected by the small size of corpora. Motivated by this issue, in this research work, we present a novel method to effectively extract the temporal information from EMR clinical texts. The essential idea of this method is first to build a feature set appropriately for clinical expressions, followed by the development of a semi-supervised framework for temporal event extraction, and finally detection of temporal relations among events with a newly formulated hypothesis. Comparative experimental evaluation on the I2B2 data set has clearly shown improved performance of the proposed methods. Specifically, temporal event and relation extraction is possible with an F-measure 89.98 and 67.1% respectively.
Similar content being viewed by others
Notes
References
Aggarwal CC, Zhai C (2012) A survey of text clustering algorithms. In: Mining text data, pp 77–128. Springer
Agrawal A (2009) Medication errors: prevention using information technology systems. Br J Clin Pharmacol 67(6):681–686
Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843
Ambit H, Gonzalo C (2016) Clinical narrative analytics challenges. In: Proceedings Rough Sets: International Joint Conference, IJCRS 2016, Santiago de Chile, Chile, October 7–11, 2016, vol 9920. Springer, p 23
Augusto JC (2005) Temporal reasoning for decision support in medicine. Artif Intell 33(1):1–24
Bethard S, Derczynski L, Savova G, Savova G, Pustejovsky J, Verhagen M (2015) Semeval-2015 task 6: clinical tempeval. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp 806–814
Bethard S, Savova G, Chen WT, Derczynski L, Pustejovsky J, Verhagen M (2016) Semeval-2016 task 12: clinical tempeval. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval 2016), San Diego, California, June. Association for Computational Linguistics, pp 962–972
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3(Jan):993–1022
Chambers N, Wang S, Jurafsky D (2007) Classifying temporal relations between events. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, ACL ’07, pp 173-176. Association for Computational Linguistics, Stroudsburg, PA, USA
Dubois S, Kale DC, Shah N, Jung K (2017) Learning effective representations from clinical notes. arXiv:1705.07025
Feldman K, Hazekamp N, Chawla NV (2016) Mining the clinical narrative: all text are not equal. In: 2016 IEEE international conference on healthcare informatics (ICHI), pp 271–280. IEEE
Galescu L, Nate B (2012) A corpus of clinical narratives annotated with temporal information. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp 715–720
Grishman R, Sundheim B (1996) Message understanding conference-6: a brief history. In: COLING 1996 Volume 1: The 16th international conference on computational linguistics, vol 1
Styler WF IV, Bethard S, Finan S, Palmer M, Pradhan S, de Groen PC, Erickson B, Miller T, Lin C, Savova G, Pustejovsky J (2014) Temporal annotation in the clinical domain. Trans Assoc Comput Linguist 2:143–154
Jiao F, Wang S, Lee CH, Greiner R, Schuurmans D (2006) Semi-supervised conditional random fields for improved sequence segmentation and labeling. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics, pp 209–216. Association for computational linguistics
Jindal P, Roth D (2013) Extraction of events and temporal expressions from clinical narratives. J Biomed Inform 46, Supplement(0):S13 – S19. https://doi.org/10.1016/j.jbi.2013.08.010. 2012 i2b2 {NLP} challenge on temporal relations in clinical data
Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning, ICML, vol 1, pp 282–289
Zhou L, Friedman C, Parsons S, Hripcsak G (2005) System architecture for temporal information extraction, representation and reasoning in clinical narrative reports. Am Med Inform Assoc 2005:869
Liu Y, LePendu P, Iyer S, Shah NH (2012) Using temporal patterns in medical records to discern adverse drug events from indications. AMIA Summits Transl Sci Proc 2012:47–56
Long Y, Li Z, Wang X, Li C (2017) XJNLP at SemEval-2017 Task 12: clinical temporal information extraction with a hybrid model. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp 1014–1018
Mani I (2004) Recent developments in temporal information extraction. In: Proceedings of the international conference on recent advances in natural language processing (RANLP’03), pp 45–60
Martinho R (2015) Text mining applied to electronic medical records. Int J E-Health Med Commun 6(3):1–18
Miller TA, Bethard S, Dligach D, Lin C, Savova GK (2015) Extracting time expressions from clinical text, pp 81–91
Jiang M, Chen Y, Liu M, Rosenbloom ST, Mani S, Denny JC, Xu H (2011) A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. J Am Med Inform Assoc 18(5):601–606
Moharasan G, Ho TB (2016) A semi-supervised approach for temporal information extraction from clinical text. In: 2016 IEEE RIVF international conference on computing & communication technologies, research, innovation, and vision for the future (RIVF), pp 7–12. IEEE
Moharasan G, Ho TB (2017) Extraction of temporal events from clinical text using semi-supervised conditional random fields. In: International conference on data mining and big data, pp 409–421. Springer
Pustejovsky J, Hanks P, Sauri R, See A, Gaizauskas R, Setzer A, Radev D, Sundheim B, Day D, Ferro L et al (2003) The timebank corpus. In: Corpus linguistics, vol 2003, p 40
Pustejovsky J, Lee K, Bunt H, Romary L (2010) Iso-timeml: an international standard for semantic annotation. LREC 2010
Roberts A, Gaizauskas R, Hepple M, Demetriou G, Guo Y, Setzer A, Roberts I (2008) Semantic annotation of clinical text: the CLEF corpus. In: Proceedings of the LREC 2008 workshop on building and evaluating resources for biomedical text mining, pp 19–26
Sohn S, Wagholikar K, Li D, Jonnalagaddaa S, Tao C, Elayavilli RK, Liu H (2013) Comprehensive temporal information detection from clinical text: medical events, time, and tlink identification. JAMIA 20(5):836–842
Sun W, Rumshisky A, Uzuner O (2013) Annotating temporal information in clinical narratives. J Biomed Inform 46:s5–s12
Sun W, Rumshisky A, Uzuner O (2013) Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J Am Med Inform Assoc 20(5):806–813
Sun W, Rumshisky A, Uzuner O (2013) Temporal reasoning over clinical text: the state of the art. J Am Med Inform Assoc 20(5):814–819
Tang B, Wu Y, Jiang M, Chen Y, Denny JC, Xu H (2013) A hybrid system for temporal information extraction from clinical text. J Am Med Inform Assoc 20(5):828–835
Tao C, Filannino M, Uzuner Ö (2017) Prescription extraction using CRFs and word embeddings. Journal of biomedical informatics 72:60–66
Trivedi G, Pham P, Chapman W, Hwa R, Wiebe J, Hochheiser H (2017) An interactive tool for natural language processing on clinical text. arXiv:1707.01890
UzZaman N, Llorens H, Allen J, Derczynski L, Verhagen M, Pustejovsky J (2012) Tempeval-3: Evaluating events, time expressions, and temporal relations. arXiv:1206.5333
Verhagen M, Gaizauskas R, Schilder F, Hepple M, Moszkowicz J, Pustejovsky J (2009) The tempeval challenge: identifying temporal relations in text. Lang Resour Eval 43(2):161–179
Verhagen M, Sauri R, Caselli T, Pustejovsky J (2010) SemEval-2010 task 13: TempEval-2. In: Proceedings of the 5th international workshop on semantic evaluation, pp 57–62. Association for Computational Linguistics
Vilain MB, Kautz HA (1986) Constraint propagation algorithms for temporal reasoning. In: Aaai, vol 86, pp 377–382
Wang CC, Chien MN, Huang CH, Liu L (2007) A rule-based disease diagnostic system using a temporal relationship model. In: 4th international conference on fuzzy systems and knowledge discovery, 2007. FSKD 2007. vol 4, pp 109–115. IEEE
Wang Y, Rastegar-Mojarad M, Elayavilli RK, Liu S, Liu H (2016) An ensemble model of clinical information extraction and information retrieval for clinical decision support. In: TREC
Wang Y, Wang L, Rastegar-Mojarad M, Moon S, Shen F, Afzal N, Liu S, Zeng Y, Mehrabi S, Sohn S et al (2017) Clinical information extraction applications: a literature review. Journal of biomedical informatics
Wong KF, Xia Y, Li W, Yuan C (2005) An overview of temporal information extraction. Int J Comput Process Orient Lang 18(02):137–152
Lin YK, Chen H, Brown RA (2013) Medtime: a temporal information extraction system for clinical narratives. J Biomed Inform 46:s20–s28
Chang YC, Dai HJ, Wu JC, Chen JM, Tsai RT, Hsu WL (2013) Tempting system: a hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries. J Biomed Inform 46:s54–s62
Zhou L, Hripcsak G (2007) Temporal reasoning with medical data-a review with emphasis on medical natural language processing. J Biomed Inform 40(2):183–202
Zhu X (2005) Semi-supervised learning literature survey. World 10:10
Zhu X, Cherry C, Kiritchenko S, Martin J, De Bruijn B (2013) Detecting concept relations in clinical text: insights from a state-of-the-art model. J Biomed Inform 46(2):275–285
Acknowledgments
We are grateful to MAYO CLINIC and Informatics for Integrating Biology and the Bedside (I2B2) organizers for providing access to annotated I2B2 temporal relations corpus.
Funding
This work is partially supported by Japan Ministry of Education, Culture, Sports, Science and Technology scholarship and the Vietnam National University at Ho Chi Minh City under the grant no. B2015-42-02.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection on Special Issue on Healthcare Knowledge Discovery and Management
Rights and permissions
About this article
Cite this article
Moharasan, G., Ho, TB. Extraction of Temporal Information from Clinical Narratives. J Healthc Inform Res 3, 220–244 (2019). https://doi.org/10.1007/s41666-019-00049-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41666-019-00049-0