MADEx: A System for Detecting Medications, Adverse Drug Events, and Their Relations from Clinical Notes
Early detection of adverse drug events (ADEs) from electronic health records is an important, challenging task to support pharmacovigilance and drug safety surveillance. A well-known challenge to use clinical text for detection of ADEs is that much of the detailed information is documented in a narrative manner. Clinical natural language processing (NLP) is the key technology to extract information from unstructured clinical text.
We present a machine learning-based clinical NLP system—MADEx—for detecting medications, ADEs, and their relations from clinical notes.
We developed a recurrent neural network (RNN) model using a long short-term memory (LSTM) strategy for clinical name entity recognition (NER) and compared it with baseline conditional random fields (CRFs). We also developed a modified training strategy for the RNN, which outperformed the widely used early stop strategy. For relation extraction, we compared support vector machines (SVMs) and random forests on single-sentence relations and cross-sentence relations. In addition, we developed an integrated pipeline to extract entities and relations together by combining RNNs and SVMs.
MADEx achieved the top-three best performances (F1 score of 0.8233) for clinical NER in the 2018 Medication and Adverse Drug Events (MADE1.0) challenge. The post-challenge evaluation showed that the relation extraction module and integrated pipeline (identify entity and relation together) of MADEx are comparable with the best systems developed in this challenge.
This study demonstrated the efficiency of deep learning methods for automatic extraction of medications, ADEs, and their relations from clinical text to support pharmacovigilance and drug safety surveillance.
The authors would like to thank the organizers who provided the annotated corpus and word embeddings for this challenge, and gratefully acknowledge the support of the NVIDIA Corporation with the donation of the GPUs used for this research. The authors would also like to thank the anonymous reviewers for their helpful feedback.
Compliance with Ethical Standards
This study was supported in part by the University of Florida Clinical and Translational Science Institute, which is funded by the National Institutes of Health (NIH) National Center for Advancing Translational Sciences under award number UL1TR001427, and the OneFlorida Clinical Research Consortium, which is funded by the Patient-Centered Outcomes Research Institute (PCORI) under award number CDRN-1501-26692. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Conflict of Interest
Xi Yang, Jiang Bian, Yan Gong, William R. Hogan, and Yonghui Wu have no conflicts of interest to declare that are directly relevant to the contents of this study.
This study utilized de-identified clinical notes provided by the University of Massachusetts Medical School through the MADE1.0 challenge, and was approved by the University of Florida Institutional Review Board.
- 1.Institute of Medicine (US) Committee on quality of health care in America. To err is human: building a safer health system. Washington, DC: National Academies Press; 2000. http://www.ncbi.nlm.nih.gov/books/NBK225182/. Accessed 23 June 2018.
- 2.Weiss AJ, Freeman WJ, Heslin KC, Barrett ML. Adverse drug events in US Hospitals, 2010 versus 2014. Statistical brief #234. AHRQ; 2018. https://www.hcup-us.ahrq.gov/reports/statbriefs/sb234-Adverse-Drug-Events.jsp. Accessed Dec 2018.
- 3.Stausberg J. International prevalence of adverse drug events in hospitals: an analysis of routine data from England, Germany, and the USA. BMC Health Serv Res. 2014;14:125.Google Scholar
- 4.Poudel DR, Acharya P, Ghimire S, Dhital R, Bharati R. Burden of hospitalizations related to adverse drug events in the USA: a retrospective analysis from large inpatient database. Pharmacoepidemiol Drug Saf. 2017;26:635–41.Google Scholar
- 5.Aljadhey H, Mahmoud MA, Mayet A, Alshaikh M, Ahmed Y, Murray MD, et al. Incidence of adverse drug events in an academic hospital: a prospective cohort study. Int J Qual Health Care. 2013;25:648–55.Google Scholar
- 6.Aljadhey H, Mahmoud MA, Ahmed Y, et al. Incidence of adverse drug events in public and private hospitals in Riyadh, Saudi Arabia: the (ADESA) prospective cohort study. BMJ Open. 2016;6:e010831.Google Scholar
- 7.Wang Y, Wang L, Rastegar-Mojarad M, et al. Clinical information extraction applications: a literature review. J Biomed Inform. 2018;77:34–49.Google Scholar
- 8.Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform. 2008;17:128–44.Google Scholar
- 9.Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc. 2011;18:544–51.Google Scholar
- 10.Kumar S. A survey of deep learning methods for relation extraction; 2017. arXiv:170503645.
- 11.Uzuner Ö, South BR, Shen S, DuVall SL. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc. 2011;18:552–6.Google Scholar
- 12.Sun W, Rumshisky A, Uzuner O. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J Am Med Inform Assoc. 2013;20:806–13.Google Scholar
- 13.Pradhan S, Elhadad N, Chapman W, Manandhar S, Savova G. SemEval-2014 Task 7: analysis of clinical text. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014);2014. p. 54–62.Google Scholar
- 14.Pradhan S, Elhadad N, South BR, Martinez D, Christensen L, Vogel A, et al. Evaluating the state of the art in disorder recognition and normalization of the clinical narrative. J Am Med Inform Assoc. 2015;22:143–54.Google Scholar
- 15.Lafferty JD, McCallum A, Pereira FCN. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning. San Franciso, CA: Morgan Kaufmann Publishers Inc.; 2001. p. 282–89.Google Scholar
- 16.Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst Appl. 1998;13:18–28.Google Scholar
- 17.Tsochantaridis I, Joachims T, Hofmann T, Altun Y. Large margin methods for structured and interdependent output variables. J Mach Learn Res. 2005;6:1453–84.Google Scholar
- 18.Aronson AR, Lang F-M. An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc. 2010;17:229–36.Google Scholar
- 19.Friedman C. Towards a comprehensive medical language processing system: methods and issues. Proc AMIA Annu Fall Symp. 1997;595–599.Google Scholar
- 20.Denny JC, Irani PR, Wehbe FH, Smithers JD, Spickard A. The KnowledgeMap project: development of a concept-based medical school curriculum database. AMIA Annu Symp Proc.; 2003. pp. 195–199.Google Scholar
- 21.de Bruijn B, Cherry C, Kiritchenko S, Martin J, Zhu X. Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. J Am Med Inform Assoc. 2011;18:557–62.Google Scholar
- 22.Zhang Y, Wang J, Tang B, Wu Y, Jiang M, Chen Y, et al. UTH_CCB: a report for semeval 2014–task 7 analysis of clinical text. Sem Eval. 2014;2014:802.Google Scholar
- 23.Tang B, Wu Y, Jiang M, Denny JC, Xu H. Recognizing and encoding disorder concepts in clinical text using machine learning and vector space model. CLEF 2013 proceedings. 2013. http://ceur-ws.org/Vol-1179/CLEF2013wn-CLEFeHealth-TangEt2013.pdf.
- 24.Le H-Q, Nguyen TM, Vu ST, Dang TH. D3NER: biomedical named entity recognition using CRF-biLSTM improved with fine-tuned embeddings of various linguistic information. Bioinformatics. 2018;24(20):3539–46.Google Scholar
- 25.Habibi M, Weber L, Neves M, Wiegandt DL, Leser U. Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics. 2017;33:i37–48.Google Scholar
- 26.Liu Z, Yang M, Wang X, Chen Q, Tang B, Wang Z, et al. Entity recognition from clinical texts via recurrent neural network. BMC Med Inform Decis Mak. 2017;17(Suppl 2):67.Google Scholar
- 27.Jagannatha AN, Yu H. Bidirectional RNN for medical event detection in electronic health records. Proc Conf. 2016;2016:473–82.Google Scholar
- 28.Wu Y, Jiang M, Lei J, Xu H. Named entity recognition in chinese clinical text using deep neural network. Stud Health Technol Inform. 2015;216:624–8.Google Scholar
- 29.Wu Y, Jiang M, Xu J, Zhi D, Xu H. Clinical named entity recognition using deep learning models. AMIA Annu Symp Proc 2018; 2017:1812–19 (eCollection 2017).Google Scholar
- 30.Zhao S, Grishman R. Extracting relations with integrated information using Kernel methods. In: Proceedings of the 43rd annual meeting of the association for computational linguistics. Stroudsburg, PA; 2005. pp. 419–426.Google Scholar
- 31.Tikk D, Thomas P, Palaga P, Hakenberg J, Leser U. A comprehensive benchmark of kernel methods to extract protein-protein interactions from literature. PLoS Comput Biol. 2010;6:e1000837.Google Scholar
- 32.Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction. J Mach Learn Res. 2003;3:1083–106.Google Scholar
- 33.Brin S. Extracting patterns and relations from the world wide web. In: Atzeni P, Mendelzon A, Mecca G, editors. The world wide web and databases. London: Springer; 1999. p. 172–83.Google Scholar
- 34.Tang B, Wu Y, Jiang M, Chen Y, Denny JC, Xu H. A hybrid system for temporal information extraction from clinical text. J Am Med Inform Assoc. 2013;20:828–35.Google Scholar
- 35.Xu J, Wu Y, Zhang Y, Wang J, Lee H-J, Xu H. CD-REST: a system for extracting chemical-induced disease relation in literature. Database. 2016;2016:baw036.Google Scholar
- 36.Wei C-H, Peng Y, Leaman R, Davis AP, Mattingly CJ, Li J, et al. Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task. Database. 2016;2016:baw032.Google Scholar
- 37.Comeau DC, Islamaj Doğan R, Ciccarese P, et al. BioC: a minimalist approach to interoperability for biomedical text processing. Database. 2013;2013:bat064.Google Scholar
- 38.Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9:1735–80.Google Scholar
- 39.Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C. Neural architectures for named entity recognition; 2016. arXiv:160301360.
- 41.Wu Y, Xu J, Jiang M, Zhang Y, Xu H. A study of neural word embeddings for named entity recognition in clinical text. AMIA Annu Symp Proc. 2015;2015:1326–33.Google Scholar
- 42.LIBSVM. A library for support vector machines. https://www.csie.ntu.edu.tw/~cjlin/libsvm/. Accessed 23 Jun 2018.
- 43.Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. J Mach Learn Res. 2011;12:2493–537.Google Scholar
- 44.Chapman AB, Peterson KS, Alba PR, DuVall SL, Patterson OV. Hybrid system for adverse drug event detection. Proc Mach Learn Res. 2018;90:16–24.Google Scholar
- 45.Dandala B, Joopudi V, Devarakonda M. IBM Research System at MADE 2018: detecting adverse drug events from electronic health records. Proc Mach Learn Res. 2018;90:39–47.Google Scholar
- 46.Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell. 2013;35:1798–828.Google Scholar