Combining Machine Learning with a Rule-Based Algorithm to Detect and Identify Related Entities of Documented Adverse Drug Reactions on Hospital Discharge Summaries

Tan, Hui Xing; Teo, Chun Hwee Desmond; Ang, Pei San; Loke, Wei Ping Celine; Tham, Mun Yee; Tan, Siew Har; Soh, Bee Leng Sally; Foo, Pei Qin Belinda; Ling, Zheng Jye; Yip, Wei Luen James; Tang, Yixuan; Yang, Jisong; Tung, Kum Hoe Anthony; Dorajoo, Sreemanee Raaj

doi:10.1007/s40264-022-01196-x

Combining Machine Learning with a Rule-Based Algorithm to Detect and Identify Related Entities of Documented Adverse Drug Reactions on Hospital Discharge Summaries

Original Research Article
Published: 06 July 2022

Volume 45, pages 853–862, (2022)
Cite this article

Drug Safety Aims and scope Submit manuscript

Hui Xing Tan¹,
Chun Hwee Desmond Teo¹,
Pei San Ang¹,
Wei Ping Celine Loke¹,
Mun Yee Tham¹,
Siew Har Tan¹,
Bee Leng Sally Soh¹,
Pei Qin Belinda Foo¹,
Zheng Jye Ling²,
Wei Luen James Yip^3,4,
Yixuan Tang⁵,
Jisong Yang⁵,
Kum Hoe Anthony Tung⁵ &
…
Sreemanee Raaj Dorajoo ORCID: orcid.org/0000-0002-9613-6994¹

373 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Introduction

Discharge summaries contain valuable information about adverse drug reactions, but their unstructured nature makes them challenging to analyse and use as a signal source for pharmacovigilance. Machine learning has shown promise in identifying discharge summaries that contain related drug-adverse event pairs but has fared relatively poorer in entity extraction.

Methods

A hybrid model is developed combining rule-based and machine learning algorithms using discharge summaries with the aim of maximising capture of related drug-adverse event pairs. The rule first identifies segments containing adverse event entities within a 100-character distance from a drug term; machine learning subsequently estimates the relatedness of the drug and adverse event entities contained. The approach is validated on four independent datasets that are temporally and geographically separated from model development data. The impact of restricted drug-adverse event pair detection on recall is evaluated by using two of the four validation datasets that do not impose rule-based restrictions to annotations.

Results

The hybrid model achieves a recall of 0.80 (fivefold cross validation), 0.80 (temporal) and 0.76 (geographical) on validation using datasets containing only pre-identified target text segments that fulfil the rule-based algorithm criteria. When tested on datasets that additionally contained drug-adverse event pairs not restricted by the rule-based criteria, recall of the model declines to 0.68 and 0.62 on temporally and geographically separated datasets, respectively.

Conclusions

The proposed hybrid model demonstrates reasonable generalisability on external validation. Rule-based restriction of the detection space results in an approximately 12–14% reduction in recall but improves identification of the related drug and adverse event terms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The role of artificial intelligence in healthcare: a structured literature review

Article Open access 10 April 2021

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Article Open access 22 September 2023

The Use of Artificial Intelligence in Pharmacovigilance: A Systematic Review of the Literature

Article 29 July 2022

References

Lopez-Gonzalez E, Herdeiro MT, Figueiras A. Determinants of under-reporting of adverse drug reactions: a systematic review. Drug Saf. 2009;32(1):19–31.
Article CAS Google Scholar
Hazell L, Shakir SA. Under-reporting of adverse drug reactions : a systematic review. Drug Saf. 2006;29(5):385–96.
Article Google Scholar
Giardina C, Cutroneo PM, Mocciaro E, Russo GT, Mandraffino G, Basile G, et al. Adverse drug reactions in hospitalized patients: results of the FORWARD (Facilitation of Reporting in Hospital Ward) Study. Front Pharmacol. 2018;9:350.
Article Google Scholar
Chan SL, Ng HY, Sung C, Chan A, Winther MD, Brunham LR, et al. Economic burden of adverse drug reactions and potential for pharmacogenomic testing in Singaporean adults. Pharmacogenom J. 2019;19(4):401–10.
Article CAS Google Scholar
Komagamine J, Kobayashi M. Prevalence of hospitalisation caused by adverse drug reactions at an internal medicine ward of a single centre in Japan: a cross-sectional study. BMJ Open. 2019;9(8): e030515.
Article Google Scholar
Honigman B, Lee J, Rothschild J, Light P, Pulling RM, Yu T, et al. Using computerized data to identify adverse drug events in outpatients. J Am Med Inform Assoc. 2001;8(3):254–66.
Article CAS Google Scholar
Tang Y, Yang J, Ang PS, Dorajoo SR, Foo B, Soh S, et al. Detecting adverse drug reactions in discharge summaries of electronic medical records using Readpeer. Int J Med Inform. 2019;128:62–70.
Article Google Scholar
Ang PS, Tham MY, Tan SH, Soh BLS, Foo BPQ, Loke CWP, et al. Towards human-machine collaboration in creating an evaluation corpus for adverse drug events in discharge summaries of electronic medical records. Big Data Res. 2016;4(C):37–43.
Article Google Scholar
Jagannatha A, Liu F, Liu W, Yu H. Overview of the first natural language processing challenge for extracting medication, indication, and adverse drug events from electronic health record notes (MADE 1.0). Drug Saf. 2019;42(1):99–111.
Article Google Scholar
Dandala B, Joopudi V, Devarakonda M. Adverse drug events detection in clinical notes by jointly modeling entities and relations using neural networks. Drug Saf. 2019;42(1):135–46.
Article Google Scholar
Yang X, Bian J, Gong Y, Hogan WR, Wu Y. MADEx: a system for detecting medications, adverse drug events, and their relations from clinical notes. Drug Saf. 2019;42(1):123–33.
Article Google Scholar
Chapman AB, Peterson KS, Alba PR, DuVall SL, Patterson OV. Detecting adverse drug events with rapidly trained classification models. Drug Saf. 2019;42(1):147–56.
Article Google Scholar
Johnson AE, Pollard TJ, Shen L, Lehman LW, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;24(3): 160035.
Article Google Scholar
US Food and Drug Administration. Orange book: approved drug products with therapeutic equivalence evaluations. www.accessdata.fda.gov/scripts/cder/ob/index.cfm. Accessed Feb 2022.
WHO Adverse Reaction Terminology (WHO-ART). 2015. Collaborating Centre for International Drug Monitoring, World Health Organization. Geneva, Switzerland.
Nelson SJ, Zeng K, Kilbourne J, Powell T, Moore R. Normalized names for clinical drugs: RxNorm at 6 years. J Am Med Inform Assoc. 2011;18(4):441–8.
Article Google Scholar
MedDRA®, the Medical Dictionary for Regulatory Activities. 2017. The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Geneva, Switzerland.
Bochkarev VV, Shevlyakova AV, Solovyev V. Average word length dynamics as indicator of cultural changes in society. Soc Evol Hist. 2012;14(2):153–75.
Google Scholar
Cocos A, Fiks AG, Masino AJ. Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts. J Am Med Inform Assoc. 2017;24(4):813–21.
Article Google Scholar
Christopoulou F, Tran TT, Sahu SK, Miwa M, Ananiadou S. Adverse drug events and medication relation extraction in electronic health records with ensemble deep learning methods. J Am Med Inform Assoc. 2020;27(1):39–46.
Article Google Scholar
Lavertu A, Hamamsy T, Altman RB. Quantifying the severity of adverse drug reactions using social media: network analysis. J Med Internet Res. 2021;23(10): e27714.
Article Google Scholar
Alvaro N, Miyao Y, Collier N. TwiMed: Twitter and PubMed comparable corpus of drugs, diseases, symptoms, and their relations. JMIR Public Health Surveill. 2017;3(2): e24.
Article Google Scholar
Naranjo CA, Busto U, Sellers EM, Sandor P, Ruiz I, Roberts EA, et al. A method for estimating the probability of adverse drug reactions. Clin Pharmacol Ther. 1981;30(2):239–45.
Article CAS Google Scholar
Xuelan F, Graeme K. Expressing causation in written English. RELC J. 1992;23(1):62–80.
Article Google Scholar

Download references

Acknowledgements

The authors thank their colleagues from the Information Management Department, Corporate Services Group, Health Sciences Authority for their assistance with the set-up required to run the machine learning models mentioned in this paper. We are grateful to the Academic Informatics Office, National University Health System for providing the data to enable the conduct of this study. We thank Prof. Cynthia Sung for her insightful comments that assisted the implementation of this study. We also thank A/Prof. Cheng Leng Chan, Group Director of the Health Products Regulation Group (HPRG), Dr. Dorothy Toh, Assistant Group Director, HPRG Vigilance, Compliance and Enforcement Cluster and Ms Jalene Poh, Director, Vigilance and Compliance Branch at the Health Sciences Authority for their support to pursue this research to advance the organisation’s pharmacovigilance mission as well as Michael Winther for his valuable programmatic coordination for the SAPhIRE (Surveillance and Pharmacogenomics Initiative for Adverse Drug Reactions) Project.

Author information

Authors and Affiliations

Vigilance and Compliance Branch, Health Products Regulation Group, Health Sciences Authority, Singapore, Singapore
Hui Xing Tan, Chun Hwee Desmond Teo, Pei San Ang, Wei Ping Celine Loke, Mun Yee Tham, Siew Har Tan, Bee Leng Sally Soh, Pei Qin Belinda Foo & Sreemanee Raaj Dorajoo
Regional Health System Office, National University of Singapore, National University Health System, Singapore, Singapore
Zheng Jye Ling
Department of Cardiology, National University Heart Centre, Singapore, Singapore
Wei Luen James Yip
Academic Informatics Office, National University Health System, Singapore, Singapore
Wei Luen James Yip
Department of Computer Science, School of Computing, National University of Singapore, Singapore, Singapore
Yixuan Tang, Jisong Yang & Kum Hoe Anthony Tung

Authors

Hui Xing Tan
View author publications
You can also search for this author in PubMed Google Scholar
Chun Hwee Desmond Teo
View author publications
You can also search for this author in PubMed Google Scholar
Pei San Ang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Ping Celine Loke
View author publications
You can also search for this author in PubMed Google Scholar
Mun Yee Tham
View author publications
You can also search for this author in PubMed Google Scholar
Siew Har Tan
View author publications
You can also search for this author in PubMed Google Scholar
Bee Leng Sally Soh
View author publications
You can also search for this author in PubMed Google Scholar
Pei Qin Belinda Foo
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Jye Ling
View author publications
You can also search for this author in PubMed Google Scholar
Wei Luen James Yip
View author publications
You can also search for this author in PubMed Google Scholar
Yixuan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jisong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kum Hoe Anthony Tung
View author publications
You can also search for this author in PubMed Google Scholar
Sreemanee Raaj Dorajoo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sreemanee Raaj Dorajoo.

Ethics declarations

Funding

This study was conducted under the SAPhIRE Project, funded by a Strategic Positioning Fund grant from the Biomedical Research Council of the Agency for Science, Technology and Research of Singapore (SPF2014/001).

Conflict of interest

The authors have no conflicts of interest that are directly relevant to the content of this article. The view expressed in this article are the authors’ personal views and may not be understood or quoted as being made on behalf or reflect the positions of Health Sciences Authority, NUH and NUS.

Ethics approval

No ethics approval was required for this study.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and material

The MIMIC III data used in this study are considered an open source database.

Code availability

All data were processed and analysed using Python 3.7. Codes are provided in the ESM.

Authors’ contributions

THX, TCHD and DSR designed the research, models and computational framework to analyse the data. APS, LWPC, TMY, TSH, SBLS, FPQB, TCHD and DSR provided the domain expertise for manual annotation of discharge summaries and developed the list of trigger words, short forms and acronyms. LZJ and YWLJ coordinated the provision of NUH data and participated in discussions. TYX, YJ and TKHA set up the analytic architecture to allow the package applications to run the codes for the data analysis. All authors read and approved the final version of this manuscript.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 125 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tan, H.X., Teo, C.H.D., Ang, P.S. et al. Combining Machine Learning with a Rule-Based Algorithm to Detect and Identify Related Entities of Documented Adverse Drug Reactions on Hospital Discharge Summaries. Drug Saf 45, 853–862 (2022). https://doi.org/10.1007/s40264-022-01196-x

Download citation

Accepted: 29 May 2022
Published: 06 July 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s40264-022-01196-x

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining Machine Learning with a Rule-Based Algorithm to Detect and Identify Related Entities of Documented Adverse Drug Reactions on Hospital Discharge Summaries