Skip to main content
Log in

Combining Machine Learning with a Rule-Based Algorithm to Detect and Identify Related Entities of Documented Adverse Drug Reactions on Hospital Discharge Summaries

  • Original Research Article
  • Published:
Drug Safety Aims and scope Submit manuscript

Abstract

Introduction

Discharge summaries contain valuable information about adverse drug reactions, but their unstructured nature makes them challenging to analyse and use as a signal source for pharmacovigilance. Machine learning has shown promise in identifying discharge summaries that contain related drug-adverse event pairs but has fared relatively poorer in entity extraction.

Methods

A hybrid model is developed combining rule-based and machine learning algorithms using discharge summaries with the aim of maximising capture of related drug-adverse event pairs. The rule first identifies segments containing adverse event entities within a 100-character distance from a drug term; machine learning subsequently estimates the relatedness of the drug and adverse event entities contained. The approach is validated on four independent datasets that are temporally and geographically separated from model development data. The impact of restricted drug-adverse event pair detection on recall is evaluated by using two of the four validation datasets that do not impose rule-based restrictions to annotations.

Results

The hybrid model achieves a recall of 0.80 (fivefold cross validation), 0.80 (temporal) and 0.76 (geographical) on validation using datasets containing only pre-identified target text segments that fulfil the rule-based algorithm criteria. When tested on datasets that additionally contained drug-adverse event pairs not restricted by the rule-based criteria, recall of the model declines to 0.68 and 0.62 on temporally and geographically separated datasets, respectively.

Conclusions

The proposed hybrid model demonstrates reasonable generalisability on external validation. Rule-based restriction of the detection space results in an approximately 12–14% reduction in recall but improves identification of the related drug and adverse event terms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Lopez-Gonzalez E, Herdeiro MT, Figueiras A. Determinants of under-reporting of adverse drug reactions: a systematic review. Drug Saf. 2009;32(1):19–31.

    Article  CAS  Google Scholar 

  2. Hazell L, Shakir SA. Under-reporting of adverse drug reactions : a systematic review. Drug Saf. 2006;29(5):385–96.

    Article  Google Scholar 

  3. Giardina C, Cutroneo PM, Mocciaro E, Russo GT, Mandraffino G, Basile G, et al. Adverse drug reactions in hospitalized patients: results of the FORWARD (Facilitation of Reporting in Hospital Ward) Study. Front Pharmacol. 2018;9:350.

    Article  Google Scholar 

  4. Chan SL, Ng HY, Sung C, Chan A, Winther MD, Brunham LR, et al. Economic burden of adverse drug reactions and potential for pharmacogenomic testing in Singaporean adults. Pharmacogenom J. 2019;19(4):401–10.

    Article  CAS  Google Scholar 

  5. Komagamine J, Kobayashi M. Prevalence of hospitalisation caused by adverse drug reactions at an internal medicine ward of a single centre in Japan: a cross-sectional study. BMJ Open. 2019;9(8): e030515.

    Article  Google Scholar 

  6. Honigman B, Lee J, Rothschild J, Light P, Pulling RM, Yu T, et al. Using computerized data to identify adverse drug events in outpatients. J Am Med Inform Assoc. 2001;8(3):254–66.

    Article  CAS  Google Scholar 

  7. Tang Y, Yang J, Ang PS, Dorajoo SR, Foo B, Soh S, et al. Detecting adverse drug reactions in discharge summaries of electronic medical records using Readpeer. Int J Med Inform. 2019;128:62–70.

    Article  Google Scholar 

  8. Ang PS, Tham MY, Tan SH, Soh BLS, Foo BPQ, Loke CWP, et al. Towards human-machine collaboration in creating an evaluation corpus for adverse drug events in discharge summaries of electronic medical records. Big Data Res. 2016;4(C):37–43.

    Article  Google Scholar 

  9. Jagannatha A, Liu F, Liu W, Yu H. Overview of the first natural language processing challenge for extracting medication, indication, and adverse drug events from electronic health record notes (MADE 1.0). Drug Saf. 2019;42(1):99–111.

    Article  Google Scholar 

  10. Dandala B, Joopudi V, Devarakonda M. Adverse drug events detection in clinical notes by jointly modeling entities and relations using neural networks. Drug Saf. 2019;42(1):135–46.

    Article  Google Scholar 

  11. Yang X, Bian J, Gong Y, Hogan WR, Wu Y. MADEx: a system for detecting medications, adverse drug events, and their relations from clinical notes. Drug Saf. 2019;42(1):123–33.

    Article  Google Scholar 

  12. Chapman AB, Peterson KS, Alba PR, DuVall SL, Patterson OV. Detecting adverse drug events with rapidly trained classification models. Drug Saf. 2019;42(1):147–56.

    Article  Google Scholar 

  13. Johnson AE, Pollard TJ, Shen L, Lehman LW, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;24(3): 160035.

    Article  Google Scholar 

  14. US Food and Drug Administration. Orange book: approved drug products with therapeutic equivalence evaluations. www.accessdata.fda.gov/scripts/cder/ob/index.cfm. Accessed Feb 2022.

  15. WHO Adverse Reaction Terminology (WHO-ART). 2015. Collaborating Centre for International Drug Monitoring, World Health Organization. Geneva, Switzerland.

  16. Nelson SJ, Zeng K, Kilbourne J, Powell T, Moore R. Normalized names for clinical drugs: RxNorm at 6 years. J Am Med Inform Assoc. 2011;18(4):441–8.

    Article  Google Scholar 

  17. MedDRA®, the Medical Dictionary for Regulatory Activities. 2017. The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. Geneva, Switzerland.

  18. Bochkarev VV, Shevlyakova AV, Solovyev V. Average word length dynamics as indicator of cultural changes in society. Soc Evol Hist. 2012;14(2):153–75.

    Google Scholar 

  19. Cocos A, Fiks AG, Masino AJ. Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts. J Am Med Inform Assoc. 2017;24(4):813–21.

    Article  Google Scholar 

  20. Christopoulou F, Tran TT, Sahu SK, Miwa M, Ananiadou S. Adverse drug events and medication relation extraction in electronic health records with ensemble deep learning methods. J Am Med Inform Assoc. 2020;27(1):39–46.

    Article  Google Scholar 

  21. Lavertu A, Hamamsy T, Altman RB. Quantifying the severity of adverse drug reactions using social media: network analysis. J Med Internet Res. 2021;23(10): e27714.

    Article  Google Scholar 

  22. Alvaro N, Miyao Y, Collier N. TwiMed: Twitter and PubMed comparable corpus of drugs, diseases, symptoms, and their relations. JMIR Public Health Surveill. 2017;3(2): e24.

    Article  Google Scholar 

  23. Naranjo CA, Busto U, Sellers EM, Sandor P, Ruiz I, Roberts EA, et al. A method for estimating the probability of adverse drug reactions. Clin Pharmacol Ther. 1981;30(2):239–45.

    Article  CAS  Google Scholar 

  24. Xuelan F, Graeme K. Expressing causation in written English. RELC J. 1992;23(1):62–80.

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank their colleagues from the Information Management Department, Corporate Services Group, Health Sciences Authority for their assistance with the set-up required to run the machine learning models mentioned in this paper. We are grateful to the Academic Informatics Office, National University Health System for providing the data to enable the conduct of this study. We thank Prof. Cynthia Sung for her insightful comments that assisted the implementation of this study. We also thank A/Prof. Cheng Leng Chan, Group Director of the Health Products Regulation Group (HPRG), Dr. Dorothy Toh, Assistant Group Director, HPRG Vigilance, Compliance and Enforcement Cluster and Ms Jalene Poh, Director, Vigilance and Compliance Branch at the Health Sciences Authority for their support to pursue this research to advance the organisation’s pharmacovigilance mission as well as Michael Winther for his valuable programmatic coordination for the SAPhIRE (Surveillance and Pharmacogenomics Initiative for Adverse Drug Reactions) Project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sreemanee Raaj Dorajoo.

Ethics declarations

Funding

This study was conducted under the SAPhIRE Project, funded by a Strategic Positioning Fund grant from the Biomedical Research Council of the Agency for Science, Technology and Research of Singapore (SPF2014/001).

Conflict of interest

The authors have no conflicts of interest that are directly relevant to the content of this article. The view expressed in this article are the authors’ personal views and may not be understood or quoted as being made on behalf or reflect the positions of Health Sciences Authority, NUH and NUS.

Ethics approval

No ethics approval was required for this study.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and material

The MIMIC III data used in this study are considered an open source database.

Code availability

All data were processed and analysed using Python 3.7. Codes are provided in the ESM.

Authors’ contributions

THX, TCHD and DSR designed the research, models and computational framework to analyse the data. APS, LWPC, TMY, TSH, SBLS, FPQB, TCHD and DSR provided the domain expertise for manual annotation of discharge summaries and developed the list of trigger words, short forms and acronyms. LZJ and YWLJ coordinated the provision of NUH data and participated in discussions. TYX, YJ and TKHA set up the analytic architecture to allow the package applications to run the codes for the data analysis. All authors read and approved the final version of this manuscript.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 125 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, H.X., Teo, C.H.D., Ang, P.S. et al. Combining Machine Learning with a Rule-Based Algorithm to Detect and Identify Related Entities of Documented Adverse Drug Reactions on Hospital Discharge Summaries. Drug Saf 45, 853–862 (2022). https://doi.org/10.1007/s40264-022-01196-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40264-022-01196-x

Navigation