Comparison of algorithms that detect drug side effects using electronic healthcare databases


The electronic healthcare databases are starting to become more readily available and are thought to have excellent potential for generating adverse drug reaction signals. The Health Improvement Network (THIN) database is an electronic healthcare database containing medical information on over 11 million patients that has excellent potential for detecting ADRs. In this paper we apply four existing electronic healthcare database signal detecting algorithms (MUTARA, HUNT, Temporal Pattern Discovery and modified ROR) on the THIN database for a selection of drugs from six chosen drug families. This is the first comparison of ADR signalling algorithms that includes MUTARA and HUNT and enabled us to set a benchmark for the adverse drug reaction signalling ability of the THIN database. The drugs were selectively chosen to enable a comparison with previous work and for variety. It was found that no algorithm was generally superior and the algorithms’ natural thresholds act at variable stringencies. Furthermore, none of the algorithms perform well at detecting rare ADRs.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. Almenoff J, Tonning JM, Gould AL et al (2005) Perspectives on the use of data mining in pharmacovigilance. Drug Saf 28(11):981–1007

    Google Scholar 

  2. Alvarez-Requejo A, Carvajal A, Begaud B et al (1998) Under-reporting of adverse drug reactions-estimate based on a spontaneous reporting scheme and a sentinel system. Eur J Clin Pharmacol 54(6):483–488

    Google Scholar 

  3. Bate A, Lindquist M, Edwards IR et al (1998) A bayesian neural network method for adverse drug reaction signal generation. Eur J Clin Pharmacol 54:315–321

    Article  Google Scholar 

  4. Behrman RE, Benner JS, Brown JS et al (2011) Developing the sentinel system—a national resource for evidence development. N Engl J Med 364:498–499

    Google Scholar 

  5. Brown JS, Kulldorff M, Chan A et al (2007) Early detection of adverse drug events within population-based health networks: application of sequential testing methods. Pharmacoepidemiol Drug Saf 16:1275–1284

    Article  Google Scholar 

  6. Coloma PM, Schuemie MJ, Trifiro G et al (2011) Combining electronic healthcare databases in europe to allow for large-scale drug safety monitoring: the eu-adr project. Pharmacoepidemiol Drug Saf 20: 1–11

    Google Scholar 

  7. Curtis JR, Cheng H, Delzell E et al (2008) Adaptation of bayesian data mining algorithms to longitudinal claims data: coxib safety as an example. Med Care 46:969–975

    Article  Google Scholar 

  8. DuMouchel W (1999) Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting systemt. Am Stat 53(3):177–190

    Google Scholar 

  9. Evans SJW, Waller PC, Davis S (2001) Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiol Drug Saf 10(6):483–486

    Article  Google Scholar 

  10. INPS (2011) A Cegedim company: welcome to inps. Accessed 25 Jan 2012

  11. Jin H, Chen J, Kelman C et al (2006) Mining unexpected associations for signalling potential adverse drug reactions from administrative health databases. In: PAKDD, pp 867–876

  12. Jin HW, Chen J, He H et al (2010) Signaling potential adverse drug reactions from administrative health databases. IEEE Trans Knowl Data Eng 22(6):839–853

    Article  Google Scholar 

  13. Joint Formulary Committee (2011) British National Formulary, 62nd edn. BMJ Group and Pharmaceutical Press, London

  14. Lewis JD, Bilker WB, Weinstein RB, Strom BL (2005) The relationship between time since registration and measured incidence rates in the general practice research database. Pharmacoepidemiol Drug Saf 14(7):443–451

    Article  Google Scholar 

  15. Noren GN, Hopstadius J, Bate A et al (2010) Temporal pattern discovery in longitudinal electronic patients records. Data Min Knowl Disc 20:361–387

    Google Scholar 

  16. Piromohamed M, James S, Meakin S et al (2004) Adverse drug reactions as cause of admission to hospital: prospective analysis of 18820 patients. Br Med J 329:15–19

    Google Scholar 

  17. Rosenberg L, Coogan P, Palmer J (2007) Case-control surveillance. In: strom bl (ed) Pharmacoepidemiology, 4th edn. Wiley, Chichester

  18. Ryan PB, Powell G, Pattishall E, Beach K (2009) Performance of screening multiple observational databases for active drug safety surveillance. International Society of Pharmacoepidermiology, Providence, RI

  19. Ryan PB, Madigan D, Stang PE, Overhage JM, Racoosin JA, Hartzema AG (2012) Empirical assessment of methods for risk identification in healthcare data: results from the experiments of the observational medical outcomes partnership. Stat Med 31:4401–4415

    Google Scholar 

  20. Schuemie MJ, Coloma PM, Straatman H, Herings RMC, Trifir G, Matthews JN, Prieto-Merino D, Molokhia M, Gini LPR, Innocent F, Mazzaglia G, Picelli G, Scotti L, van der Lei J, Sturkenboom MCJM (2012) Using electronic health care records for drug safety signal detection: A comparative evaluation of statistical methods. Med Care 50(10):890–897

    Google Scholar 

  21. Shephard E, Stapley S, Hamilton W (2011) The use of electronic databases in primary care research. Fam Pract 28(4):352–354. doi:10.1093/fampra/cmr039

    Google Scholar 

  22. Story NL (1974) Sexual dysfunction resulting from drug side effects. J Sex Res 10(2):132–149

    Article  Google Scholar 

  23. van Puijenbroek EP, Bate A, Leufkens HGM et al (2002) A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. Pharmacoepidemiol Drug Saf 11(1):3–10

    Article  Google Scholar 

  24. WHO Collaborating Centre for Drug Statistics Methodology: ATC classification index with DDDs, 2013. Oslo (2012)

  25. Wilke RA, Xu H, Denny JC et al (2001) The emerging role of electronic medical records in pharmacogenomics. Int J Clin Pharmacol Ther 89(3):379–386

    Article  Google Scholar 

  26. World Health Organization: International Statistical Classification of Diseases and Related Health Problems (The) ICD-10. 2010 Edition. Nonserial Publications (2011)

  27. Zhou X, Murugesan S, Bhullar H, Liu Q, Cai B, Wentworth C, Bate A (2013) An evaluation of the thin database in the omop common data model for active drug safety surveillance. Drug Saf 36:119–134

    Article  Google Scholar 

  28. Zorych I, Madigan D, Ryan P et al (2011) Disproportionality methods for pharmacovigilance in longitudinal observational databases. Stat Methods Med Res 1–18

Download references

Author information



Corresponding author

Correspondence to Jenna Marie Reps.

Additional information

Communicated by G. Acampora.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Reps, J.M., Garibaldi, J.M., Aickelin, U. et al. Comparison of algorithms that detect drug side effects using electronic healthcare databases. Soft Comput 17, 2381–2397 (2013).

Download citation


  • Adverse drug event
  • Electronic healthcare database
  • Longitudinal observational database
  • HUNT
  • Temporal pattern discovery
  • Disproportionality methods