Drug Safety

, Volume 28, Issue 11, pp 981–1007 | Cite as

Perspectives on the Use of Data Mining in Pharmacovigilance

  • June Almenoff
  • Joseph M. Tonning
  • A. Lawrence Gould
  • Ana Szarfman
  • Manfred Hauben
  • Rita Ouellet-Hellstrom
  • Robert Ball
  • Ken Hornbuckle
  • Louisa Walsh
  • Chuen Yee
  • Susan T. Sacks
  • Nancy Yuen
  • Vaishali Patadia
  • Michael Blum
  • Mike Johnston
  • Charles Gerrits
  • Harry Seifert
  • Karol LaCroix
Leading Article


In the last 5 years, regulatory agencies and drug monitoring centres have been developing computerised data-mining methods to better identify reporting relationships in spontaneous reporting databases that could signal possible adverse drug reactions. At present, there are no guidelines or standards for the use of these methods in routine pharmacovigilance. In 2003, a group of statisticians, pharmacoepidemiologists and pharmacovigilance professionals from the pharmaceutical industry and the US FDA formed the Pharmaceutical Research and Manufacturers of America-FDA Collaborative Working Group on Safety Evaluation Tools to review best practices for the use of these methods.

In this paper, we provide an overview of: (i) the statistical and operational attributes of several currently used methods and their strengths and limitations; (ii) information about the characteristics of various postmarketing safety databases with which these tools can be deployed; (iii) analytical considerations for using safety data-mining methods and interpreting the results; and (iv) points to consider in integration of safety data mining with traditional pharmacovigilance methods. Perspectives from both the FDA and the industry are provided.

Data mining is a potentially useful adjunct to traditional pharmacovigilance methods. The results of data mining should be viewed as hypothesis generating and should be evaluated in the context of other relevant data. The availability of a publicly accessible global safety database, which is updated on a frequent basis, would further enhance detection and communication about safety issues.


Adverse Event Reporting System Reporting Odds Ratio Proportional Reporting Ratio Safety Database Uppsala Monitoring Centre 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The Working Group acknowledges the participation and help of Rosanne Ososki, Lesley-Anne Furlong, Cheryl Watton, Dionigi Maladorno and Min Chu Chen. The Working Group also thanks Miles Braun and Paul Seligman for their support of this collaboration and for their helpful reviews of the manuscript.

We acknowledge PhRMA for funding technical support for preparation of this manuscript. A number of the authors are employed by pharmaceutical companies, as described in their respective affiliations.


  1. 1.
    O’Neill RT, Szarfman A. Some FDA perspectives on data mining for pediatric safety assessment. Curr Ther Res Clin Exp 2001; 62(9): 650–63CrossRefGoogle Scholar
  2. 2.
    Szarfman A, Machado SG, O’Neill RT. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf 2002; 25(6): 381–92PubMedCrossRefGoogle Scholar
  3. 3.
    Szarfman A, Tonning JM, Doraiswamy PM. Pharmacovigilance in the 21st century: new systematic tools for an old problem. Pharmacotherapy 2004; 24(9): 1099–104PubMedCrossRefGoogle Scholar
  4. 4.
    Hauben M. Early postmarketing drug safety surveillance: data mining points to consider. Ann Pharmacother 2004; 38: 1625–30PubMedCrossRefGoogle Scholar
  5. 5.
    Wolkenstein P, Latarget J, Roujeau J, et al. Randomised comparison of thalidomide versus placebo in toxic epidermal necrolysis. Lancet 1998; 352: 1586–9PubMedCrossRefGoogle Scholar
  6. 6.
    Edwards IR, Biriell C. Harmonization in pharmacovigilance. Drug Saf 1994; 10(2): 93–102PubMedCrossRefGoogle Scholar
  7. 7.
    Lindquist M, Edwards IR, Bate A, et al. From association to alert: a revised approach to international signal analysis. Pharmacoepidemiol Drug Saf 1999; 8: S15–25PubMedCrossRefGoogle Scholar
  8. 8.
    Report of CIOMS Working Group VI. Management of safety information from clinical trials. Geneva: Council for International Organization of Medical Sciences (CIOMS), 2005Google Scholar
  9. 9.
    Evans SJW, Waller PC, Davis S. Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports. Pharmacoepidemiol Drug Saf 2001; 10: 483–6PubMedCrossRefGoogle Scholar
  10. 10.
    Gould AL. Practical pharmacovigilance analysis strategies. Pharmacoepidemiol Drug Saf 2003; 12: 559–74PubMedCrossRefGoogle Scholar
  11. 11.
    Hauben M. A brief primer on automated signal detection. Ann Pharmacother 2003; 37: 1117–23PubMedCrossRefGoogle Scholar
  12. 12.
    Hauben M, Zhou X. Quantitative methods in pharmacovigilance: focus on signal detection. Drug Saf 2003; 26(3): 159–86PubMedCrossRefGoogle Scholar
  13. 13.
    Wilson AM, Thabane L, Holbrook A. Application of data mining techniques in pharmacovigilance. Br J Clin Pharmacol 2003; 57(2): 127–34CrossRefGoogle Scholar
  14. 14.
    Evans SJ. Pharmacovigilance: a science or fielding emergencies? Stat Med 2000; 19(23): 3199–209PubMedCrossRefGoogle Scholar
  15. 15.
    Egberts AC, Meyboom RH, van Puijenbroek EP. Use of measures of disproportionality in pharmacovigilance: three Dutch examples. Drug Saf 2002; 25(6): 453–8PubMedCrossRefGoogle Scholar
  16. 16.
    van der Heijden PGM, van Puijenbroek EP, van Buuren S, et al. On the assessment of adverse drug reactions from spontaneous reporting systems: the influence of under-reporting on odds ratios. Stat Med 2002; 21: 2027–44PubMedCrossRefGoogle Scholar
  17. 17.
    van Puijenbroek EP, Bate A, Leufkens HGM, et al. A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. Pharmacoepidemiol Drug Saf 2002; 11: 3–10PubMedCrossRefGoogle Scholar
  18. 18.
    Bate A, Lindquist M, Edwards IR, et al. A Bayesian neural network method for adverse drug reaction signal generation. Eur J Clin Pharmacol 1998; 54: 315–21PubMedCrossRefGoogle Scholar
  19. 19.
    Lindquist M, Stahl M, Bate A, et al. A retrospective evaluation of a data mining approach to aid finding new adverse drug reaction signals in the WHO international database. Drug Saf 2000; 23(6): 533–42PubMedCrossRefGoogle Scholar
  20. 20.
    DuMouchel W. Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system. Am Stat 1999; 53(3): 177–202Google Scholar
  21. 21.
    DuMouchel W, Pregibon D. Empirical Bayes screening for multi-item associations. In: Conference on Knowledge Discovery in Data. Proceedings of the seventh ACM SigKDD International Conference on Knowledge Discovery and Data Mining. 2001 Aug 26-29; San Francisco (CA). New York: ACM Press, 2001: 67–76Google Scholar
  22. 22.
    Council for International Organizations of Medical Sciences (CIOMS). Guidelines for Preparing Core Clinical-Safety Information on Drugs. 2nd ed. Report of CIOMS Working Groups III and V. Geneva: Council for International Orgnizations of Medical Sciences (CIOMS), 1999: 27–33Google Scholar
  23. 23.
    Purcell P, Barty S. Statistical techniques for signal generation: the Australian experience. Drug Saf 2002; 25(6): 415–21PubMedCrossRefGoogle Scholar
  24. 24.
    Mamedov MA, Saunders GW. Fuzzy set analysis of Australian drug safety data. Proceedings of HIC 2002: Tenth National Health Informatics Conference; 2002 Dec 4-6, MelbourneGoogle Scholar
  25. 25.
    Mamedov MA, Saunders GW, Yearwood J. A fuzzy derivative approach to classification of outcomes from the ADRAC database. International Transactions in Operational Research 2004; 11(2): 169–79CrossRefGoogle Scholar
  26. 26.
    Spiegelhalter D, Grigg O, Kinsman R, et al. Risk adjusted sequential probability ratio tests: application to Bristol, Ship-man and adult cardiac surgery. Int J Qual Health Care 2003; 15: 7–13PubMedCrossRefGoogle Scholar
  27. 27.
    Grigg OA, Farewell VT, Spiegelhalter DJ. Use of risk-adjusted CUSUM and RSPRT charts for monitoring in medical contexts. Stat Methods Med Res 2003; 12(2): 147–70PubMedGoogle Scholar
  28. 28.
    Kulldorf M, Fang Z, Walsh SJ. A tree-based scan statistic for database disease surveillance. Biometrics 2003; 59: 323–31CrossRefGoogle Scholar
  29. 29.
    Hauben M, Reich L. Drug-induced pancreatitis: lessons in data mining [letter]. Br J Clin Pharmacol 2004; 58(5): 560–2PubMedCrossRefGoogle Scholar
  30. 30.
    Waller P, Heeley E, Moseley J. Impact analysis of signals detected from spontaneous adverse drug reaction reporting data. Drug Saf 2005; 28(10): 843–50PubMedCrossRefGoogle Scholar
  31. 31.
    Waller PC, Heeley EL, Moseley JNS. Impact analysis of signals detected from spontaneous adverse reaction reporting data [abstract]. Pharmacoepidemiol Drug Saf 2004; 13: S323CrossRefGoogle Scholar
  32. 32.
    Heeley E, Waller P, Moseley J. Testing and implementing signal impact analysis in a regulatory setting: results of a pilot study. Drug Saf 2005; 28(10): 901–6PubMedCrossRefGoogle Scholar
  33. 33.
    Rothman KJ, Lanes S, Sacks ST. The reporting odds ratio and its advantages over the proportional reporting ratio. Pharmacoepidemiol Drug Saf 2004; 13: 519–23PubMedCrossRefGoogle Scholar
  34. 34.
    Waller P, van Puijenbroek E, Egberts A, et al. The reporting odds ratio versus the proportional reporting ratio: ‘deuce’ [letter]. Pharmacoepidemiol Drug Saf 2004; 13: 525–6PubMedCrossRefGoogle Scholar
  35. 35.
    Stahl M, Lindquist M, Edwards IR, et al. Introducing triage logic as a new strategy for the detection of signals in the WHO drug monitoring database. Pharmacoepidemiol Drug Saf 2004; 13: 355–63PubMedCrossRefGoogle Scholar
  36. 36.
    Begaud B, Moride Y, Tubert-Bitter P, et al. False-positive in spontaneous reporting: should we worry about them? Br J Clin Pharmacol 1994; 38(5): 401–4PubMedCrossRefGoogle Scholar
  37. 37.
    Hauben M. Application of an empiric Bayesian data mining algorithm to reports of pancreatitis associated with atypical antipsychotics. Pharmacotherapy 2004; 24(9): 1122–9PubMedCrossRefGoogle Scholar
  38. 38.
    Hauben M. Trimethoprim-induced hyperkalaemia-lessons in data mining [letter]. Br J Clin Pharmacol 2004; 58(3): 338–9PubMedCrossRefGoogle Scholar
  39. 39.
    Hauben M, Reich L. Safety related drug-labelling changes: findings from two data mining algorithms. Drug Saf 2004; 27(10): 735–44PubMedCrossRefGoogle Scholar
  40. 40.
    Levine JG, Tonning JM, Szarfman A. Reply: the evaluation of data mining methods for the simultaneous and systematic detection of safety signals in large databases. Lessons to be learned [letter]. Br J Clin Pharmacol. In pressGoogle Scholar
  41. 41.
    Hauben M, Reich L. Response to letter by Levine et al. [letter]. Br J Clin Pharmacol. In pressGoogle Scholar
  42. 42.
    Lindquist M. The WHO adverse reaction database: basic facts [online]. Available from URL: [Accessed 2004 Sep 14]
  43. 43.
    European Medicines Agency. EudraVigilance [online]. Available from URL: [Accessed 2005 Sep 17]
  44. 44.
    Cosentino M, Leoni O, Michielotto D, et al. Increased reporting of adverse reactions to ACE inhibitors associated with limitations to drug reimbursement for angiotensin-II antagonists. Eur J Clin Pharmacol 2001; 57: 509–12PubMedCrossRefGoogle Scholar
  45. 45.
    Bate A, Edwards RI, Lindquist M, et al. The authors’ reply [letter]. Drug Saf 2003; 26(5): 364–6CrossRefGoogle Scholar
  46. 46.
    Szarfman A, DuMouchel W, Fram D, et al. Lactic acidosis: unraveling the individual toxicities of drugs used in HIV and diabetes polytherapy by hierarchical Bayesian logistic regression data mining [abstract]. 11th Annual FDA Science Forum, 2005 Apr 27-28 [online]. Available from URL: [Accessed 2005 Sep 14]
  47. 47.
    Brown EG. Effects of coding dictionary on signal generation: a consideration of use of MedDRA compared with WHO-ART. Drug Saf 2002; 25(6): 445–52PubMedCrossRefGoogle Scholar
  48. 48.
    Haramburu F, Begaud B, Moride Y. Temporal trends in spontaneous reporting of unlabelled adverse drug reactions. Br J Clin Pharmacol 1997; 44: 299–301PubMedCrossRefGoogle Scholar
  49. 49.
    Manson JM, Freyssinges C, Ducrocq MB, et al. Postmarketing surveillance of lovastatin and simvastatin exposure during pregnancy. Reprod Toxicol 1996; 10(6): 439–46PubMedCrossRefGoogle Scholar
  50. 50.
    Blais L, Ernst P, Suissa S. Confounding by indication and channeling over time: the risks of beta 2-agonists. Am J Epidemiol 1996; 15(12): 1161–9CrossRefGoogle Scholar
  51. 51.
    Blais L, Ernst P, Suissa S. The authors’ reply [letter]. Am J Epidemiol 1997; 146(10): 886–7CrossRefGoogle Scholar
  52. 52.
    Leufkens HG. Pharmacoepidemiology and gastroenterology: a close couple. Scand J Gastroenterol Suppl 2000; 232: 105–8PubMedGoogle Scholar
  53. 53.
    Leufkens H, Urquhart J. Variability in patterns of drug usage. J Pharm Pharmacol 1994; 46Suppl. 1: 433–7PubMedGoogle Scholar
  54. 54.
    Meijer WEE, Heerdink ER, Pepplinkhuizen LP, et al. Prescribing patterns in patients using new antidepressants. Br J Clin Pharmacol 2001; 51: 181–3PubMedCrossRefGoogle Scholar
  55. 55.
    Pearce N, Beasley R, Crane J, et al. Confounding by indication and channeling over time: the risks of beta-2 agonists [letter]. Am J Epidemiol 1997; 146(10): 885–6PubMedCrossRefGoogle Scholar
  56. 56.
    deBruin ML, van Puijenbroek EP, Egberts ACG, et al. Nonsedating antihistamine drugs and cardiac arrhythmias: biased risk estimates from spontaneous reporting systems? Br J Clin Pharmacol 2002; 53: 370–4CrossRefGoogle Scholar
  57. 57.
    Tisonova J, Szalayova A, Kriska M. Factors influencing the spontaneous reporting of adverse drug reactions: the experience of the Slovak Republic. Pharmacoepidemiol Drug Saf 2003; 13: 333–7CrossRefGoogle Scholar
  58. 58.
    Coster TS, Szarfman A, Tonning J. The application of data mining to analyze pre-publicity psychiatric signals with the use of mefloquine [abstract]. ASCPT Annual Meeting; 2004 Mar 4; Miami (FL)Google Scholar
  59. 59.
    Varricchio F, Iskander J, Destefano F, et al. Understanding vaccine safety information from the vaccine adverse event reporting system (VAERS). Pediatr Infect Dis J 2004; 23: 287–94PubMedCrossRefGoogle Scholar
  60. 60.
    Institute of Medicine. Immunization safety review [online]. Available from URL: [Accessed 2005 Sep 26]
  61. 61.
    Yee CL, Klincewicz SL, Knight JF, et al. Practical considerations in developing an automated signaling program within a pharmacovigilance department. Drug Inf J 2004; 38: 293–300CrossRefGoogle Scholar
  62. 62.
    US FDA. Guidance for industry: good pharmacovigilance practices and pharmacoepidemiologic assessment. US Food and Drug Administration Center for Drug Evaluation and Research and Center for Biologics Evaluation and Research, March 2005 [online]. Available from URL: [Accessed 2005 Sep 27]
  63. 63.
    Niu MT, Erwin DE, Braun MM. Data mining in the US vaccine adverse event reporting system: early detection of intussusception and other events after rota virus vaccine. Vaccine 2001; 19: 4627–34PubMedCrossRefGoogle Scholar
  64. 64.
    Banks D, Woo EJ, Burwen D, et al. Comparison of 4 data mining methods in the US Vaccine Adverse Event Reporting System (VAERS) [abstract]. Pharmacoepidemiol Drug Saf 2003; 12Suppl. 1: S138Google Scholar
  65. 65.
    Begier EM, Burwen D, Haber P, et al. Post-marketing safety surveillance for typhoid fever vaccines from the Vaccine Adverse Event Reporting System, July 1990-June 2002. Clin Infect Dis 2004; 38: 771–9PubMedCrossRefGoogle Scholar
  66. 66.
    McMahon AW, Bryant-Genevier MC, Woo EJ, et al. Photophobia following smallpox vaccination [letter]. Vaccine 2005; 23: 1097–8PubMedCrossRefGoogle Scholar
  67. 67.
    Zhou W, Pool V, DeStefano F, et al. A potential signal of Bell’s palsy after parenteral inactivated influenza vaccines: reports to the vaccine adverse event reporting system (VAERS): United States, 1991-2001. Pharmacoepidemiol Drug Saf 2004; 13: 505–10PubMedCrossRefGoogle Scholar
  68. 68.
    Shapiro S. Clinical judgment, common sense and adverse reaction reporting. Pharmacoepidemiol Drug Saf 2004; 13: 511–3PubMedCrossRefGoogle Scholar
  69. 69.
    Zhou W, Pool V, DeStefano F, et al. Reply to the editorial. Pharmacoepidemiol Drug Saf 2004; 13: 515–7CrossRefGoogle Scholar
  70. 70.
    van Puijenbroek EP, Egberts ACG, Heerdink ER, et al. Detecting drug-drug interactions using a database for spontaneous adverse drug reactions: an example with diuretics and nonsteroidal anti-inflammatory drugs. Eur J Clin Pharmacol 2000; 56: 733–8PubMedCrossRefGoogle Scholar
  71. 71.
    Almenoff JS, DuMouchel W, Kindman A, et al. Disproportionality analysis using empirical Bayes data mining: a tool for the evaluation of drug interactions in the post-marketing setting. Pharmacoepidemiol Drug Saf 2003; 12(6): 517–21PubMedCrossRefGoogle Scholar
  72. 72.
    Szarfman A. Syndromic surveillance and risk management using multi-item gamma Poisson shrinker. Journal of Urban Health: bulletin of the New York Academy of Medicine 2003; 80(2 Suppl. 1): i133 [online]. Available from URL: [Accessed 2005 Sep 15]Google Scholar
  73. 73.
    Yuen NA, Almenoff JS, DuMouchel W, et al. Disproportionality analysis to explore patient and treatment related factors associated with adverse events [abstract]. Pharmacoepidemiol Drug Saf 2004; 13: S259Google Scholar
  74. 74.
    Szarfman A. Gender-related ‘higher-than-expected’ drug-event combinations in spontaneous adverse drug event reports [abstract no. D05]. 2000 FDA Science Forum - FDA and the science of safety: new perspectives; 2000 Feb 14-15; Washington, DC [online]. Available from URL: [Accessed 2004 Oct 12]
  75. 75.
    Gogolak VV. The effect of backgrounds in safety analysis: the impact of comparison cases on what you see. Pharmacoepidemiol Drug Saf 2003; 12: 249–52PubMedCrossRefGoogle Scholar

Copyright information

© Adis Data Information BV. 2005

Authors and Affiliations

  • June Almenoff
    • 1
  • Joseph M. Tonning
    • 2
  • A. Lawrence Gould
    • 3
  • Ana Szarfman
    • 2
  • Manfred Hauben
    • 4
    • 5
    • 6
  • Rita Ouellet-Hellstrom
    • 2
  • Robert Ball
    • 2
  • Ken Hornbuckle
    • 7
  • Louisa Walsh
    • 8
  • Chuen Yee
    • 9
  • Susan T. Sacks
    • 10
  • Nancy Yuen
    • 1
  • Vaishali Patadia
    • 11
  • Michael Blum
    • 12
  • Mike Johnston
    • 2
  • Charles Gerrits
    • 13
  • Harry Seifert
    • 1
  • Karol LaCroix
    • 1
  1. 1.Global Clinical Safety and PharmacovigilanceGlaxoSmithKlineResearch Triangle ParkUSA
  2. 2.US Food & Drug AdministrationRockvilleUSA
  3. 3.Merck Research LaboratoriesWest PointUSA
  4. 4.Pfizer Inc.New YorkUSA
  5. 5.Department of MedicineNYU School of MedicineNew YorkUSA
  6. 6.Departments of Pharmacology and Community and Preventive MedicineNew York Medical CollegeValhallaUSA
  7. 7.Eli Lilly and CompanyIndianapolisUSA
  8. 8.AstraZeneca LPWilmingtonUSA
  9. 9.Johnson & Johnson Pharmaceutical Research & Development L.L.C.TitusvilleUSA
  10. 10.Hoffmann-La Roche Inc.NutleyUSA
  11. 11.Allergan Inc.IrvineUSA
  12. 12.Wyeth ResearchCollegevilleUSA
  13. 13.Schering-Plough Research InstituteSpringfieldUSA

Personalised recommendations