Abstract
Background: A phenomenon of ‘masking’ or ‘cloaking’ in pharmacovigilance data mining has been described, which can potentially cause signals of disproportionate reporting (SDRs) to be missed, particularly in pharmaceutical company databases. Masking has been predicted theoretically, observed anecdotally or studied to a limited extent in both pharmaceutical company and health authority databases, but no previous publication systematically assesses its occurrence in a large health authority database.
Objective: To explore the nature, extent and possible consequences of masking in the US FDA Adverse Event Reporting System (AERS) database by applying various experimental unmasking protocols to a set of drugs and events representing realistic pharmacovigilance analysis conditions.
Methods: This study employed AERS data from 2001 through 2005. For a set of 63 Medical Dictionary for Regulatory Activities (MedDRA®) Preferred Terms (PTs), disproportionality analysis was carried out with respect to all drugs included in the AERS database, using a previously described urnmodel-based algorithm. We specifically sought masking in which drug removal induced an increase in the statistical representation of a drug-event combination (DEC) that resulted in the emergence of a new SDR. We performed a series of unmasking experiments selecting drugs for removal using rational statistical decision rules based on the requirement of a reporting ratio (RR) >1, top-ranked statistical unexpectedness (SU) and relatedness as reflected in the WHO Anatomical Therapeutic Chemical level 4 (ATC4) grouping. In order to assess the possible extent of residual masking we performed two supplemental purely empirical analyses on a limited subset of data. This entailed testing every drug and drug group to determine which was most influential in uncovering masked SDRs. We assessed the strength of external evidence for a causal association for a small number of masked SDRs involving a subset of 29 drugs for which level of evidence adjudication was available from a previous study.
Results: The original disproportionality analysis identified 8719 SDRs for the 63 PTs. The SU-based unmasking protocols generated variable numbers of masked SDRs ranging from 38 to 156, representing a 0.43–1.8% increase over the number of baseline SDRs. A significant number of baseline SDRs were also lost in the course of our experiments. The trend in the number of gained SDRs per report removed was inversely related to the number of lost SDRs per protocol. Both the number and nature of the reports removed influenced the number of gained SDRs observed. The purely empirical protocols unmasked up to ten times as many SDRs. None of the masked SDRs had strong external evidence supporting a causal association. Most involved associations for which there was no external supporting evidence or were in the original product label. For two masked SDRs, there was external evidence of a possible causal association.
Conclusions: We documented masking in the FDA AERS database. Attempts at unmasking SDRs using practically implementable protocols produced only small changes in the output of SDRs in our analysis. This is undoubtedly related to the large size and diversity of the database, but the complex inter-dependencies between drugs and events in authentic spontaneous reporting system (SRS) databases, and the impact of measures of statistical variability that are typically used in real-world disproportionality analysis, may be additional factors that constrain the discovery of masked SDRs and which may also operate in pharmaceutical company databases. Empirical determination of the most influential drugs may uncover significantly more SDRs than protocols based on predetermined statistical selection rules but are impractical except possibly for evaluating specific events. Routine global exercises to elicit masking, especially in large health authority databases are not justified based on results available to date. Exercises to elicit unmasking should be driven by prior knowledge or obvious data imbalances.
Similar content being viewed by others
Notes
1MedDRA® is a registered trademark of the International Federation of Pharmaceutical Manufacturers and Associations (IFPMA).
References
Gould AL. Practical pharmacovigilance analysis strategies. Pharmacoepidemiol Drug Saf 2003; 12: 559–74
Almenoff J, Tonning JM, Gould AL, et al. Perspectives on the use of data mining in pharmacovigilance. Drug Saf 2005; 28(11): 981–1007
EMEA Eudravigilance Expert Working Group (EV-EWG). Guideline on the use of statistical signal detection methods in the Eudravigilance data analysis system. 26 June 2008. London. Doc. Ref. EMEA/106464/2006 rev.1 [online]. Available from URL: http://www.ema.europa.eu/docs/en_GB/document_library/Regulatory_and_procedural_guideline/2009/11/WC500011434.pdf [Accessed 2010 Sep 6]
Hauben M, Madigan D, Gerrits CM, et al. The role of data mining in pharmacovigilance. Expert Opin Drug Saf 2005 Sep; 4(5): 929–48
Hauben M, Hochberg A. The importance of reporting negative findings in data mining: the example of exenatide and pancreatitis. Pharmaceut Med 2008; 22(4): 215–9
Zeinoun Z, Seifert H, Verstraeten T. Quantitative signal detection for vaccines: effects of stratification, background and masking on GlaxoSmithKline’s spontaneous reports database. Hum Vaccines 2009; 5(9): 599–607
Hauben M, Aronson JK. Defining ‘signal’ and its subtypes in pharmacovigilance based on a systematic review of previous definitions. Drug Saf 2009; 32(2): 99–110
Hauben M, Reich L. Response to letter by Levine et al. Br J Clin Pharmacol 2005; 61(1): 115–7
Hochberg AM, Reisinger SJ, Pearson RK, et al. Using data mining to predict safety actions from FDA adverse event reporting system data. Drug Inf J 2007; 41(5): 633–44
Hauben M, Noren GN. A decade of data mining and still counting. Drug Saf 2010; 33(7): 527–34
Hauben M, Madigan D, Hochberg AM, et al. Data mining in pharmacovigilance: computational cost as a neglected performance parameter. Int J Pharm Med 2007; 21(5): 319–23
Caster O, Noren GK, Madigan D, et al. Large scale regression-based pattern discovery: the example of screening the WHO global drug safety data base. Stat Anal Data Min 2010; 3(4): 197–208
Pariente A, Gregoire F, Fourrier-Reglat A, et al. Impact of safety alerts on measures of disproportionality in spontaneous reporting databases: the notoriety bias. Drug Saf 2007; 30(10): 891–8
Hauben M, Reich L, DeMicco J, et al. ‘Extreme duplication’ in the US FDA Adverse Events Reporting System Database. Drug Saf 2007; 30(6): 551–4
Arimone Y, Miremont-Salame G, Haramburu F, et al. Interexpert agreement of seven criteria in causality assessment of adverse drug reactions. Br J Clin Pharmacol 2007 Oct; 64(4): 482–8
Howalzi M, Delafosse C. Omeprazole-induced intractable cough. Ann Pharmacother 2003; 37: 1607–9
Acknowledgements
We thank the anonymous reviewers of this manuscript for their many helpful comments.
This work was partly funded by a grant from the Pharmaceutical Research and Manufacturers Association (PhRMA) to ProSanos Corporation. At the time of writing this manuscript, Alan Hochberg and Ronald Pearson were employed by ProSanos Corporation and, as employees of ProSanos, have patents pending on data mining visualization tools, assigned to ProSanos Corporation. Alan Hochberg is presently employed by F. Hoffman-La Roche Ltd, Basel, Switzerland, and Ronald Pearson is presently employed by The Travelers Company, Inc., Hartford, CT, USA. Manfred Hauben and Hsin-wei Wang are full-time employees of Pfizer, Inc., which manufactures and/or markets drugs discussed in this article and/or potentially competing drugs in the same therapeutic classes as drugs discussed in this article. They own stock and stock options in Pfizer Inc. and own stock in other pharmaceutical companies for which the above-described competing interests may apply.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Description of Masking
To illustrate the basis of masking, consider the contingency table below (table AI). From this table, we can compute the RR as a measure of disproportionality. Other data mining methods use different measures, but RR will serve here for illustrative purposes.
For this table, the RR for Drug A and the event of interest is calculated as RR = (500/20 500) ÷ (13 000/1 053 000) = 1.98. In this case, using the criteria that an RR >2.0 indicates an ‘SDR’, i.e. an indication of a statistical association between Drug A and the adverse event under study, we would say that there is no SDR for Drug A and the event under study.
Now consider the table below (table AII), which differs from the one above only in that we have separately listed reports for a particular drug, Drug B, for which the adverse event under study occurs frequently.
For simplicity in this example, let us assume that mention of Drug A and Drug B are mutually exclusive. If we were to eliminate reports for Drug B, the RR for Drug A and the event of interest would be RR = (500/20 500) ÷ (10 500/1030 500) = 2.39. This is above the commonly used detection threshold of RR ≥2.0, and therefore we would declare that there is an SDR for Drug A and the event under study. The SDR was ‘masked’ by the presence of Drug B. The reason for the masking is the large number of reports for Drug B that mention the event under consideration. This number is large enough to significantly influence the fraction of reports in the entire database that mention the event.
Rights and permissions
About this article
Cite this article
Wang, Hw., Hochberg, A.M., Pearson, R.K. et al. An Experimental Investigation of Masking in the US FDA Adverse Event Reporting System Database. Drug-Safety 33, 1117–1133 (2010). https://doi.org/10.2165/11584390-000000000-00000
Published:
Issue Date:
DOI: https://doi.org/10.2165/11584390-000000000-00000