Empirical Performance of the Self-Controlled Case Series Design: Lessons for Developing a Risk Identification and Analysis System
- 318 Downloads
The self-controlled case series (SCCS) offers potential as an statistical method for risk identification involving medical products from large-scale observational healthcare data. However, analytic design choices remain in encoding the longitudinal health records into the SCCS framework and its risk identification performance across real-world databases is unknown.
To evaluate the performance of SCCS and its design choices as a tool for risk identification in observational healthcare data.
We examined the risk identification performance of SCCS across five design choices using 399 drug-health outcome pairs in five real observational databases (four administrative claims and one electronic health records). In these databases, the pairs involve 165 positive controls and 234 negative controls. We also consider several synthetic databases with known relative risks between drug-outcome pairs.
We evaluate risk identification performance through estimating the area under the receiver-operator characteristics curve (AUC) and bias and coverage probability in the synthetic examples.
The SCCS achieves strong predictive performance. Twelve of the twenty health outcome-database scenarios return AUCs >0.75 across all drugs. Including all adverse events instead of just the first per patient and applying a multivariate adjustment for concomitant drug use are the most important design choices. However, the SCCS as applied here returns relative risk point-estimates biased towards the null value of 1 with low coverage probability.
The SCCS recently extended to apply a multivariate adjustment for concomitant drug use offers promise as a statistical tool for risk identification in large-scale observational healthcare databases. Poor estimator calibration dampens enthusiasm, but on-going work should correct this short-coming.
- 1.United States Congress. Food and drug administration amendments act of 2007. Public Law. 2007. p. 115–85.Google Scholar
- 6.Grosso A, Douglas I, Hingorani A, MacAllister R, Smeeth L (2008) Post-marketing assessment of the safety of strontium ranelate; a novel case-only approach to the early detection of adverse drug reactions. British J Cli Pharmacol. 66:689–94.Google Scholar
- 11.Fram D, Almenoff J, DuMouchel W. Empirical Bayesian data mining for discovering patterns in post-marketing drug safety. In: Ninth ACM SIGKDD international conference on knowledge discovery and data mining. 2003. p. 359–68.Google Scholar
- 12.Simpson SE, Madigan D, Zorych I, Schuemie MJ, Ryan PB, Suchard MA Multiple self-controlled case series for large-scale longitudinal observational databases. Biometrics (in press).Google Scholar
- 14.Suchard MA, Simpson SE, Zorych I, Ryan P, Madigan D. Massive parallelization of serial inference algorithms for a complex generalized linear models. Trans Model Comput Simul 2013;23(1):10.Google Scholar
- 15.Ryan PB, Schuemie MJ. Evaluating performance of risk identification methods through a large-scale simulation of observational data. Drug Saf (in submission to this supplement). doi:10.1007/s40264-013-0110-2.
- 16.Ryan PB, Stang PE, Overhage JM, Suchard MA, Hartzema AG, DuMouchel W, et al. A comparison of the empirical performance of methods for a risk identification system. Drug Saf (in this supplement issue). doi:10.1007/s40264-013-0108-9.
- 17.Ryan PB, Schmuemie MJ, Welebob E, Duke J, Valentine S, Hartzema AG. Defining a reference set to support methodological research in drug safety. Drug Saf (in submission to this supplement). doi:10.1007/s40264-013-0097-8.
- 26.Trifirò G, Pariente A, Coloma PM, Kors JA, Polimeni G, Miremont-Salamé G, Catania MA, Salvo F, David A, Moore N et al. Data mining on electronic health record databases for signal detection in pharmacovigilance: which events to monitor? Pharmacoepidemiol Drug Saf. 2009;18(12):1176–84.PubMedCrossRefGoogle Scholar