Managing Data Quality for a Drug Safety Surveillance System
- 392 Downloads
The objective of this study is to present a data quality assurance program for disparate data sources loaded into a Common Data Model, highlight data quality issues identified and resolutions implemented.
The Observational Medical Outcomes Partnership is conducting methodological research to develop a system to monitor drug safety. Standard processes and tools are needed to ensure continuous data quality across a network of disparate databases, and to ensure that procedures used to extract-transform-load (ETL) processes maintain data integrity. Currently, there is no consensus or standard approach to evaluate the quality of the source data, or ETL procedures.
We propose a framework for a comprehensive process to ensure data quality throughout the steps used to process and analyze the data. The approach used to manage data anomalies includes: (1) characterization of data sources; (2) detection of data anomalies; (3) determining the cause of data anomalies; and (4) remediation.
Data anomalies included incomplete raw dataset: no race or year of birth recorded. Implausible data: year of birth exceeding current year, observation period end date precedes start date, suspicious data frequencies and proportions outside normal range. Examples of errors found in the ETL process were zip codes incorrectly loaded, drug quantities rounded, drug exposure length incorrectly calculated, and condition length incorrectly programmed.
Complete and reliable observational data are difficult to obtain, data quality assurance processes need to be continuous as data is regularly updated; consequently, processes to assess data quality should be ongoing and transparent.
KeywordsTarget Concept Data Anomaly Common Data Model Observational Medical Outcome Partnership Data Quality Assurance
The Observational Medical Outcomes Partnership is funded by the Foundation for the National Institutes of Health (FNIH) through generous contributions from the following: Abbott, Amgen Inc., AstraZeneca, Bayer Healthcare Pharmaceuticals, Inc., Biogen Idec, Bristol-Myers Squibb, Eli Lilly & Company, GlaxoSmithKline, Janssen Research and Development, Lundbeck, Inc., Merck & Co., Inc., Novartis Pharmaceuticals Corporation, Pfizer Inc, Pharmaceutical Research Manufacturers of America (PhRMA), Roche, Sanofi-aventis, Schering-Plough Corporation, and Takeda. Drs. Schuemie, Stang, and Ryan are employees of Janssen Research and Development. Dr. Schuemie received a fellowship from the Office of Medical Policy, Center for Drug Evaluation and Research, Food and Drug Administration. Dr. Reich is an employee of AstraZeneca. Drs. Schuemie, Madigan and Hartzema have received funding previously from FNIH. J. Marc Overhage and Emily Welebob have no conflicts of interest to declare.
The authors thank and acknowledge the contributions of the OMOP Distributed Research Partners in phases of this research, who were supported by a grant from FNIH. Assistance with writing and manuscript preparation was provided by Ken Scholz, PhD, with financial support from FNIH.
This article was published in a supplement sponsored by the Foundation for the National Institutes of Health (FNIH). The supplement was guest edited by Stephen J.W. Evans. It was peer reviewed by Olaf H. Klungel who received a small honorarium to cover out-of-pocket expenses. S.J.W.E has received travel funding from the FNIH to travel to the OMOP symposium and received a fee from FNIH for the review of a protocol for OMOP. O.H.K has received funding for the IMI-PROTECT project from the Innovative Medicines Initiative Joint Undertaking (http://www.imi.europa.eu) under Grant Agreement no 115004, resources of which are composed of financial contribution from the European Union’s Seventh Framework Programme (FP7/2007–2013) and EFPIA companies’ in kind contribution.
- 3.FDA. The Sentinel Initiative: A National Strategy for Monitoring Medical Product Safety. May 2008 [cited 2012 September 15]. http://www.fda.gov/Safety/FDAsSentinelInitiative/ucm089474.htm.
- 9.Varas-Lorenzo C, Castellsague J, Stang MR, Tomas L, Aguado J, Perez-Gutthann S. Positive predictive value of ICD-9 codes 410 and 411 in the identification of cases of acute coronary syndromes in the Saskatchewan Hospital automated database. Pharmacoepidemiol Drug Saf. 2008;17(8):842–52.PubMedCrossRefGoogle Scholar
- 10.Software Engineering—Product Quality—Part 1: Quality Model. Geneva, Switzerland: International Organization for Standardization; 2001.Google Scholar
- 11.Kan SH. Metrics and models in software quality engineering. 2nd ed. Boston: Addison-Wesley; 2002.Google Scholar
- 12.Glass RL. Building quality software. Upper Saddle River: Prentice-Hall; 1992.Google Scholar
- 17.Guidance for Industry E6 Good Clinical Practice: Consolidated Guidance. 1996 [cited Oct 5, 2010]. http://www.fda.gov/downloads/drugs/guidancecomplianceregulatoryinformation/guidances/ucm073122.pdf.
- 21.OMOP. Common Data Model (version 4); 2012 [cited 2012 November 12]. http://omop.org/CDMvocabV4.
- 24.Informatics for Integrating Biology and the Bedside (i2b2) Software. [cited November 18, 2010]. https://www.i2b2.org.
- 26.Guideline on General Principles of Process Validation. 1987 [cited Cot 5, 2010]. http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm124720.htm.
- 27.General Principles of Software Validation: Guidance for Industry and FDA Staff. 2002 [cited Oct 5, 2010]. http://www.fda.gov/RegulatoryInformation/Guidances/ucm126954.htm.
- 33.Wahl PM, Rodgers K, Schneeweiss S, Gage BF, Butler J, Wilmer C, et al. Validation of claims-based diagnostic and procedure codes for cardiovascular and gastrointestinal serious adverse events in a commercially-insured population. Pharmacoepidemiol Drug Saf. 2010;19(6):596–603.PubMedCrossRefGoogle Scholar
- 45.Butani AL, Sherwood N, Adams K, et al. The VDW vital signs file: strengths, issues and recommendations for the future. Poster presented at the 15th Annual HMO Research Network Conference, Danville; 2009.Google Scholar
- 46.Hornbrook MC, Hitz P, Pardee R, et al. The VDW demographic and enrollment files: strengths, issues, and recommendations for the Future. Presented at the 15th annual HMO research network conference, Danville; 2009.Google Scholar
- 47.Moore KM, Cheetham C, Dublin S, et al. VDW pharmacy file: strengths, weaknesses and recommendations. Poster presented at the 15th annual HMO research network conference, Danville; 2009.Google Scholar
- 48.Saylor G, Ellis JL, Raebel MA, et al. Formalization of the laboratory result content area of the VDW. Poster presented at the 14th Annual HMO research network conference, Minneapolis; 2008.Google Scholar
- 49.OMOP. Observational Source Characteristics Analysis Report (OSCAR) Design Specification and Feasibility Assessment. 2010 [cited 2012 June 18]. http://omop.org/OSCAR.
- 50.OMOP. NATHAN—Utility of Natural History Information; 2010 [cited 2012 June 18]. http://omop.org/NATHAN.
- 51.OMOP Implementation 2011 [cited 2012 December 12]. http://omop.org/OMOPimplementation.