Skip to main content
Log in

An Evaluation of the THIN Database in the OMOP Common Data Model for Active Drug Safety Surveillance

  • Original Research Article
  • Published:
Drug Safety Aims and scope Submit manuscript

Abstract

Background

There has been increased interest in using multiple observational databases to understand the safety profile of medical products during the postmarketing period. However, it is challenging to perform analyses across these heterogeneous data sources. The Observational Medical Outcome Partnership (OMOP) provides a Common Data Model (CDM) for organizing and standardizing databases. OMOP’s work with the CDM has primarily focused on US databases. As a participant in the OMOP Extended Consortium, we implemented the OMOP CDM on the UK Electronic Healthcare Record database—The Health Improvement Network (THIN).

Objective

The aim of the study was to evaluate the implementation of the THIN database in the OMOP CDM and explore its use for active drug safety surveillance.

Methods

Following the OMOP CDM specification, the raw THIN database was mapped into a CDM THIN database. Ten Drugs of Interest (DOI) and nine Health Outcomes of Interest (HOI), defined and focused by the OMOP, were created using the CDM THIN database. Quantitative comparison of raw THIN to CDM THIN was performed by execution and analysis of OMOP standardized reports and additional analyses. The practical value of CDM THIN for drug safety and pharmacoepidemiological research was assessed by implementing three analysis methods: Proportional Reporting Ratio (PRR), Univariate Self-Case Control Series (USCCS) and High-Dimensional Propensity Score (HDPS). A published study using raw THIN data was selected to examine the external validity of CDM THIN.

Results

Overall demographic characteristics were the same in both databases. Mapping medical and drug codes into the OMOP terminology dictionary was incomplete: 25 % medical codes and 55 % drug codes in raw THIN were not listed in the OMOP terminology dictionary, representing 6 % condition occurrence counts, 4 % procedure occurrence counts and 7 % drug exposure counts in raw THIN. Seven DOIs had <0.3 % and three DOIs had 1 % of unmapped drug exposure counts; each HOI had at least one definition with no or minimal (≤0.2 %) issues with unmapped condition occurrence counts, except for the upper gastrointestinal (UGI) ulcer hospitalization cohort. The application of PRR, USCCS and HDPS found, respectively, a sensitivity of 67, 78 and 50 %, and a specificity of 68, 59 and 76 %, suggesting that safety issues defined as known by the OMOP could be identified in CDM THIN, with imperfect performance. Similar PRR scores were produced using both CDM THIN and raw THIN, while the execution time was twice as fast on CDM THIN. There was close replication of demographic distribution, death rate and prescription pattern and trend in the published study population and the cohort of CDM THIN.

Conclusions

This research demonstrated that information loss due to incomplete mapping of medical and drug codes as well as data structure in the current CDM THIN limits its use for all possible epidemiological evaluation studies. Current HOIs and DOIs predefined by the OMOP were constructed with minimal loss of information and can be used for active surveillance methodological research. The OMOP CDM THIN can be a valuable tool for multiple aspects of pharmacoepidemiological research when the unique features of UK Electronic Health Records are incorporated in the OMOP library.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Strom BL. Pharmacoepidemiology. 4th ed. West Sussex: John Wiley & Sons Ltd; 2005.

    Google Scholar 

  2. Hall GC, Sauer B, Bourke A, et al. Guidelines for good database selection and use in pharmacoepidemiology research. Pharmacoepidemio Drug Saf. 2012;21(1):1–10.

    Article  Google Scholar 

  3. Arnold RJ, Balu S. Retrospective database analysis. In: Arnold RJG, editor. Pharmacoeconomics: from theory to practice. Boca Raton: CRC Press; 2009. p. 59–82.

    Chapter  Google Scholar 

  4. Foundation for National Institutes of Health. Observational Medical Outcomes Partnership (online). http://omop.fnih.org/node/22. Accessed 1 May 2010.

  5. Stang PE, Ryan PB, Racoosin JA, et al. Advancing the science for active surveillance: rationale and design for the Observational Medical Outcomes Partnership. Ann Intern Med. 2010;153(9):600–6.

    Article  PubMed  Google Scholar 

  6. Coloma PM, Schumie MJ, Trifiro G, et al. Combining electronic healthcare databases in Europe to allow for large-scale drug safety monitoring: the EU-ADR Project. Pharmacoepdiemio Drug Saf. 2011;20(1):1–11.

    Article  Google Scholar 

  7. Maro JC, Platt R, Holmes JH, et al. Design of a national distributed health data network. Ann Intern Med. 2009;151(5):341–4.

    Article  PubMed  Google Scholar 

  8. Lewis JD, Schinnar R, Bilker W, et al. Validation studies of the health improvement network(THIN) database for pharmacoepidemiology research. Pharmacoepdiemio Drug Saf. 2007;16:393–401.

    Article  Google Scholar 

  9. Smeeth L, Cook C, Thomas S, et al. Risk of deep vein thrombosis and pulmonary embolism after acute infection in a community setting. Lancet. 2006;367(9516):1075–9.

    Article  PubMed  Google Scholar 

  10. Bourke A, Dattani H, Robinson M. Feasibility study and methodology to create a quality-evaluated database of primary care data. Inform Prim Care. 2004;12(3):171–7.

    PubMed  Google Scholar 

  11. Gonzalez EL, Johansson S, Wallander MA, et al. Trends in the prevalence and incidence of diabetes in the UK: 1996–2005. J Epidemiol Community Health. 2009;63(4):332–6.

    Article  PubMed  Google Scholar 

  12. Overhage JM, Ryan PB, Reich CG, et al. Validation of a common data model for active surveillance research. J Am Med Inform Assoc. 2012;19(1):54–60.

    Article  PubMed  Google Scholar 

  13. Reisinger S, Ryan PB, O’Hara DJ, et al. Development and evaluation of a common data model enabling active drug safety surveillance using disparate healthcare databases. J Am Med Inform Assoc. 2010;17:652–62.

    Article  PubMed  Google Scholar 

  14. Van Le H, Beach KJ, Powell G, et al. Performance of a semi-automated approach for risk estimation using a common data model for longitudinal health databases. Stat Methods Med Res. Epub 2011 Jun 16.

  15. Foundation for National Institutes of Health. OMOP Common Data Model (CDM) specifications, version 2.0, November 3, 2009 (online). http://omop.fnih.org/ETLProcess. Accessed 2 May 2010.

  16. Cegedim Strategic Data (CSD) Medical Research Group, UK. THIN data content (online). http://csdmruk.cegedim.com/our-data/data-content.shtml. Accessed 3 May 2010.

  17. Ryan PB. Establishing a condition era persistence window for active surveillance, Jan 12, 2010 (online). http://omop.fnih.org/OMOPWhitePapers. Accessed 4 May 2010.

  18. Ryan P, OMOP PI, Establishing a Drug Era Persistence Window for Active Surveillance, Jan 25, 2010 (online). http://omop.fnih.org/OMOPWhitePapers. Accessed 4 May 2010.

  19. Ryan PB, Reich C, Welebob E. Managing data quality for an active surveillance system (online). http://omop.fnih.org. Accessed 12 May 2010.

  20. Foundation for National Institutes of Health, OMOP. OSCAR – Observational Source Characteristics Analysis Report (OSCAR) design specification and feasibility assessment (online). http://omop.fnih.org/OSCAR. Accessed 5 May 2010.

  21. Foundation for National Institutes of Health, OMOP. NATHAN – Utility of Natural History Information (online). http://omop.fnih.org/NATHAN. Accessed 6 May 2010.

  22. Foundation for National Institutes of Health, OMOP. Generalized Review of OSCAR Unified Checking (online). http://omop.fnih.org/GROUCH. Accessed 1 May 2010.

  23. OMOP. Specifications for implementation of standard vocabularies in observational data analysis, version 3.0, June 2010 (online). http://omop.fnih.org/vocabularies. Accessed 4 May 2010.

  24. Zorych I, Madigan D, Ryan P, et al. Disproportionality methods for pharmacovigilance in longitudinal observational databases. Stat Methods Med Res. Epub 2011 Aug 30.

  25. Bate A, Evans SJW. Quantitative signal detection using spontaneous ADR reporting. Pharmacoepidemio Drug Saf. 2009;18(6):427–36.

    Article  CAS  Google Scholar 

  26. Ryan PB, Madigan D, Method performance results from the Health Outcomes of Interest Experiment: part 1 (online). http://omop.fnih.org/OMOP2011Symposium. Accessed 5 May 2010.

  27. Hubbard R, Farrington P, Smith C, et al. Exposure to tricyclic & selective serotonin reuptake inhibitor antidepressants and the risk of hip fracture. Am J Epidemiol. 2003;158:77–84.

    Article  PubMed  Google Scholar 

  28. Maclure M, Fireman B, et al. When should case-only designs be used for safety monitoring of medical products? Pharmacoepidemiol Drug Saf. 2012;21(Suppl. 1):50–61.

    Article  PubMed  Google Scholar 

  29. Schneeweiss S, Rassen JA, Glynn RJ. et al High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009;20:512–22.

    Article  PubMed  Google Scholar 

  30. Rassen JA, Schneeweiss S. Using high-dimensional propensity scores to automate confounding control in a distributed medical product safety surveillance system. Pharmacoepidemiol Drug Saf. 2012;21(Suppl. 1):41–9.

    Article  PubMed  Google Scholar 

  31. Whitaker HJ, et al. The methodology of self-controlled case series studies. Stat Methods Med Res. 2009;18(1):7–26.

    Article  PubMed  Google Scholar 

  32. Foundation for National Institutes of Health, OMOP. Observational analysis methods and methods library (online). http://omop.fnih.org/MethodsLibrary. Accessed 12 May 2010.

  33. Cegedim Strategic Data (CSD) Medical Research Group, UK. THIN bibliography (online). http://csdmruk.cegedim.com/bibliography/bibliography_01.shtml. Accessed 10 May 2010.

  34. Hardoon SL, Whincup PH, Petersen I, et al. Trends in longer-term survival following an acute myocardial infarction and prescribing of evidenced-based medications in primary care in the UK from 1991: a longitudinal population-based study. J Epidemiol Community Health. 2011;65(9):770–4.

    Article  PubMed  Google Scholar 

  35. van Puijenbroek EP, Bate A, Leufkens HG, et al. A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. Pharmacoepidemiol Drug Saf. 2002;11(1):3–10.

    Article  PubMed  Google Scholar 

  36. Bate A, Evans SJW. Quantitative signal detection using spontaneous ADR reporting. Pharmacoepidemiol Drug Saf. 2009;18(6):427–36.

    Article  PubMed  CAS  Google Scholar 

  37. Norén NG, Hopstadius J, Bate A, et al. Temporal pattern discovery in longitudinal electronic patient records. Data Min Knowl Discov. 2010;20(3):361–87.

    Article  Google Scholar 

  38. Schneeweiss S. A basic study design for expedited safety signal evaluation based on electronic healthcare data. Pharmacoepidemiol Drug Saf. 2010;19:858–68.

    Article  PubMed  Google Scholar 

  39. Ryan PB. Defining a reference set for evaluating the performance of active surveillance methods. 31 January 2010 (online). http://omop.fnih.org/OMOPWhitePapers. Accessed 4 May 2010.

Download references

Acknowledgements

We would like to thank OMOP for designing a CDM and associated tools and algorithms, and for making the programs and associated detailed documentation readily available. In particular, we would like to thank Drs Patrick Ryan and Christian Reich for their advice on the implementation of the THIN database into the OMOP CDM format. We would also like to thank Dr. Manfred Hauben for his review and specific advice to the mapping of the THIN database into the OMOP CDM.

Funding and Conflicts of Interest

No sources of funding were used to conduct this study or prepare this manuscript. Xiaofeng Zhou, Sundaresan Murugesan, Qing Liu, Bing Cai and Andrew Bate are full-time employees of Pfizer and hold Pfizer stock options/stocks. Harshvinder Bhullar is an employee of CSD, the company that supplies the THIN database. Chuck Wentworth was a part-time contract employee at Pfizer during the time the study was conducted.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofeng Zhou.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 400 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, X., Murugesan, S., Bhullar, H. et al. An Evaluation of the THIN Database in the OMOP Common Data Model for Active Drug Safety Surveillance. Drug Saf 36, 119–134 (2013). https://doi.org/10.1007/s40264-012-0009-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40264-012-0009-3

Keywords

Navigation