Performing studies using the UK Clinical Practice Research Datalink: to link or not to link?
The Clinical Practice Research Datalink (CPRD) is a repository of electronic medical records collected during routine primary care clinical practice in the UK, and is one of the most widely used sources of real-world data for healthcare research. Although CPRD provides access to comprehensive longitudinal patient records, the data does not fully capture diagnoses or outcomes occurring in secondary care and/or mortality. We provide here an overview of CPRD and the potential bias when using unlinked data in certain situations. Linkage of CPRD to other datasets can help to overcome these limitations. We discuss when to consider linkage to secondary care, disease-specific data sources or the official mortality data when conducting research using CPRD data.
KeywordsCPRD HES Data linkage Epidemiology Primary care Secondary care
We are also grateful to staff at the CPRD for their work and research, which contributes to the continuous improvement of studies using CPRD data.
Laura McDonald (LM) and Sreeram Ramagopalan (SR) are full-time employees of Bristol-Myers Squibb. Anna Schultze (AS) and Robert Carroll (RC) are full-time employees of Evidera. The authors report no other competing interests. All authors have extensive experience of performing observational epidemiological studies using CPRD. After reading a number of publications not considering linked CPRD data, SR conceived the article to provide guidance to researchers. LM identified all research articles comparing linked and unlinked CPRD data and wrote the first draft of the manuscript. AS and RC worked on further manuscript drafts. All authors reviewed and approved the manuscript as submitted.
- 6.Hospital Episode Statistics. http://www.hscic.gov.uk/hes. Accessed 1 Jan 2017.
- 7.National Cancer Intelligence Network. http://www.ncin.org.uk/home. Accessed 1 Jan 2017.
- 9.Office for National Statistics. Mortality statistics: metadata. Cardiff: ONS; 2014.Google Scholar
- 10.CPRD linked data access. https://www.cprd.com/dataAccess/linkeddata.asp. Accessed 1 Jan 2017.
- 12.Herrett E, Shah AD, Boggon R, Denaxas S, Smeeth L, van Staa T, et al. Completeness and diagnostic validity of recording acute myocardial infarction events in primary care, hospital care, disease registry, and national mortality records: cohort study. BMJ. 2013;346(May):f2350.CrossRefPubMedPubMedCentralGoogle Scholar
- 14.Morgan CL, Currie CJ, Stott NCH, Smithers M, Butler CC, Peters JR. Estimating the prevalence of diagnosed diabetes in a health district of Wales: the importance of using primary and secondary care sources of ascertainment with adjustment for death and migration. Diabet Med. 2002;17(2):141–5.CrossRefGoogle Scholar
- 20.Langan SM, Guttmann A, Harron K, Moher D, Petersen I, Sorensen H, et al. Guidelines for the REporting of studies Conducted using Observational Routinely-collected health data (RECORD): an extension of the STROBE reporting guidelines. PLoS Med. 2015;12(10):e1001885.CrossRefPubMedPubMedCentralGoogle Scholar