Introduction

Sodium-glucose co-transporter-2 (SGLT2) inhibitors belong to a relatively new class of oral antihyperglycemic agents for the treatment of people with type 2 diabetes (T2DM). They ameliorate glycemic control by preventing the reabsorption of glucose by SGLT2 in the proximal convoluted tubule of the kidney [1]. This process occurs independently of the actions of insulin and allows the drug to be used at any stage of diabetes progression while minimizing the risk of hypoglycemia [2, 3]. Clinical trials have demonstrated additional benefits of SGLT2 inhibitor therapy, including weight reduction and decreased blood pressure, both of which are thought to be due to increased excretion of glucose and sodium by the kidneys [4, 5]. In additon, cardiovascular outcome trials (CVOTs) have been undertaken to demonstrate the safety in addition to efficacy of these antihyperglycemic agents.

In recent years two CVOTs have been conducted to explore whether SGLT2 inhibitors have cardioprotective effects in high-risk cardiovascular patients. In the Empagliflozin Cardiovascular Outcomes, and Mortality in Type 2 Diabetes Mellitus Patients—Removing Excess Glucose (EMPA-REG) outcome trial, time-to-event analysis confirmed that death due to cardiovascular causes, non-fatal myocardial infarction, or nonfatal stroke was less likely to occur in patients treated with empagliflozin than those given a placebo [6]. Similarly, the Canagliflozin Cardiovascular Assessment Study (CANVAS and CANVAS-R) Program showed that those treated with canagliflozin and followed-up over a mean of 3.6 years were at a reduced risk of having a cardiovascular event than those assigned to the placebo group [7]. Both trials, therefore, confirmed non-inferiority with regard to cardiovascular safety of each SGLT2 inhibitor, as well as superiority in terms of primary outcome events.

Further trials have been carried out more recently to explore whether other SGLT2 inhibitors used to treat T2DM patients have similar cardiovascular benefits. The multicenter trial to evaluate the effect of Dapagliflozin on the Incidence of Cardiovascular Events ([DECLARE])-TIMI 58; referred to hereafter as the DECLARE trial) demonstrated non-inferiority of dapagliflozin for major adverse cardiovascular events in patients treated with this drug compared to placebo, as well as significantly reduced hospitalization for heart failure or cardiovascular death [8]. Another CVOT, the Cardiovascular Outcomes Following Ertugliflozin Treatment in Type 2 Diabetes Mellitus Participants with Vascular Disease (VERTIS CV) trial, is estimated to be completed in September 2019 [9]. However, despite the cardiovascular safety/efficacy demonstrated by CVOTs, the real-world applicability of these findings to patients in clinical practice is uncertain.

We have previously compared the cardiovascular risk profile of T2DM patients in an English primary care dataset with that of trial participants using the inclusion criteria of the EMPA-REG trial [10]. Our findings showed that the results of the EMPA-REG trial were only applicable to a small proportion of people with T2DM, and to an even smaller proportion of those prescribed an SGLT2 inhibitor. The applicability of other SGLT2 inhibitor CVOTs to real-world clinical practice in a primary care setting is yet to be elucidated.

In this protocol we describe the method which will be used in our study to compare the cardiovascular risk profile of patients in a real-world primary care setting to that of participants in each of the four SGLT2 inhibitor CVOTs mentioned in the preceding text (CANVAS Program, DECLARE, EMPA-REG, and VERTIS CV). The results will inform the extent to which the previous and upcoming findings of each trial can be generalized to a real-world setting.

Objectives

The aim of the study is to identify all adult patients with T2DM in the Royal College of General Practitioners (RCGP) Research and Surveillance Center (RSC) database that meet the inclusion criteria of each of the four CVOTs for treatment with an SGLT2 inhibitor. An additional aim is to compare the demographic and clinical characteristics of identified patients with participants included in these trials.

Primary Objectives

  1. 1.

    To establish the number of people in the RCGP RSC population that meet the inclusion criteria of each CVOT for treatment with an SGLT2 inhibitor.

  2. 2.

    To describe the characteristics of people eligible for each trial according to:

    1. (a)

      type of cardiovascular disease/risk factor;

    2. (b)

      duration of their diabetes; and

    3. (c)

      the number on concurrent oral antihyperglycemic medications or prescribed insulin.

  3. 3.

    To describe the demographic (age, gender, ethnicity) and clinical characteristics (glycated hemoglobin [HbA1c], body mass index, blood pressure, and renal function) of people identified in each trial.

Secondary Objectives

To determine the number of people on an SGLT2 inhibitor that meet each of the inclusion criteria of each of the four CVOTs.

Methods

Study Design

The study will be a cross-sectional analysis of all adults with T2DM included in the RCGP RSC database, with the aim to identify people with an equivalent cardiovascular risk profile to that of those persons included in each of the SGLT2 inhibitor CVOTs: CANVAS Program, DECLARE, EMPA-REG, and VERTIS CV. We will use an updated dataset from our previous comparison with the EMPA-REG trial.

Data Source

The RCGP RSC is a long established primary care sentinel network [11], comprising computerized medical records (CMRs) for over 200 primary care practices across England and a population of over 2,000,000 registered patients. This nationally representative network set up a weekly returns service in 1964 for the surveillance of respiratory infections, including influenza [12, 13], but has more recently widened its remit to include research into long-term conditions, such as diabetes.

As with UK primary care more generally, the RCGP RSC data is registration based, so that every patient is registered with only one practice at a time. All patients have a unique patient identifier, the National Health Service (NHS) number. This unique patient identifier enables the transition of a patient’s medical record to another practice when he/she moves to a different location and patient data to be linked with other datasets, including secondary care datasets[14].

CMR data in UK primary care are captured using Read codes [15]. These codes are used to collate data for diagnoses, processes of care (such as care pathways, referrals, etc.), prescriptions, and results from laboratory-based data. Data quality in UK primary care is high, dating back as far as 2004 due to the introduction of a pay-for-performance scheme, i.e., the Quality and Outcomes Framework (QOF), which was implemented to encourage clinicians to achieve set targets for the management of chronic diseases [16].

We will analyze data extracted from primary care practices up to 31 December 2016, which will include all patients with a T2DM diagnosis and aged > 18 years. From this sample we will identify and report the proportion of those persons with cardiovascular risk/diseases similar to those of persons in each of the four CVOTs (CANVAS Program, DECLARE, EMPA-REG, and VERTIS CV). Demographic and clinical characteristics for each identified sample will be reported and subsequently compared with those of the CVOTs that have previously been published. In addition, missing data for each variable will be provided.

To protect patient data the RCGP RSC data is pseudonymized by NHS number. This study was classified as an “Audit of current practice”, so specific ethical approval was not required.

Data Analysis

We will identify people with T2DM using a two-step process, which we have previously described elsewhere [17]. Firstly, people with diabetes are identified according to presence of diagnostic codes, two or more results for HbA1c or plasma glucose that confirm diabetes, and antihyperglycemic medications (not including metformin). People are then categorized by diabetes type (type 1 DM [T1DM], T2DM, undetermined) via a seven-step algorithm, which considers medications, diagnosis codes specific to diabetes type, and other clinical characteristics specific to T1DM or T2DM.

To calculate prevalence within the T2DM cohort, we will use the high cardiovascular risk inclusion criteria for each SGLT2 inhibitor CVOT (Table 1). To identify people by cardiovascular risk, we will use the closest matching diagnosis codes or other codes available to define diagnosis of each risk factor (Electronic Supplementary Material Appendix Tables A1–A9).

Table 1 Inclusion criteria for the four published sodium-glucose co-transporter-2 inhibitor trials compared in this study

Statistical Methods

Descriptive statistics will be used to report the findings. We will calculate the proportions of patients eligible for each trial. To describe the characteristics of each cohort, we will use percentages to report categorical data, and means (with standard deviations) and medians (with interquartile ranges) will be used to describe continuous data. Differences between crude rates will be explored using 95% confidence intervals.

Compliance with Ethics Guidelines

Consent will not be required for these data. We will not process data for people where opt-out codes are present; these account for just over 2% of the RCGP RSC population [18]. The data will be pseudonymized and encrypted before uploading to the Clinical Informatics Research Group secure server. Personal data will not be identifiable. This study is considered to be an “Audit of current practice” when tested against the Health Research Authority (HRA)/Medical Research Council (MRC) ‘‘Is my study research’’ tool and therefore does not require specific ethical approval [19]. Approval for use of the data was acquired from the RCGP RSC Study Approval Committee.

Strengths and Limitations

As mentioned in the Data Source section, the large sample size of this representative dataset and the high-level data completeness of the data are particular strengths of the RCGP RSC dataset. Furthermore, our previous study comparing real-world use of empagliflozin with data from a trial demonstrated that this type of study is feasible using the RCGP RSC dataset [10]. However, primary care data are associated with some limitations.

Practices participate in the RCGP RSC network on a voluntary basis, and there is slight underrepresentation of practices with more deprived patients compared to the national population [12]. Therefore, the sample is subject to some selection bias. In addition, the data collected are dependent on data entry into a patient’s medical record, so data for particular conditions could be missing from some patients’ records. Nonetheless, improved management of chronic diseases since the introduction of QOF will have minimized such an effect for this particular study on people with cardiovascular risk factors and T2DM [16].

Identification of patients according to trial inclusion criteria will also be restricted by primary care clinical codes, i.e., Read codes, which do not align directly with those used in the trials. Although we will use codes that most closely match those in the trials, this may lead to over- or underestimation of the number of people meeting each of the criteria. We will report additional strengths and limitations identified while undertaking the study in the final manuscript.

Conclusions

Our real-world evidence-based cross-sectional analysis will report the proportion of people with T2DM in a national primary care population that meet the cardiovascular risk inclusion criteria of each of the four drug-specific SGLT2 inhibitor CVOTs, with the aim to determine those deemed suitable for treatment as per each trial. The clinical characteristics of the identified patients in the RCGP RSC dataset will also be reported and compared with published findings from trials to determine their generalizability to real-world clinical practice.