figure a

Introduction

Sodium–glucose cotransporter 2 (SGLT2) inhibitors block the SGLT2 within the proximal renal tubule, reducing glucose and sodium reabsorption and increasing glycosuria and fluid loss. Dapagliflozin is a new SGLT2 inhibitor indicated alongside diet and exercise for improving glycaemic control in adults with type 2 diabetes (licensed in Europe in 2012 [1] and the USA in 2014 [2]). In randomised controlled trials (RCTs) [3,4,5,6,7,8,9,10], dapagliflozin was found to improve glycaemic control, with mean difference in HbA1c of ~5.5 mmol/mol (0.52%) vs control groups [11, 12]. Although not an indication for use, RCTs of dapagliflozin have demonstrated weight loss and improved systolic blood pressure (SBP) [3, 5, 6, 9, 10, 13]. In large-scale placebo-controlled cardiovascular disease (CVD) outcome trials, other SGLT2 inhibitors (empagliflozin [14] and canagliflozin [15]) were shown to reduce major CVD events. Although the results for the dapagliflozin DECLARE CVD outcome trial have not yet been published it has been reported that the primary safety endpoint of non-inferiority for major adverse cardiovascular events was met and that there was a significant reduction in one of two primary efficacy CVD endpoints [16, 17].

Over 3 years of real-world observational data are available for dapagliflozin users in a large national electronic healthcare record-derived dataset of individuals with type 2 diabetes in Scotland, allowing effects on continuously distributed outcomes HbA1c, BMI, body weight, SBP and kidney function (as eGFR) to be evaluated. First, we aimed to determine whether the effects of dapagliflozin on HbA1c and other variables in RCTs are obtained in real-world practice. Second, we aimed to undertake safety event outcome analyses, since safety concerns about SGLT2 inhibitors exist, to establish whether an increased rate of these could be observed in dapagliflozin users. Specifically, the Canagliflozin Cardiovascular Assessment Study Programme demonstrated an unexpected increased risk of lower-limb amputation (LLA) in patients treated with canagliflozin [15] and the US Food and Drug Administration (FDA)’s Adverse Reporting System showed a disproportionately increased reporting ratio for canagliflozin and LLA. It is unclear whether increased LLA risk is a class effect of SGLT2 inhibitors, is restricted to canagliflozin or is a chance effect [18]. Case reports exist detailing the development of (often euglycaemic) diabetic ketoacidosis (DKA) in individuals with type 2 diabetes following initiation of SGLT2 inhibitor therapy, with increased disproportionality signalling in both European Medicines Agency (EMA) and FDA pharmacovigilance databases [19, 20]. It is unclear whether this is a true drug effect.

It is important to understand the extent to which drug effects in RCTs are achieved in real clinical recipients who may have a wider range of characteristics [21, 22]. Dapagliflozin is licensed for those between 18 and 75 years of age, with an eGFR ≥60 ml min−1 [1.73 m]−2 and not receiving pioglitazone or loop diuretics. Some individuals not meeting these criteria are nevertheless prescribed the drug.

We focus on dapagliflozin use in Scotland, as there are sufficient dapagliflozin users to adequately power our analyses (8566 dapagliflozin users, 1782 canagliflozin users and 2385 empagliflozin users in current data extract). As data accrues, other SGLT2 inhibitors will be evaluated.

Methods

Data sources

Anonymised data were extracted from the Scottish Care Information-Diabetes (SCI-Diabetes) collaboration database. This database has been described in detail previously [23, 24] and, in brief, comprises a nationwide register of e-health-records containing extensive clinical data and issued prescriptions for 99% of Scottish diabetes patients. These data are linked using the Community Health Index, an identifier used in all Scottish records, to mortality data from the General Registrar and hospitalisation records available from the Informatics Service Division of the National Health Service in Scotland.

Study period and population

Data were available from 2004 until mid-2016 for all analyses. Those eligible for inclusion into the study met the following criteria: (1) alive with a diagnosis of type 2 diabetes at any time since the introduction of dapagliflozin; (2) had no diagnosis of type 1 diabetes and (3) were aged 18–80 years upon study entry. For the safety analyses, since we focus on cumulative drug effect, a further criterion was imposed that persons had to be fully evaluable for drug exposure since the date of introduction of dapagliflozin or date of onset of diabetes, whichever was later. For both analyses, individuals’ contributed person-time to the study upon the latest of study start date, date of diagnosis of type 2 diabetes or becoming observable within the dataset. Individuals ceased contributing person-time to the study upon the earliest of death, becoming unobservable within the dataset (i.e. lost due to emigration) or study end date. For the safety analysis, individuals were censored following exposure to other SGLT2 inhibitors.

Informed consent

The study was carried out in accordance with the ethical principles in the Declaration of Helsinki as revised in 2008.

Drug exposure

Issued prescription data were used to define drug exposures. All prescriptions were assigned Anatomical Therapeutic Chemical Classification System (ATC) codes; dapagliflozin exposure was defined as ATC code A10BX09/A10BK01. Dapagliflozin ever-users were those with any initiation of dapagliflozin between November 2012 (the first date of dapagliflozin availability) and study end date. Drug exposure start date was defined as the date of initial prescription and drug exposure end dates were extrapolated based on dosage, frequency and directions. Dapagliflozin users were stratified to those receiving dapagliflozin ‘on-licence’ (defined as age 18–75 years, eGFR ≥60 ml min−1 [1.73 m]−2, not receiving pioglitazone and not receiving loop diuretics) and ‘off-licence’ for individuals not fulfilling these criteria. Never-users were those who never received a prescription for dapagliflozin throughout the study period.

Clinical measures including outcome measures

The SCI-Diabetes database contains demographic data, captures all HbA1c, serum creatinine and other biochemical results, as well as all routine clinical measures such as blood pressure, height and body weight. For baseline comparisons of demographic and clinical characteristics of dapagliflozin users vs never-users, measurements for users were taken as those closest to (but no earlier than 24 months before) dapagliflozin initiation. For never-users, equivalent measurements were taken as those closest to (but no earlier than 24 months before) the median initiation date among users.

CVD, DKA and LLA were captured using linkage to national hospitalisation records and death data. ICD-10 codes (http://apps.who.int/classifications/icd10/browse/2016/en) for cause of admission and operative codes for amputations and revascularisation surgeries were used to define events. CVD codes included chronic ischaemic heart disease, cerebrovascular disease, heart failure, cardiac arrhythmia or coronary revascularisation. See electronic supplementary material (ESM) Table 1 for details.

Statistical methods

Simple descriptive statistics and linear or logistic regressions adjusted for age, sex and diabetes duration were used to compare characteristics of users and never-users. To evaluate the effect of dapagliflozin on continuous clinical outcomes of interest, we first described the distribution of within-person absolute and percentage changes following dapagliflozin initiation at regular intervals of 3 months throughout follow-up among users. For this analysis, clinical outcomes were assigned to time windows by applying a caliper of ±1.5 months (e.g. the 3 month time point contained measurements observed between 1.5 and 4.5 months). For continuous variable analyses, person-time was right-censored when dapagliflozin was ceased, a diabetes drug that was co-prescribed at dapagliflozin initiation was ceased or a new diabetes drug was started that was not already being received at the dapagliflozin initiation date. Where another diabetes drug was dropped at the time of initiation of dapagliflozin, the record was included since that will be conservative with respect to the apparent dapagliflozin effect. For analyses with outcomes of SBP and for CVD events, person-time was also right-censored upon initiating a new drug for CVD (all drugs with first-level ATC code C) that was not received at dapagliflozin initiation or ceasing a CVD drug that was co-prescribed at dapagliflozin initiation.

Mixed-effects regression models of continuous outcomes

Simple analyses of pre/post-drug-initiation comparisons in clinical outcomes in observational studies can provide misleading estimates of apparent long-term efficacy if the underlying trend in that outcome in the absence of drug exposure is not considered. Therefore, to assess the change in clinical outcomes of interest following dapagliflozin exposure while taking into consideration the underlying calendar time trend, we constructed linear mixed-effects regression models (ESM Methods) [25], which utilise pre-exposure data to control for the expected within-person trajectories in the outcome of interest in the absence of the drug. Clinical measurements up to 24 months before dapagliflozin initiation, and measurements throughout the entire follow-up time until right-censoring, were used.

To examine the likely magnitude of regression-to-the-mean effects we constructed mixed regression models of the deviation of within-person observed HbA1c from the expected HbA1c at the time of drug initiation. For this analysis, we used data from up to a maximum of 3 years prior to dapagliflozin initiation. Fixed effects were specified as age, sex, duration of diabetes, number of diabetes drug classes and month of observation. Random effects and autocorrelation structure were specified as for the primary analysis.

Cox regression models for event outcomes

As we have described in detail elsewhere, detecting drug effects on events is subject to allocation bias if simple comparisons of event rates in those ever vs never exposed are made [26]. Such bias is not removed by adjustment for differences in observed risk factors for the events between ever-users and never-users. We have argued that for CVD, evaluation of cumulative effects on outcomes is a more valid way to infer causality [26]. More specifically, Cox regression models for time-to-event were specified to include a time-updated term for ever-exposure vs never-exposure (this term in fact captures the allocation bias) and also to include a term for cumulative exposure. Person-time was split into intervals of 28 days and each interval was updated for exposure. Models were constructed with and without adjustment for baseline clinical risk factors. Imputation was used where risk factor data were missing (see ESM Table 2). For events such as DKA, it might be argued that if sudden rather than cumulative drug effects occur then this effect would be captured in the ever vs never term but it cannot be differentiated from allocation bias effects.

Results

Cohort descriptive statistics

In total, 8566 dapagliflozin users (of which 7231 were considered ‘on-licence’) and 230,310 never-users met inclusion criteria for this analysis (Table 1 and ESM Table 3). In total, 2782 users (32.48%) had ceased dapagliflozin before their last follow-up date and mean within-person persistence (i.e. proportion of all available follow-up time in which dapagliflozin continued to be received) was 0.81 (SD 0.32). During their observable follow-up, 2576 users (30.07%) initiated at least one additional diabetes drug they were not receiving at dapagliflozin initiation (median time until additional diabetes drug initiation was 214 days (interquartile range [IQR] 103–381 days) and 1963 users (22.92%) ceased a non-dapagliflozin diabetes drug that they had continued to receive when dapagliflozin had been initiated (median time until cessation of concurrent diabetes drug was 265 days [IQR 141–453 days]). Thirty-two per cent of users discontinued another diabetes drug at time of initiation of dapagliflozin and their inclusion here is conservative with regard to estimating the effect of dapagliflozin. Altogether, there were 6674 person-years of follow-up time available post-initiation for evaluating treatment effects and the median observation time post-initiation (i.e. follow-up time was 210 days [IQR 91–421 days]). Dapagliflozin was prescribed mostly as an add-on therapy on top of existing monotherapy or dual therapy (Table 1). Baseline characteristics adjusted for age, sex and diabetes duration differed considerably between users and non-users (Table 1).

Table 1 Baseline characteristics of dapagliflozin users and never-users

Crude within-person absolute changes in clinical measures with dapagliflozin exposure

The mean (SD) number of measurements per user pre-initiation were as follows: HbA1c 4.92 (1.99); BMI 3.58 (2.81); body weight 3.44 (2.81); SBP 5.44 (3.28) and eGFR 5.07 (3.62). The mean (SD) number of measurements per user post-initiation were as follows: HbA1c 2.42 (1.71); BMI 2.13 (1.77); body weight 2.06 (1.70); SBP 2.68 (2.27) and eGFR 2.94 (3.12).

Crude within-person absolute changes in clinical outcomes throughout follow-up are shown in Table 2 (with per cent changes shown in ESM Table 4). Note that unlike in a clinical trial where all measurements will occur at the same regularly spaced intervals, our real-world observational dataset reflects whatever clinical measures were made. Thus, different individuals contribute data within different 3 month windows being evaluated and there are of course fewer persons observed at increasingly longer durations of follow-up. With these caveats in mind, at 3 months the mean change in HbA1c was −10.41 mmol/mol (−0.95%), with the largest change observed at 12 months where it was −12.99 mmol/mol (−1.19%). The change in HbA1c from baseline generally persisted above −10 mmol/mol (−0.91%) throughout follow-up (Table 2). In terms of target achievement by 6 months after initiation, 26.0% of users reached HbA1c ≤58 mmol/mol (7.5%) and 13.1% reached HbA1c ≤53 mmol/mol (7.0%) compared with 5.4% and 2.3% of users at baseline respectively. SBP, BMI and body weight had all decreased by 3 months post-initiation (crude within-person changes of −4.32 mmHg, −0.74 kg/m2and −2.10 kg, respectively) and these changes persisted thereafter.

Table 2 Within-person changes in clinical outcomes over time among all dapagliflozin users

Variability in effects by 24 months’ exposure and effects of baseline characteristics

HbA1c was reduced in the majority of dapagliflozin recipients but, as shown in a quadrant plot (ESM Fig. 1), the magnitude of effect varied considerably. Using the most recently available treatment measure up to 24 months post-initiation, much larger mean within-person absolute reductions were observed for users in the highest two tertiles for baseline values of HbA1c (ESM Table 5). Mean within-person HbA1c reductions were also marginally more pronounced for users with a higher baseline kidney function and shorter duration of diabetes. No clear sex difference was observed. Similarly, those in the top tertile for body weight, BMI and SBP had the highest absolute decline in these outcomes. All observed subgroup effects persisted when within-person changes were examined on a proportional scale.

Crude effects by on-licence status

Mean within-person changes in the 84.4% of individuals considered to be on-licence users of dapagliflozin were similar to the overall effect. Effects on HbA1c in the 15.6% of off-licence-users were clearly observed and substantial but were of slightly lower magnitude than those in the on-licence-users (see ESM Tables 6, 7).

Effects from mixed-effects regression models

As shown in Table 2, crude absolute changes in clinical outcomes compared with baseline were fairly stable over follow-up time. Fitted mean trajectories of clinical outcomes from final covariate-adjusted mixed regression models suggested that before initiating dapagliflozin, HbA1c was increasing by 0.40% per year (Table 3), SBP was increasing by 0.59 mmHg per year, BMI was decreasing by 0.03 kg/m2 per year, body weight was decreasing by 0.12 kg per year and eGFR was decreasing by 1.21 ml min−1 [1.73 m]−2 per year in these users. However, there was substantial individual random variation around these mean slopes.

Table 3 Estimated effects of time and dapagliflozin exposure from final covariate-adjusted linear mixed regression models predicting clinical outcomes of interest

In a mixed regression model taking account of time trends in the clinical outcomes, the estimates for the apparent effect of dapagliflozin on HbA1c at 3 months was similar to the simple crude comparisons but the effect sizes yielded by the model were greater than the crude estimates at longer follow-up. Thus, the crude effect on HbA1c at 12 months was −12.99 mmol/mol (−1.19%) whereas the estimate from the model taking into consideration the upward change in HbA1c that would have been expected in the absence of the drug at that time point was −15.14 mmol/mol (95% CI −15.87, −14.41) (−1.39% [95% CI −1.45, −1.32]), (Table 3 and Fig. 1a). Model effect estimates for BMI and body weight showed stabilisation by 6 months, at a change of −0.82 kg/m2 (95% CI −0.87, −0.77) and −2.20 kg (95% CI −2.34, −2.06), respectively (Table 3 and Fig. 1b, c). For SBP, dapagliflozin was associated with a decrease of −4.32 mmHg (95% CI −4.84, −3.79) within the first 3 months of use, that persisted throughout follow-up (Table 3 and Fig. 1d). The pattern of apparent effect of dapagliflozin upon kidney function was less clear. An initial decline in eGFR of −1.81 ml min−1 [1.73 m]−2 (95% CI −2.10, −1.52) was observed within the first 3 months of dapagliflozin use (Table 3 and Fig. 1e) but by 12 months the effect was no greater than the expected decline in eGFR in the absence of drug. Estimates of treatment effects were consistent when mixed regression model procedures were restricted to data for on-licence-users only (data not shown).

Fig. 1
figure 1

Estimates of treatment effect through time from final covariate-adjusted mixed linear regression models, for clinical outcomes of interest: HbA1c (a), BMI (b), body weight (c), SBP (d) and eGFR (e). Time post dapagliflozin initiation is specified as categories at 3 month intervals. As models are adjusted for baseline value and pre-initiation trajectory, points represent estimated mean difference from expected trajectory in the absence of dapagliflozin exposure. The dashed line denotes value under null hypothesis (i.e. zero; no treatment effect). Vertical bars represent 95% CIs. Size of points and bars denotes number of observations within the respective time category (see key). Time windows are calculated by applying a caliper of ±1.5 months (e.g. the 3 month time point contained measurements observed between 1.5 and 4.5 months)

Estimating potential magnitude of regression to the mean on apparent drug-associated changes

Residual values from mixed regression models showed that the closest prior measurements to dapagliflozin initiation appeared systematically greater than expected given the respective individual HbA1c trajectories. This difference was approximately 10% on average but with considerable variability in this estimate. Thus, for the change in HbA1c at 12 months of −15.14 mmol/mol (−1.39%), approximately 1.5 mmol/mol (2.29%) might be attributable to this bias.

Safety event analysis

ESM Table 2 shows the results of fitting Cox proportional hazards models for CVD, DKA and LLA. For this duration of follow-up there were very few cases of CVD (n = 111) and even fewer cases of DKA (n = 13) and LLA (n = 28) in the exposed subgroup, such that power to detect effects was limited. Power was further limited for cumulative effects analysis in that duration of exposure was short overall. Nonetheless we show these data for completeness and have right-censored for exposure to other SGLT2 inhibitors to ensure no negative confounding effect balancing the non-significant effect observed in the dapagliflozin-exposed group. As shown, there was no significant effect of cumulative exposure on any of these outcomes. In the comparison of ever-users vs never-users there was a significantly lower rate of CVD (HR 0.71, p = 0.02), which was unchanged by adjustment for additional clinical covariates for CVD events. While this is consistent with a protective effect of the drug it is not proof of this since this effect could also be due to allocation bias.

Discussion

We describe usage trends of dapagliflozin in individuals with type 2 diabetes in Scotland. Almost all (84.4%) of those prescribed dapagliflozin were on-licence users. Dapagliflozin was largely prescribed as add-on therapy on top of one or two drugs and was continued throughout follow-up for the majority of people. As expected, dapagliflozin appeared to be preferentially prescribed for younger individuals who had poorer glycaemic control and longer duration of diabetes and who were on more than one additional oral glucose-lowering medication. Dapagliflozin use was associated with substantial improvements in HbA1c and with slight improvements in BMI, body weight and SBP. Based on follow-up values, the greatest absolute improvements in glycaemic control were observed for users with poorer baseline glycaemic control, as well as users with shorter duration of diabetes and higher kidney function. As well as an initial reduction in these outcomes, dapagliflozin appeared to stabilise HbA1c and SBP so that expected rises in these through time were prevented across this median of 210 days of follow-up.

Real-world evidence can help corroborate findings from RCTs by testing the generalisability of their reported treatment effects and conclusions within a broader and more heterogeneous population who are less supervised in their healthcare management. At 3–6 months, the crude and the modelled estimated treatment effects on HbA1c were slightly higher than the effect sizes observed in previous RCTs. For example, at 3 months and 6 months, the observed crude mean reduction in HbA1c from baseline was 10–12 mmol/mol (or −1.0 to −1.1% units), compared with RCT estimates of −0.61% to −0.85% at 3 months and −0.5% to −1.4% at 6 months, subject to dosage and additional drug therapies [3, 6, 27,28,29,30,31]. The US FDA estimates the treatment effect of dapagliflozin on HbA1c to be in the range −0.40% to −0.84%, which is smaller than our findings [2, 32]. The National Institute for Health and Care Excellence (NICE) in the UK estimates the same to be −0.39% to −0.84%, which is also smaller than our findings and similar to the US FDA estimate [33]. However, our estimated effects of dapagliflozin upon HbA1c at 3–6 months were consistent with reported follow-up values from a recent UK-wide real-world retrospective study [34].

One potential reason for apparently higher treatment effects in an observational study is regression to the mean. Where there is considerable short-term within-person variation (or ‘noise’) in clinical measures, whether due to true short-term biological variability or to measurement error, a new drug is more likely to be prescribed in response to extreme or outlying clinical observations such as unusually high HbA1c for the respective individual. Even where a drug is ineffective post- vs pre-initiation, comparisons of data might show an apparent treatment effect, as after a given extreme observation subsequent measurements might be expected to regress to the mean. It is also the case that true biological worsening of the clinical measure such as HbA1c will also precede new drug intervention. The combined effect of these two phenomena is that HbA1c at the time of drug initiation is likely to be systematically higher than expected given an individual’s prior measurements, their current characteristics of age, sex, diabetes duration and other characteristics relevant to expected HbA1c. Precisely how much of the observed treatment effect is attributable to regression-to-the-mean effects is not directly estimable. By examining the residuals of the last measurements prior to dapagliflozin initiation in a model of pre-initiation trajectories, we have provided a crude estimate of the magnitude of such effects as being an apparent reduction of about 10% of the apparent treatment effect. The treatment effects we observed on HbA1c at 3–6 months were 0.15–0.30% units higher than in clinical trials but half of this difference might be explained by regression to the mean. The apparently greater effects could also be because any changes in lifestyle that reduce HbA1c could co-occur upon drug initiation.

Our observed treatment effects at 3–6 months upon BMI, body weight and SBP were highly consistent with those from RCTs wherein reported effects on body weight were −1.50 kg to −3.2 kg and effects on SBP were −2.19 to −3.9 mmHg [3, 5, 6, 27,28,29, 31]. The effects on SBP are consistent with the mechanisms of action of the drug, which encourages renal sodium and glucose loss.

An important aspect of our analyses is the persistence of the apparent drug effect. Since HbA1c and SBP tend to worsen over time in diabetes in the absence of drug exposure, a stable absolute difference from baseline over longer follow-up is consistent not only with the drug not only improving the HbA1c and SBP but also preventing their worsening over time. This is illustrated in the mixed model where a much larger net effect of dapagliflozin on HbA1c of −16 mmol/mol (−1.47%) given its underlying time trend was estimated at 24 months of exposure.

The vast majority of people receiving dapagliflozin respond to the drug but there is considerable variation in the magnitude of the response. We are not able to evaluate in this study to what extent such variation reflects true biological variation in response vs differences in compliance. Some of the variation on an absolute scale reflects that the largest reductions in HbA1c during follow-up were seen for those users in the highest tertile for baseline HbA1c (ESM Table 5). Individuals also generally exhibited wide variation in responses in clinical trials (SD for HbA1c effect ranging from 0.61% to 0.92%), though the wider variation seen here may reflect the broader diversity of individuals’ characteristics in our real-world dataset as well as more diverse compliance [3,4,5,6,7, 10].

Currently, the most common reported adverse effect reported in RCTs of dapagliflozin is a higher risk of urinary and genital tract infections [9, 10, 35]. There is also some evidence that dapagliflozin is associated with a risk of decline in kidney function, though this association did not persist in subgroups with long-term treatment (>24 months), consistent with our observation of an initial decline in eGFR, which by 12 months was consistent with the annual decline expected in the absence of drug (Fig. 1e) [36]. Dapagliflozin treatment is not currently recommended for individuals with an eGFR below 60 ml min−1 [1.73 m]−2 [1, 14, 15]. Our focus here was on effects on continuous outcomes for which there is adequate power rather than on CVD and other safety-events analyses for which power is very low. Nonetheless, we included such analyses for completeness. No significant safety signals were found. While there was a significantly lower CVD event rate in those ever vs never exposed, there was no significant cumulative effect of exposure on CVD. As we have described previously, such ever vs never comparisons, while providing some reassurance, cannot be interpreted as proof of a protective causal effect since they remains subject to allocation bias. As further follow-up data accrues in this dataset, we will be able to test for cumulative effects on events with more power. In addition, we intend to explore effects of other SGLT2 inhibitors that were licensed later as further data accrue.

We acknowledge the limitations of our analysis, the most important of which is that unbiased control comparisons cannot be achieved as they are in clinical trials. In addition, as described, within-person analyses can fail to take into account regression to the mean and underlying calendar time trends. Nevertheless, we have made extensive efforts to estimate the likely magnitude of these latter two biases, going well beyond many observational studies of this nature.

In conclusion, the effectiveness of dapagliflozin on HbA1c and other clinical outcomes observed in clinical trials was apparent in this real-world effectiveness study; treatment effect estimates were at least as large as in clinical trials even when likely observational analysis biases are considered. Dapagliflozin lowers HbA1c and SBP shortly after treatment initiation but also appears to prevent worsening of these outcomes over the ensuing 2 years. Dapagliflozin also lowered BMI and body weight.