Background

In patients with type 2 diabetes (T2D) and established cardiovascular diseases (CVD), treatment with empagliflozin, a sodium-glucose cotransporter 2 inhibitor (SGLT2i), has demonstrated reductions in the risks of major adverse cardiovascular events (MACE), mortality, hospitalization for heart failure (HHF), and kidney-related outcomes relative to placebo [1]. Glucagon-like peptide-1 receptor agonists (GLP-1RA) have also demonstrated efficacy against MACE and kidney-related outcomes relative to placebo in patients with T2D with or at risk for atherosclerotic CVD (ASCVD) [2,3,4].

The demonstrated benefits of empagliflozin and GLP-1RA in placebo-controlled trials raise the question of how their benefits compare in broader populations of patients with T2D. To date, no large cardiovascular outcome trials have directly compared empagliflozin vs. GLP-1RA to help guide treatment prescribing. Previous observational studies that compared the effectiveness of SGLT2i and GLP-1RA included only a small number or proportion of patients on empagliflozin [5,6,7,8,9,10], are based on a single healthcare setting with limited generalizability [9, 11,12,13,14,15], or are too small to evaluate CVD outcomes with reasonable precision [8, 11, 12, 14,15,16,17,18]. Since not all SGLT2i and GLP-1RA agents demonstrated cardiorenal benefits in placebo-controlled trials, prior evidence may not apply to the effectiveness of empagliflozin relative to GLP-1RA agents. Only two studies have compared empagliflozin with GLP-1RA although they focused on older patients only [19] or did not characterize individual cardiovascular outcomes [20]. They also did not evaluate the cardiovascular mortality and kidney outcomes, which are approved indications for empagliflozin [19, 20].

EMPagliflozin compaRative effectIveness and SafEty (EMPRISE) study is a sequentially built population-based monitoring program designed to evaluate the effectiveness and safety of empagliflozin [21]. We present the final-year results from the EMPRISE study comparing the cardiorenal effectiveness of empagliflozin vs. GLP-1RA (restricted to agents with demonstrated cardioprotective effects) across broad population subgroups.

Methods

We conducted an active-comparator, new-user cohort study [22] using three data sources: two US commercial claims (Optum’s de-identified Clinformatics® Data Mart Database and IBM Marketscan), and Medicare federal insurance data. These databases contain deidentified, longitudinal patient-level data on all reimbursed medical services, including inpatient and outpatient diagnoses and procedures, along with pharmacy dispensing records. The study protocol [EnCEPP (EUPAS20677) and ClinicalTrials.gov (NCT03363464)] was approved by the Institutional Review Board at Mass General Brigham. Data use agreements were in place.

Study population and exposure assessment

The study population included patients with T2D aged 18 years or older (65 years or older in Medicare), who initiated empagliflozin or cardioprotective GLP-1RA (liraglutide, albiglutide, dulaglutide, injection semaglutide) from 8/2014 to 9/2019 (Additional file 1: sFigure 1). Exenatide and lixisenatide were not considered in the GLP-1RA group due to the lack of demonstrated cardiovascular benefits, and oral semaglutide was not yet approved during the timeframe of this study. Cohort entry was on the date of initiation of empagliflozin or a comparator without history of any SGLT2i or GLP-1RA prescriptions for 12 months, which was defined as the baseline period. We required patients to have continuous coverage for insurance plans and a recorded diagnosis of T2D during this baseline period. We excluded patients with recorded diagnoses of type 1 or secondary diabetes, malignancy, ESKD or kidney replacement therapy, human immunodeficiency virus, solid organ transplant, or a nursing home admission during baseline (Additional file 1: sTable 1). Patients who initiated both empagliflozin and GLP-1RA on the cohort entry date were excluded.

To identify study outcomes, patients were followed from one day after cohort entry until the earliest of: discontinuation of the index drug, switching to the comparator drug class, switching from an initial to an alternative agent within the same class, gap in insurance coverage (> 30 days), death, or end of the study (September 30, 2019). We considered patients to be on medications until 60 days after the end of the last prescription’s supply.

Outcome definitions

Primary outcomes were (i) a composite of myocardial infarction (MI) or stroke, (ii) hospitalization for heart failure (HHF), defined as an HF diagnosis in the primary discharge position, (iii) MACE, defined as a composite of MI, stroke, or cardiovascular mortality, and (iv) a composite of cardiovascular mortality or HHF. Secondary outcomes included the composite of all primary outcomes (MACE or HHF) (Medicare data only), the composite of MI, stroke, or HHF, HHF defined as diagnosis in all discharge positions (HHF-broad), individual components of MACE, unstable angina, coronary revascularization, all-cause mortality, and ESKD. To allow sufficient time for patients to progress to ESKD, we restricted the population to patients with chronic kidney disease (CKD) stages 3–4, defined using a validated claims-based algorithm [23]. We report detailed outcome definitions in Additional file 1: sTable 2.

We defined primary outcomes using validated claims-based algorithms, with high specificity (93–98%) and positive predictive value (PPV > 98%) [24,25,26]. Date of death was ascertained from the Vital Status files, with linkage to the Social Security Administration (SSA) data, which has been validated and captures > 95% of deaths in older adults aged > 65 years in the US [27, 28]. Cause of death was ascertained through linkage with the National Death Index data considering only diagnoses in the primary position, and was available only in Medicare data [29].

Patient characteristics

We identified 143 covariates a priori based on literature review and clinical knowledge: demographics, census region, calendar time of cohort entry, modified Charlson/Elixhauser combined comorbidity score [30], validated claims-based frailty index [31], diabetes complications, glucose-lowering medication use on cohort entry and during baseline, cardiovascular diseases, systemic comorbidities, chronic disease medications, and measures of healthcare utilization across different healthcare settings as a proxy for the general health status and the intensity of care. Patient characteristics were measured during the baseline period using administrative enrollment data, diagnosis or procedure codes, and pharmacy National Drug Codes. Data on laboratory results were available in Clinformatics® (~ 45%) and MarketScan (~ 5–10%). We provided the complete list of covariates in Additional file 1: sTable 3.

Statistical analyses

To improve comparability of the two treatment groups, we 1:1 matched patients initiating empagliflozin with those initiating GLP-1RA based on the estimated propensity score (PS). We estimated PS as the predicted probability of initiating empagliflozin relative to cardioprotective GLP-1RA, separately within each database, after controlling for 143 baseline characteristics using multivariable logistic regression [32]. Since laboratory test results were only available in a subset of the population, they were not included in the PS model. We matched using the nearest neighbor approach without replacement [33], and the maximum allowed difference (caliper) in PS between empagliflozin and GLP-1RA was 0.01 [33]. Balance in covariates, including that of laboratory results, was assessed using absolute standardized mean differences (SMD) (lower values indicate better balance) [34] and the post-matched c-statistic of the model predicting the exposure conditional on baseline covariates (values closer to 0.5 indicate better balance) [35].

To allow tight control of baseline CVD, and risk factors related to evolving treatment indications over time, we conducted PS estimation and matching separately within each baseline CVD subgroup [ASCVD or HF (yes/no)] and each calendar time block (before and after 2018), across a total of 4 strata, within each database. The year 2018 was chosen to better capture the period before and after the shift in treatment guideline recommendations for SGLT2i and GLP-1RA [36,37,38]. After matching, we pooled all the PS-matched databases, and estimated treatment effects in the final pooled database using stratified likelihood [39]. We estimated hazard ratios (HR) using Cox proportional hazards models, and rate differences (RD) using Mantel-Haenszel methods [39]. We present the cumulative risk of outcomes over the follow-up period using Cumulative Incidence Function (CIF) plots using the Kaplan-Meier method.

We estimated treatment effects within the following subgroups: (i) age (≥ vs. <65 years), (ii) sex (male vs. female), (iii) history of baseline ASCVD (defined as a diagnosis for any major ASCVD, including MI, angina, coronary atherosclerosis or other forms of chronic ischemic heart disease, coronary procedure, ischemic stroke, peripheral arterial disease or surgery, or lower extremity amputation), and (iv) baseline HF. Within each subgroup, the PS was re-estimated, and patients were re-matched on the newly estimated PS. The heterogeneity of estimates across the subgroups was evaluated using the Wald test for homogeneity [39].

Sensitivity analyses

We undertook several steps to mitigate the potential for unmeasured confounding. First, to reduce unmeasured confounding by kidney function, we restricted the study cohort to patients with at least two dispensed prescriptions for metformin (recommended first line therapy for patients without severely compromised kidney function) during the 6 months prior to cohort entry [40], and without any insulin prescriptions during the baseline period. Second, we restricted the study population to patients with laboratory results data available [i.e., patients with non-missing hemoglobin A1c (HbA1c) and estimated glomerular filtration rate (eGFR)] in Clinformatics® or Marketscan, and matching was re-performed using claims-based variables, HbA1c, and eGFR. Third, we performed 1:1 high-dimensional PS matching, which enriched the original PS with 100 additional empirically identified covariates, based on thousands of candidate covariates in different care settings [41]. The algorithm automatically selects covariates based on their confounding potential and has been shown to improve adjustment for unmeasured confounding [41]. Fourth, we conducted bias analyses in which we re-estimated treatment effects after adjusting for HbA1c or eGFR to check if our estimates were robust even under assumptions of extreme imbalance in these unmeasured confounders between treatment groups [42].

To account for potential informative censoring, we conducted further analyses: (i) intent-to-treat (ITT) analyses which do not censor for treatment discontinuation or switching allowing maximum follow-up of up to two years, and (ii) censoring-weighted analyses which create pseudo-populations in which treatment discontinuation/switching was independent of baseline covariates [39]. Other sensitivity analyses included addressing potential exposure misclassification by varying the exposure assessment window from 60 to 30 days before censoring for treatment discontinuation, and restricting analyses to patients with at least 1 and 2 years of follow-up to account for longer follow-up time necessary for the development of cardiorenal outcomes. In these analyses, follow-up started at 1- and 2-years post-index until the end of available follow-up. We also compared empagliflozin with each individual GLP-1RA agent (liraglutide or dulaglutide) after re-matching them and re-estimating the PS for each pair.

All analyses were performed using the Aetion Evidence Platform® (2023), a software for real-world data analysis validated for a range of studies (Aetion, Inc.) [43], with R version 4.2 (R Foundation for Statistical Analysis) and SAS 9.4 Statistical Software (SAS Institute Inc., Cary, NC).

Results

The study population included 169,599 patients with T2D initiating empagliflozin and 298,298 initiating GLP-1RA prior to matching. After 1:1 matching, 141,541 patients remained in each group (Additional file 1: sFigure 2).

Mean ages of empagliflozin and GLP-1RA initiators were similar even prior to matching: 63 vs. 62 years. Empagliflozin initiators were less likely to be female (43% vs. 54%) and white (71% vs. 75%). The proportion of patients with baseline CVD was approximately similar (35% vs. 33%). Empagliflozin initiators were slightly more likely to have a history of metformin use at baseline (82% vs. 74%) and less likely to have a history of insulin use (21% vs. 38%) relative to GLP-1RA initiators. Empagliflozin initiators were also less likely to have a history of diabetes complications and CKD. All these differences were removed after PS matching (Table 1). Laboratory results, available in a subset of the populations, were also balanced even prior to PS matching (Table 1 and Additional file 1: sTable 3). Laboratory results were still balanced after restricting analyses to patients with non-missing HbA1c and eGFR (Additional file 1: sTable 4). Approximately 52% of the matched population were older adults ≥ 65 years (Table 1). The c-statistic of the model predicting treatment as a function of covariates in the post-matched database was ~ 0.5 indicating satisfactory balance.

Table 1 Selected baseline characteristics of patients with type 2 diabetes initiating empagliflozin or GLP-1RA

The median follow-up time after matching was 5 months (interquartile range: 3–10 months) for both empagliflozin and GLP-1RA initiators. Approximately 21–22% of the original PS-matched cohort had follow-up of at least one year and 6–7% remained at two years. The most common reason for censoring was treatment discontinuation (40–45%) for both treatment groups (Additional file 1: sTable 5).

Cardiovascular effectiveness outcomes

After matching, rates of the composite of MI or stroke were similar between empagliflozin and GLP-1RA initiators [13.1 and 13.4 events per 1,000 person-years (PY) with corresponding HR of 0.99 (0.92, 1.07) and RD of -0.23 (-1.25, 0.79) per 1,000 PY]. Empagliflozin was associated with lower rates of HHF relative to GLP-1RA with rates of 5.0 and 7.3 per 1,000 PY respectively, corresponding to HR of 0.69 (0.62, 0.77) and RD of -2.28 (-2.98, -1.59).

In analyses restricted to Medicare, empagliflozin was associated with a slightly lower risk of MACE compared with GLP-1RA initiators, based on rates of 22.6 and 25.1 per 1,000 PY respectively, HR of 0.90 (0.82, 0.99), and RD of -2.54 (-4.76, -0.32) per 1,000 PY. Rates of the composite of cardiovascular mortality or HHF were 14.2 and 18.3 events per 1,000 PY among empagliflozin and GLP-1RA initiators, respectively [HR: 0.77 (0.69, 0.86), RD: -4.11 (-5.95, -2.29) per 1,000 PY]. In a subgroup of patients with history of baseline CKD stages 3–4 (10,837 PS-matched pairs), empagliflozin was associated with a lower risk of ESKD compared with GLP-1RA initiators [HR: 0.75 (0.60, 0.94), RD: -6.77 (-11.97, -1.61) per 1,000 PY] (Table 2).

Table 2 Comparative risk of cardiorenal outcomes among 1:1 PS-matched initiators of empagliflozin vs. GLP-1RA

Estimates for secondary outcomes were overall consistent, with similar risks of MI, stroke, unstable angina, and coronary revascularization between the groups, while the risks of composite of primary outcomes, HHF (defined more broadly), all-cause, and cardiovascular mortality were lower in empagliflozin vs. GLP-1RA initiators (Table 2). Empagliflozin was also associated with lower risks of the composite of MI, stroke, or HHF in all patients, and the composite of MACE or HHF in older Medicare patients relative to GLP-1RA.

Database-specific estimates were overall consistent except for small differences in commercial claims databases due to the small numbers of events (Additional file 1: sTable 6).

Consistent with HR and RD estimates, CIF curves showed similar risks of the composite of MI or stroke, and lower risk of MACE among patients initiating empagliflozin relative to GLP-1RA. The risks of HHF and the composite outcome of cardiovascular mortality or HHF were also lower in empagliflozin relative to GLP-1RA initiators. (Fig. 1).

Fig. 1
figure 1

Cumulative risk of primary outcomes among PS-matched initiators of empagliflozin vs. GLP-1RA

CAPTION: The risks of cardiovascular mortality, MACE, and HHF were lower in the empagliflozin vs. GLP-1RA initiators. Cardiovascular mortality data was only available in the Medicare database

CV: cardiovascular; GLP-1RA: Glucagon-like peptide-1 receptor agonists; MACE: major adverse cardiovascular events; PS: propensity score

Subgroup analyses

On the relative scales, estimates for the composite outcome of MI or stroke and MACE outcomes were similar in patients with and without baseline history of ASCVD or HF. Relative risk reductions in HHF and the composite of HHF and cardiovascular mortality were consistently observed independently of baseline ASCVD and HF. For all outcomes, absolute RDs were larger in patients with baseline ASCVD or HF than in those without these conditions (Fig. 2).

Fig. 2
figure 2

Subgroup analyses for primary outcomes by atherosclerotic cardiovascular disease or heart failure

CAPTION: On the relative scale, HRs were consistent across all subgroups examined for all outcomes. On the absolute scale, for all outcomes, RDs were larger in patients with ASCVD compared to those without it, and in those with HF compared to those without it

Stratified analyses by age showed that the relative risk of the composite outcome of MI or stroke was lower in older vs. younger patients, while the relative risk reductions for HHF were similar across age categories. Estimates for the remaining outcomes did not differ by age subgroups. Stratified analyses by sex produced similar relative hazards between male and female patients across all outcomes. Absolute RDs were larger in older vs. younger patients (Fig. 3). Subgroup analyses for secondary outcomes provided similar findings (Additional file 1: sTable 7).

Fig. 3
figure 3

Subgroup analyses for primary outcomes by age and sex

CAPTION: On the relative scale, empagliflozin was associated with a lower risk of MI/stroke in patients 65 years or older, while it was not associated with MI/stroke in patients younger than 65 years. The HR estimates were consistent across other subgroups for all outcomes. For all outcomes, RD estimates were larger in older than in younger patients, while they did not differ by sex

Sensitivity analyses

Sensitivity analyses were overall consistent with primary analytical findings (Additional file 1: sTable 8). Rematching populations using laboratory results, high-dimensional PS matching (Additional file 1: sTables 910), and bias analyses (Additional file 1: sFigures 3, 4) supported the robustness of the primary findings to unmeasured confounding. Analyses restricted to patients with ≥ 1–2 years of follow-up and comparing empagliflozin vs. each individual GLP-1RA agent (liraglutide or dulaglutide) revealed similar findings (Additional file 1: sTables 11, 12).

Discussion

In this comparative effectiveness cohort study, empagliflozin was associated with a similar risk of a composite of MI or stroke, and a lower risk of HHF, MACE, and a composite of HHF or cardiovascular mortality, when compared with cardioprotective GLP-1RA agents (i.e., liraglutide, albiglutide, dulaglutide, or semaglutide). In a subgroup of patients with T2D and baseline CKD stages 3–4, empagliflozin was associated with a lower risk of ESKD relative to GLP-1RA. Regarding the secondary outcomes, patients initiating empagliflozin had a lower risk of all-cause and cardiovascular mortality, compared to cardioprotective GLP-1RA, whereas the risks of MI and stroke (individually considered) were comparable between exposure groups. These analyses remained robust to unmeasured confounding due to eGFR and HbA1c in analyses restricted to patients with available laboratory results and in bias analyses quantifying the potential impact of unmeasured confounding. Across the pre-specified subgroups, similar results were found, but absolute benefits of empagliflozin were larger in patients with history of ASCVD or HF and in older adults.

Overall, our findings were in line with previous studies comparing empagliflozin and GLP-1RA agents [19, 20], although our study focused on GLP-1RA agents that demonstrated cardioprotective effects in trials (without considering exenatide and lixisenatide). Unlike previous studies, we evaluated the risk of cardiovascular mortality, an outcome that remains relatively unexplored in clinical practice and an approved indication for empagliflozin.

The relative effect of empagliflozin and GLP-1RA on MI and stroke outcomes is an area of debate in the literature. GLP-1RA agents, with the exception of exenatide and lixisenatide, offered risk reductions relative to placebo for MI and stroke by ~ 10–17% in patients with T2D, with estimates varying across different trials [2]. Empagliflozin, on the other hand, showed a 13% risk reduction for MI relative to placebo which did not reach statistical significance, and a numerical increase in the risk of stroke in patients with T2D and ASCVD [1]. Observational studies comparing empagliflozin with GLP-1RA also reported HRs ranging from 0.9 to 1.0 with larger risk reductions in patients with history of cardiovascular events [19, 20]. In our study, we found no association between empagliflozin vs. cardioprotective GLP-1RA and the risk of MI or stroke outcomes (either as a composite outcome or individually considered), in line with the placebo-controlled trials and prior evidence [2, 3, 19, 20]. We observed that the absolute risk reductions of empagliflozin for MI and stroke were larger in patients with history of HF and in patients 65 years or older, highlighting the fact that these patient subgroups could potentially have greater benefit from empagliflozin relative to cardioprotective GLP-1RA agents [40].

While the benefits of empagliflozin on HHF outcomes have been well-demonstrated, the effect of GLP-1RA on HHF from placebo-controlled trials was modest, at 11% relative risk reduction, and not uniformly established across different trials [1, 2]. We found a 31% relative risk reduction of empagliflozin relative to GLP-1RA, which was consistent across multiple pre-defined subgroups of patients and in line with prior evidence [19, 20]. Absolute risk reductions were larger in patients with history of ASCVD or HF and in older adults, again highlighting the role of empagliflozin in reducing the risk of HHF in these patients [40].

In our study, empagliflozin was associated with a 10% relative risk reduction for MACE compared to GLP-1RA. Absolute risk reductions were larger in patients with history of ASCVD or HF than in patients without it. This is consistent with current evidence from placebo-controlled trials and prior observational studies [19, 20]. This small benefit of empagliflozin towards MACE was mainly driven by the 19% reduction in the risk of cardiovascular mortality. While we have no trial evidence on the relative benefits of empagliflozin vs. GLP-1RA initiators with respect to cardiovascular mortality, in placebo-controlled trials empagliflozin offered apparently larger relative risk reductions than GLP-1RA versus placebo (38% vs. 22% risk reductions respectively), mainly in patients with T2D and history of ASCVD [1, 2]. The absolute risk reductions in MACE observed in the present study were larger in patients with history of ASCVD or HF, suggesting that patients in these high-risk subgroups could benefit more from empagliflozin relative to cardioprotective GLP-1RA agents.

Our study is one of the few to compare the renal benefits of empagliflozin and GLP-1RA. In placebo-controlled trials, empagliflozin demonstrated a 46% relative risk reduction for a composite kidney outcome (nephropathy including macroalbuminuria and ESKD) in patients with T2D and ASCVD, and a 28% risk reduction for kidney disease progression (ESKD) or cardiovascular mortality in patients with established CKD, consistently across subgroups defined by baseline eGFR [1, 3, 44]. GLP-1RA also demonstrated benefits on a composite nephropathy outcome [HR: 0.79 (0.73, 0.87)] in patients with T2D, with HR estimates ranging from 0.64 to 0.85 with varying degrees of precision across trials [2]. Consistent with evidence from trials suggesting strong kidney benefit of SGLT2i and weaker benefit of GLP-1 RA, in an analysis restricted to patients with history of CKD stages 3–4, we observed a 25% risk reduction of empagliflozin towards ESKD outcomes relative to GLP-1RA [1,2,3, 40]. However, kidney outcome trials of GLP-1 RA agents (e.g., injection semaglutide) are ongoing (FLOW trial) [45].

Limitations

Several limitations should be considered. First, we cannot exclude unmeasured confounding. After PS matching using laboratory results in a subset of the cohort with laboratory results available, findings remained consistent with the primary analyses. Our 1:1 PS matched design incorporated a rich set of claims-based variables that have been shown to balance laboratory results and clinical parameters typically only available in electronic health records [46]. Second, our outcome definitions relied on claims-based algorithms previously validated to have high specificity and PPV but low sensitivity [39]. Third, the median follow-up was short due to the lower persistence on treatments of patients in routine clinical practice compared to RCTs. Fourth, the primary analysis may suffer from informative censoring, which we addressed using ITT and censoring-weighted analyses. Finally, beneficial effects of GLP-1RA may require longer follow-up time to become apparent especially the effects mediated by atherosclerosis progression and weight loss. Analyses restricted to patients with ≥ 1–2 years of available follow-up provided consistent findings.

Conclusion

In this final-year report from the EMPRISE study, after extensive confounding control, empagliflozin was associated with similar risks of MI/stroke and lower risks of MACE, HHF, a composite of cardiovascular mortality or HHF, all-cause of mortality, and ESKD (in patients with CKD), when compared with selected GLP-1RA agents that demonstrated cardioprotective effects. Cardiovascular benefits of empagliflozin were larger in older patients and in patients with ASCVD or HF on the absolute scale. Our findings complement existing trial evidence by directly comparing empagliflozin with alternative cardioprotective agents and incorporating broad patient populations in clinical practice using robust and generalizable methodology.