FormalPara Key Points

The results of these two post-authorization safety studies of dapagliflozin to assess the risk of hospitalization for acute liver injury (hALI) and the risk of severe complications of urinary tract infection (sUTI), including pyelonephritis and urosepsis, did not suggest increased risks of hALI and sUTI in patients initiating dapagliflozin compared with those initiating comparator glucose-lowering drugs.

These findings contributed to the removal of liver injury and urinary tract infection as important identified risks in the dapagliflozin risk-management plan in Europe.

1 Introduction

In clinical trials, dapagliflozin has been shown to be effective in lowering blood glucose levels, body weight, and blood pressure in patients with type 2 diabetes mellitus (T2DM) compared with placebo [1,2,3,4,5]. Dapagliflozin promotes urinary glucose excretion by selectively inhibiting human renal sodium-glucose cotransporter 2 (SGLT2), the major transporter responsible for renal glucose reabsorption. Dapagliflozin was initially approved for the treatment of T2DM by the European Medicines Agency (EMA) in November 2012 and by the US Food and Drug Administration (FDA) in January 2014 [6, 7]. At the time of the initial approval of dapagliflozin to treat T2DM in Europe, four EMA-endorsed pharmacoepidemiological post-authorization safety (PAS) studies were initiated to monitor the safety of dapagliflozin in real-world use [8]. In this report, we describe the results of 2 of these studies: an evaluation of the risk of acute liver injury (EUPAS 12110 [9], NCT02695095 [10]), and an evaluation of the risk of severe complications of urinary tract infection (sUTI) including urosepsis and pyelonephritis (EUPAS 12113 [11], NCT02695173 [12]). Other PAS studies include an assessment of the risk of acute kidney injury [13,14,15], and an assessment of cancer, including specific evaluations of bladder cancer and female invasive breast cancer, which is ongoing at the time of this report [16, 17].

As with other SGLT2 inhibitors, dapagliflozin is minimally metabolized in the liver [18]. In the Phase 3 clinical program for dapagliflozin, the proportions of patients with elevated liver function tests or adverse events of hepatic disorder were similar between the dapagliflozin and placebo groups; however, one patient in the dapagliflozin arm experienced a case of “possible” drug-induced liver injury (DILI) [8]. While this case of possible DILI was later revealed to be a case of autoimmune hepatitis (after the approval of dapagliflozin by the EMA) [19], at the time of regulatory review, the signal was sufficient to prompt investigation into DILI in dapagliflozin initiators in a larger population. Drug-induced liver injury occurs infrequently and therefore is rarely detected in clinical studies, although it is one of the most common forms of drug toxicity and the most common single cause of safety-related drug marketing withdrawals in the last several decades [20]. In addition, dapagliflozin was the first SGLT2 inhibitor to be approved worldwide and therefore the safety profile of SGLT2 inhibitors as a class was not yet characterized. These reasons, and the concerns raised with the one “possible” case of DILI during the Phase 3 trial prior to dapagliflozin approval, led to the inclusion of liver injury in the post-marketing safety monitoring program for dapagliflozin.

Patients with T2DM experience a higher incidence of urinary tract infections than patients without T2DM, with women experiencing higher incidence than men [21]. Possible mechanisms for the increased risk of urinary tract infections in patients with diabetes include glucosuria, diabetes-associated bladder dysfunction, and immune dysfunction related to hyperglycemia [22]. Serious complications of urinary tract infections, such as pyelonephritis and urosepsis, may also be more common in patients with diabetes [22]. Because dapagliflozin’s mechanism of action results in glucosuria, urinary tract infections have been rigorously evaluated in clinical trials. Pooled results of safety data from 12 clinical trials of dapagliflozin showed a small increased incidence of any urinary tract infection in patients treated with 5 mg dapagliflozin (5.7%) or 10 mg dapagliflozin (4.3%) compared with placebo (3.7%) [23]. In longer-term pooled analyses of these trials, there was no increased incidence of pyelonephritis [24]. Due to the increased risk of urinary tract infection among those with T2DM and also among users of dapagliflozin, the post-marketing safety monitoring program for dapagliflozin included an assessment of pyelonephritis and urosepsis, serious conditions that may develop as complications of urinary tract infection, which often require hospitalization and may be life threatening.

We present the results from two large PAS studies, one assessing hospitalization for acute liver injury (hALI) and the other assessing sUTI (hospitalization or emergency department visit for pyelonephritis and/or urosepsis), that were conducted to address potential safety concerns at the time of approval of dapagliflozin in Europe. Specifically, the study objectives were to assess the incidence of hALI or sUTI in patients with T2DM newly treated with dapagliflozin compared with those initiating another glucose-lowering drug (GLD) in a real-world setting.

2 Methods

2.1 Study Population and Study Design

Two noninterventional, population-based studies were conducted concurrently, each using secondary data from three real-world longitudinal claims databases. The Clinical Practice Research Datalink (CPRD) in the UK is an electronic primary healthcare medical records database with linkage to hospital data through the Hospital Episode Statistics database; the HealthCore Integrated Research Database (HIRD) is an administrative claims database in the USA including commercially insured individuals; and the US Medicare research database includes information on federally funded fee-for-service insurance claims. The study period, which varied across the data sources, started on the date that dapagliflozin became available in each country after regulatory approval; ended on the date of the most recently available data at the time of data extraction; and included November 2012 through December 2018 for CPRD, January 2014 through February 2019 for the HIRD, and January 2014 through December 2017 for Medicare.

The study design, with inclusion and exclusion criteria for each PAS study cohort, is illustrated in Fig. 1. The study population comprised adult patients initiating dapagliflozin or an eligible comparator GLD (eligible comparator GLD classes and drug substances are listed in Table 1), with or without concomitant use of other GLDs (other than non-dapagliflozin SGLT2 inhibitors) or insulin. Because dapagliflozin was recommended as a second-line therapy for T2DM at the time of the study [25, 26], only drugs that were also considered second-line GLD treatment were eligible comparator GLDs. Metformin monotherapy and monotherapy with a sulfonylurea were not considered eligible comparators, as these are typically used as first-line therapy. Additionally, monotherapy with insulin was not considered as a comparator. However, combination therapy of an eligible comparator GLD with metformin, a sulfonylurea, or insulin were considered as eligible comparators.

Fig. 1
figure 1

Study design, cohort eligibility, and inclusion criteria. CPRD Clinical Practice Research Datalink, GLD glucose-lowering drug, GOLD General Practitioner Online Database of the CPRD, hALI hospitalization for acute liver injury, HIRD HealthCore Integrated Research Database, SGLT2 sodium-glucose cotransporter 2, sUTI severe complications of urinary tract infection (pyelonephritis or urosepsis), T1DM type 1 diabetes mellitus. Schematic based on the framework of graphical representation for visualizing longitudinal study designs proposed by Schneeweiss et al [50]. The light blue box represents the assessment window for use of the index study medication, the orange boxes represent the assessment windows for the exclusion criteria, the dark blue boxes represent the assessment windows for covariate variables, and the green box represents the follow-up period. aCPRD, registered in an up-to-standard participating general medical practice. HIRD, complete pharmacy and medical coverage in a health insurance plan with no enrollment gaps greater than 30 days; Medicare, enrolled in fee-for-service insurance in Parts A (hospital insurance), B (medical insurance), and D (prescription drug coverage). bCPRD, ages < 18 years; HIRD, ages < 18 or > 64 years; Medicare, ages < 65 years. cMedicare, enrolled because of disability or end-stage renal disease; nonresident of a US state or the District of Columbia; enrolled in managed care coverage. dApplicable only to the cohorts assessing the hALI outcome. eApplicable only to the cohorts assessing the sUTI outcome. fBody mass index, smoking history, and alcohol use were available in CPRD GOLD data but were not available in HIRD or Medicare. gDiscontinuation of the study medication (30 days after the end of the days’ supply of the last consecutive prescription or dispensing), hALI or sUTI event, death, end of patient-specific data or eligibility in each data source or end of the study period, initiation of an SGLT2 inhibitor, diagnosis of T1DM

Table 1 Glucose-lowering drugs eligible for the comparator GLD group.

This study used an active-comparator, new-user design, which enhances comparability of treatment groups by selecting patients at comparable stages in the disease trajectory [27]. “New use” was defined as the first recorded prescription or dispensing of dapagliflozin or an eligible comparator GLD within the study period, with no prior recorded prescription for that medication when all available lookback data before the first prescription or dispensing were assessed (a minimum lookback period of 180 days was required for study eligibility). The date of new use of an eligible treatment was considered the index date, and the period of time a patient continuously remained on the eligible treatment was defined as a treatment episode. A given patient could contribute more than one non-overlapping treatment episode within the study period for different eligible medications. Eligible comparator GLD treatment episodes were matched to dapagliflozin treatment episodes based on calendar year of the index date, age, sex, and geographic region. The comparator GLD to dapagliflozin matching ratio was 6:1 in CPRD and 15:1 in the HIRD and Medicare.

2.2 Exposure Assessment

The primary exposure of interest for each PAS study was the initiation of dapagliflozin or an eligible comparator GLD. Medication use was identified from written prescription records in CPRD GOLD (General Practitioner Online Database) by using Gemscript codes and from pharmacy dispensing records in the HIRD and Medicare by using Generic Product Identifiers (GPI) and National Drug Codes (NDCs), respectively.

Exposure time at risk was defined as the period starting the day after the index date until 30 days after the end of the days’ supply of the last consecutive prescription/dispensing in the treatment episode; this definition is based upon the assumption that any potential risk of each of the outcomes of interest would increase shortly after therapy initiation, remain increased during treatment, and then decrease gradually after treatment discontinuation. The duration of each prescription/dispensing was defined as the days’ supply of the medication plus 30 days. If there was more than one consecutive prescription or dispensing for the index medication separated by gaps of 30 days or fewer, the prescriptions/dispensings were concatenated into one treatment episode; the duration of the treatment episode included the gaps between the prescriptions/dispensings and ended 30 days after the end of the days’ supply of the last prescription/dispensing. A detailed description of the treatment episodes is available in ‘Methods and Results’, Section 1.1, of Online Resource 1.

2.3 Outcome Assessment

Electronic case-finding algorithms tailored to each data source were used to identify each of the outcomes, hALI and sUTI (Table 2). Case-finding algorithms were validated through clinical review of a sample of up to 125 cases for each outcome. In the CPRD electronic medical records database, validation was performed by clinical review of chronological patient profiles and by completed questionnaires from general practitioners. In the HIRD and Medicare claims databases, validation was performed by clinician review of medical records. The algorithm-identified cases in the validation samples were classified as either confirmed cases or non-cases according to guidance published by an FDA Working Group [20] and criteria proposed by Navarro and Senior [28] for hALI cases and according to clinical definitions for pyelonephritis [29] or urosepsis [30] for sUTI cases.

Table 2 Case-finding algorithms for hospitalization for acute liver injury and severe complications of urinary tract infection in CPRD, the HIRD, and Medicare

2.4 Baseline Characteristics and Covariates

Baseline characteristics included demographic and lifestyle characteristics, comorbidities, comedications, and healthcare resource utilization and were assessed on or before the index date for each treatment episode. Baseline characteristics were assessed using all available lookback data, unless otherwise specified. Healthcare resource utilization variables (number of outpatient encounters to a general practice or outpatient clinic, number of hospitalizations, number of emergency department visits, and number of specialty care visits) and comedications were assessed in the 180 days before the index date. All measured baseline characteristics were considered for inclusion in propensity score (PS) modeling as covariates for confounding adjustment, as described below.

2.5 Statistical Analysis

Each of the outcomes, hALI and sUTI, were analyzed in separate outcome-specific cohorts in each data source. The sUTI outcome was analyzed separately for females and males. Data from CPRD and Medicare were analyzed by RTI Health Solutions, and the HIRD data were analyzed by HealthCore, Inc. After the inclusion and exclusion criteria and the matching were applied to generate each of the study cohorts, descriptive analyses were conducted of each cohort by treatment group by calculating frequencies and percentages for categorical variables and by calculating means and standard deviations or medians, interquartile ranges, or minimum and maximum values for continuous variables. The absolute standardized difference was used to assess the balance of baseline characteristics between the dapagliflozin and comparator GLD groups [31].

Potential confounding was addressed by using PS trimming and stratification to identify dapagliflozin and comparator GLD groups with balanced characteristics [32, 33]. The PS modeling approach is described in detail in ‘Methods and Results’, Section 1.2, of Online Resource 1. Briefly, the PSs, or the predicted probability of initiating treatment with dapagliflozin given the observed characteristics, were estimated for each treatment episode by fitting a multivariable logistic regression model, which included (1) dapagliflozin versus comparator GLD exposure as the dependent variable and (2) all baseline variables identified as potential confounders of the association between dapagliflozin and the outcome (hALI or sUTI) as independent variables [31]. All baseline variables were considered for inclusion in the PS models, and included demographic characteristics, lifestyle characteristics, calendar year of the index date, type of index medication (see ‘Methods and Results’, Section 1.5, of Online Resource 1, for a description of the type of index medication categories), number of years since the initial T2DM diagnosis, diabetes severity indicators, comedications, comorbidities, and healthcare utilization variables. Patients with an extreme PS (i.e., below the 2.5th percentile value of the dapagliflozin-exposed distribution and above the 97.5th percentile of the comparator GLD-exposed distribution) were trimmed (i.e., excluded from the analytic cohort). The remaining treatment episodes were ranked by PS value and divided into equally sized strata. Confounding control was assessed by evaluating the balance of covariates between treatment groups within each PS stratum using the absolute standardized difference values.

Incidence and comparative analyses were performed by using algorithm-identified hALI or sUTI events. Propensity score-adjusted incidence rates (IRs) of hALI and sUTI were estimated by first calculating crude IRs across the PS strata within each exposure group; then for each exposure group, the stratum-specific IR was standardized by using the person-years in the dapagliflozin group to estimate the standardized IR and variance, with the 95% confidence intervals (CIs) estimated by using the exact limits method [34].

Incidence rate ratios (IRRs) were calculated in each data source for each outcome by dividing the IR in the dapagliflozin group by the IR in the comparator GLD group, with the 95% CI estimated by using a Poisson distribution. Adjusted IRRs were estimated by calculating the IRR in each PS stratum and then combining the stratum-specific IRRs by using the Mantel-Haenszel method [35]. Database-specific adjusted IRR estimates were pooled to generate an overall adjusted IRR estimate and 95% CI also by using the Mantel-Haenszel method [35] (the detailed methods for the pooled analysis are provided in ‘Methods and Results’, Section 1.3, of Online Resource 1).

Sensitivity analyses were conducted and comprised (1) an extension of the risk window after the end of the medication's days' supply from 30 days to 90 days; (2) inclusion of only dipeptidyl peptidase-4 (DPP-4) inhibitors in the comparator GLD group; (3) inclusion of only glucagon-like peptide-1 (GLP-1) receptor agonists in the comparator GLD group; (4) inclusion of only patients new to the comparator GLD class (i.e., no previous use of a drug within the GLD class); and (5) inclusion of only the first treatment episode for an individual patient. A quantitative bias analysis was performed to assess the possible effect of potential unmeasured confounding variables of various strengths and prevalence on the effect estimate (the quantitative bias analysis is described in detail in ‘Methods and Results’, Section 1.4, of Online Resource 1) [36].

The study protocols for both PAS studies were reviewed and approved by the RTI International Institutional Review Board. For the CPRD and Medicare aspects, the UK Medicines Healthcare products Regulatory Agency’s Independent Scientific Advisory Committee (ISAC) and the US Centers for Medicare and Medicaid Services (CMS) Privacy Board approved the use of the respective data for these studies. The HealthCore-specific components were reviewed and approved by the New England Institutional Review Board. A waiver of informed consent was obtained, as data used in these studies were obtained from databases of anonymized medical records, claims, and pharmacy records and not directly from human subjects.

3 Results

3.1 Hospitalization for Acute Liver Injury Outcome

3.1.1 Baseline Characteristics

There were 129,520 potentially eligible treatment episodes (dapagliflozin or an eligible comparator GLD) in CPRD, 1,060,582 in the HIRD, and 2,474,817 in Medicare. After all inclusion and exclusion criteria and treatment-episode matching were applied, the final number of treatment episodes was 49,639 (dapagliflozin, 10,466; comparator GLD, 39,173) in CPRD, 212,580 (dapagliflozin, 17,187; comparator GLD, 195,393) in the HIRD, and 212,473 (dapagliflozin, 13,280; comparator GLD, 199,193) in Medicare (Table S1 in Online Resource 1).

Baseline characteristics for the treatment groups in the full hALI cohort (i.e., after matching and before PS trimming) are presented in Table 3. The mean age of patients at the index date was 57 years for the dapagliflozin group and 59 years for comparator GLD group in CPRD, 52 and 70 years for both treatment groups in the HIRD and Medicare, respectively. In CPRD (the only data source with available information on body mass index), more dapagliflozin users were obese or severely obese (74%) than comparator GLD users (61%). In all three data sources, a higher proportion of dapagliflozin users than comparator GLD users had concomitant insulin use on the index date, and dapagliflozin initiators were more likely to have used three or more other classes of GLD medications in the year before the index date. The proportion with one or more dispensing/prescription for a drug with a known association with liver injury (for the list of drugs, see Table S2 in Online Resource 1) as well as the prevalence of most baseline medical conditions were similar across the dapagliflozin and comparator GLD groups within each data source.

Table 3 Selected baseline characteristics of cohorts to assess hospitalization for acute liver injury, full sample before propensity score trimming

Propensity score trimming and stratification were effective in achieving balance between the treatment groups for the variables included in the PS models (Fig. S1 in Online Resource 1). For the included covariates, most absolute standardized differences were below 0.10, corresponding to small differences in the distribution of the variable between the dapagliflozin and comparator GLD groups. The baseline characteristics for the hALI cohort after PS trimming are presented in Table S3 in Online Resource 1.

3.1.2 Incidence and Comparative Analyses

The incidence and comparative analyses were performed by using hALI events identified by the case-finding algorithm, which, in CPRD, included more than 20 but fewer than 25 events combined across both treatment groups (CPRD policy does not allow values of 1 to 4 to be reported, which in this case, applies to the small number of hALI events identified in the dapagliflozin group; a range is given to prevent back calculation of the small number of events in the dapagliflozin group). There were 186 and 202 total hALI events in the HIRD and Medicare, respectively. Table 4 presents the number of hALI events, exposure time, and adjusted IRs by exposure group for each data source. The estimated PS-adjusted IR of hALI per 1000 person-years for dapagliflozin and comparator GLD treatment episodes, respectively, was 0.37 (95% CI 0.10–0.93) and 0.62 (95% CI 0.27–1.11) in CPRD, 1.36 (95% CI 0.74–2.28) and 1.83 (95% CI 1.55–2.15) in the HIRD, and 1.92 (95% CI 1.02–3.29) and 1.70 (95% CI 1.42–2.02) in Medicare.

Table 4 Adjusted incidence rates of hospitalization for acute liver injury

The adjusted IRR estimates were below the null value of 1.0 in CPRD, and the HIRD and was slightly above the null value in Medicare (Fig. 2). The overall adjusted hALI IRR estimate (pooled across all data sources) was 0.85 (95% CI 0.59–1.24). The 95% CIs were wide for all adjusted IRR estimates because of a small number of hALI events.

Fig 2
figure 2

Adjusted IRRs of hospitalization for acute liver injury. CI confidence interval, CPRD Clinical Practice Research Datalink, HIRD HealthCore Integrated Research Database, IRR incidence rate ratio

3.1.3 Sensitivity and Bias Analyses

The adjusted IRR results from all sensitivity analyses for hALI in the HIRD and Medicare were generally similar to the primary analysis and, in CPRD, were variable because of a small number of hALI events (Fig. S2 in Online Resource 1). The assessment of the potential impact of possible unmeasured confounders indicates that it is unlikely that an unmeasured hypothetical confounder would mask a harmful association of hALI with dapagliflozin (the results from this analysis are described in detail in ‘Methods and Results’, Section 1.4 and Fig. S7, of Online Resource 1).

The electronic algorithms had low to moderate validity in identifying true cases of hALI in each of the three data sources [37,38,39]; however, simulation analyses to assess the impact of potential outcome misclassification indicated that it is unlikely that an increased risk of hALI associated with dapagliflozin, if it exists, is being masked by outcome misclassification that differs between the treatment groups (data on file with the corresponding author).

3.2 Severe Complication of Urinary Tract Infection Outcome

3.2.1 Baseline Characteristics

After all inclusion and exclusion criteria and treatment-episode matching were applied, the final number of treatment episodes in the sUTI female cohort was 26,315 (dapagliflozin, 5508; comparator GLD, 20,807) in CPRD, 135,299 (dapagliflozin, 10,544; comparator GLD, 124,755) in the HIRD, and 200,976 (dapagliflozin, 12,561; comparator GLD, 188,415) in Medicare; the final number of treatment episodes selected into the male cohort was 36,805 (dapagliflozin, 7610; comparator GLD, 29,195) in CPRD, 160,828 (dapagliflozin, 13,091; comparator GLD, 147,737) in the HIRD, and 204,519 (dapagliflozin, 12,783; comparator GLD, 191,736) in Medicare (Table S4 in Online Resource 1).

Baseline characteristics for the full sample of patients in the sUTI cohorts (i.e., after matching and before PS trimming) are presented in Table 5. The mean age at the index date was similar among males and females within each data source. Among both females and males, the mean age in CPRD was approximately 57 years in the dapagliflozin group and approximately 59 years in the comparator GLD group; the mean age in the HIRD was approximately 52 years for both treatment groups, and in Medicare it was approximately 72 years for both treatment groups. In CPRD, the proportion of users who were obese or severely obese was higher in dapagliflozin users (79% of females, 72% of males) than in comparator GLD users (67% of females, 58% of males). In all data sources for females and males, the prevalence of insulin use at the index date was higher for the dapagliflozin group than the comparator GLD group, and dapagliflozin initiators were more likely to have used 3 or more other classes of GLD medications in the year before the index date. For females and males, a history of kidney disease of all types (acute and chronic) was more common in the comparator GLD group than the dapagliflozin group, but the prevalence of chronic or recurring urinary tract infections was similar between the two treatment groups. Other medical conditions were similarly distributed between the dapagliflozin and comparator GLD groups.

Table 5 Selected baseline characteristics of cohorts to assess severe complications of urinary tract infection, full sample before propensity score trimming, by sex

Similar to the hALI cohorts, PS trimming and stratification were very effective in achieving balance between the treatment groups for the variables in the PS models in both the female and male sUTI cohorts in all the data sources; most PS-stratified values of the absolute standardized difference were below 0.10 (Fig. S3 in Online Resource 1). The baseline characteristics for the sUTI sample after PS trimming are presented in Table S5 in Online Resource 1.

3.2.2 Incidence and Comparative Analyses

The total number of algorithm-identified sUTI events for the incidence and comparative analyses in females was 27 in CPRD, 450 in the HIRD, and 904 in Medicare, and in males it was 25 in CPRD, 230 in the HIRD, and 584 in Medicare. Across the data sources in both females and males, the PS-adjusted IR estimates per 1000 person-years was consistently lower in the dapagliflozin group than in the comparator GLD group (Table 6).

Table 6 Adjusted incidence rates of severe complications of urinary tract infection, by sex

For both females and males, when the IRs of sUTI were compared between the dapagliflozin group and the comparator GLD group, the adjusted IRR estimates were below the null value of 1.0 for all data sources, although the CIs were wide in CPRD due to the small number of sUTI events (Fig. 3). The overall adjusted sUTI IRR estimate pooled across all data sources was similar for females (0.76 [95% CI 0.60–0.96]) and males (0.74 [95% CI 0.56–1.00]).

Fig 3
figure 3

Adjusted IRRs for severe complications of urinary tract infection, by sex. CPRD Clinical Practice Research Datalink, HIRD HealthCore Integrated Research Database, IRR incidence rate ratio

3.2.3 Sensitivity and Bias Analyses

The adjusted IRR results of most sensitivity analyses were consistent with those of the primary analyses of sUTI for both males and females (Fig. S4 in Online Resource 1). The assessment of potential unmeasured confounders for the sUTI analysis had similar results to those for the hALI analysis (the results from this analysis are described in detail in ‘Methods and Results’, Section 1.4 and Fig. S8, in Online Resource 1).

The case-finding algorithms had moderate validity in identifying true cases of sUTI in each of the three data sources [37,38,39], and, similar to hALI, simulation analyses indicate that it is unlikely that a potential increased risk of sUTI associated with dapagliflozin, if it exists, is being masked by misclassification of sUTI that differs between the treatment groups (data on file with the corresponding author).

4 Discussion

Safety labeling for new prescription drugs is often based on a limited number of events observed in the relatively short time frame of clinical trials. Compared with clinical trials, PAS studies, such as those reported here, can be conducted over longer time spans and with larger base populations that are fully representative of the target population and can clarify the magnitude of these outcomes in the treated populations in a real-world setting.

The two large observational PAS studies reported here, comprising three healthcare databases with 3–6 years of observation per database, found no increased risk of hALI or sUTI associated with dapagliflozin exposure compared with other GLD medications. The assessment of hALI included over 28,000 person-years of dapagliflozin exposure, and the assessment of sUTI included over 17,000 and 22,000 person-years of dapagliflozin exposure in females and males, respectively; however, because of a small number of hALI and sUTI events, the database-specific IRR estimates were imprecise, with wide CIs. For hALI, the pooled adjusted IRR estimate was 0.85 (95% CI 0.59–1.24), while for sUTI, the pooled adjusted IRR estimate was 0.76 (95% CI 0.60–0.96) for females and 0.74 (95% CI 0.56–1.00) for males. The pooled adjusted IRR point estimates for both outcomes were below the null value of 1.0, suggesting a decrease in risk associated with dapagliflozin exposure compared with other GLD medications. However, the 95% CI estimates are imprecise and compatible with the null value.

There is scant published literature reporting risk of acute liver injury associated with SGLT2 inhibitors. In prelicensure clinical studies, no SGLT2 inhibitor medications (canagliflozin, dapagliflozin, empagliflozin, ertugliflozin) were associated with acute liver injury [18]. The Dapagliflozin Effect on CardiovascuLAR Events (DECLARE) trial – a randomized, double-blind, placebo-controlled, Phase 3 trial of dapagliflozin in patients with T2DM and with either known cardiovascular disease or two or more risk factors for cardiovascular disease in addition to T2DM – assessed 17,160 patients during a median of 4.2 years with about 30,000 person-years of exposure to dapagliflozin. As part of the evaluation of secondary outcomes, the occurrence of hepatic events was reported to be similar in the dapagliflozin group compared with the placebo group (hazard ratio [HR], 0.92; 95% CI 0.68–1.25) [40, 41]. This estimate is similar to the pooled adjusted IRR estimate for hALI in the current study (0.85; 95% CI 0.59–1.24). The metabolism of SGLT2 inhibitors in the liver is minimal and could partially explain the relative lack of reporting of hepatotoxicity with dapagliflozin [18].

Overall, for both males and females, we found no increased risk for severe complications (pyelonephritis and/or urosepsis) of urinary tract infection associated with dapagliflozin, and instead, the results suggested a decreased risk when compared with other GLDs. In a previous analysis of safety data pooled from 13 placebo-controlled trials of dapagliflozin of up to 24 weeks’ duration, the occurrence of urinary tract infections (as adverse events or serious adverse events) was slightly higher with dapagliflozin (4.7%) than with placebo (3.5%), with a higher frequency in women than men in both treatment groups [42]. Similarly, a recent meta-analysis of clinical trials assessing safety outcomes in users of SGLT2 inhibitors reported that dapagliflozin users had an increased risk of urinary tract infection compared with users of other non-SGLT2 inhibitor comparator treatments (relative risk: 1.42; 95% CI 1.07–1.87) [43]. However, urinary tract infections associated with dapagliflozin use are typically mild and respond well to standard of care treatment [22, 23, 44]. In the large, Phase 3 DECLARE trial, the incidence of urinary tract infections that occurred as serious adverse events or led to treatment discontinuation was similar in the dapagliflozin (1.5%) and placebo (1.6%) groups (HR, 0.93; 95% CI 0.73–1.18) [40, 41]. Published population-based observational studies using real-world data have demonstrated similar or lower rates of urinary tract infection outcomes with SGLT2 inhibitors as a group compared with other second-line or later-line therapy (DPP-4 inhibitors or GLP-1 receptor agonists) [45,46,47,48,49]. One study, performed in two US healthcare claims databases, assessed a composite outcome (hospitalization for primary urinary tract infection, sepsis with urinary tract infection, or pyelonephritis) similar to our outcome of sUTI, and found no increased risk associated with SGLT2 inhibitors compared with DPP-4 inhibitors (HR, 0.57; 95% CI 0.29–1.14) or GLP-1 receptor agonists (HR, 0.52; 95% CI 0.28–0.97) [45]. Similarly, another study that used data from Canadian healthcare administrative databases and CPRD reported no increased risk of urosepsis when comparing new users of SGLT2 inhibitors to new users of DPP-4 inhibitors (pooled adjusted HR: 0.58; 95% CI 0.42–0.80), and this finding was consistent when dapagliflozin was assessed alone (HR, 0.63; 95% CI 0.36–1.12) [46]. Our study builds on these previous studies by examining the effects of dapagliflozin alone compared with other GLDs on pyelonephritis and urosepsis, providing separate estimates for females and males across three large population-based data sources in the UK and the USA. Mild and uncomplicated urinary tract infections were not the focus of our study.

An important strength of these studies is the large, population-based sample of dapagliflozin users, which comprised over 40,000 and 62,000 new dapagliflozin treatment episodes across the data sources for assessing hALI and sUTI, respectively. Matching and PS trimming and stratification were used to address confounding for observed covariates and resulted in well-balanced treatment groups; however, in observational studies that use data collected for other purposes (i.e., electronic medical record or health insurance billing claims), bias can occur from confounding factors that cannot be measured in the data source. Lifestyle variables (i.e., body mass index, smoking status, and alcohol use), which may have been potential confounding variables, were only available in the CPRD data and thus could not be directly accounted for in the HIRD or Medicare data. However, results from CPRD, which included lifestyle variables, were largely similar to those of the other two databases, and quantitative bias analyses suggested that unmeasured confounders would be required to be either quite strong or highly imbalanced between the treatment groups to mask a truly elevated association between dapagliflozin and either of the study outcomes. In CPRD, full prescription information, including prescriptions from specialists, is not available, and, therefore, it is possible that if initial prescriptions were provided by specialist physicians, rather than general practitioners, the first use of dapagliflozin or comparator GLD would not be properly captured. Similarly, inpatient administration of medications in the HIRD and Medicare was not captured. No databases recorded the use of over-the-counter medication. Furthermore, IRR estimates for each outcome in the individual data sources were imprecise due to the low number of hALI and sUTI events, particularly among dapagliflozin users.

5 Conclusions

Results from two large, robust multi-year real-world studies do not suggest an increased risk of hospitalization for acute liver injury (> 28,000 person-years of dapagliflozin exposure) or severe complications of urinary tract infection (about 40,000 person-years of dapagliflozin exposure) associated with new exposure to dapagliflozin compared with new exposure to other GLDs. The results from these PAS studies, combined with evidence from the DECLARE trial on dapagliflozin [40, 41], contributed to the removal of liver injury and urinary tract infection as important identified risks in the dapagliflozin risk management plan in Europe.