The rapid evolution of diabetes pharmacotherapy and the availability of many glucose-lowering medication (GLM) classes for the treatment of type 2 diabetes (T2D) have made the choice of second-line agents after metformin an difficult task [1]. Ideally, patient characteristics should be matched with drug modes of action, favorable effects, and side effects. However, this is not always possible because of restrictions in drug indications, availability, reimbursement, and contraindications. Furthermore, translating results of phase III randomized controlled trials (RCTs) to clinical practice may be problematic owing to the many differences between trial and routine care settings [2].

In many countries, the two most popular second-line GLMs for the treatment of T2D are sulfonylureas (SUs) and dipeptidyl peptidase 4 inhibitors (DPP4i). RCTs that directly compared drugs of these classes showed that SUs tend to be more effective than DPP4i in reducing HbA1c over the short term, but that such differences are mostly lost in the long run [3]. Rather, compared with SU, DPP4i have been associated with a markedly lower risk of hypoglycemia and a mild benefit in the control of body weight [4]. In addition, while three large placebo-controlled RCTs support the cardiovascular safety of DPP4i [5,6,7], SUs have been linked with an increased risk of cardiovascular events and mortality, although such data derive mainly from observational studies [8, 9]. Gliclazide is often considered the preferred SU, because it has been associated with greater than 50% less hypoglycemia risk [10] and a safer cardiovascular risk profile [11] compared with other SUs. Furthermore, gliclazide is by far the mostly widely used SU in Italy [12] and its relative cardiovascular safety has been recently confirmed in the Italian TOSCA.IT study [13].

Despite SUs and DPP4i being very popular GLMs, there is a striking paucity of real-world studies comparing the effectiveness of these drugs in routine clinical practice. This is particularly important since SUs are still perceived as highly effective drugs for the control of hyperglycemia and their low cost makes them particularly attractive for healthcare systems with limited resources. Thus, comparative real-world studies on these drugs may complement information from RCTs and inform on therapeutic appropriateness.

The DARWIN-T2D was a multicenter retrospective study conducted on electronic medical records containing clinical data, performed at 46 diabetes specialist outpatient clinics in Italy [14]. We herein report results of a subanalysis comparing the effectiveness of DPP4i versus gliclazide on glycemic and extra-glycemic endpoints.


Data Source

The main objective of the DARWIN-T2D study was to describe the clinical characteristics and the changes from baseline in glycemic and extra-glycemic effectiveness parameters in patients newly treated with the SGLT2 inhibitor dapagliflozin, a DPP4i, gliclazide, or a GLP-1 receptor agonist [14]. The study was conducted at 46 Italian diabetes outpatient clinics. The detailed study protocol and the primary results have been previously published [14]. Results of the study indicated a significant channelling of different patients towards different GLMs, and an overall low common support between patients receiving dapagliflozin and other GLMs [12]. The largest common support of propensity scores was detected for patients starting DPP4i and patients starting gliclazide, thereby providing a rationale for comparing effectiveness of such drugs.

Patients were retrospectively included if they were aged 18–80 years, had a diagnosis of T2D for at least 1 year, and were newly prescribed with a full-dose DPP4i (per protocol, linagliptin was excluded [14]), or with gliclazide extended release at a daily dose of 30 mg or higher. Exclusion criteria were a diagnosis of type 1 diabetes and age less than 18 or greater than 80 years.

Dedicated software automatically extracted all relevant clinical data (demographics, anthropometrics, blood pressure, HbA1c, fasting plasma glucose, lipid values, liver enzymes, renal function, history of complications, and medications) at baseline and at the first available follow-up visit, 3–12 months after baseline. LDL cholesterol levels were calculated using Friedewald’s equation [15], whereas eGFR (estimated glomerular filtration rate) was computed using the CKD-EPI equation [16]. Microangiopathy was defined as the presence of an albumin excretion rate greater than 30 mg/24 h or mg/g of creatinine, an eGFR less than 60 ml/min/1.73 m2, diabetic neuropathy (either somatic or autonomic), diabetic retinopathy (any grade), or maculopathy. Macroangiopathy was defined as the presence of a history of myocardial infarction or stroke/transient ischemic attack, peripheral arterial disease, surgical or endovascular revascularization (any site), or a diagnosis of asymptomatic atherosclerosis. We retrieved information on all concomitant medications and on the entire history of GLM use to define whether patients were being prescribed DPP4i or gliclazide as second-line agents after metformin (i.e., had been treated only with metformin) or after failure of at least another GLM different from metformin.

The primary effectiveness endpoint was the change from baseline in HbA1c. Secondary endpoints were changes from baseline in fasting plasma glucose, body weight, and systolic blood pressure. We excluded patients without a follow-up examination, those with missing data for the primary outcome at baseline or follow-up, and those initiating DPP4i and gliclazide at the same visit (because the effect could not be attributed to one or the other). All data were extracted automatically from the same electronic chart system (MyStar Connect, Me.Te.Da.).

Multiple Imputation and Propensity Score Matching

For a comparative analysis of effectiveness, we used propensity score matching (PSM), one of the most popular methods to estimate treatment effects in observational studies [17]. In a trade-off between unconfoundedness and precision, the following baseline covariates were chosen for PSM as they are expected to affect outcomes and therapy assignment: age, gender, diabetes duration, BMI, body weight, systolic and diastolic blood pressure, FPG, Hb1Ac, total cholesterol, HDL cholesterol, triglycerides, aspartate aminotransferase, alanine aminotransferase, eGFR, insulin as associated therapy, metformin as associated therapy, use of DPP4i or gliclazide as second-line therapy, microangiopathy and macroangiopathy. Presence of missing data was handled with multiple imputation (MI), as previously described [18]. Outcomes and selected variables were used as predictors in MI models [19]. A PS model was fitted on each imputed data set and the final individual PS value was computed as the average of all the subject PS values obtained in each imputed data set. Then, PS values were used to create a matched set of individuals from the original non-imputed data set. Matching was performed with 1:1 ratio, i.e., each subject treated with DPP4i was matched with only one subject treated with gliclazide, using a genetic algorithm, without replacement. Covariate balance after matching was evaluated using standardized mean difference across group of treatment and standardized mean differences of the square of continuous variables. Balance was achieved if standardized difference was less than 0.1. Outcome analysis was conducted on a matched set of individuals obtained after PSM. Effect of treatment on outcomes was evaluated with adjusted linear regression models, with confidence intervals computed using a robust sandwich estimator. More details on MI and PSM can be found in the Online Supplementary Material.

Statistical Analysis

Except where otherwise specified, data are presented as mean ± standard deviation or as percentage, as appropriate. Comparisons between the two groups of patients (e.g., those receiving DPP4i and those receiving gliclazide) were performed using the 2-tail unpaired Student t test for continuous variables, or the chi-square test for categorical variables. Differences in clinical characteristics between matched cohorts were better analyzed using standardized bias than using p values, as previously suggested [13]. Comparisons in continuous variables among more than two groups was performed using ANOVA with Bonferroni correction. Evaluation of within-group changes in outcome variables was performed using the 2-tail paired Student’s t test. Changes from baseline in outcome variables were calculated for each group as data collected at follow-up minus data collected at baseline, and compared using the 2-tail unpaired Student t test. To analyze the time trend of HbA1c reduction in the two groups, we divided the 9-month observation window (3–12 months after baseline) into five equal periods and assigned each patient to the relevant follow-up duration. Statistical significance was accepted at p < 0.05.

Compliance with Ethics Guidelines

The study was approved by the ethical committee of each participating center. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. Since the study was performed retrospectively on an anonymized database, no patient consent was required.


Study Population

The study flowchart is shown in Fig. 1. Between 15 March 2015 and 31 December 2016, we collected baseline data on 6594 T2D patients who initiated therapy with a DPP4i (53.2% sitagliptin, 22.9% alogliptin, 20.6% vildagliptin, 3.3% saxagliptin) and 5960 patients who initiated therapy with gliclazide. Of these, 2999 patients treated with DPP4i (45.5%) and 2111 patients treated with gliclazide (35.4%), had a follow-up visit available between 3 and 12 months after baseline; 589 patients treated with DPP4i and 521 treated with gliclazide were excluded for missing data for the primary outcome or because they initiated both drugs at the same time (n = 151). Data on the remaining 2410 DPP4i users and 1590 gliclazide users are shown in Table 1. Patients newly treated with gliclazide versus those newly treated with DPP4i had longer disease duration, higher body weight, BMI, systolic blood pressure, HbA1c, fasting plasma glucose, triglycerides, and liver enzymes, and lower HDL cholesterol and eGFR. Patients starting gliclazide also had a higher prevalence of microangiopathy and less frequent use of metformin than patients starting a DPP4i.

Fig. 1
figure 1

Study flowchart. DPP4i dipeptidyl peptidase 4 inhibitors, PSM propensity score matching

Table 1 Baseline clinical characteristics of study patients

Within-Group Effectiveness Analysis

After a median follow-up of 6.1 months (IQR 5.5–6.7), in patients who received a DPP4i, HbA1c declined by 0.6%, fasting plasma glucose declined by 11.4 mg/dl, body weight declined by 0.5 kg, with no significant change in systolic blood pressure (Table S1). Among DPP4i, no significant difference was observed in the change from baseline in HbA1c, fasting plasma glucose, and systolic blood pressure, while reductions in body weight were larger for sitagliptin and alogliptin than for vildagliptin (Fig. S1).

After a median follow-up of 6.2 months (IQR 4.8–7.1), in patients who received gliclazide, HbA1c declined by 0.6% and fasting plasma glucose declined by 14.5 mg/dl, while no significant change was observed for body weight and blood pressure (Table S1).

Comparison of Propensity Score Matched Groups

PSM was performed on a predefined set of variables, which were considered to be clinically relevant for the outcome and therapy assignment. After MI, PSM identified 1316 patients in each group, who were well balanced for all clinical variables except concomitant use of basal insulin, which was significantly more common in patients starting a DPP4i (17.4%) versus those starting gliclazide (13.2%) (Fig. S2).

In matched cohorts (Fig. 2), the change from baseline in HbA1c was significantly higher in patients starting DPP4i than in those starting gliclazide (− 0.6 ± 1.1% versus − 0.4 ± 1.2%; p < 0.001). The same was true for fasting plasma glucose (− 14.1 ± 43.5 mg/dl versus − 8.8 ± 46.2 mg/dl; p = 0.007) and body weight (− 0.4 ± 3.3 kg versus − 0.1 ± 2.9 kg; p = 0.006), while the between-group difference in the change from baseline in systolic blood pressure did not reach statistical significance (− 1.5 ± 19.8 mmHg versus 0.3 ± 18.9 mmHg; p = 0.056).

Fig. 2
figure 2

Comparative effectiveness in matched cohorts. Baseline, follow-up data, and the change from baseline are shown for the primary outcome (HbA1c, a) and for secondary outcome measures: fasting plasma glucose (FPG, b), body weight (c), and systolic blood pressure (SBP, d). *p < 0.05 for the indicated comparisons. Bars indicate standard error

The use of concomitant GLM did not significantly change at follow-up compared to baseline in either group (Fig. S3).

By dividing patients according to distance between baseline and follow-up, we simulated a time course of HbA1c reduction: while patients who received gliclazide show a progressively lower glycemic effect with longer follow-up, such loss of effectiveness was not observed with DPP4i (Fig. S4).

After we adjusted for concomitant insulin use, patients starting DPP4i still showed greater reductions in HbA1c and body weight than those starting gliclazide (Fig. S5). In models fully adjusted for basal insulin, each variable at baseline (either linearly or non-linearly modelled), and interaction terms, DPP4i proved superior to gliclazide in reducing HbA1c, FPG, body weight, and systolic blood pressure (Table S2).

Figure 3 shows the changes in HbA1c, FPG, body weight, and systolic blood pressure in the matched cohorts of patients with or without concomitant insulin therapy and in those initiating DPP4i or gliclazide as second-line therapy after metformin or as a more advanced line of therapy. Patients starting DPP4i experienced a stronger HbA1c reduction than patients starting gliclazide, irrespective of background insulin therapy, but the between-group difference was significantly larger in insulin-treated patients. Only in patients who were on insulin therapy, DPP4i reduced fasting plasma glucose more than gliclazide. Vice versa, only in patients who were not on insulin, DPP4i reduced body weight more than gliclazide. An interaction between type of new prescription (DPP4i vs gliclazide) and background insulin therapy in determining the change from baseline in HbA1c, fasting plasma glucose, and body weight was confirmed upon a multivariable analysis (Table S2).

Fig. 3
figure 3

Comparison of effectiveness according to concomitant and previous therapy. The changes in HbA1c (a), fasting plasma glucose (FPG, b), body weight (c), and systolic blood pressure (SBP, d) are shown for the entire cohorts of matched patients (all) or according to the presence or absence of concomitant basal insulin therapy, and whether DPP4i or gliclazide was being used as second-line therapy after metformin. *p < 0.05 for the indicated comparisons. Bars indicate standard error

In patients who started DPP4i or gliclazide as second-line therapy after metformin, no significant difference was noted in the changes from baseline in HbA1c, fasting plasma glucose, body weight, and systolic blood pressure. Rather, in patients starting DPP4i or gliclazide as third or more advanced line of therapy, reductions in HbA1c, fasting plasma glucose, body weight, and systolic blood pressure were greater with DPP4i than with gliclazide.


This large retrospective real-world study demonstrates that, in routine diabetes outpatient clinical practice, addition of a DPP4i to the ongoing therapy improved glucose control more than addition of gliclazide. This was particularly true in patients who were receiving DPP4i or gliclazide after having failed with at least another GLM different from metformin and in those who were on basal insulin.

These findings contrast with results of phase III RCTs comparing DPP4i with SUs, which show that DPP4i are less effective than SUs in reducing HbA1c in the short term and are non-inferior to SUs in the long term. A few reasons can explain these results. First, the broader population of patients included in this real-world study differs from that of phase III RCTs, especially in terms of age, history of GLM use, complication burden, and overall heterogeneity. Second, phase III RCT protocols require that SUs are uptitrated to the maximal tolerated dose, which is rarely reached in clinical practice, especially in aged patients with chronic complications. While we do not have full information about dose titration in the DARWIN-T2D study, gliclazide extended release is usually prescribed at the initial daily dose of 30–60 mg and not uptitrated until the following visit. Indeed, cross-sectional data from pilot study centers indicate that the average daily gliclazide dose was 49 mg. Thus, it is possible that DPP4i allowed a better improvement in glucose control because gliclazide was being used at a relatively low dose. As doses of gliclazide, but not of DPP4i, can be uptitrated over time, it is possible that a longer observation would have produced different results. It is nonetheless remarkable that, at doses used in routine clinical practice, initiation of DPP4i provides a better glycemic control than initiation of gliclazide at the first follow-up visit after about 6 months.

Of note, even in the larger cohort of patients before PSM, the various DPP4i provided similar improvements in HbA1c and fasting glucose and differed only in the change in body weight. These data indicate that, from a practical perspective, DPP4i do not differ in their glycemic effectiveness.

Interestingly, by dividing study patients according to the time elapsed from baseline to follow-up visits, we simulated a time course of HbA1c change. Although this approach cannot be equated to a longitudinal observation in the same patients, it suggests that gliclazide lost effectiveness over time more than DPP4i, a trend present also in long-term phase III RCTs.

Remarkably, the history of previous glucose-lowering therapy had an impact on the comparative effectiveness. In patients who were prescribed DPP4i or gliclazide as second-line drugs after metformin, i.e., who had received no GLM other than metformin, the improvement in glucose control was similar with the two treatment regimens. This is more in line with results of phase III RCTs, wherein patients with T2D, usually of short duration, uncontrolled on metformin monotherapy, were randomized to DPP4i or SUs. However, in patients in whom at least another GLM had failed, DPP4i was superior to gliclazide in improving HbA1c, FPG, body weight, and blood pressure. Furthermore, DPP4i retained a significant glucose-lowering effect when added to combination of basal insulin and oral therapy, which was greater than the effect of gliclazide. The previous history of GLM use may be more reflective of residual beta-cell function than disease duration. Although both DPP4i and gliclazide stimulate endogenous insulin secretion, DPP4i exert a more physiological meal-dependent action and may be more able to improve beta and alpha cell function [20].

These findings have clinical implications for individualization of therapy based on patients’ history, indicating that DPP4i can be effective also in a more advanced disease stage.

Although DPP4i are associated with a markedly lower risk of hypoglycemia than SUs, information on hypoglycemic events are not yet available in the DARWIN-T2D database and it is therefore impossible to weigh the benefits of glucose control against the risk of hypoglycemia. Future real-world studies combining clinical and administrative data on hospital discharge codes will be useful to address this issue.

In addition to the glucose-lowering potency and the risk of hypoglycemia, there is great focus on cardiovascular effects of drugs for the treatment of T2D. The ongoing CAROLINA trial, comparing linagliptin versus glimepiride, will shed light on cardiovascular outcomes with DPP4i versus SUs [21].

This study has limitations inherent to its retrospective and non-randomized design. Therefore, the level of evidence arising from these data cannot be equated to that of RCTs. The risk of confounding by indication and reverse causality always limit interpretation of the comparisons between therapeutic strategies in observational studies. To address this issue, we used PSM to obtain matched cohorts of patients and simulate a quasi-experimental design. With this tool, it is possible to emulate the conditions of an RCT with respect to the observed baseline characteristics. By PSM, we have been able to obtain well-balanced groups, except for a residual difference in the rate of concomitant basal insulin use, at a magnitude that may not be clinically relevant. Nonetheless, to account for this residual confounding, the outcome analysis was adjusted for insulin use or presented separately for insulin users and non-users. Importantly, however, for as good as a PSM can be, it does not guarantee equal distribution of unmeasured variables, making the issue of residual confounders unresolved. Data missingness was addressed with MI, but we decided not to impute missing outcome variables. Since some patients had to be excluded from the matched cohorts because of missing values in secondary outcome variables, results for FPG, body weight, and blood pressure have to be considered with more caution than results for HbA1c. Finally, since only a fraction of the initial patient cohort could be matched, results apply only to patients with the baseline clinical characteristics obtained after PSM.

On the other hand, the study has remarkable strengths. These include the large sample size, the extensive patient characterization, the multicenter nature with nationwide distribution, the rigorous consideration of biases, and the automatic data extraction from the same electronic chart, which guarantees reproducibility, uniform data coding, and low reporting bias.


Addition of a DPP4i to an ongoing glucose-lowering regimen in Italian diabetes specialist outpatient clinical practice improved glucose control more than addition of gliclazide. Although gliclazide was being used at submaximal doses and confounding cannot be definitely ruled out, these data confute the general belief that initiation of SUs is highly effective in reducing HbA1c and provide a rationale for pragmatic trials comparing DPP4i and SU in a routine clinical setting.