Introduction

Patients with type 2 diabetes mellitus (T2DM) often require multiple therapies to achieve glycemic control. Combination therapy with a dipeptidyl peptidase-4 (DPP-4) inhibitor and metformin or sulfonylurea (i.e., dual therapy) results in substantial and additive glucose-lowering effects in patients with T2DM. Alogliptin (Vipidia®) is the latest DPP-4 inhibitor to be licensed in the UK and the fifth agent of its class to be licensed. Alogliptin is licensed for the treatment of T2DM in combination with other glucose-lowering agents including insulin. The safety and efficacy of alogliptin as monotherapy and combination therapy in patients with T2DM have been evaluated in numerous clinical trials [1]. In a multicenter, randomized, double-blind, placebo-controlled study, Nauck et al. [2] assessed the efficacy of adding alogliptin to metformin therapy in patients with T2DM and inadequate glycemic control, for 26 weeks. The addition of alogliptin produced a significantly greater decrease in HbA1c (−0.60%) at the SmPC recommended dose of 25 mg once daily (qd) when compared to placebo (P < 0.001). Rapid and significant fasting plasma glucose (FPG) reduction from baseline was also perceived as early as week 1 and continued through the length of trial to week 26 for alogliptin 25 mg qd versus placebo [2]. Pratley et al. [3] evaluated combination therapy of alogliptin added to glyburide (a sulfonylurea, SU) in 500 patients with T2DM inadequate controlled on SU monotherapy and showed significant HbA1c reductions of alogliptin 25 mg (−0.53%) compared with placebo (+0.01%; P < 0.001). Reductions were seen as early as 4 weeks and continued through the 26-week period. More patients in the alogliptin 25 mg group achieved HbA1c reductions >0.5% (26.3% with placebo and 50.5% with 25 mg of alogliptin; P < 0.001).

DPP-4 inhibitors also exert clinically relevant glucose-lowering effects as oral triple therapy with a good tolerability profile when added to metformin plus SU as shown in randomized clinical trials. Sitagliptin 100 mg once daily (qd) significantly improved glycemic control and β-cell function in patients with T2DM who had inadequate glycemic control with glimepiride plus metformin therapy [4]. Similarly adding linagliptin 5 mg qd [5], vildagliptin 50 mg twice daily (bid) [6], and saxagliptin 5 mg qd [7] to metformin in combination with SU significantly improved glycemic control in T2DM patients and all were well tolerated. Adding alogliptin 25 mg qd to a metformin–pioglitazone regimen provided superior glycemic control and potentially improved β-cell function versus uptitrating pioglitazone in T2DM patients, with no clinically important differences in safety [8].

There has been no study specifically evaluating alogliptin in triple therapy when added to metformin and SU. The EXAMINE trial was a phase 3, multicenter, randomized, double-blind, placebo-controlled study designed to demonstrate non-inferiority of alogliptin versus placebo with respect to a composite of major adverse cardiac events (MACE) in high-risk patients with T2DM. A total of 5380 patients were randomized to either alogliptin (N = 2701) or placebo (N = 2679) [9]. A substantial population in EXAMINE entered on dual therapy with metformin and SU with alogliptin or placebo added to this dual therapy (N = 1398; alogliptin = 693, placebo = 705) and were followed for up to 40 months (median 18 months) [10]. A post hoc analysis of this subgroup data has been performed [10].

For all patients on metformin + SU at baseline, characteristics were similar for the alogliptin and placebo groups (mean HbA1c, 8.14%). By the end of the study period, the least square (LS) mean difference for change from baseline of HbA1c was −0.52% (P < 0.001) [10]. The alogliptin and placebo groups did not differ in the percentage of patients with at least one adverse event (75.2% alogliptin and 79.6% placebo) or serious adverse events (28.3% alogliptin and 32.1% placebo). There was no significant difference in the incidence of any report of hypoglycemia (8.8% alogliptin and 6.7% placebo, P = 0.161) or serious hypoglycemia (1.30% alogliptin and 0.43% placebo, P = 0.088) [10]. These data demonstrate that triple therapy with alogliptin, metformin, and SU was effective and well tolerated.

Head-to-head comparisons between the DPP-4 inhibitors are uncommon, and only one trial comparing the efficacy of saxagliptin 5 mg qd and sitagliptin 100 mg qd added to metformin in T2DM patients (i.e., dual therapy) has been published to date and demonstrated non-inferiority in the primary efficacy endpoint of change in HbA1c from baseline. The safety profile was similar for the two DPP-4 inhibitors, with modest weight loss and almost no increase in the incidence of reported or documented hypoglycemic episodes [11]. There currently exists no comparative trial evidence assessing the relative efficacy and safety of alogliptin as a third-line treatment option added to dual therapy with metformin and SU. Several network meta-analyses (NMA) have been performed of different classes of drug treatment for T2DM in triple therapy (following failure with metformin + SU) including DPP-4 inhibitors, although none have compared the relative efficacy of each DPP-4 treatment or included alogliptin within the DPP-4 inhibitor class [1215]. This is because of a previous lack of available published triple therapy data for alogliptin. However, the recent availability of new triple therapy subgroup analysis of the alogliptin EXAMINE trial has enabled an NMA of the relative efficacy and safety of alogliptin versus comparator DPP-4 inhibitors added to metformin and SU to be performed.

Hence, the objective of this study was to perform a systematic review and NMA using Bayesian methods of the relative efficacy and safety of alogliptin 25 mg qd added to metformin and SU dual therapy, compared to other DPP-4 inhibitors added to metformin and SU dual therapy. The analysis has been performed using a clinical decision-focused approach and so covers those individual comparators that are most likely to be displaced by the introduction of a new DPP-4 agent, alogliptin, for use in triple therapy, which are the current alternative DPP-4 inhibitors used in clinical practice in the UK (i.e., linagliptin, saxagliptin, sitagliptin, and vildagliptin), rather than including other classes of anti-T2DM therapies less likely to be displaced by alogliptin in clinical practice such as GLP-1, SGLT-2, or TZD agents. From a methods perspective this approach minimizes noise and heterogeneity in the network by focusing on the comparator evidence of direct relevance to clinical decision-making in a UK context.

Methods

This article is based on previously conducted studies and does not involve any new studies of human or animal subjects performed by any authors.

Systemic Literature Search: Identification of Trials

A systemic literature review (SLR) was conducted relevant to a UK clinical and Health Technology Assessment Database (HTA) decision-making context for assessing the relative efficacy and safety of alogliptin versus other DPP-4 inhibitors in combination with metformin and sulfonylurea. Hence, the SLR covered triple therapy studies for alogliptin and other DPP-4 inhibitors (linagliptin, saxagliptin, sitagliptin, and vildagliptin), used in triple therapy (i.e., in combination with metformin and sulfonylurea) in the treatment of patients with T2DM in UK clinical practice. The SLR was conducted in accordance with guidance recommended by the National Institute of Health and Care Excellence (NICE) [16] as well as the methodological principles recommended by the University of York Centre for Reviews and Dissemination [17], and is reported according to the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) [18]. MEDLINE, EMBASE, Medline-In-Process, and other non-indexed citations (including PubMed records), Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews (CDSR), Database of Abstracts of Reviews of Effectiveness (DARE), HTA, and ISI Web of Science-Proceedings. In addition, bibliographies of reviews, retrieved articles, key conference abstracts, and other relevant sources (e.g., ClinicalTrials.gov, EU Clinical Trials Register) were searched for other studies. The full search strategies are available in the supplementary material (Tables S1.1–S1.4). All electronic databases were searched between 29 and 31 January 2016. All searches were limited to English-language publications. The EXAMINE triple therapy subgroup analysis data [10] was identified and provided by Takeda as the abstract had been accepted to be presented at the American Diabetes Association (ADA) in New Orleans, LA, USA from10 to 14 June 2016.

The search strategy was built around the PICOS framework (Population, Interventions, Comparators, Outcomes, and Study designs). The PICOS framework allows for all possible combinations of search terms. Key inclusion criteria were as follows: RCTs reporting adult patients (≥18 years) with T2DM treated with alogliptin + metformin + SU or a DPP-4 + metformin + SU triple therapy only; RCTs reporting one or more of the following outcomes: mean change from baseline in HbA1c, FPG, body weight, BMI, the occurrence of at least one hypoglycemic event, and treatment discontinuation due to any adverse event; RCTs that contained the SmPC recommended daily dose of alogliptin and the other DPP-4 inhibitors of interest: alogliptin 25 mg qd, saxagliptin and linagliptin 5 mg qd, sitagliptin 100 mg qd, and vildagliptin 50 mg bid; RCTs that include comparisons between the DPP-4 agents of interest and placebo + metformin + SU, or form head-to-head trials versus other DPP-4 agents. Outcomes were to be reported at 24 ± 6 weeks. This time period was sufficiently long to include all studies that reported outcomes at 24 weeks, as well as the alogliptin EXAMINE study which reported efficacy assessment at 6 months (26 weeks) [10].

The study selection consisted of two stages: a first-pass screening of titles and abstracts according to the PICOS criteria, followed by a second-pass screening of full-text articles according to the inclusion criteria for consideration in the NMA. Both stages were conducted by two reviewers with a third reviewer checking the selection of full-text articles. Any disagreements regarding the inclusion/exclusion of articles were discussed until a consensus was reached. Full-text articles were obtained, and the reference lists of these studies and systematic reviews were also hand searched to check and identify any further publications likely to be of relevance. The study selection process was documented detailing reasons for inclusion and exclusion for consideration in the NMA. Data extraction was performed by one reviewer and verification by a second. The relevance of each study was assessed according to the inclusion/exclusion criteria set out previously in the systematic literature review protocol. Study quality was assessed using a question checklist adapted from that used in the NICE Single Technology Appraisal specifications checklist for manufacturer/sponsor submission of evidence for assessing the internal validity and the quality of reporting in the published RCT study (see Table S2 in the supplementary material) [19].

Outcome Measures

The primary efficacy outcome considered was HbA1c change from baseline. In addition, body weight change from baseline and change in fasting plasma glucose (FPG) from baseline were included, along with two safety outcomes: incidence of hypoglycemia, and discontinuations due to adverse events. According to recommendations from the American Diabetes Association (ADA), World Health Organization (WHO), and American College of Endocrinologists (ACE), HbA1c level is considered the “gold standard” in assessment of metabolic control and the specific level of HbA1c constitutes the target at which treatment of both type 1 and 2 diabetes mellitus should be aimed [20]. All three organizations also underline the importance of normalizing FPG, and the relationship this endpoint has with HbA1c to improve glycemic control. In addition, the relationship between weight loss and glycemic improvement has been seen in several observational studies as well [2123].

Mean values and associated measures of variability [standard deviation (SD), standard error (SE), confidence intervals (CIs), and P values] were extracted for all endpoints. Where studies did not explicitly report standard errors for continuous endpoints these were imputed according to the recommendations detailed in the Cochrane handbook for systematic reviews of interventions [24]. Within the study by Lukashevich et al. [6], an image depicting the adjusted mean change (and s.e.) of HbA1c for the trial treatments and their difference was scanned. This was then inputted into the software package Engauge Digitizer V6.2 which extracted the standard errors (prudent as the only available information related to effect variability was “P < 0.001” for the difference between treatments).

Network Meta-Analysis: Statistical Methodology

A Bayesian NMA combining both indirect and direct evidence synthesis was conducted to enable individual pairwise treatment comparisons between the DPP-4 inhibitors. The Bayesian framework used Markov Chain Monte Carlo (MCMC) methods implementing the guidelines laid down in the Evidence Synthesis Technical Support Document Series produced by the NICE Decision Support Unit [2528].

For each outcome, the protocol-driven intention was that both fixed and random effects models were to be run with model goodness of fit statistics used to choose between them. However random effects models require trials that repeat pairwise comparisons. For the main analysis set only two trials repeated the same comparison (involving sitagliptin versus placebo). Hence, there was insufficient data to estimate the between-study contrast variance within the random effects model. Therefore, fixed effects modelling was used for pairwise comparisons, and random effects modelling was also used for comparisons between alogliptin and the comparator DPP-4 inhibitors grouped as a category. In order to retain the advantages of random effects modelling for pairwise comparisons (including alogliptin vs. each comparator DPP-4 inhibitor), a “quasi-random” effects modelling approach was used, based on inputs from the grouped DPP-4 inhibitor comparisons (utilizing five different prior distributions).

The various implications of the different models attempted on the perceived efficacy of alogliptin versus other DPP-4 inhibitors could then be assessed. If there was a qualitative difference in this perceived efficacy between different models, then deviance information criteria (DIC) could be used to judge amongst them. Lower DIC scores indicate better fit with differences greater than 5 points providing confidence that the difference could not be due to chance [29]. Total residual deviance was also calculated: a high total residual deviance relative to data points (total number of trial arms) indicates poor fit. Such a measure is an important adjunct to DIC which can only make comparisons between models.

In addition to the Bayesian analyses, fixed and random effect frequentist meta-analysis models were undertaken grouping all DPP-4 inhibitors (including alogliptin) against placebo (both sets in combination with metformin + SU).

The base case (main analysis set) involved all the studies where data was presented for the full analysis set (FAS)/intention to treat (ITT) set, thereby excluding PP studies and studies where it was unclear which data set was presented. Within the Bayesian NMAs treatment effects were assessed as follows: for continuous variables, analysis was performed on the estimated difference between treatments in mean change from baseline to study end, together with 95% confidence/credible intervals. For dichotomous variables, analysis was performed on treatment log odds ratio, together with 95% confidence/credible intervals. In addition, the probability that alogliptin was non-inferior to grouped DPP-4 inhibitors (difference in treatment contrast between alogliptin and comparator less than 0.3% HbA1c in the comparator favor) was derived on the basis of a difference of 0.3% in HbA1c change from baseline considered as a clinically meaningful difference [30].

Heterogeneity between trials was assessed in both frequentist and Bayesian paradigms. Within the frequentist meta-analysis, the value associated with the Q test, the between-trial variance of the treatment effect under random effects, and the I-squared statistics were all calculated. I-squared provides an estimate of the percentage of variability in effect estimates that are due to heterogeneity rather than sampling error, and the values from this were interpreted according to the Cochrane Handbook [24]. An I-squared below 40% “might not be important” whilst 50% to 90% “may represent substantial heterogeneity” [24]. These results are displayed within the frequentist forest plots.

From the Bayesian analysis, leverage plots were used to identify any specific trials that appear as outliers (in terms of being either influential or poorly fitted). Trials that are heterogeneous in terms of key parameters are likely to be either influential or poorly fitted and hence will stand out on such a plot. Inconsistency checks were also carried out where necessary.

NMA Sensitivity Analyses

Sensitivity analysis was conducted on the primary endpoint, change in HbA1c from baseline, for the main population analysis set involving: addition of non-FAS (ITT) studies, i.e., PP studies, and studies where it was unclear which data set was presented; removal of studies with baseline HbA1c values that potentially differed from the rest; exclusion of alogliptin comparator DPP-4 studies defined as moderate or poor quality.

Within the main analysis set, sensitivity to trial heterogeneity was performed by examining the effect on results of removing trials identified as outliers on the leverage plots.

NMA covariate regression techniques were also implemented, examining sensitivity to between-trial differences in age and gender for the endpoints of change in HbA1c, change in body weight, and the incidence of hypoglycemic episodes. In addition, baseline HbA1c and baseline body weight were also examined as covariates in the HbA1c and body weight analyses, respectively.

All the above sensitivity analyses were prespecified in the NMA protocol. However, a post hoc HbA1c sensitivity analysis was attempted using the hierarchical “partial pooling” method that may obviate the need to perform formal multiple testing correction procedures such as Bonferroni [31]. This method was applied to one sensitivity analysis that previously produced a statistically significant (uncorrected) difference using a Bayesian fixed effect individual treatment NMA.

More in-depth discussion of the statistical methodology is presented in the technical appendix of the supplementary material.

Results

Systematic Review Results

The systematic literature search identified a total of 2186 hits before duplication. After deduplication and abstract screening, 94 full-text articles were assessed for eligibility in the NMA. To reduce heterogeneity among the RCTs identified in the systematic review and to improve generalizability of the NMA estimates, additional criteria were defined for inclusion of studies in the NMA. Six studies were initially selected for inclusion into the NMA from the systematic search [47, 10, 32]. In addition to these six studies, an abstract in support of the efficacy and safety of alogliptin evaluating subjects treated with only metformin and SU at baseline from the EXAMINE trial [10] was provided by Takeda, along with a non-indexed study identified from ClinicalTrials.gov (Identifier: NCT01590771) [33], taking the total amount of studies to eight. A PRISMA flow diagram summarizing search results and study selection is provided in the supplementary material (Fig. S1). TableS3 in the supplementary material presents study and patient characteristics from the eight RCTs included in the NMA.

The patient characteristics were generally similar across the eight studies: the average age ranged from 55.0 to 62.9 years; the proportion of male patients ranged from 45.0% to 70.9% and, where reported, the average duration of T2DM ranged from 7.0 to 13.3 years. Baseline HbA1c ranged from 8.15% to 8.80% across treatment arms in the studies. In terms of other characteristics, the majority of patients were Caucasian in each study apart from Chen et al. [34] and NCT01590771 [33] where the patients were Chinese, and Hong et al. [32] (ClinicalTrial.gov: NCT01099137) where the patients were Korean. All patients were reported to be uncontrolled on metformin + SU dual therapy in line with inclusion criteria for the systematic review and NMA. Where reported, there were some differences in the type of SU given and their dose (glimepiride or gliclazide); the dose of metformin; and duration of prior metformin or SU. However, it is unlikely that the variation observed has a significant bias on the efficacy and safety outcomes across studies. Study design was similar in all of the eight studies included in the NMA. The treatment duration was 24 weeks (26 weeks for the alogliptin EXAMINE study), and the majority were placebo controlled. All studies were in the patient population of interest; T2DM patients with inadequate glycemic control despite treatment with metformin + SU dual therapy. The endpoint data for the individual trials included in the NMA is presented in the supplementary material: Table S4 (HbA1c), Table S5 (body weight), Table S6 (FPG), Table S7 (hypoglycemia), and Table S8 (discontinuation due to adverse events).

Network Meta-Analysis Results

HbA1c Change from Baseline: Base Case Analysis

The base case analysis for HbA1c change from baseline was performed on the FAS/ITT data for six of the eight studies (EXAMINE, Heller et al. [10], Hermansen et al. [4], Lukashevich et al. [6], Moses et al. [7], NCT01590771 [33], and Owens et al. [5], ClinicalTrials.gov NCT00602472). The relevant network plot is shown in Fig. 1. The two studies excluded from the main analysis group were the studies where analysis was carried out per protocol (Chen et al. [34] and Hong [32]), but were included within a sensitivity analysis. From the quality assessment of each study for the primary outcome, the same two studies were classed as of poor quality as well.

Fig. 1
figure 1

HbA1c (%) change from baseline: main analysis set—network plot

The forest plot in Fig. 2 shows the mean change in HbA1c (%) recorded in the main analysis set [47, 10, 32, 34]. The mean difference (MD) for each DPP-4 inhibitor against placebo is shown on the right and also plotted. With a mean difference against placebo of −0.62% (95% CI of −0.76 to −0.48%) alogliptin falls in the middle of the spread of the individual trial results with means ranging from −0.41 to −0.89%. The two trials that provide these range endpoints both refer to the same treatment (sitagliptin) against placebo [4, 33].

Fig. 2
figure 2

HbA1c (%) change from baseline: main analysis set—forest plot of frequentist meta-analysis showing individual trial results and grouped DPP-4 treatments against placebo fixed and random effect models (all with metformin + SU). MD mean difference

All Bayesian models converged without problems. For the primary analysis, Fig. 3 presents a forest plot of all possible pairwise contrast comparisons between the individual DPP-4 treatments under both the fixed and quasi-random effect models. For the comparisons involving alogliptin, none approached statistical significance and all point estimates were within 0.15% HbA1c of zero difference (three of the four within 0.05%). Statistically insignificant differences were also recorded for all other DPP-4 comparisons. The grouped DPP-4 analysis against alogliptin under various fixed and random effect models produced the comparison results shown in Fig. 4—all model definitions produced contrasts that also show non-inferiority (all estimates around 0.04 Hba1c % in absolute value). The posterior estimates of the between-trial standard deviations from these random effect models were in close agreement with medians at or close to 0.17 (see Fig. S2 in the supplementary material). Hence this value was inputted into the individual DPP-4 treatment quasi-random effect model. The mean estimates from this model were very similar to the fixed effect mean results as shown in Fig. 3, but with an approximate doubling of the confidence interval ranges.

Fig. 3
figure 3

HbA1c (%) change from baseline: main analysis set fixed and quasi-random (between-trial SD fixed at 0.17) effects models forest plot—pairwise differences between DPP-4 treatments (with metformin + SU)

Fig. 4
figure 4

HbA1c (%) change from baseline: main analysis set forest plot—mean differences between alogliptin and other grouped DPP-4 treatments (with metformin + SU) under various random and fixed effects models

The non-inferiority probabilities for alogliptin against individual comparator DPP-4 treatments are shown in Table 1. Indeed, the probability of alogliptin being non-inferior to at least one individual DPP-4 treatment is 1 under the fixed effects model and 0.98 under the quasi-random effects model. Grouping the comparator DPP-4 inhibitors shows similar results for the non-inferiority of alogliptin HbA1c efficacy (Table 2).

Table 1 HbA1c change from baseline: main analysis set—non-inferiority (<0.3% in competitor’s favor) of alogliptin 25 mg under fixed effects and quasi-random effects models (all regimens with metformin + SU)
Table 2 HbA1c change from baseline: main analysis set—non-inferiority (<0.3% in competitor’s favor) of alogliptin 25 mg against remaining DPP-4 inhibitors grouped under various fixed and random effect models (with goodness of fit statistics)

Model fit and heterogeneity assessments that that the quasi-random effects model (DIC = 24.2) is superior to the fixed effects model (DIC = 31.9). The poor fit of the fixed effect model is displayed in the Leverage plot shown in Fig. 5 [27]. The circles in this figure that lie well outside the red parabola (considered outliers) are the trial arms in the two studies involving sitagliptin [4, 33]. On the equivalent figure drawn for the quasi-random effects model (Figure S3 supplementary material) these arms fall well within this parabola.

Fig. 5
figure 5

HbA1c change from baseline: main analysis set (fixed effects model)—leverage versus deviance residual plot incorporating model fit statistics. Values that lie outside the drawn smooth parabola with a constant of 3 (the red curves) can generally be identified as contributing to the model’s poor fit

Frequentist meta-analysis (combining the DPP-4 treatments and contrasting against placebo) supports the Bayesian findings as shown by the bottom rows of Fig. 2: alogliptin falls well within the middle of the confidence ranges produced by these fixed and random effect models and heterogeneity problems are indicated by the I-squared of 55%.

HbA1c Change from Baseline: Sensitivity Analysis

To explore the impact of study heterogeneity, all analyses were repeated eliminating the two sitagliptin trials (Hermansen et al. [4] and NCT01590771 [33]). As both these studies were the only alogliptin comparator DPP-4 treatment studies that were deemed to be of moderate quality, this trial set also serves as sensitivity analysis relating to study quality.

The elimination of the two studies had no qualitative impact: individual treatment results under the Bayesian fixed effect NMA remained the same as displayed in Fig. 3 (omitting the comparisons involving sitagliptin) and the non-inferiority probabilities of alogliptin against individual DPP-4 treatments remained high (increasing for the quasi-random effects models where the between-trial standard deviation fell to 0.1). For alogliptin versus other DPP-4 inhibitors grouped comparisons, the non-inferiority probabilities remain high: 0.999 probability that alogliptin is non-inferior to other DPP-4 inhibitors based on the fixed effects model (Table 3). In addition, the goodness of fit statistics for the fixed effects model were the best (although not decisively).

Table 3 HbA1c change from baseline: main analysis set removing two outlier studies—non-inferiority of alogliptin 25 mg against remaining DPP-4 inhibitors grouped under various fixed and random effects models (with goodness of fit statistics)

All other sensitivity analysis results involving adding or removing trials produced the same strong qualitative assessment for the change in HbA1c endpoint: the results indicate no difference between the DPP-4 inhibitors. These results are presented in the supplementary material (Figs. S4–S16, Tables S9–S14), with the exception of the sensitivity analysis removing studies with potential outlier baseline HbA1c values (shown below).

From the leverage plot analysis potential outlier trials that were identified were Lukashevich et al. [6] and NCT01590771 [33] which had relatively high baseline HbA1c values of 8.75% and 8.61%, respectively, contrasting with the other studies in the main analysis set (range 8.15% to 8.37%). Under the fixed effects model only, this resulted in a marginal statistically significant advantage for sitagliptin over alogliptin decreasing HbA1c by an additional 0.27% (95% CIs of 0.02% to 0.52%). Figure 6 which displays all the fixed effect contrasts in dark blue. The “partial pooling” estimates in yellow, which perform a moderate adjustment for multiple testing considerations [31], eliminate this statistical difference. This insignificance viewpoint is reinforced by the quasi-random effects estimates shown in green which do not attempt to adjust for multiple testing considerations but allow for between-trial variability in treatment effects. Goodness of fit statistics for these various models have no power to differentiate between them, as there are insufficient trials and each treatment contrast comparison occurs only once. However, grouping the DPP-4 inhibitors suggests that between-trial variability should be allowed for (worst DIC and significantly worst residual deviance observed for the fixed effects model). All comparisons between grouped DPP-4 inhibitors and alogliptin were highly insignificant. Non-inferiority results and the goodness of fit statistics are presented in Table 4 for the grouped comparison.

Fig. 6
figure 6

HbA1c (%) change from baseline: main analysis set excluding two outlier studies with higher baseline HbA1c values—pairwise differences between DPP-4 treatments (with metformin + SU) under fixed, partial pooling, and quasi-random effects (SD = 0.2) model assumptions

Table 4 HbA1c change from baseline: main analysis set excluding two outlier studies with higher baseline HbA1c values—non-inferiority of alogliptin 25 mg against remaining DPP-4 inhibitors grouped under various fixed and random effects models (with goodness of fit statistics)

The meta regressions examining the sensitivity of HbA1c treatment efficacy to baseline Hba1c, age, and male/female ratio disparities also produced statistically insignificant results. Table 5 presents the results for baseline HbA1c and indicates that the sensitivity analysis that eliminated the two higher baseline Hba1c trials was inappropriate. Results for the other covariates are shown in Tables S15–S17 in the supplementary material.

Table 5 Baseline HbA1c covariate: coefficient details and goodness of fit statistics—main analysis set (alogliptin trial omitted) other DPP-4 treatments grouped

Even more evidence is presented in the supplementary material to support the claim that alogliptin is equivalent to other DPP-4 treatments in terms of HbA1c change from baseline. This is based around the p value of the meta-analysis Q statistic and the nature of the study network (see technical appendix for methodological explanation and text below Fig. S4 for results/interpretation). Figure S17 of the supplement contains a forest plot of vildagliptin 100 mg versus all other DPP-4 inhibitors grouped. Vildagliptin achieved the best point estimate based on this outcome measure so that it is reassuring that this figure shows that differences are statistically insignificant under all model scenarios examined.

Body Weight Change from Baseline

From frequentist analysis alogliptin (based on data from the EXAMINE trial) [10] was associated with the least mean body weight gain compared to placebo of all the DPP-4 inhibitors (0.14 kg over 26 weeks), as shown in the forest plot of Fig. 7. This resulted in favorable NMA pairwise DPP-4 inhibitor comparisons as shown in Fig. 8. Two of these comparisons (against saxagliptin and sitagliptin) are bordering on statistical significance in the fixed effects representation. However, the range of the 95% confidence intervals across all the pairwise comparisons together with the quasi-random effects results suggest that in reality this apparent advantage is likely to be spurious. This is reinforced by grouping the non-alogliptin DPP-4 inhibitors and comparing against it—results under all NMA fixed and random effect models produced point estimates in alogliptin’s favor but with zero overlapping confidence intervals. Further diagnostic figures related to weight change are provided in Figs. S18–S21 of the supplementary material.

Fig. 7
figure 7

Body weight (kg) change from baseline: forest plot of frequentist meta-analysis showing individual trial results and grouped DPP-4 treatments against placebo fixed and random effects models (all with metformin + SU)

Fig. 8
figure 8

Body weight (kg) change from baseline: main analysis set—fixed and quasi-random (between-trial SD fixed at 0.25 kg) effects models forest plot. Pairwise differences between DPP-4 treatments (with metformin + SU)

Bayesian covariate regression analysis for gender, age, and baseline body weight indicated that none of these covariates had any explanatory power (all 95% confidence intervals widely overlap zero) and the models that excluded any covariates had better DIC scores than corresponding models that included them (Tables S18–21 in the supplementary material). This supports a finding of comparability between alogliptin and the DPP-4 inhibitors, and between the DPP-4 inhibitors, with regards to change in body weight.

Incidence of Hypoglycemic Events

The individual trial results for the number of hypoglycemic events in Fig. 9 show that alogliptin achieves a relatively low risk ratio against placebo (point estimate of 1.54, almost joint best with linagliptin) compared to other DPP-4 inhibitors. This is reflected in the Bayesian DPP-4 individual treatment NMA comparison results shown in Fig. 10. For the fixed effect model, this comparison against sitagliptin is statistically significant. Again though, this is likely to be spurious—the quasi-random effects model fitted significantly better (DIC more than 5 better) which produced overlapping zero confidence intervals for contrasts (as did all alogliptin versus other DPP-4 inhibitor grouped NMA analyses). The results for the number of hypoglycemic events do not appear to be sensitive to covariates, age or gender. These predictors had no explanatory power in the covariate regression equations (all 95% CIs overlap zero) and DIC scores all slightly improve on their removal. Thus, comparable outcomes for alogliptin versus other DPP-4 inhibitors and between DPP-4 inhibitors for the incidence of hypoglycemic events are supported by the analyses.

Fig. 9
figure 9

Incidence of hypoglycemic events: forest plot of frequentist meta-analysis showing individual trial results and grouped DPP-4 treatments against placebo, fixed and random effect models (all with metformin + SU)

Fig. 10
figure 10

Incidence of hypoglycemic events: main analysis set; fixed and quasi-random (between-trial SD fixed at 1) effects models forest plot. Pairwise comparisons between DPP-4 treatments (with metformin + SU) measured by log odds ratios

Adverse Events Leading to Study Discontinuation

The trial results used in the analysis of adverse events leading to study discontinuation are shown in Fig. 11. The results from this forest plot support that alogliptin has a similar safety profile to the other DPP-4 inhibitors when compared against placebo. This is supported by all Bayesian NMA analyses with all contrasts having confidence intervals crossing zero. The results for pairwise DPP-4 inhibitor comparisons are shown in Fig. 12.

Fig. 11
figure 11

Adverse events leading to study discontinuation: forest plot of frequentist meta-analysis showing individual trial results and grouped DPP-4 treatments against placebo, fixed and random effects models (all with metformin + SU)

Fig. 12
figure 12

Adverse events leading to treatment discontinuation; main analysis set; fixed and quasi-random (between-trial SD fixed at 0.5) effects models. Forest plot of pairwise comparisons between DPP-4 treatments (with metformin + SU) measured by log odds ratios

Discussion

The current NICE guidelines (NG28) for the management of T2DM in adults recommend: “triple therapy with metformin, a DPP-4 inhibitor and a sulfonylurea as an option if dual therapy with metformin and another oral drug (including SU) has not continued to control HbA1c to below the person’s individually agreed threshold” [35]. There is no comparative data on the efficacy and safety of alogliptin against other DPP-4 inhibitors in triple therapy added to metformin and sulfonylurea, and until recently no data was available to investigate the clinical effectiveness of alogliptin in a triple therapy context having previously received metformin and sulfonylurea alone. However, data has now become available for alogliptin in this patient population from a large subgroup of the EXAMINE study [10]. The EXAMINE study provides efficacy and safety evidence for alogliptin triple therapy against dual therapy with metformin and sulfonylurea. Using data from the EXAMINE subgroup, the current study reported in this paper presents the methods and results of a decision-focused SLR and NMA to investigate the relative efficacy and safety of alogliptin 25 mg in triple therapy compared with the other DPP-4 inhibitors currently used in UK clinical practice.

The NMA of DPP-4 inhibitors within triple therapy (with metformin + SU) has indicated that there is no evidence of a difference for alogliptin compared to other DPP-4 inhibitors (sitagliptin, saxagliptin, vildagliptin, and linagliptin) in terms of key efficacy and safety outcomes: HbA1c mean change from baseline, mean change in body weight from baseline, number of patients experiencing hypoglycemic events, and number of patients experiencing adverse events leading to study discontinuation. This similarity of the efficacy and safety of the DPP-4 inhibitors used in triple therapy has been established by multiple complementary analysis methods including simple frequentist forest plots, Bayesian NMA analysis at the individual DPP-4 treatment level, and Bayesian NMA analysis of alogliptin versus all other DPP-4 inhibitors grouped. A range of sensitivity analyses and meta regression covariate adjustment techniques for the primary outcome of HbA1c change from baseline are also all supportive of a finding of comparable efficacy. A simple non-technical illustration of this is provided in Fig. 2 which shows that shows that the EXAMINE trial [10] falls in the middle when visually comparing the confidence (error) bars for each of the six trials included (all comparisons against placebo), and that the greatest difference in the results involves the same treatment, sitagliptin. Therefore, any differences between DPP-4 treatments in terms of HbA1c efficacy appear less than the variability within the same treatment comparison between separate trials.

The comparable efficacy of the DPP-4 inhibitors, along with other anti-diabetes treatments (SGLT-2 inhibitors, GLP-1 agonists, and the TZDs) in triple therapy have previously been reported, and the NMA performed here has further confirmed the expectation of no significant differences in key outcomes such as change in HbA1c in a clinical trial setting when comparing alogliptin with other DPP-4 inhibitors [1214, 3640]. For example, the mean change (%) in HbA1c for the grouped DPP-4 inhibitors compared to placebo was −0.64 and −0.65 for fixed effects and random effects models, respectively. Previous published meta-analyses of treatments for T2DM in triple therapy (following failure with metformin + SU) have reported similar results for mean change in HbA1c: −0.68; [14] −0.62 [12], and −0.69 [13]. The most recent NMA from Lee et al. [15], which is the first study to estimate and compare the effectiveness of all triple therapy combinations that have been studied in randomized trials, not limited to those that included both metformin and SU, in terms of HbA1c and the associated effect on body weight and hypoglycemia, showed the mean change (%) in HbA1c for the grouped DPP-4 inhibitors compared to placebo was 0.56 which is a clinically relevant reduction [15].

Strengths and Limitations

A strength of this study was the use of a decision-focused approach ensuring that the systematic search and the selected studies for the NMA relate directly to the populations of interest and the drug comparators of interest that alogliptin is intended to displace in clinical practice in the UK (i.e., other DPP-4 inhibitors). The comparisons presented for alogliptin 25 mg are for each comparator DPP-4 inhibitor at its recommended daily dose to enhance the clinical relevance of the results. The systematic search and subsequent statistical analysis were conducted according to a protocol specified in advance, using transparent, reproducible methods to identify evidence, perform data abstraction, and conduct the analysis. Necessary analytical deviations from the protocol (impossible to conduct specified random effect models at the individual DPP-4 treatment level) were handled in a transparent way: for example, quasi-random effects models were created and used as fully outlined in the methods section and technical appendix.

The range of analyses performed for each endpoint is another key strength. Specifically, not only was alogliptin compared to each individual DPP-4 comparator but these other comparators were individually compared against each other. Visual comparisons from forest plots across all pairwise comparisons were crucial here to provide a context to assess the alogliptin results. Even if all treatments are equivalent, random chance (stochastic uncertainty) will lead to non-identical estimates. By showing all pairwise contrasts it is possible to assess whether alogliptin comparisons have a similar or different trend to the rest. The analyses combining the other DPP-4 inhibitors into one group to compare against alogliptin rests on an assumption of comparable efficacy and safety between the DPP-4inhibitors, which is borne out by the results of the individual comparisons. Furthermore, this grouped analysis was handled in a credible way—imposing various different priors for the between-study standard deviation and assessing if this had any effect at all on the results. The HbA1c non-inferiority probability analysis recognizes the importance of clinically meaningful differences in HbA1c and also demonstrates that there is a very high probability that alogliptin is non-inferior in efficacy to other DPP-4 inhibitors. Final assessments concerning non-inferiority on an endpoint were only made after looking across the totality of results.

A further strength of the study was the introduction of “partial pooling” techniques for the purpose of addressing multiple testing concerns. Having many statistical comparisons performed across a range of sensitivity analysis leads to an expectation of a number of statistically significant results for non-inferior treatments: 1 in every 20 expected for independent tests over non-inferior treatments at standard 0.05 P values (95% confidence/credible intervals). Standard techniques to adjust for multiple comparisons, such as Bonferroni, often increase confidence intervals to unrealistic extents. One method to deal with multiple testing concerns in many instances is actually to make no direct adjustment but simply show all comparisons together and judge visually whether any obvious pattern involving a treatment emerges. Hence, the one borderline (unadjusted for multiple testing) statistically significant difference involving alogliptin against another DPP-4 inhibitor that arose in one sensitivity analysis (Fig. 6 fixed effect sitagliptin comparison) is not thought to be credible. A hierarchical “partial pooling” approach to this fixed effect model was implemented. This assumed that each individual DPP-4 treatment effect could be different from the rest but they all arose from the same underlying normal distribution (whose mean and variance priors were uninformative). Such an assumption is plausible for treatments within a class and for which there is no biological reason to expect a particular named treatment to do better than any other before the results are known—the conditions that arose in the NMA. The implementation adopted was a form of random effects model but importantly it did not allow for between-trial heterogeneity for the same treatment effect and so is equivalent to the fixed effect model in this regard. Confidence intervals are not increased with this approach. Instead, estimates are pulled towards the common effect—with a greater pull given to the more extreme effects recorded with higher uncertainty. As shown in Fig. 6, this shifted the estimate to statistical insignificance.

There are inherent limitations with the study. Firstly, the data for alogliptin is from a post hoc subgroup analysis from the EXAMINE trial, an event-driven cardiovascular outcomes study, so patients enrolled were diagnosed with type 2 diabetes and acute coronary syndrome, and therefore could be considered “high risk”. In contrast the cardiovascular status of patients in the comparator DPP-4 inhibitor studies is limited with only one other study [32] reporting data. As expected, there were fewer patients with hypertension (51.2% vs. 83.6%) compared to patients enrolled in the EXAMINE trial. Similarly, baseline renal function was recorded for patients in the EXAMINE trial (GFR <60 ml/min/1.73 m2), and was comparable compared to the two comparator DPP-4 inhibitor studies (Lukashevich et al. [29.6% vs. 34.8%] [6] and Owens et al. [5] [29.6% vs. 31.6%]) where this was recorded. The study did not have as its primary objective to evaluate the effects of alogliptin on glycemic control [9] and so was not powered to investigate the relative efficacy of MET + SU + alogliptin vs. MET + SU alone. Patients were also permitted to have changes in glycemic therapies according to local standard of care including changes in the metformin, SU, or existing dose. Thus background therapy at baseline was intended to be indicative of “real-world” treatment, so doses of metformin and type/dose of SU were neither standardized nor controlled [10]. In addition, not all patients who entered on metformin stayed on metformin and SU throughout the study follow-up, although a large proportion did (84.9%) [10]. This is in contrast to the DPP-4 inhibitor comparator studies, whereby all patients were intended to be on the maximum tolerated and stable dose of MET + SU. Further, whilst all patients in the DPP-4 inhibitor comparator studies were defined as having failed on MET + SU (i.e., baseline HbA1c >7% or 7.5%, depending on study), there was no formal definition of failure in the EXAMINE study and the inclusion criterion was HbA1c >6.5–11% [9] (although the range for patients studied in the post hoc subgroup was 4.9–11%). Overall, the proportion of patients in the EXAMINE subgroup who had HbA1c >7.5% and >7% was 64.9% and 81.3%, respectively; hence a large proportion matched the patient population in the other DPP-4 studies. So whilst the NMA performed can only directly address the question of the relative efficacy of alogliptin added to MET + SU compared to patients receiving MET + SU alone, on the basis of the similarity of patient characteristics for EXAMINE versus other studies it could be considered sufficient to use this data and NMA as representative to address the question of the relative efficacy of alogliptin to MET + SU in patients who have failed on MET + SU (i.e., have inadequate glycemic control). This is relevant for HTA in countries such as the UK where this is the specific decision problem faced for new drug therapies in triple therapy.

Despite the differences in study design and patient inclusion criteria, the baseline characteristics of patients in the metformin + SU dual therapy subgroup of EXAMINE were comparable with patients in the other studies (e.g., baseline mean HbA1c of 8.15%, and mean duration of diabetes of 8.5 years; see Table S3 in the supplementary material). Hence, it is unlikely that these differences impact on the findings of comparable efficacy and safety between alogliptin and the other DPP-4 inhibitors in triple therapy. Baseline HbA1c and duration of T2DM are patient disease characteristics known to potentially impact the efficacy of glucose-lowering treatments [41]. In a recent meta-regression analysis by Esposito et al. which aimed to predict the HbA1c response to DPP-4 inhibitors including alogliptin, results showed that a greater absolute reduction of baseline HbA1c is seen in patients with higher baseline HbA1c and lower fasting glucose level [42]. This is consistent with previous reports for other non-insulin glucose-lowering agents, including SGLT-2 inhibitors [41, 43], sulfonylureas, metformin, and TZDs [44]. As the selected studies in the NMA had similar mean HbA1c at baseline, the result of comparable efficacy in the reduction of HbA1c from baseline between alogliptin and the other DPP-4 inhibitors (sitagliptin, saxagliptin, linagliptin, and vildagliptin) was as expected.

As a result of the limited availability of triple therapy studies of a DPP-4 inhibitor in combination with metformin + SU, studies were restricted to 24-week follow-up only, whereas dual therapy studies of a DPP-4 inhibitor in combination with either metformin or SU, of which there are many more, include studies with a longer follow-up of 52 weeks. Despite these limitations of the data, this post hoc analysis does provide clinical evidence for the triple use of alogliptin in combination with metformin + SU over a reasonable time period.

Because the NMA is a decision-focused approach that concentrates on a comparison of alternative DPP-4 inhibitors in triple therapy, the studies selected are more homogeneous than other broader NMAs in triple therapy that have previously been reported [13, 14, 39], and a relatively narrow network will reduce heterogeneity and noise that might be seen with larger networks. The lack of studies preventing running NMA random effects models at the individual DPP-4 treatment level has to be acknowledged as a limitation. However, the formulation of the quasi-random effects models as a replacement, together with the strength of results from all the combined analyses, provides reassurance that such omission is not a major concern.

Two features are worth pointing out concerning these comparisons:

  • The lower end definition of 0.3% has been set as a clinically meaningful difference. The upper end of 0.4% would actually show alogliptin in a more favorable light (i.e., easier to achieve alogliptin non-inferiority). Hence, in this context the estimates could be considered conservative.

  • Within random effect models as the between-trial study contrast variance estimates increase, then no matter what the true difference is between the treatments, the proportion of non-inferiority will fall (if the true estimate is indeed high). This is a result of higher between-trial variances dominating true differences.

The aim of performing a decision-focused NMA was to potentially reduce between-study heterogeneity and uncertainty as well as to narrow credible intervals without any major loss of precision—an approach that is suitable for HTA-based decision-making in the UK, based on feedback received from UK HTA bodies—the Scottish Medicines Consortium (SMC) and All Wales Medicines Strategy Group (AWMSG) [45, 46].

Conclusion

Triple therapy with a DPP-4 inhibitor in combination with metformin and a sulfonylurea is one of the current recommendations from NICE if patients with T2DM have inadequate glycemic control on dual therapy with metformin and an SU [35]. This systematic review and extensive analysis of DPP-4 treatments within T2DM triple therapy settings (combined with metformin + SU) provides evidence of comparability between alogliptin and other DPP-4 inhibitors in terms of key efficacy and safety outcomes. In addition, this study has used a number of techniques that may well be considered novel to traditional NMA methods—quasi-random effects models (when random effects models cannot be run) and “partial pooling methods” (seeking to address multiple testing issues) have been implemented in this study that could be of use in future NMA research. Local health service decision-makers will need to consider the available evidence on efficacy and safety, as well as cost and individual patient factors, when developing guidance and making formulary decisions about the use of alogliptin. Alogliptin has the lowest acquisition cost of all the DPP-4 inhibitors used in UK clinical practice [47]; hence, it could be considered to have relative cost-effectiveness based on the finding of comparable efficacy and safety from this NMA. The current NMA should help local decision-makers have a clearer understanding of the drug’s place in therapy. Further RCT evidence from additional clinical studies investigating alogliptin added to metformin plus a sulfonylurea and further active comparator studies comparing alogliptin with other DPP-4 inhibitors would be useful to further strengthen the evidence base for alogliptin in triple therapy.