FormalPara Key Summary Points

Why carry out this study?

The growing number of available advanced therapy options for ankylosing spondylitis (AS) calls for assessment of comparative efficacy across therapy options and populations using stringent and clinically relevant outcome criteria.

This study assessed via multiple indirect treatment comparison methods the comparative clinical efficacy of approved treatments for AS, among patients with AS who were naïve or had demonstrated inadequate response/intolerance to tumor necrosis factor inhibitors (TNFi-naïve and TNFi-IR, respectively) using the Ankylosing Spondylitis Disease Activity Score low disease activity (ASDAS LDA) outcome.

What was learned from this study?

Rates of ASDAS LDA, as well as relative effect measures such as odds ratios and numbers needed to treat, were favorable for patients receiving upadacitinib in both the TNFi-naïve and TNFi-IR populations.

This study is the first to utilize ASDAS LDA to compare the clinical efficacy of advanced therapies for the treatment of AS across TNFi-naïve and TNFi-IR populations.

The findings of these indirect treatment comparisons suggest upadacitinib may provide greater clinical benefit in AS compared to other advanced therapies regardless of prior TNFi treatment status.

Introduction

Ankylosing spondylitis (AS), also referred to as radiographic axial spondyloarthritis (r-axSpA), is a chronic, progressive inflammatory disease largely involving the spine and sacroiliac joints but may also produce peripheral and/or extra-musculoskeletal manifestations [1]. In addition to chronic pain and stiffness, AS is characterized by radiographically identified structural changes in and damage to the sacroiliac joints [1].

Along with non-pharmacological interventions such as physiotherapy and lifestyle changes, non-steroidal anti-inflammatory drugs (NSAIDs) are generally recommended as first-line pharmacologic therapies in the treatment of AS [2]. If symptom control remains inadequate, glucocorticoids and conventional synthetic disease-modifying antirheumatic drugs (DMARDs) may be considered; however, these therapies are discouraged for long-term use or in purely axial disease [3]. Thus, advanced biologic/targeted synthetic DMARDs (b/tsDMARDs) may be considered for patients with an inadequate response to conventional first-line therapies [3, 4].

Tumor necrosis factor inhibitors (TNFis) have historically been utilized as first-line advanced therapies; however, more than half of patients may experience non-response or loss of response to initial TNFi therapy, suggesting therapy efficacy should be considered at both short-term and long-term timepoints [5,6,7]. Given this, AS therapy guidelines now recommend consideration of b/tsDMARDs utilizing other mechanisms of action, such as interleukin-17A inhibitors (IL-17Ais) and Janus kinase inhibitors (JAKis), alongside TNFis as first-line advanced therapies; recently updated European Alliance of Associations for Rheumatology (EULAR) guidelines note that current practice is often to begin treatment with a TNFi or IL-17Ai therapy [3].

With the increasing number of b/tsDMARDs available for consideration as first-line advanced therapy, there is likewise an increasing importance for the assessment of comparative efficacy across these advanced therapies. Furthermore, as many patients with AS have been previously exposed to and subsequently failed TNFis, it is important to assess such comparative efficacy both among patients who have not had prior exposure to TNFis (TNFi-naïve) and among patients with prior exposure and inadequate response or intolerance to TNFis (TNFi-IR).

As there are limited head-to-head clinical trials of b/tsDMARDs in AS, indirect treatment comparisons (ITCs) have been utilized to address this. Prior ITCs in AS have largely assessed the comparative efficacy of b/tsDMARDs in terms of Assessment of Spondyloarthritis International Society (ASAS) outcomes, including a greater than 20% (ASAS20) or greater than 40% (ASAS40) improvement from baseline. In contrast, recent AS management recommendations identify the Ankylosing Spondylitis Disease Activity Score (ASDAS) as the appropriate measure with which to define active AS, with an ASDAS threshold of 1.3 signifying inactive disease and 2.1 signifying low disease activity status [4]. Patients who achieve such ASDAS thresholds have previously been shown to have substantial improvements in disease control and in patient-reported outcomes relative to patients who do not achieve said thresholds, suggesting ASDAS thresholds to be an important target in managing AS [8]. As such, this study aims to further address the gap in comparative efficacy among advanced therapies for AS by conducting ITC utilizing the ASDAS low disease activity (ASDAS LDA) outcome, defined as achieving an ASDAS threshold of 2.1 or less, separately in TNFi-naïve and TNFi-IR patient populations.

Methods

Trial Identification

Phase 3 randomized clinical trials (RCTs) of b/tsDMARDs (e.g., TNFi, IL-17Ai, and JAKi) in adult patients with active AS were identified by a clinical systematic literature review (SLR; May 2021) as described in Walsh et al. [9]. Trials eligible for inclusion in this analysis were conducted among adult patients with active AS who fulfilled the modified New York criteria for AS; patients with non-radiographic axial spondyloarthritis (nr-axSpA) were excluded [10]. Eligible trials were of US Food and Drug Administration (FDA)-approved therapies and doses for AS, and were either placebo-controlled or reported results from head-to-head comparisons. Moreover, trials were required to report the proportion of patients achieving ASDAS LDA between weeks 12 and 16 among a TNFi-naïve or TNFi-IR population (i.e., trials reporting ASDAS LDA only among a mixed TNFi-naïve and TNFi-IR population were excluded). Trials included in this analysis were selected by two independent researchers; any disagreements were resolved by a third researcher. As such, this publication is based on previously conducted clinical trials and no new clinical trials with human participants or animals were performed by any of the authors.

For the TNFi-naïve and TNFi-IR ITC, the evidence base consisted of data extracted from included trial publications by two independent researchers using a standard data abstraction form, as well as the individual patient-level data (IPD) of SELECT-AXIS 1 and 2 trials, respectively, for the JAKi upadacitinib in active AS. Additionally, patients in SELECT-AXIS 2 with prior exposure to IL-17Ai therapies were excluded in ad hoc analysis to better align patient inclusion criteria across trials and ensure only patients who were TNFi-IR were included in TNFi-IR analyses.

Outcome Measures

ITCs were conducted on the ASDAS LDA outcome as evaluated between weeks 12 and 16. ITC was additionally conducted on the ASDAS LDA outcome as evaluated at week 52 where feasible as outlined in the Statistical Methods. ITC-estimated rates of ASDAS LDA, adjusted for placebo as feasible, were cross-tabulated with RCT-reported rates of discontinuation due to adverse events (AEs), likewise adjusted for placebo as feasible, to approximate the efficacy-safety profiles of included therapies.

Statistical Methods

The ITCs conducted and approaches taken were determined by the number of active treatment comparators, in addition to upadacitinib, and the presence of a common comparator linking them.

In instances where more than one comparator to upadacitinib was identified, with a common comparator such as placebo linking them, Bayesian network meta-analysis (NMA) was utilized. NMA allows for simultaneous efficacy comparisons to be made across multiple therapies using a probabilistic framework. Both fixed effects (FE) and random effects (RE) models were assessed using a logit link. Non-informative prior distributions were utilized, and model goodness-of-fit was thoroughly assessed via review of a model’s residual deviance, effective number of parameters, deviance information criterion (DIC), and leverage plots, as well as the posterior distribution of the between-study heterogeneity parameter in RE models [11]. Estimated values were reported as posterior medians and 95% credible intervals (CrIs).

If only one comparator to upadacitinib was identified, matching-adjusted indirect comparison (MAIC) was utilized. MAIC allows for a robust comparison between two comparators of interest by weighting the IPD of one comparator, i.e., upadacitinib from the SELECT-AXIS trials in the present study, to match the average baseline characteristics reported for the other comparator. Baseline characteristics included in the present matching were informed by a prior MAIC in the AS literature: age, gender (percent male), Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) score, Bath Ankylosing Spondylitis Functional Index (BASFI) score, C-reactive protein (CRP; percent > 5 mg/L), and number of prior TNFis (percent with exposure to 1 vs. > 1) [12]. Pre- and post-match baseline characteristics and sample sizes (N and effect sample size [ESS]) were reviewed to ensure that a sufficient match was made between the two populations [13]. Additionally, MAIC allows for comparisons to be made between two comparators of interest when a connection between them is lacking, such as when comparators are studied in single-arm trials; this is known as unanchored MAIC. Thus, in instances where only one comparator to upadacitinib was identified, single-arm trial extensions were considered for analysis to assess ASDAS LDA at week 52. Estimated values were reported as point estimates and 95% confidence intervals (CIs).

From both ITC approaches, odds ratios (ORs) were estimated. From these ORs, absolute rates and numbers needed to treat (NNTs), calculated as the reciprocal of the difference in estimated absolute rates between each comparator and placebo, were estimated. In the absence of a common comparator, such as in unanchored MAIC, only absolutes rates could be estimated (i.e., NNTs were not estimated).

All statistical analyses were conducted in R Statistical Software, leveraging bnma and JAGS for NMA and National Institute for Health and Care Excellence Decision Support Unit (NICE DSU)-derived code for MAIC [13,14,15,16].

Results

Study Selection

The clinical SLR identified a total of 191 records reporting on 63 unique registered RCTs. Following application of the ITC requirement that trials report ASDAS LDA between week 12 and 16 among TNFi-naïve or TNFi-IR populations, four RCTs from the SLR plus the SELECT-AXIS 2 study for upadacitinib (which completed post-SLR) were ultimately included. Thus, the final ITC evidence base consisted of five RCTs: (1) SELECT-AXIS 1 for the JAKi upadacitinib, (2) COAST-V for the IL-17Ai ixekizumab and the TNFi adalimumab, and (3) SPINE for the TNFi etanercept, all in TNFi-naïve populations; and (4) SELECT-AXIS 2 for upadacitinib and (5) COAST-W for ixekizumab, both in TNFi-IR populations (Supplementary Table 1). From these RCTs, the following four FDA-approved doses of b/tsDMARDs were included: upadacitinib 15 mg orally once daily (UPA15), ixekizumab 80 mg subcutaneously (SQ) every 4 weeks (IXE80), adalimumab 40 mg SQ every other week (ADA40), and etanercept 50 mg SQ once weekly (ETN50) [17,18,19,20].

TNFi-Naïve Population Analysis

As more than one comparator to UPA15 was identified in the TNFi-naïve population (i.e., IXE80, ADA40, and ETN50) and each was compared to placebo in its RCT, the associated ITC of ASDAS LDA at week 12 to 16 was conducted using NMA with placebo as the common comparator and network link. Similarities in the fits of the FE and RE models led to the selection and results presentation of the FE model on the basis of parsimony.

Relative to placebo, the estimated ORs (95% CrI) for ASDAS LDA among the TNFi-naïve population was highest for UPA15 (8.50 [4.03, 19.36]), followed by ETN50 (5.66 [1.89, 19.93]), IXE80 (5.42 [2.55, 12.29]), and ADA40 (4.30 [2.05, 9.63]), all of which were statistically significantly better than placebo (Fig. S2). These ORs translated to absolute ASDAS LDA rates (95% CrI) of 52.8% (34.1%, 72.2%) for UPA15, 42.7% (19.6%, 72.7%) for ETN50, 41.6% (24.6%, 62.2%) for IXE80, 36.1% (20.9%, 56.4%) for ADA40, and 11.6% (9.8%, 13.7%) for placebo (Fig. 1). Finally, estimated NNTs (95% CrI) for one additional patient to achieve ASDAS LDA over placebo followed a similar ordering, being lowest for UPA15 (2.43 [1.66, 4.38]), followed by ETN50 (3.22 [1.64, 12.04]), IXE80 (3.34 [1.99, 7.48]), and ADA40 (4.08 [2.25, 10.46]) (Fig. 2).

Fig. 1
figure 1

Estimated ASDAS LDA absolute rates at week 12–16 in TNFi-naïve populations with AS. Input indicates values utilized in the NMA; asterisk indicates significant comparison in favor of UPA15 (OR credible intervals do not cross 1). ADA40, adalimumab 40 mg subcutaneous injection every 2 weeks; AS, ankylosing spondylitis; ASDAS LDA, Ankylosing Spondylitis Disease Activity Score low disease activity; CrI, credible interval; ETN50, etanercept 50 mg subcutaneous injection once weekly; IXE80, ixekizumab 80 mg subcutaneous injection every 4 weeks; NMA, network meta-analysis; OR, odds ratio; PBO, placebo; TNFi, tumor necrosis factor inhibitor; UPA15, upadacitinib 15 mg orally once daily

Fig. 2
figure 2

Estimated number needed to treat for ASDAS LDA at week 12–16 in TNFi-naïve populations with AS. Input indicates values utilized in the NMA. ADA40, adalimumab 40 mg subcutaneous injection every 2 weeks; AS, ankylosing spondylitis; ASDAS LDA, Ankylosing Spondylitis Disease Activity Score low disease activity; CrI, credible interval; ETN50, etanercept 50 mg subcutaneous injection once weekly; IXE80, ixekizumab 80 mg subcutaneous injection every 4 weeks; NMA, network meta-analysis; NNT, number needed to treat; TNFi, tumor necrosis factor inhibitor; UPA15, upadacitinib 15 mg orally once daily

TNFi-IR Population Analysis

As only one comparator to UPA15 was identified in the TNFi-IR population (i.e., IXE80), the associated ITC of ASDAS LDA at week 12 to 16 was conducted using MAIC. Additionally, as MAIC is feasible even in the absence of a common comparator, ITC of ASDAS LDA at week 52 was conducted using unanchored MAIC.

For the MAIC of ASDAS LDA at week 12 to 16, weights were applied to IPD of the UPA15 and placebo arms from the SELECT-AXIS 2 trial in order to match the relevant population characteristics of SELECT-AXIS 2 to those reported for the IXE80 and placebo arms from the COAST-W trial. The pre/post-match N/ESS for UPA15 and placebo were 173/40 and 172/36, respectively. Pre/post-match baseline characteristics indicated homogeneity on matched characteristics between SELECT-AXIS 2 and COAST-W (Table 1).

Table 1 Baseline characteristics of interest pre- and post-matching in week 12–16 MAIC

After matching, the estimated absolute ASDAS LDA rates (95% CI) for placebo and UPA15 were 7.4% (3.4%, 15.5%) and 41.3% (30.3%, 53.1%), respectively, compared to the reported ASDAS LDA rates for placebo and IXE80 in COAST-W of 4.8% (0.1%, 8.9%) and 17.5% (10.6%, 24.5%), respectively. Matched UPA15 was numerically, though not statistically, superior to IXE80 on the OR scale (95% CI) (2.07 [0.52, 8.23]) (Fig. 3).

Fig. 3
figure 3

Estimated ASDAS LDA absolute rates and odds ratio comparison of matched UPA15 and IXE80 at week 12–16 in TNFi-IR populations with AS. Input indicates values utilized in the MAIC; OR comparison conducted between relative (i.e., PBO-adjusted) differences in UPA15 and IXE80; asterisk indicates significant comparison in favor of UPA15 (OR confidence intervals do not cross 1). AS, ankylosing spondylitis; ASDAS LDA, Ankylosing Spondylitis Disease Activity Score low disease activity; CI, confidence interval; IR, inadequate responder/intolerant; IXE80, ixekizumab 80 mg subcutaneous injection every 4 weeks; MAIC, matching-adjusted indirect comparison; OR, odds ratio; PBO, placebo; TNFi, tumor necrosis factor inhibitor; UPA15, upadacitinib 15 mg orally once daily

For the unanchored MAIC of ASDAS LDA at week 52, weights were applied to IPD of the UPA15-to-UPA15 arm from the SELECT-AXIS 2 trial single-arm extension to match the relevant population characteristics of SELECT-AXIS 2 to those reported for the IXE80-to-IXE80 arm from the COAST-W trial single-arm extension. The pre/post-match N/ESS for UPA15 were 168/46; pre/post-match baseline characteristics again indicated homogeneity on matched characteristics between SELECT-AXIS 2 and COAST-W (Supplementary Table 2).

Matched UPA15 had an estimated absolute ASDAS LDA rate (95% CI) of 45.0% (34.4%, 56.0%) compared to the reported ASDAS LDA rate for IXE80 at week 52 in COAST-W of 27.6% (19.6%, 37.2%). UPA15 was additionally found to be statistically superior to IXE80 on the OR scale (95% CI) (2.15 [1.03, 4.46]) (Fig. 4).

Fig. 4
figure 4

Estimated ASDAS LDA absolute rates and odds ratio comparison of matched UPA15 and IXE80 at week 52 in TNFi-IR populations with AS. Input indicates values utilized in the MAIC; OR comparison conducted between absolute differences in UPA15 and IXE80; asterisk indicates significant comparison in favor of UPA15 (OR confidence intervals do not cross 1). AS, ankylosing spondylitis; ASDAS LDA, Ankylosing Spondylitis Disease Activity Score low disease activity; CI, confidence interval; IR, inadequate responder/intolerant; IXE80, ixekizumab 80 mg subcutaneous injection every 4 weeks; MAIC, matching-adjusted indirect comparison; OR, odds ratio; TNFi, tumor necrosis factor inhibitor; UPA15, upadacitinib 15 mg orally once daily

Cross-Tabulation of Discontinuation Due to Adverse Events and ASDAS LDA in TNFi-Naïve and TNFi-IR Populations

In the TNFi-naïve analysis, the cross-tabulation of NMA-estimated/placebo-adjusted rates of ASDAS LDA at week 12 to 16 with RCT-reported/placebo-adjusted rates of discontinuation due to AEs demonstrated that UPA15 had the highest ASDAS LDA rate and lowest discontinuation rate. Other assessed therapies had similar discontinuation rates but lower efficacy (Fig. S6). Similarly, in the TNFi-IR analysis, the cross-tabulation of MAIC-estimated/placebo-adjusted rates of ASDAS LDA at week 12 to 16 with RCT-reported/placebo-adjusted rates of discontinuation due to AEs suggested that UPA15 had a higher ASDAS LDA rate than and similar discontinuation rate to IXE80, a trend that is durable to week 52 albeit without placebo adjustment (Figs. S7 and S8).

Discussion

Of the b/tsDMARD therapies assessed in this ITC, UPA15 demonstrated the greatest clinical efficacy in AS per the ASDAS LDA outcome. UPA15 had numerically higher rates of response and lower NNTs than all assessed comparators in analysis at week 12 to 16 in both TNFi-naïve and TNFi-IR populations. UPA15 also had statistically significantly greater efficacy than IXE80 in unanchored MAIC at week 52 in the TNFi-IR population.

A 2023 ITC via NMA assessed the comparative efficacy of approved (adalimumab, certolizumab pegol, etanercept, golimumab, infliximab, ixekizumab, secukinumab, upadacitinib, and tofacitinib) and investigational (bimekizumab) therapies for AS and nr-axSpA using the binary outcomes ASAS20, ASAS40, and ASAS partial response (ASAS PR, defined as a score of less than 2 on a scale of 0 to 10 in each of the four ASAS domains) at week 12 to 16 [21]. Analyses were conducted in three b/tsDMARDs exposure networks: predominantly naïve (defined as among RCTs with > 50% of participants b/tsDMARD-naïve), naïve, and exposed. In the predominantly naïve AS network, upadacitinib had the most favorable surface under the cumulative ranking curve (SUCRA) score for the more stringent ASAS PR and ASAS40 outcomes, as well was statistically significantly more efficacious than placebo and, for ASAS40 only, secukinumab (no loading dose). Upadacitinib also had the most favorable SUCRA score for ASAS40 and ASAS20 in the purely b/tsDMARD-naïve and b/tsDMARD-exposed AS networks, respectively. The study also assessed two safety outcomes at 12 to 16 weeks in a combined nr-axSpA and AS population—discontinuation due to any reason and serious adverse events—and found no significant differences in safety between the b/tsDMARDs. Although different efficacy outcomes were assessed, the findings of this NMA in terms of the relative efficacy and safety of upadacitinib generally corroborate with those of the current study as well as other safety ITCs in the literature [22]. Of note, bimekizumab phase 3 RCTs were not available at the time of the SLR that informed the current study.

There remain limited head-to-head RCTs of b/tsDMARDs utilizing differing mechanisms of action in AS. A recent head-to-head study of secukinumab and adalimumab biosimilar (SDZ-ADL) in b/tsDMARD-naïve patients with AS found low spinal radiographic progression over a 2-year period for both therapies, with no statistically significant difference between them [23]. However, there is as yet no head-to-head RCT that includes different mechanisms of action and focuses on ASDAS outcomes, a notable absence considering the potential clinical importance of ASDAS outcomes and reasonable treatment target based on shared decisions between provider and patient. The analysis presented in this current study helps to bridge that gap by providing the first ITC of an ASDAS-based outcome in AS and thus may assist clinicians in treatment decision-making. Extending these ITC methods to nr-axSpA RCTs as well as other ASDAS outcomes such as ASDAS inactive disease, defined as achieving an ASDAS threshold of 1.3 or less, is also an important topic of future research.

Both NMA and MAIC are widely utilized and accepted ITC methods in the literature. Analysis was robustly conducted in accordance with best practice guidelines available in the literature; NMA model fits were thoroughly assessed and MAIC matching characteristics were compared pre- and post-match to ensure sufficient matches were achieved between SELECT-AXIS 2 and COAST-W study populations [11, 13]. Furthermore, stringent RCT inclusion criteria were applied to ensure comparable populations were assessed in the ITCs and minimize potential biases from differing study designs.

Nevertheless, limitations remain with the present analysis. The stringent RCT inclusion criteria inherently limited the breadth of data which could be included in the analysis. Across the presented ITCs, ASDAS LDA assessment timepoints were heterogeneous across included RCTs, which may introduce bias in the analysis.

NMA relies on multiple statistical assumptions, violations of which may negatively impact the validity of NMA results. One such assumption, that direct and indirect effect estimates in the network are sufficiently similar, often referred to as the consistency assumption, could not be evaluated in this analysis owing to the lack of multiple RCT inputs for each comparator [24]. Inclusion of future RCTs that measure and report ASDAS LDA may help to address this limitation.

MAIC also relies on statistical assumptions, namely that relevant treatment effect modifiers are accounted for in the matching process. Improper balancing of patient populations may result in bias of unknown direction and amount in the analysis [13]. Unanchored MAIC must further assume that all relevant treatment effect modifiers and prognostic factors for the applicable therapies and disease of study have been accounted for in the matching process, a strong assumption that may not hold true given the limited number of characteristics matched on in this analysis [13]. While baseline characteristics utilized for matching in the presented MAIC had precedence in previous research, a binary threshold for CRP (percent of patients with a CRP value greater than 5 mg per liter, a threshold reported within COAST-W publications) was used in this analysis rather than treating CRP as a continuous variable [12]. While this choice was made to preserve the post-match ESS, it may negatively impact the quality of the match on CRP between the SELECT-AXIS 2 and COAST-W populations. Increasing the number of characteristics in the match, while potentially improving the similarity of the assessed populations, would also result in a further reduction to ESS and statistical power.

Finally, while safety was reviewed across included RCTs in the form of discontinuation due to AEs rates and cross-tabulated with efficacy estimates from the ITC, safety outcomes were not formally assessed via ITC. Assessing the risk–benefit profile of a therapy is an important consideration in treatment decision-making. Safety outcomes have been assessed in prior related ITCs; however, such data are potentially heterogeneous, being assessed in mixed b/tsDMARD-naïve and b/tsDMARD-experienced populations and/or across multiple immune-mediated inflammatory disorders [21, 25]. This, combined with the low rates of safety events observed in RCTs, suggests that future real-world data analysis may be required to serve as an important validation of the current study findings in safety comparisons.

Conclusions

This is the first ITC focused on the clinically relevant ASDAS LDA outcome among approved b/tsDMARDs for the treatment of active AS in both TNFi-naïve and TNFi-IR populations. Findings indicate that UPA15 may provide the greatest clinical benefit per ASDAS LDA among advanced therapies included in this analysis (adalimumab, etanercept, ixekizumab, upadacitinib). As more data and additional advanced therapies become available, this analysis warrants updating to ensure the most current estimates of comparative clinical efficacy are available to aid medical decision makers in optimal treatment selection. This analysis also calls for validation with comprehensive head-to-head RCTs and real-world data.