FormalPara Key Summary Points

Lebrikizumab and dupilumab are monoclonal antibodies used to treat patients with moderate-to-severe atopic dermatitis. Both have demonstrated beneficial efficacy and safety during a 16-week induction phase in clinical trials. However, a comparative analysis on their long-term efficacy and safety, which could guide healthcare decision-making, has not been conducted at this time.

This analysis aimed to compare the long-term efficacy and safety of lebrikizumab and dupilumab. The unanchored matching-adjusted indirect comparison (MAIC) method was employed, which re-weights individual patient data from lebrikizumab trials to match reported aggregate results from the SOLO-CONTINUE long-term dupilumab clinical trial.

The unanchored MAIC results suggest that lebrikizumab may provide equal or superior long-term maintenance of Investigator’s Global Assessment (IGA) scores of 0/1 (clear or almost clear) compared with dupilumab. In addition, they have demonstrated equivalent maintenance of 75% reduction from baseline in the Eczema Area and Severity Index (EASI-75) and in overall adverse event rates.

These findings may have significant implications for the long-term management of moderate-to-severe atopic dermatitis. They may inform the decisions of healthcare decision-makers as well as the daily practice of clinicians.

Introduction

Atopic dermatitis (AD) is a chronic, relapsing inflammatory skin disease characterized by recurrent eczematous lesions and severe itching. It is the most prevalent chronic inflammatory skin condition, affecting up to 20% of children and between 2% and 7% of adults worldwide [1]. In moderate-to-severe cases of AD, skin lesions can cover a large body surface area and are often accompanied by intense, persistent itching. Sleep disturbances, primarily due to unrelenting itching, is common and significantly diminishes patients’ quality of life [2, 3]. Moreover, AD has substantial psychosocial burden, with patients frequently exhibiting symptoms of anxiety or depression and generally scoring lower on mental health assessments compared with the general population [4, 5]. For patients with moderate-to-severe AD, topical therapies often have limited efficacy, and conventional systemic treatments such as cyclosporine and methotrexate are not suitable for long-term use due to their extensive side effect profiles [6, 7]. Therefore, there is a clinical need for long-term safe and effective therapeutic options for moderate-to-severe AD.

The pathogenesis of AD is multifaceted, involving skin-barrier dysfunction and a complex interaction of genetic, immunologic, and environmental factors. The dysregulation of type 2 helper T cells, which produce cytokines such as interleukins 4, 13, and 31, plays an essential role in the pathogenesis of AD [8, 9]. Recent evidence suggests that interleukin-13 may be the key cytokine in AD and serum levels of interleukin-13 correlate with disease severity [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24]. Consequently, targeting the signalling of interleukins is an attractive approach to treatment of AD.

Dupilumab was the first-in-class monoclonal antibody approved by the European Medicines Agency (EMA) as well as by the US Food and Drug Administration (FDA) to treat moderate-to-severe AD in 2017 [25, 26]. However, given the disease’s heterogeneous nature and the individual variation in treatment response, there remains a substantial need for innovative therapeutic options. Lebrikizumab is a novel monoclonal antibody that binds with high affinity and slow off-rate to interleukin-13, thereby blocking the downstream effects of interleukin-13 with high potency [27]. Lebrikizumab has demonstrated efficacy and safety in clinical trials [28], with significantly improved skin clearance and itch reduction and reduced impact on quality of life and sleep.

Clinical trials for lebrikizumab and dupilumab employ similar treatment periods: a 16-week induction phase followed by 36-week maintenance phase [29,30,31]. As there have been no head-to-head comparison of lebrikizumab and dupilumab, comparative treatment effects rely on indirect treatment comparison. In an early network meta-analysis comparing monotherapy treatments for moderate-to-severe AD over the 16-week induction phase, lebrikizumab demonstrated similar efficacy and safety profiles to dupilumab [32,33,34]. Patients with moderate-to-severe AD often require lifelong treatment. Consequently, reliable evidence regarding the long-term comparative benefits and harms of alternative interventions is needed to make clinical decisions regarding their use. However, comparison beyond the induction phase is challenging due to the study designs of the relevant clinical trials: treatment allocation subsequent to the induction phase in trials for both dupilumab and lebrikizumab depend on the achieved treatment response at week 16. While there were placebo arms for the maintenance phase, these included patients assigned to active treatment in the induction phase. Consequently, the placebo arms do not remain suitable as baselines for common comparison beyond the induction phase, rendering anchored comparison methods (e.g., network meta-analysis) infeasible.

Where anchored comparisons cannot be performed, unanchored matching-adjusted indirect comparison (MAIC) has been recommended in the literature and by healthcare payers, including the European Network for Health Technology Assessment and the United Kingdom (UK) National Institute for Health and Care Excellence (NICE) as a viable approach to reduce the risk of bias in comparison of alternative treatment options applied to similar populations [35,36,37]. The objective of a MAIC is to re-weight the individual-level patient data (IPD) from one trial to align the distribution of known or suspected prognostic factors and effect modifiers with corresponding reported aggregate data from the comparator population, thereby mitigating the risk of potential bias.

This study employs unanchored MAIC to compare the long-term (weeks 16–52) maintenance of efficacy and safety of lebrikizumab 250 mg administered every 4 weeks (Q4W) and dupilumab 300 mg administered weekly or every second week (QW/Q2W) in the treatment of adult patients with moderate-to-severe AD who had achieved treatment effect following induction.

Methods

Data Sources

This analysis included individual patient data from the ADvocate trials for Lebrikizumab (ADvocate1 and ADvocate2 registered under clinical trial numbers NCT04146363 and NCT04178967, respectively) [31] and published aggregate data information on baseline characteristics and outcomes for dupilumab from the SOLO-CONTINUE trial (NCT02395133) [30]. The design of the ADvocate trials and the SOLO-CONTINUE trial is illustrated in Fig. 1.

Fig. 1
figure 1

ADvocate and SOLO-CONTINUE trial designs. The arms used in this study are outlined in gold

The ADvocate 1 and 2 trials are 52-week, randomized, double-blind, global studies designed to evaluate the efficacy and safety of lebrikizumab as monotherapy in adult and adolescent patients. During the 16-week induction period, patients received lebrikizumab 500 mg initially and at 2 weeks, followed by lebrikizumab 250 mg or placebo every 2 weeks. Participants who achieved a clinical response (EASI-75 or IGA 0/1 with a 2-point improvement and no rescue medication) were re-randomized to receive either lebrikizumab 250 mg every 2 weeks (Q2W), lebrikizumab 250 mg every 4 weeks (Q4W) or placebo (lebrikizumab withdrawal) for a period of 36 weeks.

The SOLO trial is a 16-week, randomized, double-blind study that evaluated the efficacy and safety of dupilumab in adult patients. The SOLO-CONTINUE trial covered weeks 16–52 for patients assigned to active treatment who achieved clinical response (EASI-75 or an IGA 0/1) by week 16 of the SOLO trial, who were re-randomized in the SOLO-CONTINUE day 1 (week 16 from treatment initiation). Patients either continued their original SOLO regimen of dupilumab 300 mg weekly or every 2 weeks (QW/Q2W) or were assigned to a less frequent regimen of 300 mg every 4 weeks (Q4W) or 8 weeks (Q8W), or placebo for a period of 36 weeks.

It is important to note that the published results of the SOLO-CONTINUE trial employ SOLO-CONTINUE baseline, e.g., week 16 of treatment, as baseline. While pre-treatment baseline statistics are reported for the full SOLO populations by study arm, corresponding statistics are not publicly available for the SOLO-CONTINUE trial arm populations. Consequently, respondent weighting can only be conducted for week 16 statistics. At the 16-week treatment mark, baseline statistics are presented separately for placebo, dupilumab 300 mg Q8W, Q4W, and QW/Q2W, and include age, race, sex and duration of AD at baseline (older/younger than 26 years) [30]. It also includes scores and measurements for the following:

  • Investigator’s Global Assessment score (IGA).

  • Eczema Area and Severity Index score (EASI).

  • Participants with improvement of at least 75% in EASI score (EASI-75).

  • Body surface area affected by eczema (% BSA).

  • Weekly peak pruritus numeric rating scale score.

  • SCORing Atopic Dermatitis (SCORAD).

  • Patient-Oriented Eczema Measure (POEM).

  • Dermatology Life Quality Index (DLQI).

  • Hospital Anxiety and Depression Scale (HADS).

In this project, we primarily compared the Q4W lebrikizumab study arm with the QW/Q2W dupilumab. While we recognize that 300 mg Q2W is the only approved posology for dupilumab, the SOLO-CONTINUE results are reported in aggregate form for QW and Q2W, and it is not possible to disaggregate these on the basis of publicly available data. For meaningful comparisons, we identified a subset of the patients in the ADvocate trials that matched the inclusion criteria of the QW/Q2W participants in the SOLO-CONTINUE trial. The resulting cohort included adult patients with EASI-75 or IGA 0/1 at week 16 assigned to Q4W lebrikizumab, i.e., the adult subset of the primary maintenance population for the Q4W study arms; hereafter referred to as the SOLO-like ADvocate sample. The analysis was limited to adults to match the inclusion criteria of the SOLO-CONTINUE trial. Table 1 presents aggregate statistics reported for the QW/Q2W SOLO-CONTINUE study arm, as well as corresponding statistics for the matching sample from the pooled ADvocate 1 & 2 studies.

Table 1 SOLO-CONTINUE and corresponding ADvocate aggregate data

Matching-Adjusted Indirect Comparison

Given the lack of direct comparative clinical trial data, indirect comparison methods such as network meta-analyses (NMA) and matching-adjusted indirect comparisons (MAIC) allow for the evaluation of relative effectiveness of alternative treatment options. Where feasible, NMA is the recommended approach. However, NMAs and other anchored comparison methods rely crucially on the presence of suitable overlapping treatment arms between different trials, which are employed through an assumption of transitivity to allow for the indirect comparison of non-overlapping arms. In the absence of evidence allowing NMAs, MAIC is a suitable approach. Health technology assessment (HTA) bodies such as the European Network for Health Technology Assessment (EUnetHTA) [37] and the UK National Institute for Health and Care Excellence (NICE) [36] have issued guidelines on the use of indirect comparisons, reflecting their acceptance and recognizing their growing importance in health technology assessment.

The objective of a MAIC is to reduce the risk and potential impact of bias in a comparison of results from separate trials where differences in the distribution of prognostic factors or effect modifiers could influence results. We conducted an unanchored MAIC in line with the recommendations outlined in NICE Technical Support Document 18 [36]. Details around the specifics of the statistical methods used can be found in the online Supplementary Material, section A. Briefly, the MAIC involved two primary steps described below.


Step 1: Re-weighting IPD from the ADvocate Trials

Individual patients from the SOLO-like ADvocate sample were re-weighted to align the distribution of expected effect modifiers and prognostic factors with the corresponding published aggregate statistics for the population in the SOLO-CONTINUE trial. Weights were estimated using calculated propensity score weighting, employing the approach used in the NICE DSU report on population-adjusted indirect comparisons.


Step 2: Calculating adjusted outcomes

The re-weighted individual patient data were used to calculate adjusted outcomes in the pooled lebrikizumab trials. Weighted risk ratios (RRs) and weighted odds ratios (ORs) were estimated to compare efficacy and safety outcomes between lebrikizumab and dupilumab. An unweighted “naïve” estimator that directly compared the treatment groups without any adjustments for differences between trial populations was included for comparison.

Prognostic Factors and Effect Modifiers

For anchored MAIC, matching is limited to suspected or known effect modifiers, while potential differences in the distribution of prognostic factors should not be adjusted for, as this is handled by reference to the common treatment arm. Conversely, in unanchored MAIC, differences in the distribution of prognostic factors could bias comparisons and should be adjusted for. When dealing with single-arm studies, it is not possible to distinguish statistically between effect modifiers and prognostic factors, but this distinction is of limited importance when either should be controlled for.

The selection of relevant prognostic factors and effect modifiers was informed by two approaches: a targeted literature search for pertinent recommendations and analyses of the SOLO-like ADvocate sample to identify potential baseline predictors of relevant outcomes.

A recent study by Thom et al. 2022, which compared crisaborole ointment with topical calcineurin inhibitors in mild-to-moderate AD [38], provides a list of recommended variables for matching, which is also relevant for moderate-to-severe AD, on the basis of an extensive literature review. The following variables are recommended for matching: age (mean and standard deviation), proportion of males, proportion of Caucasians, percentage body surface area (BSA) affected, baseline IGA score, and proportion with prior topical corticosteroids use. The IGA score and BSA were used as indicators of AD severity. Prior exposure to topical corticosteroids exposure was particularly relevant to the study by Thom et al., as the subject of investigation was a topical treatment.

We performed a logistic regression analysis on the subset of the patients in the ADvocate trials that matched the inclusion criteria of the participants in the SOLO-CONTINUE trial. Details around the methods used and the results of these analyses can be found in the online Supplementary Material section B.

Outcomes of Interest

Outcome variables for this analysis were identified following the efficacy outcomes definitions used in each trial. Treatment efficacy was assessed on the basis of the proportion of patients who maintained EASI-75 or IGA 0/1 between week 16 and week 52. These measures were selected on the basis of their use as primary and secondary endpoints in both the ADvocate and the SOLO-CONTINUE trials. Safety was evaluated by the proportion of patients who experienced overall adverse events during the study period.

Non-responder Imputation (NRI)

We employed strict non-responder imputation throughout in the handling of both IPD from the SOLO-like ADvocate sample and the reported aggregate outcomes from SOLO-CONTINUE. For instance, where the reported maintenance for IGA 0/1 in Worm et al. [30] is limited to participants who displayed IGA 0/1 at baseline (week 16), we include the participants who did not achieve IGA 0/1 at week 16 as non-responders. The same approach was applied to the ADvocate IPD, where we additionally applied non-response to any missing week 52 data, irrespective of the result at last observation.

Base Case and Sensitivity Analysis

In line with the recommendations by Thom et al. 2022 [38], the selected parameters for propensity score weighting in our base case analysis were age (mean and SD), sex (proportion male), white versus non-white (proportion non-white), % BSA at week 16 (mean, SD), and EASI score at week 16 (mean, SD).

In addition to the base case analysis, several sensitivity analyses were performed to assess the robustness of our findings. In sensitivity analysis 1–3, mean and SD for EASI score at week 16 were substituted by corresponding scores for IGA, DLQI, and POEM, respectively. In sensitivity analysis 4–8, only a single baseline severity variable was used rather than two, i.e., % BSA, EASI, IGA, DLQI, and POEM. In the final sensitivity analysis, only age, proportion male, and proportion white were included for matching, illustrating the impact of baseline score adjustment.

All statistical analysis were performed in R for windows, version 4.2.2 [39].

Ethics

This study involved secondary analysis of a limited subset of data from the ADvocate 1 and 2 clinical trials, which were undertaken in accordance with the ethical principles of the Declaration of Helsinki, including informed consent from study participants. The article does not contain any studies with human participants or animals performed by any of the authors.

Results

Out of the 118 patients in the ADvocate trials receiving lebrikizumab every 4 weeks (Q4W), 17 individuals below the age of 18 years were excluded to align with the enrollment criteria of the SOLO-CONTINUE trial, which exclusively included adults, leaving a SOLO-like ADvocate sample of 101 individuals.

Table 1 presents the baseline characteristics of patients from the SOLO-CONTINUE and matching ADvocate trial sample. The unweighted, target, and weighted statistics for the matched data are presented in Table 2. Prior to matching, the SOLO-like ADvocate sample was not greatly different from the aggregate statistics of the SOLO-CONTINUE target population. For the base case analysis, the re-weighted ADvocate sample closely matched the target population in terms of age (mean and SD), sex, race, and week 16 scores for EASI and % BSA. Achieving homogeneity between the two populations reduced the effective sample size (ESS) from 101 to 75.5.

Table 2 Table presenting aggregate statistics, targets, and weight information

The distribution of the propensity score weights reports the most extreme weights around 4 for a minority of participants (see Fig. 2) and was considered acceptable. The weights were similar across a large proportion of the population, and no clear outliers were identified, indicating a high overlap between samples and minimal reliance on a small subset of patients.

Fig. 2
figure 2

Patient weights histogram for unanchored matched-adjusted indirect comparison

Figure 3 provides forest plots and statistics for the risk ratios (RR) and corresponding odds ratios (OR) from the base case analysis for both naïve comparisons and unanchored MAICs. Both in naïve comparisons and MAIC, patients treated with lebrikizumab were more likely to maintain IGA 0/1 through the 36-week maintenance period (weeks 16–52) compared with those treated with dupilumab (RR 1.334; 95% CI 1.02–1.74; p = 0.035). Both treatments demonstrated comparable efficacy in terms of EASI-75 maintenance (RR 0.937; 95% CI 0.78–1.13; p = 0.490) and similar AE rates (RR 1.052; 95% CI 0.90–1.23; p = 0.526).

Fig. 3
figure 3

Risk and odds ratio forest plots for MAIC and naïve comparisons. Lebri lebrikizumab, Dupi dupilumab, CI confidence Interval, SE standard error, RR relative risk/risk ratio, OR odds ratio, 0/1 Score of 0 or 1 on Investigator’s Global Assessment, EASI-75 75% improvement in score on the Eczema Area and Severity Index, MAIC matching-adjusted indirect comparison, AE adverse events

Our sensitivity analyses substantiated these findings (Supplementary Material section C). For EASI-75 and overall adverse events, all sensitivity analyses yielded results that were statistically indistinguishable from equivalence. The MAIC estimates for maintaining IGA 0/1 were statistically non-significant in sensitivity analysis 2 and 3, which replaced the EASI score in our base case scenario with DLQI and POEM, respectively. Similarly, non-significant results were found in sensitivity analysis 8, which included the parameters age, sex, race (white versus non white), and % BSA. The final sensitivity analysis in which only age, sex, and proportion white was used for matching closely resembled the naïve comparison, indicating that the two study populations were very similar in terms of demographic makeup.

Discussion

This study aimed to compare the long-term maintenance of efficacy and safety for lebrikizumab 250 mg Q4W and dupilumab 300 mg QW/Q2W in patients who had achieved treatment effect following induction using unanchored MAIC. The findings indicate that maintenance of IGA 0/1 at week 52 is equivalent or higher for lebrikizumab than dupilumab, while maintenance of EASI-75 is equivalent, with equivalent rates of adverse events.

Although the difference for IGA 0/1 was statistically significant in the base case analysis, this varied in sensitivity analyses. Considering the limited sample sizes involved and potential prognostic factors and risk modifiers remaining unaccounted for, this significance level should be considered with caution. While point estimates varied somewhat in the different sensitivity analyses, none indicate inferior long-term maintenance for lebrikizumab, suggesting that a statement of at least equivalence is relatively robust.

The study aligns with NICE guidelines for population-adjusted indirect comparisons, specifically the elements relating to unanchored MAIC, where anchored comparison is not feasible, and the indirect comparisons were conducted on linear predictor scale of log risk ratios and log odds ratios.

An alternative approach would be to use simulated treatment comparison (STC), which has been reported to be potentially superior to MAIC in simulation studies of anchored analyses. However, differences between MAIC and STC tend to be minor, with MAIC being a more mature approach with which HTA agencies have experience, and relative performance for unanchored comparison remains unknown.

The study builds on results from the maintenance phase of the ADvocate 1 and 2 studies. The findings in this study for week 52 line up well with reported findings from the induction phase. In the ADvocate 1 & 2 studies, 58.8% and 52.1% of participants achieved EASI-75 over the 16 week induction phase, respectively, while 43.1% and 33.2% achieved IGA 0/1 with at least 2-point improvement, respectively [40]. This is very similar to the reported findings for week 16 after administration of 300 mg dupilumab Q2W in the SOLO 1 and 2 trials, where 52.5% and 48.1% achieved EASI-75, respectively, and 37.2% and 36.4% achieved IGA 0/1 with at least a 2-point improvement, respectively [29]. A recently updated network meta-analysis [41] of short-term effectiveness for systemic therapies for moderate-to-severe atopic dermatitis including both the SOLO and ADvocate trials found absolute response rates (ARR) for EASI-75 of 45.3% for dupilumab and 44.7% for lebrikizumab, and for IGA 0/1 of 32.4% for dupilumab and 28.0% for lebrikizumab.

Certain limitations should be acknowledged. Most importantly, the absence of reported pre-treatment baseline aggregate data for dupilumab means that we were not able to adjust for differences in study populations in terms of pre-treatment AD severity. Consequently, the analyses in this study rest on a core assumption that study participants in the SOLO-CONTINUE and maintenance primary ADvocate populations can be considered essentially similar conditional on equivalent achievement of efficacy at week 16, i.e., independent of their pre-treatment baseline. This needs to be acknowledged as a relevant assumption and may not apply to patients for whom treatment response in the induction phase falls below the per-protocol definition, who may need to consider treatment adjustments.

The clinical outcomes under consideration, IGA 0/1 and EASI-75, are dichotomizations of more continuous clinical effect measures; IGA scores and improvements in EASI scores. Such dichotomization generally reduces statistical power compared with analyses of corresponding continuous variables. However, IGA 0/1 and EASI-75 are widely reported and used in current medical guidelines for AD in Europe, such as EuroGuiDerm [42].

Furthermore, we were not able to investigate some potential prognostic factors and/or effect modifiers, such as weight or body mass index, as relevant statistics are not publicly available for the SOLO-CONTINUE population. There is a risk of remaining differences in the distribution of prognostic factors and effect modifiers unaccounted for in our analyses. As we have no indication of differences in the distribution of these characteristics, we are unable to make justifiable assumptions regarding the direction of potential remaining bias from variables unaccounted for.

The overall sample size of the relevant maintenance study arms both for dupilumab and lebrikizumab are limited, and the effective sample size is further reduced due to re-weighting of the SOLO-like ADvocate sample. While this is accounted for in the statistical analysis, the achieved precision of estimates in the study does not allow for the identification of small but potentially clinically relevant differences in maintenance of efficacy or rates of adverse events.

There would be general benefit from analyses comparing the long-term efficacy of different treatments for AD, e.g., NMA, but this would require clinical data amenable to such analyses.

Conversely, the overall finding of equivalence is robust to substitution of week 16 baseline severity for adjustment, and we can observe that the included populations overall display high degree of similarity over available characteristics, which improves confidence that we are dealing with overall similar patient populations. Finally, the reported findings from the SOLO-CONTINUE trial pool data on 80 individuals administered dupilumab 300 mg QW and 89 individuals administered Q2W, where Q2W is the relevant posology for comparison. Considering the reported maintenance rates for Q8W, Q4W, and QW/Q2W from Worm et al. 2019 [30], there is a clear dose–response effect for both EASI-75 and IGA 0/1. While this effect could be smaller for the difference between QW and Q2W, pooling these two dosing regimens may elevate results over Q2W alone, which would bias results in favor of dupilumab, considering that Q2W is the only approved posology.

The indirect comparison method employed in this study is accepted as suitable and reliable, but cannot replace the currently unavailable evidence from studies that involve or better reflect head-to-head comparison.

Conclusions

On the basis of the SOLO-CONTINUE and the ADvocate trial data, our MAIC results suggest that lebrikizumab Q4W may display equivalent or potentially superior maintenance of efficacy measured with EASI-75 and IGA 0/1 compared with dupilumab QW/Q2W between 16 weeks and 52 weeks of treatment, with the added benefit of requiring less frequent doses. Safety in terms of overall adverse event rates up to week 52 are similar for lebrikizumab Q4W and dupilumab QW/Q2W.