FormalPara Key Summary Points

Why carry out this study?

Clinicians and healthcare payers have a multitude of options for treating moderate-to-severe plaque psoriasis, and network meta-analysis (NMA), a method by which multiple interventions can be compared simultaneously in a single analysis, has been used widely to support decision making based on the best available evidence of efficacy and safety.

Despite the widespread application of NMA to synthesize randomized evidence in psoriasis, skepticism around its use and mistrust of its results persist.

To address these concerns, we performed a systematic review of published NMAs assessing biologic therapies for moderate-to-severe psoriasis, aiming to assess the methodological quality of these analyses and explore differences in their results.

What was learned from the study?

Twenty-five NMAs have been published since 2006 and most have come to broadly similar results and conclusions, considering the available data when they were performed.

The most useful NMAs for clinical decision making are those that: include all relevant trials of comparator treatments, assessed in a way that reflects their marketing authorization and expected use in clinical practice; provide a thorough assessment of heterogeneity and inconsistency; and report comparative effects and associated uncertainty in a comprehensive manner.

Digital Features

This article is published with digital features, including a summary slide, to facilitate understanding of the article. To view digital features for this article, go to https://doi.org/10.6084/m9.figshare.13129880.

Introduction

Psoriasis is a chronic, inflammatory, immune-mediated skin disorder [1]. Approximately 70–80% of patients have mild psoriasis, which is adequately managed by topical therapies, whilst the remaining 20–30% present with moderate-to-severe psoriasis [1]. Typically, topical treatments alone can be inadequate for these patients. If there is insufficient improvement in disease symptoms after treatment with phototherapy and/or conventional systemic therapies, patients are offered biologic systemic therapy [1].

The first class of biologic treatments licensed for moderate-to-severe plaque psoriasis were anti-tumor necrosis factor (anti-TNF) agents, such as etanercept, infliximab and adalimumab [2]. Since then, other classes of biologics have been licensed for this indication including a dual interleukin-12/23 (IL-12/23) inhibitor ustekinumab, followed by IL-17 inhibitors (secukinumab, ixekizumab) and receptor antagonist (brodalumab) [3], and IL-23 targeted treatments (guselkumab, tildrakizumab, risankizumab) [4]. Most recently, another anti-TNF treatment, certolizumab pegol, has been licensed in this population [3].

Due to the large number of available treatments for psoriasis, clinical and healthcare funding decisions often require a comparison between available treatment options. Although numerous clinical trials have been undertaken in psoriasis, there is a limited number of head-to-head trials, none of which compare all the available treatment options. In this disease area, similarities in study design and patient populations make these trials particularly suitable for comparison in a network meta-analysis (NMA). This method allows the comparison of multiple interventions in a single coherent analysis so all available interventions can be compared, using both direct and indirect evidence, thus resulting in a more precise estimate of treatment effect and the ability to rank treatments against one another, even if they have not been compared directly in a head-to-head randomized controlled trial (RCT) [5].

Statistical methods used to carry out NMAs are usually classed as either Bayesian or frequentist, which differ in some of their assumptions, methods for data synthesis, and interpretation of results [6, 7]. Although both analytical frameworks facilitate comparisons of multiple treatments, the Bayesian approach has been used most often, likely due to several advantages over the frequentist [5]. These include the ability to incorporate prior knowledge about the treatment effects into the model, and easy calculation of treatment ranks [8]. It is also the preferred framework of health technology assessors (HTAs) such as the National Institute for Health and Care Excellence (NICE) in the United Kingdom [9].

For both Bayesian and frequentist approaches, two model types can be used when performing an NMA: fixed-effect and random-effects models. A fixed-effect model assumes that all studies are estimating a single effect size and differences observed between study estimates are a result of chance. The results from a fixed-effect model provide an estimate of this underlying effect. The random-effects model assumes different studies are estimating different, but related effects. This is due to the differences between studies in patient and study characteristics (between-study heterogeneity). The results from a random-effects model can be interpreted as a mean of these different effects [5, 10].

Other than the choice of an analytical framework and model type, the results of an NMA may be influenced by a number of factors, such as which trials are included, bias within these trials, inconsistency (differences between direct and indirect estimates of treatment effect), and heterogeneity (and how it is addressed) [11].

As mentioned above, numerous NMAs have been undertaken in psoriasis; however, it is unclear whether these have always used robust methods or arrived at the same conclusions. Although frequently used to support decision making, there remains skepticism surrounding the use of NMAs, and mistrust of their results [12].

To address these concerns, a systematic literature review (SLR) of the published NMAs assessing biologic treatments for moderate-to-severe plaque psoriasis was performed. The aim was to assess the methodological quality of these analyses and explore the differences in their methods and results. Potential reasons for any differences in results were also considered.

Methods

A SLR was conducted to identify all NMAs assessing biologic treatments for moderate-to-severe plaque psoriasis. NMAs that evaluated two or more currently licensed biologic treatments in patients with moderate-to-severe psoriasis were included. Analyses evaluating any efficacy or safety outcome were included. Only NMAs using aggregate level data and a Bayesian or frequentist approach were included to allow for coherent comparisons of methodology and results. Indirect treatment comparisons carried out using methods such as those described by Bucher [13], or those using matching-adjustment or simulation methods were excluded because they are more limited in scope. Full eligibility criteria can be found in the supplementary material. The review was carried out in accordance with a protocol developed prior to commencement of work. This article is based on previously conducted studies and does not contain any studies with human participants or animals performed by any of the authors.

Study Identification

Embase, MEDLINE, MEDLINE In-Process, and the Cochrane Library were searched on 13 March 2019 and updated 19 February 2020, without time restrictions. Search strategies included text and index terms for psoriasis, relevant biologic treatments and study design. Full search strategies can be found in the online supplement. Reference lists of included publications were checked to identify any additional NMAs. After removal of duplicates in EndNote (Thomson Reuters), titles and abstracts were imported into DistillerSR (Evidence Partners) and assessed for inclusion by one reviewer, with a second reviewer independently performing a 40% check. Had the check revealed a significant number of disputes, a full 100% check would have been performed. This did not turn out to be necessary. Full texts were independently assessed for inclusion by two reviewers. Discrepancies were resolved by discussion or, when necessary, by a third reviewer.

Data Extraction and Quality Assessment

Details of each NMA were extracted, including the definition of the question addressed by the NMA, the sources of funding and methods used for identification and selection of studies, data extraction, critical appraisal, and data synthesis. In addition, information was collected on the studies included in each NMA, as well as the base-case results for all treatments compared with placebo, and the ranking of treatments based on efficacy or safety.

Data extraction and subsequent quality assessment of studies was carried out by one reviewer and checked by another. Any disagreements were resolved by discussion or, if necessary, by a third reviewer. Quality assessment was carried out using the International Society of Pharmacoeconomics and Outcomes Research (ISPOR) checklist for assessing reliability of NMAs [14]. This checklist covers five main areas, including (1) the used evidence base, (2) analysis methods, (3) reporting quality and transparency, (4) interpretation of findings, and (5) conflicts of interest [14]. There are 22 questions, allowing readers to better understand the applicability and credibility of the NMA and its results. Similar topic areas are covered in other commonly used assessment checklists, such as one developed by the NICE Decision Support Unit [15].

Data Analysis

The methodological details of identified NMAs were compared in a narrative summary. Psoriasis Area and Severity Index (PASI) response was the most commonly reported efficacy outcome, allowing the broadest comparison between NMAs. For that reason, results for PASI response were the focus of the analysis (safety outcome results were also summarized). A forest plot for each biologic versus placebo was generated using R Studio [16] to present the individual NMA results, with risk ratios (RR) and odds ratios (OR) considered separately. The results of each NMA reporting a treatment effect for the same comparison were compared by visual inspection. Ranking of treatments based on PASI response and safety outcomes was also compared between NMAs, considering the presence of any uncertainty.

Results

Search Results

Electronic searches identified 3043 publications. An additional three were identified through reference checking. After removal of duplicates, a total of 2271 titles and abstracts were assessed for inclusion and 208 papers were assessed in full text: a total of 25 analyses were included. Details can be seen in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagram Fig. 1.

Fig. 1
figure 1

A PRISMA diagram showing the flow of the included and excluded studies

Included NMAs

Study Identification and Selection in the NMAs

Table 1 provides brief details of the inclusion criteria in each identified analysis. All analyses included adults with moderate-to-severe psoriasis; Jabbar-Lopez 2017 [17] also included children.

Table 1 Study eligibility criteria in the included NMAs

The majority of identified NMAs searched for primary studies in MEDLINE, Embase, and the Cochrane Library. However, in two NMAs only MEDLINE was searched, [18, 19] and in another the search methods were not reported [20]. The search strategies in the identified NMAs were generally comprehensive, with the exception of seven analyses, for which the methods used were likely to miss relevant studies [18, 19, 21,22,23,24,25,26]. Risk of bias was assessed using the Jadad scale for randomized controlled trials [27] in six, the Cochrane Collaboration’s tool for assessing risk of bias [28] in nine, the Newcastle–Ottawa Scale [29] in one and the NICE methodology checklist for RCTs [9] in two NMAs. Six analyses did not clearly report assessment of risk of bias [20, 22, 23, 30,31,32]. Details of study identification and selection can be found in the online supplement.

A range of relevant biologic interventions were considered in the analyses, with the total number ranging from four in Geng 2018 [22] to twelve in the Cochrane Reviews by Sbidian and colleagues [33, 34]. The majority of NMAs considered licensed doses. Exceptions included Jabbar-Lopez 2017 [17] and the Cochrane reviews, [33, 34] which included any dose of treatments of interest; Geng 2018, [22] which included unlicensed doses of etanercept and infliximab; Woolacott 2006, [35] which included unlicensed doses of infliximab; and Sawyer 2018 (induction [i]) [36], Sawyer 2019, [37] and Cameron 2018 [38] which included unlicensed doses of several therapies where their inclusion added indirect evidence. Two unlicensed doses of ustekinumab were also commonly included in analyses: 45 mg and 90 mg, irrespective of patient’s body weight. Eleven of 22 NMAs evaluating ustekinumab included the licensed weight-based dose [20, 23, 24, 26, 30, 36,37,38,39,40,41].

The most frequently assessed outcomes were PASI response in 23 NMAs, followed by safety in nine NMAs. Seven analyses assessed other efficacy or quality of life outcomes [17, 20, 21, 25, 33, 34, 38]. All but one of the 25 analyses evaluated treatments at the end of the induction phase (between 10 and 24 weeks, depending on the analysis), while Sawyer 2018 (maintenance [m]) and Armstrong 2020 (m) compared treatments at one year. All NMAs included RCTs; Sawyer 2018 (m) and Armstrong 2020 (m) additionally included long-term extensions of RCTs and Wu 2020 also included non-randomized studies. Eight NMAs reported including phase 2 studies and above [17, 30, 32,33,34, 40,41,42], five NMAs restricted inclusion to just phase 3 studies and above [19, 20, 24, 38, 43] and the rest did not specify.

Source of Funding and Conflicts of Interest

Only one analysis [22] did not report on conflicts of interest or source of funding. Two NMAs declared no conflicts of interest including any sources of funding [18, 25], two declared conflicts but no source of funding, [43, 44] and two reported on potential conflicts but not on sources of funding [19, 21]. The remaining eighteen provided details of conflicts of interests and the funding received. Seven NMAs [17, 23, 24, 26, 33,34,35] were funded by public grants (such as from the Institute for Clinical and Economic Review [24] or the Cochrane Collaboration [33, 34]) and eleven by drug manufacturers [20, 30,31,32, 36,37,38,39,40,41,42].

NMA Analytical Methods and Assessment of Quality using ISPOR Criteria

The methods used in each of the NMAs can be seen in Table 2. Nineteen NMAs used a Bayesian approach and six used a frequentist approach. PASI response was assessed in 23 NMAs; Messori 2015 [18] assessed safety outcomes only and Wu 2020 [44] assessed the impact of biologic therapy on body weight and body mass index. Ten NMAs evaluated PASI as an ordered categorical outcome, eleven treated PASI response as a dichotomous outcome, and one analyzed the outcome both ways [38]. One analysis compared biologic therapies on their cumulative clinical benefit, measured by the area under the curve (AUC) for PASI 75, 90, and 100 [41].

Table 2 Methods employed by each included network meta-analysis

Most analyses considered each dose of a treatment separately, or only combined different dosing schedules that resulted in the same weekly dose (for example, etanercept 25 mg twice a week and 50 mg once a week). However, Jabbar-Lopez 2017 [17] and the Cochrane Reviews [33, 34] included any evaluated dose (licensed or unlicensed, including phase 2 doses not tested in phase 3) and combined (pooled) these in the analysis.

The details of critical appraisal of the NMAs using the ISPOR checklist can be seen in Table 3. Seven analyses were judged to have performed systematic reviews that could have failed to identify and include all relevant RCTs, [18, 19, 21, 22, 24,25,26] and one study did not report details on how included studies were identified and included [20]. The quality of primary studies included in the NMAs was generally reported as good, minimizing the risk of bias, with the exception of non-RCT studies included by Wu et al. [44] There appeared to be no evidence of selective reporting of results in primary studies included in the identified NMAs.

Table 3 Assessment of the quality of included network meta-analyses using the ISPOR checklist

In all NMAs the identified studies formed a connected network. All analyses used statistical methods that preserve within-study randomization, with the exception of Armstrong 2020 (m), which used an unanchored indirect comparison.

The rationale for the choice of a fixed-effect or random-effects model was discussed in nine analyses [18, 24, 32, 34,35,36, 38, 40, 43]. Of the seventeen NMAs using a random-effects model, assumptions about heterogeneity were explored in eleven. Imbalances in effect modifiers across treatment comparisons were considered in 11 analyses; methods that aimed to reduce the impact of bias due to these imbalances were used in ten analyses [24, 30, 33, 34, 36,37,38, 40, 42, 43]. Seven analyses considered the impact of patient characteristics on treatment effects in a qualitative fashion [34, 36, 38,39,40, 42, 43].

All analyses included both direct and indirect evidence in their evidence networks. The consistency of direct and indirect evidence was discussed in only fifteen (details of methods used are provided in Table 2). Two analyses [32, 35], however, did not need to assess inconsistency as there were no closed loops in their network.

Reporting of NMA results was not always complete. All but five NMAs presented the results of individual primary studies, but the majority did not present the direct comparison results separately from the indirect comparison or NMA results. Fourteen analyses reported results of all pairwise comparisons from the NMA for the outcome of interest and 16 reported treatment ranking in some form. Three studies [19, 22, 43] reported only the rank order of evaluated therapies based on comparative effect sizes and one reported the mean rank, but not its associated uncertainty [20]. Three studies presented rank results as “rankogram” plots, illustrating the probability distribution for each treatment’s rank [18, 25, 39]. Nine studies [17, 21, 23, 26, 33, 34, 38, 40, 44] presented overall rankings based on the Surface Under the Cumulative RAnking curve (SUCRA), which is a numeric representation of overall rank [45]. Higher SUCRA values (bounded at 100%) indicate that a therapy is ranked best or near the best, and lower values (bounded at 0%) indicate that a therapy is ranked among the worst.

Comparison of Primary Studies Included in NMAs

The trials included in each NMA can be seen in the supplementary material. Overall, the studies included in each NMA were similar, considering the interventions of interest and the search dates. The studies included in each NMA largely reflected the objective of the analysis, quality of the search, and subsequent choices made at the protocol stage. Some NMAs included evidence for oral systemic therapies like apremilast, fumaric acid esters, methotrexate and cyclosporine; therefore, placebo and head-to-head evidence for these therapies were included. Many NMAs excluded phase 2 studies or imposed a minimum trial sample size. Jabbar-Lopez 2017 [17] was the only study to include studies in pediatric patients and to exclude studies with fewer than 50 participants. The 2020 Cochrane Review included studies focused on nail psoriasis [46, 47], whereas these studies did not meet the inclusion criteria for other NMAs. Similarly, two RCTs [48, 49] focusing on patients with psoriatic arthritis appear to have been included in a pair of NMAs [22, 25] but nowhere else.

Another discrepancy between NMAs was the inclusion of studies evaluating only unlicensed doses, such as phase 2 dose-ranging studies of ixekizumab [50], secukinumab [51, 52], risankizumab [53] and guselkumab, [54, 55] which were included in the Cochrane Reviews [33, 34], Jabbar-Lopez, [17] and three others [21, 25, 26].

Further differences can be seen for infliximab trials, many of which were excluded in the 2017 Cochrane Review [34] due to their assessment of outcomes at 10 weeks. In their 2020 update, the timepoint criterion was relaxed and studies lasting at least 8 weeks were included [33]. Jabbar-Lopez [17] stated the exclusion of 10-week trials, but in fact included infliximab data coming from these.

NMA Results

PASI Response

Fourteen analyses reported results for PASI 90; 19 for PASI 75; 11 for PASI 50 and 5 for PASI 100. Comparative effects were presented as RRs, ORs, numbers needed to treat (NNT) and proportion of maximum AUC [41]. Results of NMAs for PASI 90 RR can be found in Fig. 2. The results for OR and other PASI levels are included in the supplementary material.

Fig. 2
figure 2

Risk Ratios versus placebo at PASI 90 across NMAs

All analyses reported the biologic treatments to be significantly more efficacious than placebo. Across all analyses using RRs, efficacy estimates were broadly similar for each intervention. The major exceptions were the 2017 and 2020 Cochrane Reviews [33, 34], which provided lower estimates of efficacy than other NMAs for each level of PASI and for each intervention. The results measured using an OR appeared to be mostly consistent. One major difference was seen for PASI 90 response to ustekinumab compared with placebo, where Jabbar-Lopez 2017 [17] provided a substantially lower OR than the other NMAs; and for the infliximab compared with placebo, where Xu 2019a reported a very high upper limit of the 95% confidence interval [25].

For older biologics, in particular infliximab and etanercept, a trend was observed whereby efficacy estimates reported by each analysis increased over time until the analysis by Signorovitch 2015 [42], when they begin to decrease; a plot depicting this can be found in the supplementary material.

Table 4 shows the treatment ranking across the included analyses. Ranking was reported in 14 NMAs. Ten analyses reported ranking on PASI 75, six on PASI 90, and one on all levels of PASI. For the remaining analyses and where rank across multiple categories was reported, ranking was inferred based on the results for PASI 90, if available, and on PASI 75 if not. Two analyses did not report PASI response and are therefore not included in this Table [18, 44].

Table 4 Efficacy ranking of treatments at PASI 90 or PASI75 across network meta-analyses

The treatment ranking appears consistent across analyses published at a similar time. As newer biologics became available, these were generally ranked higher than older therapies except infliximab which continues to rank amongst the highly efficacious treatments, when assessing short-term efficacy. In recent NMAs the most efficacious treatments, at PASI 90, were the IL-17 inhibitors (brodalumab, ixekizumab, secukinumab), IL-23 inhibitors (guselkumab and risankizumab), and infliximab. All of these treatments tend to be significantly more efficacious than adalimumab, certolizumab, etanercept, and ustekinumab. As far as differences between drugs within and across the IL-17 and IL-23 inhibitor classes are concerned, several NMAs have shown the efficacy of ixekizumab and brodalumab to be significantly greater than secukinumab, and similar to guselkumab and risankizumab.

Safety

Safety outcomes were analyzed in seven NMAs [17, 18, 21, 26, 33, 34, 38]. Five NMAs investigated the proportion experiencing any adverse event (AE), five investigated the incidence of serious AEs, and two others looked at infectious AEs and discontinuations due to AEs, respectively. Due to the small numbers of NMAs assessing safety outcomes and variation in the interventions considered, comparisons between these analyses were very limited. Full results can be seen in the supplementary materials.

Discussion

Findings in Context

The availability of multiple biologic treatments for moderate-to-severe plaque psoriasis has resulted in their efficacy and safety being compared in numerous NMAs. Although NMAs are often used to support reimbursement decisions [56], doubts regarding their credibility may have limited their application to clinical decision making [12]. This SLR is, to our knowledge, the first to review all published NMAs evaluating biologics for the treatment of psoriasis. We have included 25 NMAs that were published between 2006 and 2020.

We found that 23 NMAs provided results for different levels of PASI response, and only seven provided safety results. Safety results were often inconclusive and inconsistent across analyses and varied in terms of the outcomes considered. Although an important aspect of treatment comparisons, safety appears to be disregarded or insufficiently analyzed.

We found that, overall, the included NMAs met at least half of the criteria included in the ISPOR checklist, reflecting the moderate to good methodological quality of the NMAs. The majority of NMAs appeared to take sufficient steps to identify all relevant trials, which always formed connected networks. All used methods that preserve within-study randomization and reported the evidence network in graphical or tabular form. One of the major limitations of the identified analyses was that only half of the NMAs discussed the rationale for the choice of the model. In addition, just over half of analyses addressed the impact of bias due to differences in effect modifiers, through sensitivity analyses and use of different models, for example [17, 24, 26, 30, 31, 33, 34, 36,37,38, 40, 42, 43].

Twenty-one NMAs provided details of their funding and conflicts of interest. After accounting for uncertainty in the head-to-head treatment effects and relative ranks, all analyses came to similar conclusions, regardless of funding source. Publicly funded and industry funded analyses that used similar methods reported similar results and conclusions. Any apparent differences by funding source are better explained by comparing the methodological approach and assumptions underpinning the synthesis of evidence.

The focus of our analysis was the comparisons of the treatment efficacy compared with placebo for PASI outcomes at the end of the induction phase, as these were most widely reported. This comparison did not include Sawyer 2018 (m) [40], as it was the only analysis to investigate longer-term efficacy and was therefore not comparable with other NMAs. In addition, Gomez-Garcia 2016 [23] did not report results for active treatments compared with placebo and therefore could not be considered here either.

We identified that the efficacy of older biologics (anti-TNF agents) compared with placebo appeared to decrease slightly from approximately 2015 onwards. One potential explanation for this trend could be a change in the percentage of patients in trials who are biologic experienced, due to biologic therapies becoming the standard of care. If more recent trials included a higher proportion of patients who have previously tried TNF inhibitors, the proportion of patients responding to these treatments could be lower, compared with earlier trials. It also has the potential to affect the placebo response, which would therefore support the idea that analyses must be placebo-adjusted.

When analyses published within a similar time period were considered, their results were broadly similar, emphasizing the consistency in reported treatment effects. The only major exception appeared to be the original and updated Cochrane Reviews [33, 34], which consistently provided substantially lower estimates of treatment efficacy compared with placebo than other NMAs published around the same time. These NMAs were outliers in some methodological respects, including the choice of approach (frequentist, rather than Bayesian), the handling of PASI outcomes (dichotomous), and evaluated doses.

Six analyses used a frequentist approach and eighteen analyses used a Bayesian one, with one study employing both methods and demonstrating the statistical approach to have no impact on the results [20]. Treatment effects are reported in a number of different of ways, including risk ratios, odds ratios, risk differences, and numbers needed to treat, and this variety makes it difficult to compare results across studies beyond directional trends and conclusions of statistical difference.

Considering model choice, we found that NMAs either analyzed every level of PASI response separately (binomial models) or jointly (multinomial models). Whilst the choice of model only appears to result in small differences between treatment efficacy and safety estimates, these can potentially lead to different analysis results when considering the relative ranking of treatments. The use of a multinomial model allows inclusion of all available information. It also has the advantage of using the evidence on the relationships between different PASI response levels to make predictions about the relative performance of a treatment at a threshold not reported in a given study or studies, thus allowing comparisons that would be otherwise impossible using a binomial model [5]. This approach results in coherence across all PASI outcomes, a feature crucial when the results of an analysis are utilized in an economic model. However, when the major purpose of the analysis is to provide the best estimate of treatment efficacy at a specific level of response, a binomial approach may be more helpful. One of the potential advantages of using a binomial approach is that it only includes evidence from studies that reported results for a particular level of PASI response and therefore avoids the additional uncertainty resulting from extrapolation from other PASI levels. It also allows for more subtle differences between the responses at different levels of PASI to be observed. However, it appears that in the case of psoriasis the choice of binomial or multinomial model did not substantially affect estimates of efficacy. This was clearly highlighted in Cameron 2018 [38], where the base-case analysis was performed using a binomial model and a supplementary analysis was carried out using a multinomial model. Based on the results reported, relatively small differences between both analyses in treatment efficacy were seen.

For their base-case analysis, 17 NMAs used a random-effects model, three used a fixed-effect model and five did not state what model was used. Placebo adjustment was reported in six NMAs [24, 30, 36,37,38, 42]. This method accounts for cross-trial differences that may otherwise be difficult to identify and incorporate in the model. It will also likely show if the analysis results are influenced by these differences, thus minimizing inaccuracies in analysis results [57]. All six NMAs found the adjusted model was a better fit for the data than the unadjusted one.

As can be seen in the online supplement, the trials included in NMAs were broadly consistent, taking into account the time of publication. However, there were some cases where the inclusion criteria of the analyses differed, potentially leading to slight differences in results.

One of the reasons for the discrepancies in the trials included in the NMAs was the timepoint of interest for the analyses. Across analyses, the timepoints at which treatments were compared varied between 8 and 24 weeks. One of the major reasons for discrepancies appears to be the inclusion or exclusion of the 10-week timepoint corresponding to the duration of the majority of infliximab trials. Only the 2017 Cochrane Review [34] did not include 10-week data. As a result, this NMA excluded some major infliximab trials [58,59,60] and based its efficacy estimate on two trials comparing infliximab with methotrexate and etanercept [61, 62]. It reported lower estimates of infliximab efficacy compared with other analyses. Although including only trials with a small range of follow-up times may reduce heterogeneity, it may not always result in a clinically relevant estimate. As biologics used in psoriasis have a different induction period at which point a decision is made on whether to continue treatment, it appears that providing estimates relevant to that decision point (which varies from 10 to 16 weeks) may be of more importance. It also ensures the inclusion of the most relevant evidence base.

In some cases, different doses of treatments for psoriasis are included in clinical trials. In particular, phase 2 studies may include a range of doses later judged as subtherapeutic or intolerable and not pursued in later-phase research. The identified NMAs that included a range of treatment doses have handled these differently.

Whilst some analyzed different treatment doses separately, others pooled evidence on some or all doses of a drug [17, 25, 33, 34]. Although pooling doses may provide the advantage of utilizing all available evidence, analyzing doses separately is generally more relevant to clinical decision making, because in clinical practice patients are prescribed the licensed dose of a drug. The efficacy and safety of drugs is usually dose-dependent and therefore the effect estimates obtained by pooling data for unlicensed and licensed doses may not provide a reliable estimate of the treatment effects. In the case of biologics in psoriasis it is likely to underestimate the efficacy of a treatment, as the doses which have not been marketed are frequently lower and do not offer the full therapeutic benefit. Although the evidence on safety was limited, pooling of licensed and unlicensed doses may also result in the under or overestimation of adverse events.

The impact of pooling evidence for different doses was most pronounced in the base-case analysis of the Cochrane Reviews [33, 34]. In contrast, the results of Jabbar-Lopez 2017 [17] appeared only slightly less favorable than in the other NMAs of that time period, likely due to the inclusion of few treatments with evidence for multiple dosing regimens. Nevertheless, for clinical decision making, these results may not provide the best summary of the relevant evidence base.

However, it must be noted that whilst both the Cochrane Reviews and Jabbar-Lopez 2017 pooled doses for their base-case analysis, they did also perform sensitivity analyses (available in supplementary analyses) in which doses were not pooled. It is, therefore, more appropriate to say that the choice of base-case should be more carefully considered.

Limitations

There were several limitations to this review. Firstly, we did not include HTA reports. Although these are of high importance and largely shape clinical practice in different countries, the details of these analyses are often confidential and therefore their inclusion in our analysis would likely not be possible. However, some of these have been published and were therefore included in our analysis [24, 35]. For similar reasons we did not consider conference abstracts to be eligible in this SLR. When screening studies, the second reviewer only performed a 40% check, resulting in the potential for missing some NMAs. This was unlikely, however, as the level of disagreement between the two reviewers was low and the chances of missing studies were further minimized by checking reference lists in identified publications. Also, we only included analyses in English; however, this is likely to have excluded few, if any, relevant analyses.

Only NMAs that included two or more biologics were of interest for our analysis. Whilst non-biologic systemic treatments are used in moderate-to-severe psoriasis and are often licensed in the same population, in practice they would be used before biologics. Therefore, we believe that their inclusion in our analysis would add unnecessary complexity. Similarly, we excluded simple and matched indirect treatment comparisons, where only two treatments were compared with placebo. In an area such as psoriasis, with its abundance of therapeutic options, such analyses would only provide a very narrow view of the treatment landscape.

Furthermore, we compared results from NMAs for PASI response, as these were the most commonly reported results. It is possible, however, that if we considered outcomes such as Physician’s Global Assessment (PGA) and Dermatology Life Quality Index (DLQI), the results of our analysis could have been different. The scope for comparison of the included NMA evidence for these outcomes was limited, however, as only five analyses evaluated PGA, [21, 25, 33, 34, 38] and five, DLQI [17, 20, 25, 33, 34].

Conclusions

There are numerous analyses assessing the efficacy and safety of treatments for moderate-to-severe psoriasis. Despite their methodological differences and differences in funding sources the analyses reported broadly similar results and conclusions. This consistency highlights the reliability of NMAs for use in clinical practice, emphasizing that newer biologics are more efficacious than older treatments. However, there were some important differences in the results of NMAs, likely resulting from the methods used and assumptions made.

When using NMAs to inform clinical practice, consideration should be given to those NMAs that include most, if not all, of the features that make a methodologically valid and robust analysis: inclusion of all relevant trials and comparator treatments, thorough assessment of heterogeneity and inconsistency, complete reporting of comparative effects, and associated uncertainty. To ensure that results are relevant and applicable, comparator treatments should be synthesized in a way that is reflective of their marketing authorization and use in clinical practice, considering characteristics such as population, dose, and duration. Finally, whilst efficacy is of high importance to patients, so are outcomes such as quality of life and safety. An analysis that captures all of these factors would be optimal for use in clinical decision making.