FormalPara Key Points for Decision Makers

Dupilumab was evaluated in 14 comparisons and was mostly cost-effective, whereas upadacitinib was the only emergent treatment that was never classified as cost-effective.

One needs to be careful when comparing results of economic evaluations for atopic dermatitis, as the underlying perspectives, designs and guidelines differed and caused a great variance in results, especially for dupilumab comparisons.

1 Introduction

Atopic dermatitis (AD) or atopic eczema is one of the most common skin diseases [1]; 4.4% of adults living in the European Union [EU, including the United Kingdom, (UK)] and 4.9% in the USA, respectively, suffer from this chronic inflammatory disease [2, 3]. Affected people experience severe itching, erythema, scaling and skin pain and some patients report vesiculation and crusting [4, 5]. Additionally, patients suffer from stigmatization, lower self-esteem and social isolation leading to sleep and depressive or anxiety disorders [6,7,8]. Furthermore, patients with AD often face additional atopic diseases such as allergic rhinitis or asthma [7]. AD therefore reduces patients’ quality of life [9] and leads to absenteeism and productivity losses [10]. Good management can reduce the burden of disease, but most patients with AD suffer all their lives from their symptoms [11].

There are a variety of treatment options available for different severity levels. However, the application of these treatments is often time consuming and uncomfortable or treatment response is limited [7, 12]. Therefore, it is clinically and societally relevant that new therapies which can fulfil these unmet care needs are developed [7]. In the last years, new promising drugs have become available and more therapies are in development [7]. These treatments are associated with a higher effectiveness while at the same time they are more expensive, leading to challenges in reimbursement decision making [7]. To be able to reasonably assess these emerging treatments for AD, decision makers need to have detailed information not only on clinical efficacy and safety of new drugs but additionally on cost-effectiveness. Even though there are studies available that assess the cost-effectiveness of novel AD therapies [12,13,14,15,16,17,18,19,20,21,22,23,24,25,26], currently no overview of the cost-effectiveness of emerging AD treatments exists.

The objective of this research is therefore to conduct a systematic literature review (SLR) of economic evaluations that assess the cost-effectiveness of emerging AD treatments for children, adolescents or adults and that have received marketing authorization by the US Food and Drug Administration (FDA) or European Medicines Agency (EMA) in 2017 or later or that are currently in FDA or EMA marketing authorization process or in phase 2 or 3 of clinical trials.

2 Methods

The recommendations of the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) were followed during the conduct of the SLR [27]. This entailed, among others, the publication of a protocol in PROSPERO (ID: CRD42022343993), the thorough abstract and full-text screening by two independent reviewers and the quality assessment of articles designated for inclusion. Search results were managed using Covidence. With this software, duplicates were removed, and title, abstract and full-text screening was conducted. Microsoft Excel was used for data extraction and quality assessment.

2.1 Literature Search and Study Selection

The monoclonal antibody dupilumab can be considered as the beginning of a new treatment paradigm of AD treatments. Therapies that were developed before dupilumab are not of interest in this review. Dupilumab received marketing authorization in 2017 by both FDA and EMA [28, 29]. Hence, it was assumed that no relevant economic evaluations were published before. Therefore, only abstracts and peer-reviewed scientific articles published between 2017 and September 2022 were included. The literature search was conducted in Medline (via Ovid), Embase, UK National Health Service Economic Evaluation Database (NHS EED) and EconLit. On the basis of the findings, backward and forward referencing was performed. For interesting abstracts that met the selection criteria, authors were contacted to provide more information in the form of full texts. When there was no response and the abstract did not include sufficient information, the abstract was excluded. Additionally, reports published by the National Institute for Health and Care Excellence (NICE), the Institute for Clinical and Economic Review and the Canadian Agency for Drugs and Technologies in Health (CADTH) were manually searched. Searches were limited to references available in English, German and French.

The search strategy (see Supplementary Information 1) was developed with support of experienced researchers and by using terms encompassing the population, interventions and study design which is in line with the Centre for Reviews and Dissemination’s (CRD) guidance for undertaking reviews in healthcare [30]. Once the literature search was completed and duplicates removed, the inclusion criteria, which follow the Population, Intervention, Comparison, Outcome, Timing, Setting/Study Design (PICTOS) framework [31] and which are presented in Table 1 were applied. On the basis of these criteria, at least two independent reviewers (KH, CB, DW, IW) screened the articles for eligibility firstly on the basis of title and abstract and secondly on the basis of full text. In case of disagreement, another reviewer (MH) was consulted.

Table 1 Inclusion criteria

2.2 Data Extraction

Data extraction was performed by one independent reviewer (KH) on the basis of a standardized data extraction form predefined and reviewed by the research team. Data extraction was subsequently checked by a second reviewer (IW, DW, CB). In case of disagreement, a third reviewer (MH) was involved. Extracted data were based on recommendations by Wijnen et al. [32] and were divided into three categories: (1) general study characteristics, (2) methods and outcomes of economic evaluation and (3) uncertainty analyses. General study characteristics included reference, publication type, funding, study perspective, time horizon, patient characteristics, intervention, control treatment, type of economic evaluation and analytic approach. Methods and outcomes of economic evaluation entailed study, intervention, control treatment, reference year, methods of measurement of effects, effectiveness and total costs of intervention and control treatment and corresponding discount rates, incremental cost-effectiveness ratio (ICER) and whether the intervention was cost-effective or not. Information about performed uncertainty analyses and respective outcomes were extracted in a third table.

2.3 Data Synthesis

The relevant characteristics and results of the articles included were presented in tables, accompanied by a summary to help to portray the comparison and evaluation. ICERs were converted into 2021 US dollars (USD) by applying the Organization for Economic Cooperation and Development (OECD) exchange and inflation rates [33, 34]. When the reference year was not stated, the year of publication was assumed as reference year. Potential research gaps were identified and recommendations for future economic evaluations were developed.

2.4 Quality Assessment

The quality of included articles was assessed by using the Consensus on Health Economic Criteria (CHEC) list [35]. This list consists of 19 items which were scored yes/no [35] by two independent reviewers (KH and CB or DW or IW). In case of disagreement, a third researcher (MH) was consulted. The percentage of items rated with yes indicates an article’s level of quality, that is, articles with a higher percentage of fulfilled items are of higher quality.

3 Results

3.1 Study Selection

A total of 1630 studies were identified via databases with the applied search strategy; 297 duplicates were directly removed and 1333 studies underwent screening; 1295 studies were excluded after title and abstract screening; and 38 studies were moved to full-text screening. Finally, six studies were included for data extraction. Supplementary Information 2 contains a list with studies excluded after full-text screening and respective exclusion reasons. Additionally, eight health technology assessment (HTA) reports and one abstract were manually identified. The corresponding PRSIMA flow chart is shown in Fig. 1.

Fig. 1
figure 1

PRISMA flow chart [27]. CADTH Canadian Agency for Drugs and Technologies in Health, NHS EED UK National Health Service Economic Evaluation Database, NICE National Institute for Health and Care Excellence

3.2 Study Characteristics

Four peer-reviewed journal papers [13,14,15,16], three abstracts [17,18,19], and eight HTA reports [12, 20,21,22,23,24,25,26] were included. Details about study characteristics of included studies are depicted in Table 2. Most studies focused on adults that are moderately to severely affected by AD [12,13,14,15,16, 20,21,22,23,24,25,26]. A few studies investigated a children population [18, 19, 21,22,23] or patients that suffer from mild-to-moderate AD [17]. Four studies took a US [14, 16, 24, 25], four a Canadian [20,21,22,23], three a UK [12, 13, 26] and two an Italian [18, 19] perspective. There was one study each from Australia [17] and Japan [15]. Investigated therapies were diverse, as seven different drugs in total served as intervention. Dupilumab was used as intervention in nine papers [12, 14, 16, 18,19,20, 22, 24, 25]. Further intervention therapies reported were crisaborole [17, 21], baricitinib [25, 26], tralokinumab [25], abrocitinib [23, 25], upadacitinib [13, 25] and delgocitinib [15]. The most frequent comparator treatment was usual care (also named best supportive care or standard of care) [12, 14,15,16,17,18,19,20,21,22,23,24,25,26]. Nevertheless, the definition of such treatment differed between publications but usually included emollients and sometimes also topical corticosteroids (TCS) and topical calcineurin inhibitors (TCI). Abrocitinib [23] and dupilumab [13, 25, 26] were used as control therapies as well. Some manuscripts included several comparisons which is why 15 references reported a total of 24 economic comparisons. All included studies used a model-based approach to assess cost-effectiveness of respective interventions. In seven papers, authors constructed a hybrid model which consisted of a decision tree followed by a Markov model [12, 14, 18,19,20, 22, 23]. Six analyses were based on a Markov model only [13, 16, 21, 24,25,26]. There was one reference that solely used a decision tree [17] and one source that did not specify what kind of simulation model was developed [15]. When comparing model structures, six distinct types, although with slight variances, could be identified, whereas two manuscripts did not provide enough information and cannot be compared in terms of the underlying model structure. One model type was used in six references and another model structure was used in three different manuscripts. One model was developed on the basis of these two dominating model types. The remaining three types were each used in one reference only. Eleven studies considered a lifelong or almost lifelong time horizon [12,13,14, 16, 18,19,20, 22,23,24, 26]. In four references, authors defined a shorter time horizon that was between 16 weeks and 15 years [15, 17, 21, 25].

Table 2 General study characteristics

3.3 Outcomes of Economic Evaluations

Table 3 contains the detailed results of the included economic evaluations of this SLR. Applied discount rates for both outcomes and costs ranged between 0% and 3.5%. Not all manuscripts reported quality-adjusted life-years (QALYs) of the respective interventions that were used for comparisons. However, in case total outcomes were presented in QALYs, the intervention was associated with more QALYs than the control treatment. Studies that reported total costs of interventions and control treatments showed that interventions were usually more expensive than control therapies. However, there was one exception. In the manufacturer’s base case, crisaborole was slightly less expensive than the control treatment pimecrolimus for both children and adults [21]. However, CADTH’s analyses came to the conclusion that crisaborole is more expensive than pimecrolimus [21].

Table 3 Methods and outcomes of economic evaluations

Overall, in 15 out of 24 (62.5%) comparisons the intervention was cost-effective compared with the respective comparator. Figure 2 depicts an overview of the cost-effectiveness results of all comparisons that were conducted in the identified papers. This figure shows that most comparisons in which emerging treatments, that is, dupilumab [12, 14, 16, 18, 19, 24, 25], abrocitinib [25], baricitinib [25, 26], tralokinumab [25], delgocitinib [15] and crisaborole [17, 21] were compared with standard of care, it led to acceptable cost-effectiveness estimates. Upadacitinib was the only novel treatment that did not achieve cost-effectiveness in any standard of care comparison [25]. When emerging therapies, namely upadacitinib [13, 25], abrocitinib [25] and tralokinumab [25] were compared with dupilumab, the result was not cost-effective except for baricitinib [25]. The ICER results differed strongly between studies. As an example, the ICERs of comparisons between dupilumab and standard of care ranged from $23,265.32 [19] to $491,804.20 [20] when transformed into 2021 US $, irrespective of cost-effectiveness assessment. The diversity of the ICER results is emphasised by Fig. 3 which shows the costs per QALY gained for each dupilumab versus standard of care comparison. Figure 3 additionally shows that comparisons that took place in the same setting yielded similar ICER results with the exception of Canada.

Fig. 2
figure 2

Number of cost-effective and not cost-effective results per type of comparison; x-axis presents number of studies, y-axis presents type of comparisons with first part emerging treatments versus standard of care, second and third part emerging treatment versus emerging treatment. abro abrocitinib, bari baricitinib, crisa crisaborole, delgo delgocitinib, dupi dupliumab, sc standard of care, tralo tralokinumab, upa upadacitinib

Fig. 3
figure 3

Incremental cost-effectiveness ratios for individual dupilumab versus standard of care comparisons; x-axis presents the costs per quality-adjusted life year gained in 2021 US $, y-axis presents names of respective comparisons and studies ordered by countries

3.4 Uncertainty Analyses

All 15 included studies provided information about uncertainty analyses; 13 studies conducted deterministic sensitivity analyses [12,13,14, 16,17,18,19,20,21,22, 24,25,26], 13 probabilistic sensitivity analyses [12, 14,15,16,17,18,19,20,21,22, 24,25,26], 9 scenario analyses [13,14,15, 20,21,22,23, 25, 26], 6 threshold or price reduction analyses [13, 20, 22,23,24,25] and 6 studies reported about subgroup analyses [16, 18, 20, 22, 24, 25]. In general, results of uncertainty analyses supported base case results. Subgroup analyses that for instance investigated the impact of disease severity came to the conclusion that higher AD severity improved cost-effectiveness of a more effective intervention [13, 16]. Utility values [12, 13, 16, 20, 21, 24, 26] and drug acquisition costs [12, 13, 16, 20, 24] were mentioned most often as most impactful cost-effectiveness drivers. Table 4 provides more details about uncertainty analyses.

Table 4 Uncertainty analyses

3.5 Quality of Studies

Supplementary Information 3 contains the quality assessment for each included reference. The overall quality of included references was good. On average, 13 out of 19 items (68.4%) were categorized as fulfilled. HTA reports and papers received generally higher scores than abstracts. This was because abstracts are by nature not detailed enough to conduct an adequate quality assessment. Overall, some important details especially regarding comparators and costs were missing, and thus assessment of methodological quality was difficult. While HTA reports are generally very extensive in regard of methods used, they often contain blacked out passages that cover important information about discontinuation rates, prices or utilities. Even though methodological quality might be high, the usefulness of the analyses that these reports present is limited, as reconstruction is difficult. Papers, however, are much less elaborated in terms of methodological procedure. There might be no blacked out sentences in published papers, but often not all information about input data is available. Irrespective of the reason, missing data lower quality of studies and additionally hamper comparability of study results. Nonetheless, three included studies, that is, NICE 2021 [26], Institute for Clinical and Economic Review 2017 [24] and Heinz et al. [13] achieved very high scores in quality assessment, fulfilling 90% or more of the quality items. As a result, these three manuscripts can be regarded as the most reliable and valid of all included studies.

4 Discussion

This review summarised the results of available economic evaluations of emerging therapies for patients that suffer from AD. A total of 15 references that conducted 24 comparisons were included in this SLR. The model structures applied in these references were often similar, with the result that six distinct model types were identified. Most economic evaluations compared an emerging treatment with standard of care which includes emollients and sometimes also TCS and TCI. This was to be expected as it is essential for a new drug to be cost-effective compared with current treatments. Otherwise, decision makers would not recommend reimbursement. Nevertheless, 25% of all comparisons used another emerging treatment as comparator. One reason could be that emerging treatments not only have to be cost-effective compared with standard of care, they additionally are evaluated to be cost-effective against a range of further novel therapies. Furthermore, former emerging treatments such as dupilumab establish themselves as standard of care. Despite dupilumab being relatively new, it was already used as comparator treatment in several economic evaluations. This review demonstrated that 79% of dupilumab comparisons came to the conclusion that dupilumab was cost-effective, either as intervention or as comparator. This review also revealed that upadacitinib is the only emergent treatment that did not turn out to be cost-effective in any comparison, neither when it was compared with standard of care nor with dupilumab. The results indicate that upadacitinib is more effective than standard of care and dupilumab. Nevertheless, the costs seem to be too high compared with the respective quality of life gain upadacitinib yields.

It has to be taken into account that cost-effectiveness judgement strongly depends on country-specific willingness-to-pay (WTP) thresholds. Therefore, an ICER that indicates cost-effectiveness for one country could result in non-cost-effectiveness for another country. As an example, on the one hand dupilumab versus standard of care yielded an ICER of $112,161 and was classified as not cost-effective by CADTH [22], but on the other hand, the Institute for Clinical and Economic Review concluded that abrocitinib is cost-effective compared with standard of care even though the ICER was $148,300 and thus higher than the ICER of dupilumab versus standard of care [25]. Overall, it was striking that the ICERs of the same comparisons, for example, dupilumab versus standard of care, greatly varied. This phenomenon is probably caused by the differences in the design of the economic evaluations. Those differences could, for instance, reside in the perspectives which effect inclusion of cost type and their valuation, selection and concrete definition of standard of care, some data, patient population and model structure. The wide range of ICERs implies that a comparison between different economic evaluations is extremely difficult, and what the economic evaluation aims and what guidelines provide the basis for the analysis have to be strongly considered.

This review had several strengths. A total of four databases were included and the search was supplemented by a manual search for references. Furthermore, data extraction and quality assessment were independently checked by a second reviewer, and thus rating and results of these two can probably be considered to be correct and complete. This review also had some limitations. Due to the authors’ limitations in language skills, only studies reported in English, German and French were included. However, the likelihood that most relevant economic evaluations were identified is still high [36]. Additionally, the quality of published abstracts might be limited. Due to lack of cost-effectiveness studies, those abstracts were included anyway. Moreover, this review included all types of perspectives and health systems. The meaningfulness of comparisons across these different economic evaluations is difficult, as the ICER and the assessment of cost-effectiveness strongly depend on underlying guidelines and designs of the evaluations. Furthermore, the identified studies did not always report all relevant information, which hampered interpretations and comparisons. In addition to that, published economic evaluations usually present public prices and do not account for confidential net prices that might be in place. Hence, formal conclusions of whether the price of a treatment is cost-effective should be drawn with caution.

To improve comparability, it is essential that future economic evaluations are conducted using similar design and following the same guidelines. Otherwise it is difficult for decision makers to make reasonable decisions on the line of therapy, as the variance of results is high. Furthermore, this review shows that there are probably enough economic evaluations available that compare dupilumab with standard of care. This is, however, not the case for other emerging AD treatments. Thus, this SLR identified a research gap of economic evaluations that compare novel AD therapies with standard of care or other new treatment options. Moreover, future economic evaluations should focus on conducting increased subgroup or scenario analyses. The huge amount of promising novel therapy options for AD can be an advantage for patients but simultaneously makes defining a useful line of therapy more challenging. Therefore, it is important to figure out what patient characteristics, and maybe even patient preferences, impact cost-effectiveness in what way to increase patients’ access to their most effective therapy.

5 Conclusions

This SLR showed that there are several new treatment options available for the treatment of patients with AD. Additionally, it revealed that the number of economic evaluations currently available is limited and more evaluations are needed on cost-effectiveness of emerging treatments. This review also underlined the difficulty of comparisons of economic evaluations’ results. To help decision makers to define a line of therapy that represents each treatment’s efficacy in relation to its costs most correctly, it is essential to conduct economic evaluations in AD. Future research should not only conduct similarly designed economic evaluations of emerging treatments, but should also focus on performing subgroup analyses to investigate how patient characteristics and preferences impact cost-effectiveness of different novel AD treatments. Finally, this will increase patients’ access to emerging treatments for AD and allow for the improvement of disease management outcomes.