FormalPara Key Points for Decision Makers

Advanced therapy medicinal products have transformative potential, but the limited evidence available presents significant challenges to economic evaluation.

Evaluation of advanced therapy medicinal products is pervaded with uncertainty because of the lack of robust evidence in many key aspects. However, conventional methods of health technology appraisal employed by the UK National Institute for Health and Care Excellence may still be applied with some adjustments and perhaps greater flexibility than for other technologies.

Considering the methodological uncertainty in cost-effectiveness appraisals and the potential for irrecoverable costs for health systems, decision makers may wish to consider new contractual arrangements that enable access to novel technologies whilst sharing the risks between health systems and pharmaceutical companies.

1 Introduction

Advanced therapy medicinal products (ATMPs) are medicines that replace or regenerate human cells, tissues or organs to restore or establish normal function. The European Medicines Agency (EMA) identifies three groups of ATMPs: gene therapy products, tissue engineered products and somatic cell therapy products [1]. In addition, some ATMPs may contain one or more medical devices as an integral part of the medicine, which are referred to as combined ATMPs (e.g., cells embedded in a biodegradable matrix or scaffold).

Although potential breakthroughs in this area of clinical research are eagerly anticipated, unregulated use of novel therapies with unproven benefits can be not only ineffective but also potentially harmful [2]. Furthermore, technology evaluations of ATMPs may present additional challenges compared with those of conventional pharmaceutical treatments, and straightforward application of standard appraisal methods recommended by the UK National Institute for Health and Care Excellence (NICE) might be difficult or even inappropriate [3]. Acknowledging the need for flexible and adaptable methods to ensure that innovative technologies, such as ATMPs, are “fairly, efficiently and robustly” evaluated, NICE is currently considering substantial changes to its methods of health technology evaluation [4]. The aims of this study are to review evaluations of ATMPs conducted by NICE and discuss the associated challenges and methodological issues.

2 Methods

We searched for appraisals of ATMPs, both published and in development, on the NICE database of technology appraisal guidance [5] and highly specialised technologies guidance in July 2021 [5]. We cross referenced this with the list of ATMPs published by the EMA [6]. For each technology, we reviewed the final appraisal or evaluation determination document and associated documents. We extracted data about several domains pertaining to clinical effectiveness and cost effectiveness, broadly following the structure recommended by the guide to the process of technology appraisal published by NICE (clinical effectiveness, cost effectiveness, uncertainty, discounting, subgroups, additional benefits, generalisability to the national health service [NHS], innovation, service reformulation, inequalities, patient access schemes, end of life) [7]. ATMPs for which marketing authorisation was withdrawn were not considered eligible for this study.

3 Results

To date, NICE has published guidance on 14 ATMPs (Table 1): ten gene therapy products (talimogene laherparepvec [TA410] [8], strimvelis [HST7] [9], tisagenlecleucel [TA554 and TA567] [10, 11], axicabtagene ciloleucel [TA559] [12], voretigene neparvovec [HST11] [13], autologous anti-CD19-transduced CD3+ cells [TA677] [14], betibeglogene autotemcel [ID968] [15], onasemnogene abeparvovec [HST15] [16] and OTL-200 [ID1666] [17]), one tissue engineered product (holoclar [TA467] [18]) and three somatic cell therapy products (darvadstrocel [TA556] [19] and autologous chondrocyte implantation (ACI) [TA477 and TA508] [20, 21]). The appraisal of sipuleucel-T was excluded since the guidance and associated documents were removed when the EMA withdrew marketing authorisation [22].

Table 1 Summary of characteristics of the advanced therapy medicinal products included in this review

Seven of these contained positive recommendations for use in specific indications (talimogene laherparepvec, strimvelis, voretigene neparvovec, holoclar, onasemnogene abeparvovec and the two ACI), four recommended that the ATMP should be included in the Cancer Drugs Fund (CDF) (tisagenlecleucel, axicabtagene ciloleucel and autologous anti-CD19-transduced CD3+ cells) and three did not recommend the ATMP for use in the NHS (darvadstrocel, betibeglogene autotemcel and OTL-200).

The incremental cost-effectiveness ratios (ICERs) for ATMPs recommended under the technology appraisal guidance [5] were within the £20,000–30,000 per quality-adjusted life-year (QALY) threshold usually considered by NICE, whereas the ATMPs with ICERs of £30,000–50,000 were recommended for the CDF. For the three ATMPs approved under the highly specialised technologies guidance [5], which applies a threshold of £100,000 per QALY, the ICERs for strimvelis were under the threshold irrespective of discount rate, but those for voretigene neparvovec were over the threshold when a 3.5% discount rate was applied. In the case of onasemnogene abeparvovec, using a discount rate of 1.5%, the ICER was under the weighted threshold of £186,000 per QALY.

Technology appraisals also considered whether there were externalities or additional benefits beyond those captured by QALYs. These additional benefits for individuals included avoidance of negative impact of alternative treatments (holoclar) and improved independence, social participation and ability to achieve full potential (strimvelis, voretigene neparvovec and onasemnogene abeparvovec). Positive externalities were taken into account qualitatively in the recommendations made for highly specialised technologies, such as reduced financial impact on families and society overall due to cost savings for the NHS and non-NHS government departments (e.g., social care). Chimeric antigen receptor (CAR) T-cell therapies were deemed to have no additional benefits, whereas the appraisals of darvadstrocel, betibeglogene autotemcel and ACIs did not comment on additional effects and externalities.

4 Discussion

This review suggests that appraisals of ATMPs present specific, and arguably greater, challenges to NICE committees than do other technologies. Uncertainty pervades clinical and cost effectiveness in all appraisals because of the sparse or even absent evidence on utility values, long-term effects and costs of treatment versus comparators. These issues are partly due to the nature of the conditions, which are typically rare and severe, and partly due to the technologies, which are novel and yet to be fully understood. We explore the key issues related to the evaluation of clinical and cost effectiveness of ATMPs, and we discuss the potential implications of adoption of ATMPs for the NHS and society overall.

4.1 Clinical effectiveness

Overall, evidence on the clinical effectiveness of ATMPs was limited and weak, but this varied substantially across technologies. The main difficulty in determining the clinical effectiveness of ATMPs was related to the limitations of the studies available, which were mostly single-armed, non-blinded studies with small sample sizes and short follow-up. The poor quality of the studies available may be because ATMPs are typically prescribed as end-of-line treatment to people with rare diseases, which limits sample size. However, this does not necessarily explain why studies were non-blinded and single-armed, which raises concerns about the pharmaceutical industry sponsoring those studies. The limited evidence available raised important issues for assessing clinical effectiveness. First, short follow-up meant that survival data were immature, so the survival estimates and overall duration of effects were highly uncertain. In this context, scenario analyses exploring the effects of different assumptions about long-term benefits might be advantageous, as shown by their use for the evaluation of onasemnogene abeparvovec [4]. Second, single-arm studies did not provide direct evidence of efficacy against comparators, and finding suitable comparator(s) with valid and reliable data available was difficult. For instance, data from previous studies often did not reflect contemporary practice because of the fast evolution of therapeutics, or they included populations with different disease stages and prognoses. Comparison with historical cohorts that studied the natural progression of disease was also naïve (e.g., OTL-200). This rendered quantification of benefits and potential harms associated with different treatment modalities challenging, which translated into uncertainty in the evaluation of clinical effectiveness.

The criteria for life-extending end-of-life treatments, which include (1) life-extending treatments for people with a short life expectancy (normally < 24 months) and (2) providing a gain in overall survival of > 3 months [5], were introduced to allow the QALYs gained by treatments that improve patient survival in terminal stages of disease to be valued more highly. Although these criteria have allowed approval of drugs that would have otherwise been denied to NHS patients, all bar one have been cancer drugs [23]. In keeping with this, of the ATMPs, only three of the CAR T-cell therapies (TA559, TA567 and TA677) met the criteria for life-extending end-of-life treatment. Perhaps acknowledging the limited scope and questionable fairness of the end-of-life criteria, NICE is considering replacing these criteria by a modifier for severity of disease [4]. The actual number of technologies that will benefit from a severity modifier depends on how it is defined and applied, but they can be expected to form a broader range of conditions and to reflect more accurately societal values than the previous end-of-life criteria.

NICE has statutory and ethical duties to support innovative technologies, so greater risks may be accepted to enable access to highly innovative technologies, which have valuable benefits for patients and society [24]. Although NICE defines which technologies qualify as innovative [5], there is much room for interpretation, and ambiguity has resulted in significant variability in implementation [25]. This raises concerns as to whether innovation, which is not an independent social value, may be jeopardising the core values of health and equity in the NHS [26]. The possibility of considering innovation as a modifier also does not garner NICE support at present, as there is no evidence that society values health benefits from innovative technology more than equivalent benefits from less innovative technology [27]. In keeping with this, no adjustment to the reference case was warranted for any of the ATMPs, despite all being considered highly innovative.

4.2 Cost Effectiveness

Overall, there was substantial uncertainty in estimates of cost effectiveness of ATMPs because of the lack of valid and reliable data to inform the economic models. This was reflected in the disagreements about model parameters between committees, evidence review groups and companies. The poor quality of the clinical studies was further compounded by a lack of data on patient-reported outcomes, particularly health-related quality of life. This meant that utility values specific to the condition of interest were often missing, and extrapolation from the literature introduced further uncertainty to the models (e.g., holoclar, strimvelis, voretigene neparvovec, betibeglogene autotemcel, onasemnogene abeparvovec, OTL-200). In some cases, economic models may have overestimated utility gains of treatment (e.g., talimogene laherparepvec, betibeglogene autotemcel). For CAR T-cell therapies, overestimation of long-term survival and cure points may have exerted a paramount effect on cost effectiveness, as those were key determinants of the economic models. In addition, QALY valuation when effects are predicted to be lifelong remains complex because of concerns about whether different social values apply to evaluations of potential cures and substantial uncertainty in long-term outcomes [28]. Nonetheless, the lifelong nature of the effects of strimvelis, voretigene neparvovec, onasemnogene abeparvovec and OTL-200 was assumed based on biological plausibility, and QALYs were weighted (i.e., granted an increased value) to reflect the added value of large QALY gains throughout life, thus enhancing their cost effectiveness. However, this is controversial as there is no evidence that society places additional value on technologies that are potentially curative, and NICE is not supportive of a specific modifier for potentially curative treatments at present [4].

NICE guidance recommends using a 3.5% discount rate for both costs and benefits, but a 1.5% discount rate is allowed when appraising treatments that “restore people who would otherwise die or have severely impaired life to full or near full health, and when this is sustained over a very long period (normally at least 30 years)” [7]. Although companies claimed that their products were eligible for the 1.5% discount rate (e.g., tisagenlecleucel, holoclar, betibeglogene autotemcel and OTL-200), committees noted that “it is rarely considered appropriate to change the discount rate” and preferred the 3.5% discount rate in most cases. However, for strimvelis and voretigene neparvovec, committees stated that both discount rates would be taken into consideration, acknowledging the uncertainty about whether those therapies would restore normal or near-normal health. For onasemnogene abeparvovec, a 1.5% discount was applied, despite, arguably, similar uncertainty about whether effects would be sustained in the long term and patients would achieve normal or near-normal health. This suggests that a lower threshold for considering the 1.5% discount rate may be applied in highly specialised technology appraisals than in standard technology appraisals. The common disagreement between companies and committees on discounting demonstrates the subjectivity involved in the interpretation of the fulfilment of the eligibility criteria. Furthermore, concerns have been raised that the non-reference case discount rate is rarely applied as it sets a high bar for the technologies it otherwise appears to support by requiring no significant irrecoverable costs and implying that a high degree of certainty is needed [29]. For all these reasons, NICE is considering a proposal to use a reference case discount rate of 1.5% per year for both costs and health effects [4], which is the discount rate for health values recommended by the government [30]. Lowering the reference case discount rate may increase the cost effectiveness of ATMPs, particularly those that have high upfront costs and long-term health benefits.

ATMPs are, in general, very expensive, and patient access schemes have been used to lower the ICER and thus facilitate NICE approval [31]. These are agreements between the Department of Health and pharmaceutical companies that enable companies to offer discounts or rebates that reduce the cost of a drug for the NHS. Simple discount schemes have been preferred over complex schemes, such as provision of free stock, dose caps or payments by results (i.e., performance or outcome-based schemes) [32]. It is arguable, though, that the latter could be appropriate to ATMPs, when there is substantial uncertainty on long-term effects and high upfront costs [33]. National discounts are known to NICE committees, and cost-effectiveness models are based on discounted prices. However, “companies sometimes provide confidential [local] discounts to the NHS, making the real cost of cells difficult to ascertain” (TA477), in which case committees accept models based on the approximate list price for the technology. In addition, the exact details of discounts are commercial in confidence, which, albeit understandable, compromises transparency and may undermine patients’ trust in how decisions are made about rationing of healthcare resources in the NHS [34]. On the other hand, patient access schemes may be disproportionately benefitting certain technologies. CAR T-cell therapies were recommended for inclusion in the CDF, which allows their use in the NHS while further data are collected to support a robust appraisal of their cost effectiveness and a subsequent final recommendation. It is arguable whether similar funding arrangements should be potentially available to all technologies irrespective of the underlying disease (e.g., darvadstrocel could have been approved under such a scheme whilst further data were being collected by a large trial). The 2019 general election featured a promise to replace the CDF with an Innovative Medicines Fund, but this is yet to be implemented.

The four ATMPs evaluated as highly specialised technologies listed children, and specifically very young children, as the primary beneficiaries. In these cases, the higher valuation of QALYs assigned to treatments that have potentially lifelong effects (implemented via a higher cost-effectiveness threshold) captured the increase in benefit that would result from delivering a potentially transformative treatment to children while remaining in line with NICE’s view that a modifier based on age is not appropriate [4]. As ATMPs are expected to target severe genetic diseases that manifest in infancy, the valuation and measurement of QALYs in children presents substantial challenges. NICE recommends using a generic measure with good psychometric performance in the relevant age group and reporting who has completed the questionnaire [5]. Nevertheless, the difficulty of valuing and measuring utility in children introduces additional uncertainty to cost-effectiveness evaluations, and further research is warranted to refine methods for assessing health-related quality of life in children [35].

Subgroup analysis may be especially important for ATMPs, as ICERs are typically high and may vary significantly across subgroups. NICE allows committees to recommend treatment for a selected subgroup of patients irrespective of whether the technology is found to be clinically and cost effective for the whole population, provided that the decision is ethically and methodologically sound [5]. Although ten of the 14 technologies analysed in this review considered whether treatment effects would be different in certain subgroups of patients, only six made specific subgroup recommendations (talimogene laherparepvec, holoclar, ACIs, onasemnogene abeparvovec and betibeglogene autotemcel). The remaining three evaluations considered that data were lacking to determine whether there were meaningful differences in treatment effects across subgroups (tisagenlecleucel, voretigene neparvovec) or that it was clinically difficult to identify those subgroups (tisagenlecleucel). This clearly illustrates the challenges in assessing the credibility and relevance of differences between subgroups, which have also been acknowledged by NICE [4].

The impact on inequalities may be particularly relevant for the evaluation of innovative technologies, such as ATMPs, as there may be a greater risk of creating or exacerbating inequalities [36]. In this review, all ATMPs except strimvelis, ACI and onasemnogene abeparvovec were considered to have a neutral effect on inequalities. The first could reduce inequalities in waiting time for transplant that result from a lack of suitable donors for certain ethnicities in whom the condition for which it is recommended (i.e., adenosine deaminase deficiency–severe combined immunodeficiency) is more common, such as Irish traveller and Somalian family origins. This then leads to increased waiting times and potentially worse outcomes for those individuals. Conversely, ACI could result in inequalities as marketing authorisation excludes those with severe osteoarthritis. However, this was mitigated by allowing clinicians to assess patients’ suitability for ACI based on severity of osteoarthritis. In addition, restricting ACI in TA477 to tertiary referral centres may lead to geographical inequalities as ACI may not be widely available across the country. For onasemnogene abeparvovec, the potential for a delayed diagnosis in disadvantaged children led the committee to allow treatment to be offered to children aged > 6 months in certain circumstances, despite scant evidence on clinical effectiveness beyond 6 months of age. NICE’s commitment to addressing inequalities is illustrated by the fact that it considered adding impact on inequalities as a modifier to future health technology appraisals [4]. This is based on equality legislation [37] and evidence suggesting that the UK population prioritises seeking a fair distribution of health across society and is willing to trade-off less health overall if the health is generated in disadvantaged groups, particularly for socioeconomic disadvantage [38]. However, further work is needed to explore which sources of inequality to include, whether direct and indirect effects would be considered, and how and in which circumstances this could be implemented in health technology evaluations.

Although none of the ATMPs included in this review required major service redesign and staff training, ATMPs are by nature likely to have significant service delivery effects. NICE prefers that the full additional cost of introducing a technology is included in the evaluation. However, it concedes that some of the costs may be apportioned between other technologies, particularly when “high-cost medical devices have other uses beyond the evaluated indication (either now or in the future) or for technologies that need substantial service redesign to introduce them (which may affect other technologies or people not having treatment with a technology)” [4]. This flexibility to allow non-reference case analysis may be especially suitable for ATMPs, as costs may be shared by different stakeholders and technologies. In addition, ATMPs may be associated with additional costs that would not commonly be considered in other technology evaluations, such as costs associated with travelling to specialist centres abroad. For strimvelis, these costs were not incorporated in the economic model, and it was unclear how those costs would be covered in the long term. However, whether this should be included in economic models to fully reflect the cost for the NHS is arguable, especially if, as expected, these issues become increasingly common.

4.3 Implications for the National Health Service and Society Overall

Although ATMPs offer potentially important benefits for patients, their families and society overall, their high upfront and often one-off costs can pose challenges related to affordability and implementation in the NHS [39]. Many of the mitigation measures are commercial in nature (e.g., patient access schemes) and not directly in the remit of NICE technology evaluations. However, NICE is accountable to the NHS, the government and the public it serves, and thus the wider implications of health technology evaluations of ATMPs for the NHS and society overall cannot be overlooked.

The first concern raised by the implementation of ATMPs by the NHS relates to the generalisability of research conducted elsewhere. This question, albeit shared with other technologies, is paramount for ATMPs because of the scant evidence available. Committees sought advice from clinical experts and complemented this with data from observational studies in England, whenever possible (e.g., darvadstrocel). Findings were considered generalisable to clinical practice in the NHS for axicabtagene ciloleucel, tisagenlecleucel in TA567 and ACI in TA508. However, in other cases, generalisability was questionable because outcomes for the comparator did not represent contemporary outcomes in the NHS (e.g., relapse rates were much higher for the comparator of darvadstrocel in the single trial available than rates reported in the NHS), the characteristics of the patients in the studies did not match those of typical NHS patients (tisagenlecleucel in TA554, darvadstrocel, holoclar, strimvelis) or heterogeneity in treatment effects was possible depending on individual variability (voretigene neparvovec).

Second, ATMPs are associated with a larger financial risk for the NHS than many other technologies [39]. They often have high upfront and, depending on contractual arrangements, potentially irrecoverable costs, yet the full benefits may take many years to accrue or may not be permanent. Furthermore, the high cost per patient means there is high volatility, which may be challenging to accommodate in annual budgets and increase financial risks if there is an unexpectedly high number of cases. The quintessential question is how much risk the NHS should incur and in which circumstances, as well as whether the risk should be shared by pharmaceutical companies (e.g., by using outcome-based contracts) [40]. NICE considers that greater risks could be accepted for “conditions for which it is recognised that evidence generation is complex and difficult, such as rare diseases, innovative technologies, technologies that provide large benefits”, all of which apply to ATMPs [4]. Nonetheless, whereas voretigene neparvovec, strimvelis and onasemnogene abeparvovec were recommended despite the scant evidence on long-term benefits, betibeglogene autotemcel was not recommended, at least partially because of “the potential to commit the NHS to irrecoverable costs”. This illustrates the difficult compromise between allowing flexibility for non-reference case analysis (e.g., by accepting greater risks) and allocating resources to maximise population health [41].

Third, ATMPs may have wider consequences for patients, their families and carers, the NHS and other governmental sectors and ultimately society at large. Those consequences are often beneficial and hence associated with positive utility (e.g., reduced burden for paid and unpaid carers). However, they can be associated with disutility, particularly when ATMPs extend life expectancy but with significant disability, leading to an increased need for caregiving over a lifetime horizon. For instance, for onasemnogene abeparvovec, caregiver disutility was not included in the model because it was difficult to quantify the disutility for carers and it would increase the ICER, which was considered “counterintuitive”. In other ATMP evaluations, wider benefits contributed qualitatively to the appraisal of the evidence and the final recommendation, as they were hard to quantify or intangible (e.g., OTL-200). This is partly due to the lack of data on these benefits, as no studies had included them as outcomes. However, even if data were available, the reference case does not contemplate wider benefits that may be accrued because of treatment, particularly when these are non-health benefits (e.g., reduced need for social care) or when they fall on someone other than the person receiving treatment (e.g., parents and carers in general). NICE guidance acknowledges that “care delivered by the NHS could have other benefits that are considered socially valuable but are not directly related to health and are not easily captured in a cost per QALY analysis” [7]. Nonetheless, incorporating techniques “that consider the trade-off between health benefits and non-health benefits quantitatively” into decision making was considered unsuitable in 2013, and it did not feature in the case for change of health technology evaluation in 2020 [4]. NICE recommends that non-reference case analysis is used when there are substantial identifiable health benefits not captured by QALYs and emphasises the need to develop methods to formally incorporate qualitative evidence into decision making [4].

5 Conclusion

NICE evaluation of ATMPs revealed significant challenges, mainly related to large uncertainty about long-term and potentially curative effects, high upfront costs, discounting, innovation, benefits above and beyond QALYs and apportioning of costs. However, these challenges are not unique to ATMPs, so completely different methods may not be required. Adaptations to the conventional decision-making process, such as the use of flexible non-reference case analysis, integration of different sources of evidence, and special funding arrangements, may improve appraisal of ATMPs.