FormalPara Key Points for Decision Makers

The choice of comparators relied on clinical opinion. In future, evidence of actual use in clinical practice should be employed.

The modelling of haematopoietic stem cell transplantation in patients with acute myeloid leukaemia who are eligible for intensive therapy is essential, as it is the second backbone of therapy for these patients besides conventional chemotherapy. Further research on the implications and validity of different methods to model haematopoietic stem cell transplantation is needed to guide future appraisals.

NICE recommended oral azacitidine (according to the commercial arrangement), within its marketing authorisation, as an option for maintenance treatment for acute myeloid leukaemia in adults who are in complete remission, or complete remission with incomplete blood count recovery, after induction therapy with or without consolidation treatment, and cannot have or do not want a haematopoietic stem cell transplant.

1 Introduction

Oral azacitidine, tradename ONUREG, was appraised within the National Institute for Health and Care Excellence (NICE) single technology appraisal (STA) process as Technology Appraisal (TA) 827. Health technologies must be shown to be clinically effective and to represent a cost-effective use of National Health Service (NHS) resources to be recommended by NICE. Within the STA process, the company (Celgene, a Bristol Myers Squibb company) provided NICE with a written submission and a mathematical health economic model, summarising the company’s estimates of the clinical effectiveness and cost-effectiveness of oral azacitidine for maintenance treatment of acute myeloid leukaemia (AML) after induction therapy. This company submission (CS) was reviewed by an evidence review group (ERG) independent of NICE [1]. The ERG, Kleijnen Systematic Reviews (KSR) in collaboration with Maastricht University Medical Centre produced an ERG report [2]. After consideration of the evidence submitted by the company, comments by other stakeholders and the ERG report, the independent appraisal committee (AC) issued guidance on whether to recommend the technology by means of the Final Appraisal Determination (FAD) [3]. This paper presents a summary of the ERG report and the development of the NICE guidance. Furthermore, it highlights important methodological issues that were identified which may help in future decision making. Full details of all appraisal documents [including the scope, CS, ERG report, consultee submissions, Appraisal Consultation Document (ACD) and FAD] are on the NICE website [4].

2 The Decision Problem

The population defined in the scope was as follows: Adults with acute myeloid leukaemia who have complete disease remission, or complete remission with incomplete blood count recovery, following induction therapy with or without consolidation treatment who are not eligible for, including those who choose not to proceed to, haematopoietic stem cell transplantation [5]. The population in the CS was the same as the population defined in the final NICE scope [1].

The intervention is oral azacitidine as maintenance treatment, and was in line with the scope.

The description of the comparators in the NICE scope was as follows: “Midostaurin, established clinical management without oral azacitidine (which may include a watch and wait strategy with best supportive care [BSC], low dose cytarabine or subcutaneous azacitidine)” [5]. The company used the following comparators: midostaurin for the FLT3-mutation positive subgroup only in line with NICE’s Technology appraisal guidance [TA523], and “established clinical management without oral azacitidine (which may include a “watch and wait” strategy with BSC)” [1, 6]. That is, the company did not include low-dose cytarabine or subcutaneous azacitidine as comparators. The company’s rationale for using different comparators from those described in the final NICE scope was: “Low-dose cytarabine and subcutaneous azacitidine are not used in clinical practice as maintenance treatments for AML in the population eligible for maintenance treatment with oral azacitidine (as confirmed by two UK AML treating clinicians) and are therefore not considered as comparators to oral azacitidine” [1]. The company also consulted with two UK AML clinical experts who, according to the company, “unequivocally confirmed that these treatments [low dose cytarabine and subcutaneous azacitidine] are not used in UK clinical practice for AML maintenance. The clinical experts could only provide very limited examples where these treatments could be used in situations resembling maintenance treatment, such as those patients whose disease was in partial remission, or patients who showed signs of early relapse. We believe that these situations might be miscategorised as maintenance treatment” [1].

2.1 ERG Critique of Decision Problem

The company did not adequately justify why it did not use the comparators described in the final NICE scope: low-dose cytarabine and subcutaneous azacitidine are potentially legitimate comparators, and they are mentioned in the final NICE scope. Moreover, the company did not provide objective evidence, but only expert opinion from two clinicians, upon which it concluded that low-dose cytarabine and subcutaneous azacytidine could be excluded as comparators.

3 Independent ERG Review

The ERG reviewed the clinical effectiveness and cost-effectiveness evidence of azacitidine for this indication. As part of the STA process, the ERG and NICE had the opportunity to ask for clarification on specific issues in the CS, in response to which the company provided additional information [7]. On the basis of this information, the ERG produced an ERG base case by modifying the health economic model submitted by the company, and assessed the impact of alternative assumptions and parameter values on the model results. Sections 3.13.6 summarise the evidence presented in the CS, as well as the review of the ERG.

3.1 Clinical Effectiveness Evidence Submitted by the Company

The CS and response to clarification provided sufficient details for the ERG to appraise the literature searches [1, 8]. Despite some issues with transparency and reproducibility, searches were carried out on a good range of resources. Additional searches included conference proceedings, HTA organisations and the checking of reference lists in relevant SRs identified during the searches. The strategies provided contained a good use of free text terms, appropriate subject headings and study design filters.

The clinical effectiveness evidence for oral azacitidine in the CS was based on the QUAZAR AML-001 trial (NCT01757535) [9]. The QUAZAR AML-001 is an ongoing, phase 3, randomised, double-blind, placebo-controlled trial comparing oral azacitidine (300 mg azacitidine orally once daily) plus BSC versus placebo plus BSC. This trial provides evidence for oral azacitidine in its expected position in the clinical pathway: as maintenance treatment for patients with AML who have achieved complete remission (CR) or complete remission with incomplete blood count recovery (CRi) following induction therapy, with or without consolidation chemotherapy, and were not candidates for haematopoietic stem cell transplantation (HSCT). The trial consisted of four phases: pre-randomisation phase (screening phase within 28 days prior to randomisation), randomisation and double-blind treatment phase (1:1 randomisation to study treatment until discontinuation following AML relapse), follow-up phase [follow-up up to 28 days after last dose of study treatment for adverse events (AEs) and then every month for the first year and then every 3 months until death] and an extension phase [unblinding to receive azacitidine if subject did not meet study discontinuation criteria (or not receive azacitidine if in placebo group) and followed for survival for at least another 12 months until death, withdrawal of consent, study closure, or lost to follow-up]. Data from the QUAZAR AML-001 trial were used as the main data for the economic modelling in this submission.

A summary of the efficacy results follows:

  • In the QUAZAR AML-001 trial, oral azacitidine significantly improved overall survival (OS) at both 15 July 2019 and 8 September 2020 data cut-off points when compared with placebo, meeting its primary endpoint [10]. At a median follow-up of 41.2 months (primary database lock), oral azacitidine was associated with a significantly longer OS compared with placebo, with a clinically meaningful difference in median OS of 9.9 months [median OS: 24.7 months versus 14.8 months; hazard ratio (HR) 0.69 (95% CI 0.55–0.86), p < 0.001].

  • Survival rates were higher in the oral azacitidine group than in the placebo group at 1 year after randomisation [72.8% versus 55.8%; difference 17.0 percentage points (95% CI 8.4–25.6)]. Higher relapse-free survival (RFS) rates were observed in the oral azacitidine group than in the placebo group at 6 months (67.4% versus 45.2%), 1 year (44.9% versus 27.4%) and 2 years (26.6% versus 17.4%).

  • The median time to relapse was 10.2 months in the oral azacitidine group and 4.9 months in the placebo group, 81.1% of patients on oral azacitidine had discontinued from the study compared with 88.9% of patients on the placebo arm by 15 July 2019, and oral azacitidine was associated with significantly fewer hospitalisation events per person-year (0.48 versus 0.64; p = 0.0068) and a lower number of days hospitalised per person-year (7.89 versus 13.36; p < 0.0001) than placebo. Overall, the results from the trial are favourable for oral azacitidine.

  • The incidences of treatment-emergent adverse events (TEAEs) were similar for the two treatment arms—97.9% of patients in the oral azacitidine group and 96.6% of those in the placebo group experienced at least one TEAE during the study. The proportion of patients who experienced at least one TEAE considered by the study investigator to be related to study treatment was higher in the oral azacitidine group than in the placebo group (89.8% versus 51.5%). The rates of serious TEAEs (oral azacitidine: 33.5%; placebo: 25.3%), grade 3/4 TEAEs (oral azacitidine: 71.6%; placebo: 63.1%) and TEAEs leading to death (oral azacitidine: 3.8%; placebo: 1.7%) were notably higher in the oral azacitidine group when compared with the placebo group. The most common TEAEs were gastrointestinal events (GI), which occurred more frequently in the oral azacitidine group (91.1%) than in the placebo group (61.8%). The most common haematologic TEAEs were neutropenia, thrombocytopenia and anaemia (which were among the most common grade 3/4 TEAEs reported with oral azacitidine).

No meta-analyses were carried out; however, the company conducted an indirect treatment comparison (ITC) comparing the efficacy of oral azacitidine as maintenance treatment with midostaurin as maintenance therapy in subjects with FLT3-ITD and/or FLT3-TKD (FLT3 mutation)-positive AML. The RATIFY trial was the only study identified in the SLR that provided an analysis of midostaurin as maintenance treatment in AML, although subjects were not randomised at the maintenance phase, but for induction [11]. The QUAZAR AML-001 trial initially consisted of 472 patients, but the individual patient data for subjects without FLT3 mutations were removed to match the inclusion/exclusion criteria of the RATIFY trial, leaving only 56 patients [11].

The company conducted a feasibility assessment of the RATIFY trial which identified significant heterogeneities in trial design (although the RATIFY trial included a maintenance therapy phase, the 205 patients who entered the maintenance phase were not re-randomised prior to the start of maintenance therapy), patient age (the inclusion criteria for QUAZAR AML-001 was ≥ 55 years compared with RATIFY, which included patients aged 18–59 years), cytogenetic risk (favourable cytogenetic risk patients were included in RATIFY but not in QUAZAR AML-001), AML mutational status, HSCT eligibility (HSCT eligibility was not a formal exclusion criterion in the RATIFY trial and 57% of patients underwent HSCT, while the QUAZAR AML-001 trial excluded patients who were eligible for HSCT at study screening and 6% of patients on the oral azacitidine arm underwent HSCT), history of consolidation therapy, and different time zero definitions of time-to-event outcomes [11].

3.2 Critique of Clinical Effectiveness Evidence and Interpretation

Most patients (65%) in the QUAZAR trial received one dose or no doses (20%) of consolidation therapy, whereas NICE guidance implies that at least one dose is usual in clinical practice [12]. This generated a non-representative sample for the trial that may have exaggerated the apparent benefits of oral azacitidine.

Additionally, only 35 (out of 472) patients in the QUAZAR trial were recruited from UK sites, and there were notable differences between the UK population and the populations analysed (results not reported for reasons of confidentiality) [2]. This limited the generalisability to the UK setting.

Crucially, randomisation of patients in the RATIFY trial, which was not prospectively designed to determine the independent effect of midostaurin as maintenance therapy, occurred at induction and not at the start of the maintenance phase. This renders the relevant comparison in the RATIFY trial subject to biases associated with non-randomised trials [11].

3.3 Cost-Effectiveness Evidence Submitted by the Company

Two sets of systematic literature searches were performed, firstly to identify available cost-effectiveness and cost-utility studies and a second search to identify relevant health utility values. Despite some issues with the reporting of searches which impacted on their transparency and reproducibility, searches were carried out on a good range of resources, and strategies combined both free text and the appropriate subject headings for each resource. Named study design filters were also reported.

To assess the cost-effectiveness of oral azacitidine compared with relevant comparators, the company developed a partitioned survival model in Microsoft Excel, using a lifetime horizon (i.e. 30 years) and a cycle length of 28 days to align with treatment cycles for therapies considered in the model and other existing AML models. The model consisted of relapse-free survival (RFS), relapse and death health states (Fig. 1). In the RFS state, patients could be either on- or off-treatment with oral azacitidine, while patients in the watch-and-wait with BSC strategy were all considered off-treatment. The company considered that any remissions would be captured through OS. HSCT was modelled as part of subsequent treatments rather than explicitly as a separate health state. The company considered that oral azacitidine is licenced for patients who are not suitable for transplant, and therefore it would be unlikely in clinical practice that patients will go on to receive HSCT after oral azacitidine unless they had relapsed. However, 6.3% and 13.7% of the patients in QUAZAR AML-001 received HSCT in the oral azacitidine and placebo arms, respectively [10]. These patients were not censored in the time-to-event analysis that informed the health state allocation. In addition, the company considered that including HSCT as a health state in the model would require inputs that were neither captured in the QUAZAR AML-001 trial nor available from the literature for this population.

Fig. 1
figure 1

Model structure

The population considered in the CS, as described in the decision problem above, was consistent with the NICE scope, the marketing authorisation for oral azacitidine and the population in the QUAZAR AML-001 trial. Two patient groups were considered in the company’s model:

  • The QUAZAR AML-001 intention to treat (ITT) population, compared with watch-and-wait with BSC, as informed by the placebo arm of the QUAZAR AML-001 ITT population.

  • The QUAZAR AML-001 FLT3 subpopulation , compared with midostaurin, as informed by an indirect comparison.

The company also considered a scenario analysis using the Europe population of the QUAZAR AML-001 study.

Consistent with the licence, the modelled starting dose of oral azacitidine was 300 mg once daily for the first 14 days of every 28-day treatment cycle until disease progression or unacceptable toxicity. The summary of product characteristics (SmPC) of oral azacitidine recommended discontinuation upon blast counts > 15% or unacceptable toxicities in the QUAZAR AML-001 trial [10].

The comparators considered were watch-and-wait with BSC and midostaurin (only in the FLT3 subgroup). Watch-and-wait with BSC represented the standard of care (SoC) in current clinical practice because there are currently no approved or funded therapies indicated for this population for the independent maintenance treatment of AML in the UK. BSC included medications such as antibiotics, antifungals and hydroxyurea. For patients with AML with mutations in FLT3, NICE recommended the use of midostaurin (TA523). For patients in complete response, midostaurin was administered orally at 50 mg twice daily as single-agent maintenance treatment until relapse for up to 12 cycles of 28 days each.

The analysis was constructed from the perspective of the NHS and the Personal Social Services (PSS) in England and Wales. A discount rate of 3.5% per annum was applied for costs and benefits in line with the NICE reference case.

The main source of evidence on treatment effectiveness used for oral azacitidine and watch-and-wait with BSC was the QUAZAR AML-001 trial [9]. The most recent September 2020 data cut-off (median follow-up 51.7 months) was used for the estimation of OS, while the July 2019 data cut-off (median follow-up 41.2 months) was used for the estimation of RFS. To estimate OS and RFS over the 30-year time horizon, parametric survival curves were fitted to QUAZAR AML-001 trial data and used to extrapolate survival beyond the study time horizon. Parametric models were assessed with regard to (1) visual inspection of model fit, (2) information criteria (AIC and BIC), (3) degree of agreement with log-cumulative hazard and Schoenfeld residual plots, (4) the marginal survival benefit in the observed and the extrapolated period, and (5) clinical considerations based on expert engagement, external literature and other relevant treatment and indication-specific domain knowledge. The company selected joint generalised gamma and log-logistic models for the modelling of OS and RFS, respectively. The modelling of oral azacitidine and midostaurin in the FLT-3 subgroup was based on an ITC between QUAZAR AML-001 and RATIFY, and generalised gamma and 1 knot odds linear models were respectively used for OS and RFS [11].

Grade 3 and 4 AEs occurring in 5% or more of the patients according to the QUAZAR AML-001 trial were included for the ITT population in the model [10]. For the FLT-3 population, grade 3 and 4 AEs occurring in more than 10% of patients treated with midostaurin were included on the basis of the ITT population of the maintenance phase in the RATIFY trial. All AE disutilities were applied in the first model cycle with an assumed duration of 1 week [11].

Health-related quality of life (HRQoL) in the ITT population of the QUAZAR AML-001 trial was measured using the EuroQol-5D-3L (EQ-5D-3L [UK tariff]) on each day 1 of the 28-day cycle [10]. The same utility value applied to RFS regardless of whether the patient was on- or off-treatment, as validated by expert opinion. Utility values for the RFS on- and off-treatment health states were derived by applying a linear mixed effects model with random intercepts. The company stated that the QUAZAR AML-001 trial did not capture HRQoL post relapse, and hence a study by Joshi et al. (obtaining utilities for AML using a composite time trade-off) was selected from the literature to inform the relapse utility [13]. A one-off 28-day utility decrement of 0.21 was applied to the proportion of patients receiving HSCT. Initially, no utility benefits resulting from subsequent treatments or HSCT were included in the modelling, but these were provided as a scenario analysis upon technical engagement.

The cost categories included in the model were treatment acquisition costs, medical costs (treatment administration, supportive care, monitoring and follow-up, HSCT, palliative care) and costs of managing AEs. Unit prices were based on the eMIT 2020, NHS reference prices and British National Formulary (BNF) [14,15,16]. Oral azacitidine acquisition costs (including confidential discount and a relative dose intensity of 86.9%) were calculated on the basis of treatment dose, number of administrations per cycle and number of cycles in which it was applied. For the comparator population of the FLT-3 subgroup, midostaurin was assumed to be administered twice daily as a single agent until relapse. Rates of resource use for disease management were applied on the basis of the QUAZAR AML-001 trial and further guided by expert opinion. Resource use and treatment administration costs were obtained from the NHS reference cost 2019/2020 [15]. BSC costs were largely based on UK expert opinion. Dosing regimens were based on the respective SmPC, and acquisition costs were sourced from eMIT 2020 and the online BNF 2021 [14, 16, 17]. The proportion and mix of subsequent therapies was informed by the QUAZAR AML-001 trial and validated by clinical advisors. Acquisition costs were sourced from eMIT 2020 and the online BNF [14, 16]. Treatment costs for HSCT were taken from the NHS reference costs 2019/2020 [15]. End-of-life costs were sourced from Nuffield 2014 and inflated to 2019/2020 based on the HCHS inflation index [18, 19]. AE costs were informed by the NHS reference costs 2019/202 and applied as a one-off cost in the first model cycle.

In the company’s base-case analysis, total life years (LYs) and quality-adjusted life years (QALYs) gained were larger for oral azacitidine than for watch-and-wait with BSC. Incremental QALYs were mainly driven by QALY gains in the RFS state. Total costs were also higher for oral azacitidine than for watch-and-wait with BSC. Incremental costs mainly resulted from higher drug costs and disease management costs in the RFS (on-treatment) state. In the FLT-3 subgroup, total QALYs were higher and total costs were lower for oral azacitidine, and hence midostaurin was dominated by oral azacitidine. In response to technical engagement and appraisal committee meetings, the company increased the oral azacitidine patient access scheme discount and conducted various model analyses: using the Europe subgroup of the QUAZAR AML-001 trial [10], exploring the long-term HRQoL impact of HSCT, modelling subgroup-specific patient baseline characteristics and exploring the impact of different assumptions related to AEs and relapse utility values. The company’s revised and final probabilistic ICER was £32,480 per QALY gained. Oral azacitidine remained dominant versus midostaurin in the FLT-3 subgroup.

3.4 Critique of Cost-Effectiveness Evidence and Interpretation

HSCT was not included as a separate health state but was implicitly included in the modelling through the survival analysis of the QUAZAR AML-001 ITT population (of which a proportion of patients received HSCT at some point). In addition, costs and a temporary disutility associated with undergoing HSCT were included in the modelling. The ERG noted that survival analyses of OS and RFS may potentially be biased when HSCT patients are included in the population as their hazard rates over time for death and relapse would be expected to differ compared with those patients not receiving HSCT. In addition, no benefit in HRQoL post HSCT was captured in the model (instead HSCT was actually penalised with the short-term disutility). Upon request, additional details were provided which illustrated that the impact of HSCT on survival analyses of OS and RFS was likely minor. A scenario analysis using a weighted average relapse utility was provided, but it was unclear whether this fully captured the long-term benefits of HSCT.

It was unclear whether the ITT population or the Europe subgroup of the QUAZAR AML-001 trial better reflected UK clinical practice [10]. The company highlighted potentially greater alignment in the diagnostic treatment pathway between the UK and the rest of Europe, which was supported by a clinical expert. A Europe subgroup analysis was therefore provided. Within this, the company did not provide a further analysis of the FLT-3 subgroup analysis, citing sample size limitations as the reason.

It was unclear whether consolidation therapy pre-treatment in the QUAZAR AML-001 trial was reflective of UK clinical practice [10]. Although the European Society for Medical Oncology’s (ESMO) clinical practice guideline on AML in adult patients (2020) recommends that people should have consolidation therapy after reaching complete remission following induction treatment, the majority of patients in the trial received only one cycle or no consolidation therapy [20]. Clinical experts described optimum clinical practice to be three or four cycles of consolidation therapy, but explained that there is variability in the number of consolidation cycles and in clinical practice many patients only have one cycle of consolidation therapy because of delayed blood count recovery, toxicity, patient choice or clinician decision. The company provided a subgroup analysis of patients with at least one cycle of consolidation therapy, but in the end preferred to use the Europe subgroup (which did not exclude any patients based on their number of consolidation therapy).

The company did not provide all the details according to the NICE Decision Support Unit Technical Support Document 14 (DSU TSD14) on curve selection for the modelling of OS and RFS in the consolidation subgroup [21]. Upon request, further details showed that additional survival analyses in the consolidation subgroup aligned with the assessment for the ITT population. Although accelerated failure time (AFT) models with a treatment covariate (i.e. joint generalised gamma and joint log-logistic for OS and RFS respectively) seemed acceptable, the ERG highlighted that using individual parametric models without a treatment covariate had a substantial impact on the ICER.

There was uncertainty regarding the assumption of no waning of the oral azacitidine treatment effect. Additional evidence by the company demonstrated that the HRs using AFT models varied over time, exhibiting a natural waning effect. Treatment waning was further explored by incorporating general population mortality (setting the HR to 1 at 150 months) and selecting individual curves for the extrapolation of OS and RFS. The ERG agreed that survival distributions can be chosen to reflect treatment waning, making additional treatment waning assumptions likely obsolete in this case.

The company’s simplified assumption of modelling AEs with a maximum frequency of 1 and an average duration of one week seemed arbitrary. No per-cycle analysis of AEs was provided, but several scenarios with regard to the frequency and duration of AEs showed that the issue was not impactful in terms of cost-effectiveness.

The relapse utility estimate was based on Joshi et al., which had a small sample size (n = 23) and used the composite time trade-off [13]. Alternative sources for relapse utilities (e.g. Stein and Tremblay) were also not ideal as they were elicited in US populations, but Tremblay was preferred by the ERG as utility measurements were used in TA523 and were mapped onto the EQ-5D; hence, these were used in the ERG base case [22].

The QUAZAR AML-001 trial measured HRQoL on every first day of a 28-day treatment cycle [10]. Oral azacitidine was given on day 1–14 (or 1–7/1–21) of every treatment cycle, followed by a period of 14 days without treatment. Hence, treatment-related AEs were likely to occur during the first 14 days of every 28 days cycle and to diminish in the 14 days of rest thereafter, which likely resulted in biased utility estimates.

The ERG was concerned that no post-HSCT utility benefit was modelled, given that a proportion of patients in the QUAZAR AML-001 trial underwent HSCT [10]. By modelling only the costs and a temporary disutility, patients were penalised for receiving HSCT while in previous technology appraisals (TA523 and TA642) a curative effect of HSCT was assumed [23, 24]. As patients were expected to experience a net utility benefit after undergoing HSCT, the ERG removed the temporary disutility and performed a scenario analysis to explore the effect of applying a return to RFS utility after relapse for the proportion of patients undergoing HSCT.

It was considered unlikely that the modelled RFS utility was higher than the age-adjusted general population norm in the UK. Although the company directly estimated utilities for the population in question using the QUAZAR AML-001 trial data, capping the RFS utility value at general population levels was deemed more appropriate [9].

A full incremental analysis of oral azacitidine, midostaurin and watch-and-wait plus BSC was not performed for the FLT-3 subgroup and was also not enabled as an option in the economic model.

3.5 Additional Work Undertaken by the ERG

On the basis of all considerations highlighted in the ERG report, the ERG defined a new base case in which various adjustments were made to the company’s base case. This included, as matters of judgement, using the consolidation subgroup from the QUAZAR AML-001 trial instead of the ITT population, using the study of Tremblay et al. instead of Joshi et al. to calculate the relapse utility, and removing the temporary disutility for patients receiving HSCT [10, 13, 22]. Furthermore, the ERG conducted a scenario analysis adding a post-HSCT utility increment (applied RFS utility post HSCT for a duration of 1.67 years) which increased the ICER substantially.

3.6 Conclusions of the ERG Report and Technical Engagement

During Technical Engagement, the ERG acknowledged apparent contradiction between the NHS website (, which suggests everyone gets consolidation, and the Haematological Malignancy Research Network (HMRN), cited by the company, which suggests a proportion of patients do not receive it (actual numbers are confidential and cannot be reported here) [4]. The ERG suspects that this is at least partly because HMRN includes patients diagnosed a long time ago (from 2004), which might include a period of change in clinical practice. The more recent (2020) ESMO guidelines state: “As soon as patients achieve CR/CRi after 1 or 2 induction cycles, they should proceed to consolidation treatment [II, B]” [20]. The ERG therefore reiterates that consolidation is expected, and that the relevant population is the consolidation subgroup. The ERG also acknowledged the uncertainty surrounding the optimal number of rounds of consolidation therapy.

The potential bias due to randomisation of patients in the RATIFY trial occurring at induction and not maintenance phase appeared to be unresolvable given no further maintenance phase evidence [11].

After technical engagement, the ERG base-case ICER of oral azacitidine versus watch-and-wait plus BSC was £40,768 per QALY gained, and oral azacitidine remained dominant versus midostaurin in the FLT-3 subgroup. During Technical Engagement, the company also updated its base case using the EU subgroup, which the ERG did suggest would be more in line with what is expected to be seen in UK clinical practice. However, this was relative to the ITT population and notwithstanding the ERG’s preference for the consolidation subgroup. There was thus a large remaining uncertainty about the effectiveness and cost-effectiveness of oral azacitidine: the generalisability of the QUAZAR AML-001 [10] ITT population to the UK setting was questionable as few UK patients were included, and the appropriate number of cycles of consolidation therapy in UK clinical practice and the most appropriate curves for the modelling of OS and RFS in the consolidation subgroup were unknown. In addition, the approaches to reflect HSCT in the modelling and to incorporate HRQoL were likely biased.

3.7 Model Changes After NICE Appraisal Committee Meeting

Following the first NICE appraisal committee meeting discussion and the resulting preliminary negative recommendation for the use of oral azacitidine for maintenance treatment of AML after induction therapy, the company provided additional analyses exploring the impact of survival curve selection and treatment waning on the cost-effectiveness of oral azacitidine. The ERG aligned its base case with the committee’s preferences to use the Europe subgroup from the QUAZAR AML-001 trial [10] and to cap the RFS utility to the age-adjusted population norm in the UK. Compared with the company’s revised base-case (£32,480 per QALY gained), this resulted in a slightly larger (probabilistic) ICER of £33,830 per QALY gained. Oral azacitidine remained dominant versus midostaurin in the FLT-3 subgroup.

4 Key Methodological Issues

4.1 Choice of Comparators

This should be according to actual clinical practice. Clinical expert opinion can be helpful in validating objective evidence as to actual use, but it should not be a substitute for this evidence.

4.2 Modelling of HSCT

HSCT is besides the use of conventional chemotherapy the second backbone of therapy for patients with AML who are eligible for intensive therapy and offers the highest potential for long-term survival as post remission therapy [25]. Although the long-term impact of HSCT on HRQoL is expected to be positive, undergoing HSCT is an invasive procedure and hence it can also have a temporary negative impact on the patient’ s quality of life. Therefore, both positive and negative effects (in terms of survival and HRQoL) of HSCT should be considered in the modelling. When HSCT is a (subsequent) treatment option to patients in the trial informing treatment effectiveness of an economic model, standard survival analyses may no longer be appropriate as the hazard rates over time likely differ between patients who received HSCT and those who did not. Alternatively, the proportion of patients that underwent HSCT could be modelled separately, for example by introducing a separate HSCT health state and by censoring patients with HSCT from the no HSCT group survival analysis. In the current TA, the company upon request provided an overlay of the Kaplan–Meier (KM) curves of ITT versus ITT with HSCT censored in one plot. The AIC/BIC fit for the individual distributions per treatment arm for the HSCT censored analysis, and a plot including all distributions were also provided. The provided evidence illustrated that the impact of HSCT on survival analyses of OS and RFS was likely minor, but the long-term impact of HSCT on HRQoL remained unaddressed. In addition, the generalisability of the trial population to the decision context should also be considered, as the proportion of patients undergoing HSCT may differ between the trial and clinical practice. Further research on the implications of methods related to the modelling of HSCT is needed to guide future appraisals.

4.3 Uncertainty in the Source to Inform the Relapse Utility

It is considered good practice in economic evaluations to inform health state utility values based on HRQoL data from the relevant clinical trial(s) [26]. However, in certain circumstances (e.g. in case of a relapse event) patients in trials may no longer be capable of, or asked, to complete the HRQoL questionnaires. In such circumstances where HRQoL data from trials are not available, utility values may be sourced from the literature. When considering studies from the literature, these should be methodologically solid, largely in line with NICE’s reference case and suitable for the decision context. In the current STA, the relapse utility value was based on a study of Joshi et al., which had a small sample size (n = 23) and large standard error, and used a composite time trade-off method to elicit utility values (not in line with the NICE reference case) [13]. Alternative studies (Stein and Tremblay) were considered more suitable but also not ideal (elicited in US populations), and it was unclear what the appropriate relapse utility value would be [22, 27]. Sensitivity analyses can be performed to reflect the impact of alternative utility values when there is uncertainty regarding the source to inform health state utility values, which was appropriately done by the company in the current TA.

5 National Institute for Health and Care Excellence Guidance

5.1 Consideration of Clinical Effectiveness

On the basis of expert clinical opinion and evidence from the HMRN of lack of use in clinical practice, the appraisal committee (AC) concluded that low-dose cytarabine and subcutaneous azacitidine “…would not likely be used routinely as maintenance treatment in people who are in complete remission.”, which implies that they need not be comparators [3].

They concluded that oral azacitidine improves overall survival and relapse-free survival compared with placebo and that the results from the QUAZAR EU subgroup are generalisable to clinical practice in England [3]. They also concluded that “…the number of cycles of pre-trial consolidation therapy in QUAZAR likely reflects NHS clinical practice.”. This is despite the maximum number of cycles in the trial being only one, which is substantially lower than the three to four cycles that they referred to as “optimum best practice”, which the clinical expert stated would be received by about 20% of people. Partly because of the trial design of RATIFY whereby randomisation to midostaurin or placebo occurred at induction and not maintenance, the committee concluded that the results of the ITC comparing oral azacitidine with midostaurin were “highly uncertain and considered this in its decision making.”

5.2 Consideration of Cost-Effectiveness

The AC agreed with the ERG’s preference to remove the temporary disutility associated with HSCT, and preferred the company to have included HSCT as a health state in the model [3]. Hence, the AC considered that there was remaining uncertainty about whether the company’s model captures the long-term benefits after HSCT. The AC considered that it did not make sense that the RFS utility was higher than the age-adjusted population norm in the UK and concluded that the RFS utility should be capped at age- and sex-matched general population levels. The AC further agreed with the ERG that data from Tremblay 2018 were most suitable to inform HRQoL after relapse, as the study of Joshi 2019 had a small sample size and used a methodology (composite time trade-off) that was not part of the NICE reference case. The AC considered that joint survival models were appropriate for estimating OS and RFS in the Europe subgroup. The company’s joint and individual modelling results were comparable and the impact of choosing between these approaches was likely minor, which reassured that both approaches reflected the trial data and resulted in similar extrapolations. Although the AC initially considered that it was highly optimistic to assume no waning of the oral azacitidine treatment effect on the basis of the observed trial data, it was convinced that modelling treatment effect waning did not have a significant impact on the cost-effectiveness results on the basis of several scenario analyses performed by the company and the ERG.

The AC recommended oral azacitidine (according to the commercial arrangement), within its marketing authorisation, as an option for maintenance treatment for acute myeloid leukaemia in adults who are in complete remission, or complete remission with incomplete blood count recovery, after induction therapy with or without consolidation treatment, and cannot have or do not want a haematopoietic stem cell transplant [3]. The clinical trial evidence showed that in patients on oral azacitidine it took longer for their cancer to relapse, and they lived longer than patients on placebo. Oral azacitidine met NICE’s criteria to be considered a life-extending treatment at the end of life, and the most likely cost-effectiveness estimates for oral azacitidine were within what NICE normally considers an acceptable use of NHS resources for end-of-life treatments.