FormalPara Key Points for Decision Makers

Nivolumab plus ipilimumab is a cost-effective treatment for advanced melanoma patients in England

Assuming equal post-progression survival between comparators of similar mechanism of action or using data from earlier data-cuts can provide comparable incremental cost-effectiveness ratios to those calculated in the absence of overall survival data

1 Introduction

A key issue in modelling immunotherapies is that survival data often suggest a plateau in overall survival (OS) but clinical trials end or release data-cuts before this has been fully demonstrated, causing uncertainty in cost-effectiveness analyses [1]. This issue has become more prevalent as timelines for regulatory and reimbursement submissions become accelerated, with companies seeking approval based on fewer, less mature survival data.

Ipilimumab and nivolumab are fully human, monoclonal immunoglobulin antibodies (IgG1k and IgG4 HuMab, respectively) that act as checkpoint inhibitors of cytotoxic T lymphocyte antigen-4 (CTLA-4) and programmed death-1 (PD-1) at their distinct, yet complementary, positions within the T cell response pathway. Ipilimumab stops the immune response from being ‘switched off’, allowing the production of active T cells to continue and potentially increasing the number of activated T cells surrounding the tumour [2]. Nivolumab stops the inactivation of T cells at the tumour site, allowing more active T cells to infiltrate and destroy the tumour [3].

Used together, nivolumab plus ipilimumab, or the Regimen, potentiates immune-mediated tumour destruction, stimulating the patient’s immune system to attack cancer cells and destroy the tumour through intrinsic processes. A challenge in assessing immunotherapy benefit is that patients may experience an increase in tumour size before a response is seen, known as pseudo-progression [4].

A phase I trial showed an unprecedented 68% OS at 3 years in unresectable melanoma patients treated with the Regimen [5]. The primary source of information for the effectiveness of the Regimen and ipilimumab monotherapy is the CheckMate 067 study, an international, randomised controlled trial (RCT) that demonstrated superior OS and progression-free survival (PFS) for the Regimen compared with ipilimumab. Latest CheckMate 067 data show 3-year OS rates of 57% for the Regimen versus 31% for ipilimumab monotherapy [6].

The Regimen is associated with a predictable safety profile, with immune-related adverse events (AEs) that are acute and generally reversible and in line with well-established safety algorithms in the majority of patients. Recent evidence shows that many patients who discontinued due to AEs have better response rates and PFS than those who did not, with the presence of AEs potentially indicating a good physiological response to the treatment [7].

The Regimen was assessed for the treatment of adults with advanced (unresectable or metastatic) melanoma by the National Institute for Health and Care Excellence (NICE) in June 2016. At the time of the manufacturer’s submission to NICE (January 2016), OS data were not available; therefore, a common post-progression treatment effect was assumed for all immunotherapies compared. The relevant standard of care was ipilimumab, BRAF inhibitor (dabrafenib or vemurafenib) monotherapy or pembrolizumab (approved in August 2015), which was added as a comparator due to its availability and subsequent increased use during the appraisal process [8, 9]. Nivolumab monotherapy was available for use subsequently, and was not a comparator for the Regimen. These treatments demonstrated significant clinical benefit over traditional chemotherapy but, unfortunately, continue to have limitations such that many patients did not achieve durable response and long-term survival. Ipilimumab, BRAF inhibitors and pembrolizumab were included in the model for the NICE appraisal of the Regimen [9], but for the purposes of this article, only the comparison to ipilimumab is shown.

The analysis presented here compares the cost effectiveness of the Regimen with ipilimumab, in the CheckMate 067 trial population in England, using two different models: one model each with 18-month (OS unavailable) and 36-month (OS available) CheckMate 067 data-cuts (Fig. 1).

Fig. 1
figure 1

Available clinical trial data. 18-month overall survival data were not available at the time of modelling for the National Institute for Health and Care Excellence submission. Ipi ipilimumab, KM Kaplan–Meier, OS overall survival, PFS progression-free survival

2 Method

2.1 Model Structure

A partitioned survival model and a Markov state-transition model were developed in Microsoft Excel® (Microsoft Corp., Redmond, WA, USA) to estimate the lifetime costs and benefits of the Regimen versus ipilimumab, from the UK healthcare system perspective.

Both models were based on a three-health state structure, similar to previous models submitted to health technology assessment agencies for ipilimumab and nivolumab monotherapy [10], and both used the most recent efficacy and safety data available at the time of building.

Health states were defined by three different measures (Fig. 2):

Fig. 2
figure 2

Model structure

  • Progression status for modelling survival and quality of life (three states): progression-free, progressed and dead.

  • Time since treatment initiation and time to death for modelling resource use (six states): first, second, third and fourth year after treatment initiation, fifth and subsequent years after treatment initiation, 12 weeks before death (palliative care) and death.

  • Treatment status for modelling drug cost and AEs (two states): on treatment and off treatment.

The two models varied in their basic structure. With CheckMate 067 OS data, an area under the curve (AUC) approach (partitioned survival) could be used, but without the OS data, a state-transition model was used. Thus, rather than the average simulated patient having their survival dependent on time and fitted survival curves, each patient has a probability of moving from one state to another, and it is the proportion of patients in each of these states that determines OS and PFS. Although these two modelling approaches require different assumptions, research has shown that they should be functionally equivalent [11].

Data for the comparative effectiveness of the Regimen and ipilimumab were taken from the CheckMate 067 and MDX010-20 (phase III RCT of ipilimumab versus gp100 vaccine) clinical trials. Trial data were used for utilities and safety; resource use estimates were obtained from the literature and a validation meeting with UK clinicians. Clinicians were selected as experts in the treatment of advanced melanoma who had experience of immunotherapies from clinical studies and real-world use, with consensus reached through advisory board discussion and questioning.

2.2 Modelling Survival for the Regimen and Ipilimumab Without CheckMate 067 Overall Survival (OS) Data

At the time of submission to NICE, when OS data were not available, CheckMate 067 trial data were used to estimate pre-progression survival (PrePS) and time to progression (TTP). PFS was estimated from TTP and PrePS, up to the longest follow-up available at the time from CheckMate 067 (1.5 years).

In the absence of head-to-head OS data, patient-level data from MDX010-20 and CheckMate 066 (for ipilimumab and nivolumab monotherapy, respectively) were used to estimate post-progression survival (PPS) for all immunotherapies.

Transition probabilities were derived from these curves and used to estimate the proportion of patients in the progression-free, progressed and dead states over time.

For the Regimen and ipilimumab, patient-level trial data were used to fit parametric curves adjusted for the covariates selected based upon the Korn meta-analysis, which analysed factors affecting prognosis within advanced melanoma treated with palliative chemotherapy [12]. This list was validated at the UK clinician advisory board:

  • Treatment (only included for TTP and PrePS)

  • Trial (MDX010-20 or CheckMate 066, only included for PPS)

  • Baseline Eastern Cooperative Oncology Group score

  • Lactate dehydrogenase

  • Metastasis stage

  • History of brain metastases

  • Age group

  • Sex

  • Subsequent ipilimumab (only included for PPS).

2.2.1 Time to Progression

Due to the trial protocol effect where the first tumour assessments were performed at Week 12 in CheckMate 067, Kaplan–Meier (KM) data were used for the first 84 days of TTP. This cut-off date was chosen based on inspection of the KM curve and timing of patients’ first clinic visits within the included clinical trial.

In accordance with NICE guidance [considering the visual fit of the parametric curves compared to the KM curves, clinical plausibility of extrapolation, and comparison of the Akaike information criteria (AIC) and Bayesian information criteria (BIC)], the log-normal curve was chosen out of the six parametric curves fitted to TTP [13] and used for extrapolation for the entire model time horizon.

2.2.2 Pre-Progression Survival

For PrePS, none of the six curves provided a good visual fit to the data. Therefore, the KM data were used directly to model PrePS, up to the latest follow-up (1.5 years). It was assumed that between 1.5 and 3 years, if a patient had not progressed, their survival was equivalent to that of the general population. This is because in the data-cut with longest follow-up of 1.5 years, no deaths were observed on the Regimen arm after 271 days or on the ipilimumab arm after 342 days.

2.2.3 Post-Progression Survival

A log-logistic curve was chosen for the base-case PPS curve, based on the NICE Decision Support Unit guidance [13]. It was conservatively assumed that PPS is the same for all immunotherapies, including ipilimumab and the Regimen, after controlling for patient characteristics.

2.2.4 OS

OS between Year 1.5 and Year 3 was based on PrePS (as in Sect. 2.2.2) and PPS from MDX010-20 and CheckMate 066. From Year 3 onwards, long-term OS for the Regimen and ipilimumab was based on pooled ipilimumab data from Schadendorf et al. [14]. The pooled analysis showed a plateau in the OS curve beginning around Year 3 and continuing to the end of the 10-year follow-up [14]. This was also assumed to be applicable to long-term OS for the Regimen due to the similarity of the mechanism of action. The Gompertz curve was selected for the base case, following NICE guidance [13]. Life tables for England were used as a minimum threshold for cycle mortality in the model [15].

2.3 Modelling Survival for the Regimen and Ipilimumab with CheckMate 067 OS Data

When OS data became available, patient-level data from CheckMate 067 were used to fit parametric curves for PFS (from a 36-month data-cut) and OS for the Regimen and ipilimumab. Due to the protocol effect mentioned previously, KM data were used for the PFS curves before 91 days, chosen based on inspection of the KM curve and timing of patients’ first clinic visit in the trial. Originally 84 days, the updated data-cut showed more events between 84 and 91 days, which made 91 days the more appropriate cut-off. After 91 days, Gompertz curves were used to fit PFS data; dependent curves (treatment used as a covariate) were fitted due to the assumption of proportional hazards holding under the log-cumulative hazard plots and proportional hazards test. Considering the visual fit of the parametric curves compared to the KM curves, clinical plausibility of extrapolation, and comparison of the AIC and BIC, a Gompertz fit was used for the Regimen OS data and a log-normal curve for the ipilimumab OS data; independently modelled curves were fitted for each treatment as the proportional hazards assumption did not hold. These curves were used for the first 2 years, and data from the Schadendorf et al. [14] paper were used in the long-term (using a Gompertz curve fit as per the model without OS data). Transition probabilities were derived from these curves and used to estimate the proportion of patients in the progression-free, progressed and dead states over time.

KM data and fitted curves for PFS are presented in Figs. 3 and 5, and those for OS are presented in Figs. 4 and 6. Curve parameters are detailed with key model inputs in Table 1. The differences between the curves of the models with and without OS data are due to the different curve fits, methodologies and data-cuts taken from CheckMate 067.

Fig. 3
figure 3

Short-term progression-free survival curve fits and Kaplan–Meier curves with and without CheckMate 067 overall survival. 36-month data-cut used for the Regimen and ipilimumab. Ipi ipilimumab, KM Kaplan–Meier, OS overall survival, w/OS model with CheckMate 067 overall survival data, w/o OS model without CheckMate 067 overall survival data

Fig. 4
figure 4

Short-term overall survival curve fits with and without CheckMate 067 overall survival. 36-month data-cut used for the Regimen and ipilimumab. Ipi ipilimumab, KM Kaplan–Meier, OS overall survival, w/OS model with CheckMate 067 overall survival data, w/o OS model without CheckMate 067 overall survival data

Fig. 5
figure 5

Long-term progression-free survival curves with and without CheckMate 067 overall survival. Ipi ipilimumab, OS overall survival, w/OS model with CheckMate 067 overall survival data, w/o OS model without CheckMate 067 overall survival data

Fig. 6
figure 6

Long-term overall survival curves with and without CheckMate 067 overall survival. Ipi ipilimumab, OS overall survival, w/OS model with CheckMate 067 overall survival data, w/o OS model without CheckMate 067 overall survival data

Table 1 Summary of key inputs

2.4 Time on Treatment

For both models, time on treatment was taken from patient-level data from CheckMate 067, and a log-logistic curve was used for extrapolation, selected as per NICE guidance using the process detailed previously.

A maximum treatment duration of 2 years was assumed in the models for nivolumab. Clinical consensus from UK clinicians is that stopping nivolumab monotherapy treatment at 2 years is an acceptable recommendation for the small minority of patients who may still be receiving treatment at this time. This treatment continuation rule was tested in a range of scenario analyses, including the percentage of ‘on treatment’ patients discontinuing treatment at 2 years, and the maximum treatment duration [Electronic Supplementary Material (ESM) Table S1].

As data from the CheckMate 067 trial indicate no loss of response upon early discontinuation of therapy for many patients [7], it is assumed that treatment effect is maintained when patients discontinue PD-1 inhibitors. In CheckMate 067, only 24.3% of patients were still on nivolumab within the Regimen arm at 18 months, and of those that discontinued due to AEs, 76.2% of patients continued to exhibit response to the Regimen [16].

Ipilimumab patients receive four doses, as specified in the licensed indication, with 100, 95, 85 and 70% of patients receiving Dose 1, 2, 3 and 4, respectively (taken from CheckMate 067).

Drug costs were calculated using the method of moments [17]. Assuming a log-normal distribution for body weight, the proportion of patients requiring each possible number of vials was calculated based on the distribution derived from the individual patient weights. This is an accurate method of accounting for wastage, assuming that no vial sharing occurs.

For nivolumab, vial sizes of 40 and 100 mL are available, resulting in 10% vial wastage. However, only a vial size of 50 mL was available for pembrolizumab, resulting in 16% vial wastage.

2.5 Health-Related Quality of Life

EQ-5D-3L data collected in CheckMate 067 were used in the base case, with utilities calculated using the UK tariff. Utilities were estimated treatment-dependently for the progression-free and treatment-independently for progressed health-states, as AE disutilities were not captured elsewhere.

The key element we aimed to characterise is how progression status impacted utility over time, controlling for baseline utility and treatment, as this is one of the key drivers of the economic model. Progression status was defined using a simple time-varying indicator of progression and a more granular definition, taking into account response status at 6 months combined with the time-varying indicator.

Results showed that including the combined response status/progression status variables (in place of a simpler post-progression variable) did not help further distinguish the model. There was also a large reduction in the available number of visits/patients for that analysis due to censoring patients with non-evaluable response status at 6 months. The final model selected includes progression status, baseline utility value and treatment arm; age and sex were not significant in the statistical models. In this final model, all parameters included are significant.

The treatment arms for the Regimen and ipilimumab had utility decrements calculated from the statistical model fitted from the CheckMate 067 data.

2.6 Resource Use and Drug Costs

The frequency of drug administration was taken from CheckMate 067 for ipilimumab (3 mg/kg every 3 weeks for 12 weeks) and nivolumab (1 mg/kg every 3 weeks for the first 12 weeks, then 3 mg/kg every 2 weeks) [2, 3]. Health-state costs were split by time on treatment: Year 1, 2, 3, 4, 5 + and palliative care period in the last 12 weeks from death. These costs included outpatient, inpatient, terminal care, home care, laboratory tests, radiological examinations and pain control costs. Standard UK cost sources were used to cost AEs, resource use and administration costs. The cost year used was 2015, and the latest available costs from the UK National Health Service (NHS) reference costs and Personal Social Services Research Unit (PSSRU) were used [18, 19].

The proportion of patients on each subsequent treatment was taken from CheckMate 067.

2.7 Adverse Events

Drug-related AEs were captured within the model and classified as either any-grade endocrine disorder, Grade 2 or higher diarrhoea, or Grade 3 or higher other drug-related AEs, based on clinical expert opinion as to which AEs are likely to have a large impact on outcomes. There was no restriction on the minimum percentage of patients experiencing an AE, and patient-level AE data from CheckMate 067 were used for the Regimen and ipilimumab arms. CheckMate 067 trial data used in AE calculations included the proportion of patients experiencing AEs and the number of hospitalisation days. Total AE costs are provided in Table 1. These are applied as a total one-off cost at the start of the model.

2.8 Sensitivity Analysis

Deterministic and probabilistic sensitivity analysis were included in the model. One-way sensitivity analysis involved calculating the model result after varying each parameter individually to its lower and upper bounds, and probabilistic sensitivity analysis involved running the model 1000 times, each time taking a random value for all parameters across a defined distribution. The distributions used in the model for these analyses are included in Table 1.

Scenario analysis were also performed, testing the assumptions around specific parameters such as survival distributions and dosing calculations.

3 Results

At list price, both models produced an incremental cost-effectiveness ratio (ICER) of approximately £4500 for the Regimen versus ipilimumab monotherapy (Table 2).

Table 2 Base-case incremental cost-effectiveness results

Life-years (LYs) and quality-adjusted life-years (QALYs) gained were greater in the model with CheckMate 067 OS data than in the model without OS data (Table 2). This was because the longer PFS and OS follow-up data showed that both the Regimen and ipilimumab had greater effectiveness than that predicted in the model using the earlier data-cut without CheckMate 067 OS data (the first signs of plateau can be seen within the longer data-cut). Although the QALYs and LYs were greater in the model with OS data, the difference (increment) between them was still comparable to the first model.

Overall costs for each treatment between models were very similar. Subsequent treatment costs were slightly higher in the model without OS as more patients were moving into the progressed state.

One-way sensitivity analysis showed that the key drivers of both models were the parameters used to inform the OS curves (ESM Figures S1 and S2).

Probabilistic sensitivity analysis results were consistent with deterministic results, showing that the results were robust to the modelled uncertainty (ESM Figures S3 and S4). Conducting probabilistic sensitivity analysis for only the parameters relating to OS demonstrated that these parameters contributed 96.3% of the overall parameter uncertainty of the net monetary benefit in the model with CheckMate 067 OS [20]. For the model without CheckMate 067 OS, OS inputs (parameters for PPS curves and for long-term OS curves) contributed 53.7% of the overall uncertainty around the net monetary benefit. In this model, the uncertainty contributed by survival parameters is much less because there were no changes in the difference between OS post-progression due to assuming equal PPS.

Cost-effectiveness acceptability curves showed the Regimen to be the most likely to be cost effective at willingness-to-pay thresholds of £30,000 and £50,000 (ESM Figures S5 and S6).

Scenario analysis showed that the results were robust to changes in the treatment continuation rule, with all scenarios showing cost effectiveness for the Regimen versus ipilimumab in the model with CheckMate 067 OS data. Only in the absence of the treatment continuation rule (i.e. patients could continue treatment for up to 40 years) was the Regimen not cost effective versus ipilimumab in the model without CheckMate 067 OS at a £30,000 willingness-to-pay threshold (although cost effectiveness was observed at a £50,000 threshold). For all other scenarios, including model fits, time horizon and utilities, the Regimen remained cost effective across both models (ESM Tables S1 and S2).

4 Discussion

Our work shows that, in this case, assuming equal PPS across immunotherapies provides a relatively accurate estimate for ICER calculations in the absence of OS data (net benefit assuming a willingness-to-pay threshold of £30,000 was within 12% of the model with OS data), although absolute estimates were far from the clinical trial projections.

From our analyses, using two different modelling methods, we have identified that the Regimen is cost effective versus ipilimumab in the UK. The cost-effectiveness argument is driven by increased OS compared with current care, with relatively short treatment duration, and a reduction in the requirement for subsequent treatment. Using the same models described earlier, the Regimen was also shown to be cost effective against ipilimumab, pembrolizumab and BRAF inhibitors as part of technology appraisals to NICE and the Scottish Medicines Consortium, leading to reimbursement in their corresponding healthcare systems [9, 21].

When OS data were unavailable, we made the assumption of equal PPS as the different treatments work by activating the immune system. However, this does have its limitations as the treatments will not have identical mechanisms of action or treatment effects. This may indicate that this technique should only be used when all treatments have reasonably similar mechanisms of action. Other possibilities to address the lack of mature data would include response-based modelling, using a patient’s response status to predict their subsequent survival and resource use [22].

The mechanism behind immunotherapies is now well-understood, but the long-term effects are still being investigated. When modelling long-term survival, using historic data from the same disease area to establish surrogate relationships is a well-established technique, but this is difficult with immune-oncology therapies due to a lack of data. Being able to model long-term survival with the Schadendorf et al. [14] data allowed us to replicate the expected long-term survival of the Regimen; as ipilimumab is part of the therapy, assuming a similar survival long-term profile to ipilimumab patients is a reasonable assumption. Consequently, this model reflected the expectation that a substantial number of patients will achieve considerable long-term survival. For models in other disease areas, if no OS data are available but historic data are, assuming equal PPS for a treatment with a similar mechanism of action is a viable option. In this study, we assumed that the transition from progressed to death was equal for both arms (data from MDX010-20 and CheckMate 066) and informed long-term OS for each arm using the same data source (Schadendorf et al. [14]). However, when comparators have different mechanisms of action, and no supporting long-term data, estimating long-term survival and cost effectiveness would be considerably more challenging.

The key model assumptions and structures have been validated by clinical and health economics experts, and model results accurately reflect those seen in the literature and in clinical trials over the last 18 months (Table 3). Longer-term model results are slightly different, with the Regimen model with OS data results underestimating 3-year OS compared with that seen in the literature (57 and 68%, respectively) [5]. For ipilimumab, the model result over-estimates OS when compared with the Schadendorf et al. [14] data. This difference may be due to a greater benefit seen in the short-term in the CheckMate 067 data than in the Schadendorf et al. [14] data (because of the greater availability of effective subsequent treatments), but the long-term benefit from the Schadendorf et al. [14] data is still applied to the model arm after 2 years.

Table 3 Model validation with clinical trial results

In our case study, we found that assuming equal PPS underestimated total LYs and QALYs but provided reasonable estimates of total costs, incremental costs, LYs and QALYs. Although we cannot determine whether the differences in model results are due to different modelling approaches or the different data used in each model, we observe that most of the efficacy results from the model with OS more closely match the trial results (Table 3). There is little difference in costs between the two models, which may be down to the majority of costs being accrued in the first 2 years, where there is more overlap in the data used in the models.

Modelling the cost effectiveness of treatments presents considerable challenges and limitations. The extended duration of benefit provided by immunotherapies leads to uncertainty in appropriate treatment duration; this greatly influences the overall cost. Often, treatments show a large benefit for PFS but lack mature OS data. This lack of OS data is set to increasingly become an issue in reimbursement, with immunotherapies providing a step-change in the treatment and survival of melanoma (and in other diseases), with an uncertain duration of benefit. These challenges are also likely to become apparent when valuing the benefit of newer gene therapies. Although it may lead to conservative estimates of OS, our analysis shows that for comparators with similar mechanisms of action, assuming equal PPS or using data from earlier data-cuts can go some way to overcoming this limitation, providing comparable ICERs in the absence of OS data.