Background

Mantle cell lymphoma (MCL) is a rare, incurable and aggressive sub-type of non-Hodgkin’s lymphoma (NHL), accounting for approximately 6 % of all NHL cases [1]. The incidence of MCL in the UK is 0.9 per 100,000 [1]. The general pattern of disease progression in MCL is one of relapse and remission, with each relapse becoming more difficult to treat, and the depth and durability of any subsequent remissions achieved invariably inferior to those achieved with first-line treatment [26].

In patients first presenting with aggressive disease requiring treatment, the initial treatment decision is whether patients are suitable for high-intensity induction therapy, to be followed by haematopoietic stem cell transplantation (HSCT). There are no strict criteria against which patients are assessed; rather, haematologists will assess eligibility on a patient-by-patient basis, taking into account factors such as patient age, performance status and disease prognosis, disease severity, co-morbidities, and clinical risk [2, 510].

For patients who are not eligible for high-intensity induction therapy, that is those for whom HSCT is unsuitable, there had been no licensed induction therapy regimens prior to bortezomib. Rituximab, cyclophosphamide, doxorubicin, vincristine and prednisone (R-CHOP) became the preferred first-line induction therapy in UK clinics because the large scale European MCL Elderly trial [11] demonstrated a survival benefit for R-CHOP when compared with rituximab in combination with fludarabine and cyclophosphamide (R-FC). Alternative rituximab-based chemotherapy induction regimens are also administered in the first-line setting, but usually only for the frailest of patients considered unsuitable for R-CHOP therapy; while alternatives are considered to be associated with lower toxicity, the evidence base supporting their use is considerably weaker [12]. Median progression-free survival (PFS) associated with chemotherapy is less than 2 years, and median overall survival (OS) is less than 5 years [10, 1319].

Bortezomib is the first product to be licensed for the treatment of patients with previously untreated MCL for whom HSCT is unsuitable. Bortezomib is administered in combination with the rituximab, cyclophosphamide, doxorubicin, prednisone backbone familiar to clinicians as part of the R-CHOP regimen. A randomised, open-label, multicentre Phase III study (LYM-3002) comparing bortezomib, rituximab, cyclophosphamide, doxorubicin and prednisolone (VR-CAP) to R-CHOP showed a significant improvement in PFS (24.7 versus 14.4 months; hazard ratio [HR] = 0.63, p < 0.001) based on the primary assessment of PFS by the independent review committee (IRC) [20]. Duration of overall response for VR-CAP was more than double that of R-CHOP (median of 36.5 versus 15.1 months), resulting in an increase in the treatment free interval (TFI) of almost 20 months versus R-CHOP (median of 40.6 versus 20.5 months; HR = 0.50, p < 0.001) [20].

There have been no previous technology appraisals by the National Institute of Health and Care Excellence (NICE) within MCL; other therapies that are frequently used such as bendamustine and temsirolimus did not go through the UK health technology assessment (HTA) process due to lack of marketing authorisation approval and manufacturer non-submission, respectively. To gain NICE recommendation for VR-CAP, the cost effectiveness of VR-CAP had to be assessed over the long term and beyond the duration of clinical trial follow up. As median survival for VR-CAP had not been reached in the LYM-3002 trial, it was challenging to provide realistic and robust estimates of long-term OS. This challenge is common in UK HTAs and will become more pronounced as regulatory and HTA bodies come under pressure to provide earlier access to promising drugs.

The objective of this study was to assess the cost effectiveness of VR-CAP compared to R-CHOP, in a UK setting, which is currently seen as standard first-line treatment for patients with MCL.

Methods

Model structure

The cost-effectiveness model was developed as a Markov model with five health states, representing pre- and post-progression from first- and second-line treatment, as well as death, as presented in Fig. 1. A hypothetical cohort of patients enter the model when they start their first-line treatment for MCL, and their progression through the disease, including second-line treatment, was followed until death. The model used a cycle length of 1 week, at which time patients could move between health states. The cycle length of 1 week was selected to give sufficient granularity to capture short-term changes in progression status. And a lifetime horizon of 20 years was used in line with UK guidance; ≥94 % of patients were modelled to have died within this time horizon [21]. The model used the perspective of the UK National Health Service, and a discount rate of 3.5 % per year for costs and health outcomes as per UK guidance [21].

Fig. 1
figure 1

Model diagram. PFS, progression-free survival; PPS, post-progression survival; PrePS, pre-progression survival; TFI, treatment-free interval. 1. Modelled using survival function to PFS Kaplan–Meier data; 2. Modelled using survival function to TFI Kaplan–Meier data; 3. Modelled using average duration of second-line treatment; 4. Modelled using survival function to PrePS Kaplan–Meier curve plus general population background mortality data; 5. Modelled using survival function to PPS Kaplan–Meier curve

Population

The population included in the model was the intention to treat population from the LYM-3002 trial; the only trial investigating the comparative effectiveness of VR-CAP and R-CHOP in MCL (this was confirmed in a systematic literature review). A scenario analysis was performed that included only patients clinically ineligible for HSCT, as LYM-3002 also included patients that were ineligible due to non-clinical reasons (e.g. HSCT was not available or was refused by the patient). Baseline patient characteristics for both populations are presented in Table 1.

Table 1 Baseline characteristics of all patients versus non-HSCT eligible patients only in the LYM-3002 trial

Transitions between health states

Transitions between health states in the model were based on LYM-3002 data. In addition to PFS by IRC, which was the primary outcome, PFS was also assessed by the investigator and in an alternative IRC assessment. In the primary IRC assessment, patients were classified as progressed when the disease seemed to have worsened based on the International Workshop Response Criteria, on one computerised tomography scan. In the alternative IRC assessment, this could be revised depending on whether a lesion was assessed as resolved or persisting at subsequent time points by the IRC. The alternative IRC assessment of PFS was considered to more closely reflect clinical practice, where more than one scan would be used to assess progression [22]. Scenario analyses were performed to test the impact of the different assessment methods on the model outcomes. To extrapolate beyond the duration of the clinical trial, six different survival functions (exponential, gamma, Gompertz, log-logistic, log-normal and Weibull) were fitted to these PFS trial data, following NICE Decision Support Unit guidance [23]. The choice between survival models was based upon statistical goodness of fit measured using the Akaike information criterion and the Bayesian information criterion (Table 2), visual fit to the trial Kaplan–Meier data, and the validity of the projected survival estimates as assessed by practicing haematologists. The log-logistic model was seen as the most reflective of outcomes observed in clinical practice, and this was therefore used in the model base case (Fig. 2).

Table 2 Goodness of fit and model parameters for the PFS, PrePS and PPS curves
Fig. 2
figure 2

Log-logistic PFS curves used in model base case. KM, Kaplan–Meier; PFS, progression-free survival; R-CHOP, rituximab with cyclophosphamide, doxorubicin, vincristine and prednisolone; VR-CAP, bortezomib with rituximab, cyclophosphamide, doxorubicin and prednisolone

Because of the immaturity of OS data, survival functions were stratified by progression status at the end of the trial (pre-progression survival [PrePS] and post-progression survival [PPS]). For non-progressed patients this was also stratified by trial arm. PPS was assumed equal across model arms. This was justified by the observation that PPS was similar for the VR-CAP and R-CHOP arms in the LYM-3002 trial [24], and the expectation that different prior treatments would not be expected to impact PPS [12]. Finally, two studies identified in a literature review of surrogate endpoints in MCL also indicated that PFS may be an appropriate surrogate for OS [25, 26].

Because the long-term projections of PrePS based on extrapolation were quite high, presumably due to the relative immaturity of data, it was decided that non-disease-specific mortality, based on age and gender, should be added to these curves to better capture long-term survival. This was included and based upon UK life tables [27]. For PrePS and PPS, the exponential curves were judged as most reflective of outcomes observed in UK clinical practice (Fig. 3) [12].

Fig. 3
figure 3

Exponential disease-specific OS curves used in model base case. KM, Kaplan–Meier; OS, overall survival; PrePS, pre-progression survival; PPS, post-progression survival; R-CHOP, rituximab with cyclophosphamide, doxorubicin, vincristine and prednisolone; VR-CAP, bortezomib with rituximab, cyclophosphamide, doxorubicin and prednisolone

Second-line treatment starts after a treatment-free interval modelled using exponential survival functions (Fig. 4). The distribution of patients over different treatments as well as average duration of treatment (used as a proxy for PFS from second-line treatment; 90 days for both arms) were based on LYM-3002.

Fig. 4
figure 4

Exponential TFI curves used in model base case. KM, Kaplan–Meier; R-CHOP, rituximab with cyclophosphamide, doxorubicin, vincristine and prednisone; TFI, treatment-free interval; VR-CAP, bortezomib with rituximab, cyclophosphamide, doxorubicin and prednisone

Adverse events

All adverse events (AEs) that happened at Grade 3 or higher in at least 5 % of either treatment group, as well as Grade 2 peripheral sensory neuropathy and Grade 3 or higher alopecia and sepsis, were included in the model, with rates as reported within the LYM-3002 trial. These were selected based on expectation of an important impact on costs, utility or both. The annual rate for each AE was calculated from the number of events in the LYM-3002 trial and the total patient years on treatment. This annual rate was then used to calculate the weekly probability of each AE.

In the model, red blood cell and platelet transfusions were administered to patients to treat AEs and to avoid having to decrease chemotherapy doses. Again, the weekly probability of requiring a transfusion was based on annual rates of administration in LYM-3002 [24].

Medical resource use and costs

All costs were based on 2013/2014 UK prices. Patient level data from the LYM-3002 trial were used to model the number of patients receiving first-line treatment per treatment cycle. Dose reductions were also applied as they were observed in the trial. Most of the drug doses included in the analysis were based on patient weight or body surface area. To calculate the number of vials required per administration, a distribution was fitted to the patient characteristics observed in the trial. This was then used to calculate the average cost per dose for all patients [28]. Administration costs were applied for all intravenous administrations; for oral drugs one administration visit was assumed at the start of treatment. The use of tests, scans and medical visits was based on advice of UK haematologists and was assumed to vary by treatment status and progression status (Table 3) [24]. Standard UK unit costs were used for treatment, administration, concomitant medication, medical resource use, adverse events and terminal care [2933]. Treatment, administration and end-of-life costs are summarised in Table 4.

Table 3 Medical resource use for disease management by health state (Source of costs: NHS reference costs 2013–2014 [29])
Table 4 Cost inputs used in the model

Quality of life

Utility scores ranging from 0 to 1, with 0 representing death and 1 representing perfect health, defined the quality of life of patients. In the LYM-3002 trial, utility was measured using the EQ-5D at each cycle of treatment and at the end-of-treatment visit, which was performed 30 days after the last dose was administered. These data were therefore used for the progression-free and progressed from first-line treatment health states. Patients that were progression-free from second-line treatment were assumed to have the same utility as patients progression-free from first-line treatment (Table 5). The economic literature was searched to identify utility values for the progressed from second-line treatment health state; values from aggressive NHL were selected as there were no utilities published specifically for MCL [34]. Decreases in utility for patients experiencing adverse events were also modelled using weekly probabilities of AEs and average durations of AEs from LYM-3002 trial data.

Table 5 Utilities applied to the model

Outcomes

The outcome used in this cost-effectiveness analysis was the cost per quality-adjusted life year (QALY). QALYs were calculated by multiplying the time a patient spent in a specific health state by the utility value associated with that health state. Average lifetime QALYs per patient were calculated as well as average lifetime costs. These were used to calculate the incremental cost-effectiveness ratio (ICER).

Sensitivity analysis

A series of one-way sensitivity analyses were performed changing one parameter at a time to the upper and lower limit of their 95 % confidence interval, respectively, holding all other parameters constant. This was done to evaluate the sensitivity of the model to individual model inputs. Additionally, a probabilistic sensitivity analysis (PSA) was performed where all parameters at once were randomly sampled from their distribution. This was iterated 1,000 times, so that the uncertainty around the point estimate of the model outcome could be tested. Through empirical testing it was found that 1,000 iterations were sufficient to capture the uncertainty around the base case ICER.

Scenario analyses were also performed testing the assumptions around PFS, OS and utilities, by changing assumptions and using alternative data sources.

Validation

Because of the uncertainty in the extrapolation of OS data due to immaturity of the data, a comparison of model outcomes to long-term observational studies from inside and outside the UK was made; this showed that outcomes of the model were comparable with contemporaneous long-term datasets (Fig. 5). In comparison to available observational datasets, the survival in the LYM-3002 trial closely followed that reported by Abrahamsson but was greater than that of Surveillance, Epidemiology, and End Results Program (SEER) [35, 36]. Abrahamsson et al. was a recent publication (2014) that reported the OS of a European population (Swedish) and used a similar treatment to the LYM-3002 trial (rituximab-based chemotherapy). In contrast, data from SEER are much older than data from the LYM-3002 trial (2004–2007 versus 2008–2011); the study was conducted in the US and included all MCL treatments (i.e. was likely to include treatments that were less efficacious than R-CHOP).

Fig. 5
figure 5

Modelled OS compared to observational datasets. MCL, mantle cell lymphoma; OS, overall survival; SEER, Surveillance, Epidemiology, and End Results Program; R-CHOP, rituximab with cyclophosphamide, doxorubicin, vincristine and prednisone; VR-CAP, bortezomib with rituximab, cyclophosphamide, doxorubicin and prednisone

Results

As presented in Table 6, VR-CAP is associated with higher costs and greater efficacy compared to R-CHOP. The base case results demonstrate that VR-CAP is a cost effective treatment at the conventional UK willingness-to-pay threshold of £20,000–£30,000 per QALY [21] with an ICER of £20,043. The PSA indicated that there was a probability of 88.9 % that the ICER lies below the threshold of £30,000 per QALY. Figure 6 indicates that most uncertainty in the model comes from uncertainty in efficacy.

Table 6 Discounted base case model outcomes
Fig. 6
figure 6

Cost-effectiveness plane from 1,000 PSA iterations. PSA, probabilistic sensitivity analysis; QALY, quality-adjusted life year; WTP, willingness to pay

Table 6 shows that VR-CAP patients have a longer PFS, whereas R-CHOP patients spend more time in the ‘progressed from second-line treatment’ health state than VR-CAP patients. This is due to the difference in PFS, while PPS is assumed to be equal between arms, generating a smaller difference in OS than PFS. The treatment cost accounts for the majority of the overall costs (Table 6), and therefore uncertainty around resource use and cost sources other than drug costs will have only a minor impact on model outcomes.

One-way sensitivity analysis showed that uncertainty in the parameters used within the model for PFS projections had the biggest impact on model outcomes together with the utility value applied to the ‘progressed from second-line treatment’ health state (Fig. 7).

Fig. 7
figure 7

Tornado diagram displaying the ICER sensitivity to the ten most influential model inputs. ICER, incremental cost-effectiveness ratio; IV, intravenous; OS, overall survival; PFS, progression-free survival; R-CHOP, rituximab with cyclophosphamide, doxorubicin, vincristine and prednisone; VR-CAP, bortezomib with rituximab, cyclophosphamide, doxorubicin and prednisone

As can be seen from Table 7, the ICER is relatively insensitive to the scenario analyses performed. Using different survival functions for PFS had the largest impact on model outcomes, and alternative sources for utility data for patients progressed from second-line treatment had the largest impact on the ICER. Using different trial assessments of PFS had only a limited impact on outcomes.

Table 7 Results of scenario analyses

Discussion

The base case ICER of £20,043 indicates that VR-CAP is a cost-effective treatment option for patients with previously untreated MCL, using the standard UK threshold of £20,000–30,000 per QALY.

In the analysis, PFS is used as a surrogate for OS. This approach assumes that there is no survival benefit after a patients disease has progressed following treatment. When OS data were used directly to model cost effectiveness, the ICER increased slightly to £21,357. In this scenario it is assumed that there is a continued benefit of VR-CAP over R-CHOP after disease progression. The observation that OS, as modelled in the base case, shows a good reflection of the LYM-3002 data supports the use of PFS as a surrogate in the base case. A targeted literature review of NICE appraisals for cancer drugs from 2010 onwards identified two recent examples where PFS was used as a surrogate for OS either directly or indirectly (by assuming the same post-progression survival [PPS]) [37, 38]. In both cases, this methodology came under substantial scrutiny. Additionally, three submissions were identified where the same PPS was applied for all treatment arms [3941].

There are some differences between the LYM-3002 trial population and MCL patients in the UK. As is often the case in clinical trials, the mean age of participants in LYM-3002 (64 years) was relatively low, compared with most patients who present at a median age of 73.5 in clinical practice in the UK [42]. Additionally, only 30 % of patients enrolled in LYM-3002 came from the European Union or North America, with no patients included from the UK. However, efficacy results showed consistency between geographic regions both in the size of benefit with VR-CAP and the absolute PFS for R-CHOP. It is therefore unlikely that the geographic spread of countries included in the trial and the lack of UK patients had any relevant impact upon the results.

The status of the OS data is the main uncertainty in assessing the cost effectiveness of treatment. Despite the conclusion that modelled OS was reasonably comparable to long-term datasets, OS data for VR-CAP are immature. Once the final analysis of OS for LYM-3002 is available, the model could be re-assessed to confirm robustness of the current analysis.

The model does not take into account rituximab maintenance (R-maintenance) treatment for patients that respond to induction therapy, which has been adopted in clinical practice in recent years based on the findings of the European MCL Elderly trial [11]. At the time of initiation of LYM-3002, R-maintenance was not commonly adopted and thus was not included in the trial design. There is a believe that R-maintenance therapy results in similar benefit after any CHOP-like induction regimen, and therefore we would expect to be able to give R-maintenance after VR-CAP induction with a similar extension to median survival times as observed with R-maintenance after R-CHOP induction [43]. As the European MCL Elderly trial was not designed to assess the clinical efficacy of induction therapy with versus without maintenance therapy, it could not be used to model R-maintenance.

When submitted to NICE, the evidence review group agreed that immature data may bias the extrapolation of survival data, and had some concerns about the methods used to overcome this. It was argued that if data are too immature to model OS for all patients, it would be questionable whether sufficient data are available to separately estimate long-term survival for patients with and without progression. However, the uncertainty was reduced for patients who had progressed as a smaller proportion of patients at risk were still alive at the time of evaluation. Furthermore, the data for the two treatment arms is pooled and thereby the total sample size is increased. The uncertainty of survival for patients who had not progressed may be increased by using this method, but this was accounted for by including general population mortality for patients that had not yet progressed. In doing so, it was assumed that all deaths in the PrePS curves (prior to adjustment for background mortality) in the trial were deaths from MCL. This was a reasonable assumption as the number of deaths reported in the LYM-3002 trial that were not due to progression or toxicity was very low. Of the 69 deaths in total in the VR-CAP group, there were only eight deaths that were not due to progression or AEs. In the R-CHOP group, there were a total of 87 deaths, of which 14 were not due to progression or AEs [22].

A submission for HTA was also made to the Scottish Medicines Consortium (SMC), who also noted that there are limitations arising from the maturity of the survival data, but found it unlikely that the approach taken would cause substantial bias in favour of VR-CAP. The SMC noted that this was supported by the literature providing evidence of an association between PFS and OS in MCL. In addition, it was noted that the modest impact on the ICER from uncertainty associated with varying survival inputs meant that the ICER for VR-CAP was robust [44].

In 2015 both NICE and the SMC accepted the overall approach taken in the cost-effectiveness model as a basis for their conclusion that VR-CAP represents a cost-effective treatment option for previously untreated MCL for whom HSCT is unsuitable, in the UK [22, 44]. VR-CAP is now recommended for use within the National Health Service.

Conclusion

The current model shows that VR-CAP is a cost effective treatment option for patients with previously untreated MCL, for whom haematopoietic stem cell transplantation is unsuitable, in the UK. Both NICE and SMC have recommended the use of VR-CAP in these patients.

Abbreviations

AE, adverse event; ECOG, Eastern Cooperative Oncology Group; HMRN, Haematological Malignancy Research Network; HSCT, haematopoietic stem cell transplantation; HTA, health technology assessment; ICER, incremental cost-effectiveness ratio; IRC, independent review committee; IV, intravenous; MCL, mantle cell lymphoma; NHL, non-hodgkin lymphoma; NHS, National Health Services; NICE, National Institute for Health and Care Excellence; OS, overall survival; PFS, progression-free survival; PPS, post-progression survival; PrePS, pre-progression survival; PSA, probabilistic sensitivity analysis; QALY, quality-adjusted life year; R, rituximab; R-CHOP, rituximab, cyclophosphamide, doxorubicin, vincristine and prednisolone; R-FC, rituximab, fludarabine and cyclophosphamide; SEER, Surveillance, Epidemiology, and End Results Program; SMC, Scottish Medicines Consortium; TFI, treatment-free interval; UK, United Kingdom; VR-CAP, bortezomib, rituximab, cyclophosphamide, doxorubicin and prednisolone