FormalPara Key Points for Decision Makers
Table 1

1 Introduction

Head and neck carcinomas are the sixth most common cancer in the world, accounting for approximately 550,000 new cases and around 300,000 deaths each year [1,2,3]. Patients with squamous cell carcinoma of the head and neck (SCCHN) who progress after platinum-based therapy have a poor prognosis, with a median overall survival (OS) of approximately 4–6 months [4]. Tumor growth and surgical treatment cause facial disfigurement, impairing patients’ ability to speak, swallow, and breathe [5, 6]. Consequently, SCCHN has a major impact on patients’ emotional, social, and physical functioning [7]. Furthermore, treatment of SCCHN has historically involved platinum-based chemoradiotherapy, which is associated with significant toxicities [8]. With multiple available treatment options in this clinical setting, it is crucial to evaluate not only the impact on survival but also on the quality of survival, in order to assist clinical and payer decision making.

CheckMate 141 (CHECKpoint pathway and nivoluMAb clinical Trial Evaluation) was a randomized (2:1), open-label, phase III trial in patients with recurrent or metastatic (R/M) platinum-refractory SCCHN that has progressed within 6 months of the last dose of a platinum-containing therapy in the adjuvant, primary (i.e., with radiation), recurrent or metastatic setting. Between June 2014 and August 2015, 361 patients in 15 countries across North America, Asia, Europe, and South America were randomized to either nivolumab or investigator’s choice (IC), which consisted of cetuximab, methotrexate or docetaxel. After progression, treatment was stopped for most patients and they transitioned to survival follow-up (a small subset of patients receiving nivolumab and meeting protocol-defined criteria were permitted to receive nivolumab after progression). Patient-reported outcomes (PROs) were assessed using the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire-Core 30 (EORTC QLQ-C30), EORTC-Head and Neck Cancer-Specific Module (EORTC QLQ-H&N35), and three-level EuroQol five-dimensional (EQ-5D-3L) questionnaires. This trial demonstrated that median OS (95% confidence interval [CI]) was significantly improved in patients receiving nivolumab (7.5 [5.5–9.1] months) compared with those receiving IC (5.1 [4.0–6.0] months), and 1-year survival (95% CI) was doubled with nivolumab (36.0% [28.5–43.4]) compared with IC (16.6% [8.6–26.8]) [9]. After 2 years, sustained OS benefit was observed with nivolumab, with a 32% reduction in risk of death compared with IC. Grade 3 or 4 treatment-related adverse events occurred in 13% of patients treated with nivolumab compared with 35% of those treated with IC. Nivolumab also showed a significant short-term quality of life (QoL) benefit compared with IC, with stabilized symptoms and functioning from baseline to weeks 9 and 15 in the nivolumab arm, whereas patients receiving IC experienced clinically meaningful deterioration [10]. Nivolumab also significantly delayed time to first deterioration relative to IC therapy for several subscales of the EORTC QLQ-C30 and EORTC QLQ-H&N35 [10]. Exploratory post hoc analysis of CheckMate 141 data demonstrated that patients receiving nivolumab had a 29% lower rate of hospitalization than patients receiving IC (p < 0.05); other healthcare resource utilization events were comparable between treatment arms [11].

The PRO analyses for CheckMate 141 were limited to short-term time points (up to week 15), during which sufficient numbers of patients provided completed questionnaires. Q-TWiST (quality-adjusted time without symptoms or toxicity) analysis provides a method to integrate the quality and quantity of survival by utilizing the toxicity, progression, and survival data without an over-reliance on the limited PRO data. We conducted a Q-TWiST analysis of CheckMate 141 data to compare the quality and quantity of survival in patients receiving nivolumab compared with patients receiving standard of care (IC) in R/M SCCHN patients who had previously received platinum therapy.

2 Methods

The analysis was performed on the all-randomized population, using the treatment arm as randomized (the intent-to-treat population: nivolumab: n = 240; IC: n = 121). The Q-TWiST analysis partitions survival duration into three clinically relevant health states: (1) the period experiencing toxicity (TOX), (2) the period before progression without experiencing toxicity, and (3) the period after disease progression. These periods are assigned preference utilities [12].

The time period patients experienced TOX was defined as the total number of days spent with grade 3 or 4 adverse events (from randomization date to progression/censoring); multiple events on the same day were counted only once. Time without symptoms or toxicity (TWiST) was defined as the period from randomization to date of disease progression (assessed using RECIST 1.1 criteria) minus the number of days with grade 3 or 4 adverse events. The relapse period (REL) was defined as the period after the date of progression until death or censoring. The Q-TWiST was calculated as the weighted sum of the time spent in each health state [13, 14]:

$${\text{Q}}{\text{-TWiST }} = \, U_{\text{TOX}} \times {\text{ TOX }} + \, U_{\text{TWiST}} \times {\text{ TWiST }} + \, U_{\text{REL}} \times {\text{ REL}},$$

where U denotes the assigned utility for each respective health state.

The utilities were initially estimated from patient-level EQ-5D-3L scores, based on UK population values [15] (from patients who completed at least one EQ-5D-3L assessment: nivolumab, n = 214; IC, n = 99). The EQ-5D-3L was collected at week 1 (before dosing), during treatment at week 9, and then every 6 weeks until disease progression. Further assessments were then collected at approximately 35 and 115 days following the last dose of treatment and at subsequent survival follow-up visits every 3 months. The mean EQ-5D-3L index score was calculated for each health state by treatment group using all patients with at least one EQ-5D-3L index score during that health state. For patients with multiple values during either the TOX or the REL period, the average of all recorded values was used. During the TWiST period, an average index score was used per patient if there were multiple assessments. A sensitivity analysis was also conducted using the worst scores for multiple assessments per patient within any of the health states.

A threshold utility analysis was performed to assess the impact of different health state utility assumptions on between-treatment Q-TWiST comparisons. The utility weights for TOX and REL were varied between 0 and 1 in increments of 0.1, with all combinations being considered. The health state ‘TWiST’ was initially assigned a utility value of 1, representing full health. In addition, the threshold utility analysis was repeated using a TWiST utility value of 0.805, based on the UK population norm [16].

Time spent in each health state is presented using restricted means up to the median follow-up time of patients across both treatment groups [17]. A bootstrap sample of 500 was used to estimate the mean area under the curve as the time spent in each health state [18]. Q-TWiST was compared between treatment groups with p values estimated based on the normal approximation method, with mean treatment differences estimated from the Kaplan–Meier analysis and variance estimated using bootstrap methodology [18]. Partitioned survival plots are displayed for each treatment group. The relative gain in Q-TWiST for nivolumab compared with IC was calculated as the difference in Q-TWiST divided by the overall survival time of the IC (as the control group). Revicki et al. [19] defined the minimally important difference in relative gain in Q-TWiST as 10%, with 15% defined as clearly clinically important. Post-hoc subgroup analyses were conducted to assess Q-TWiST in patients stratified by human papillomavirus (HPV) status; programmed death ligand 1 (PD-L1) status (negative: < 1%; positive: ≥ 1%); and by whether they had received prior cetuximab.

3 Results

3.1 Duration of Time Spent in Each Health State

The median follow-up time for patients across both treatment arms was 16 months. The (restricted) mean time in each health state is presented in Table 1. The mean duration of time spent in TWiST and REL was statistically significantly longer in the nivolumab arm than in the IC arm (difference of 1.04 months and 0.72 months, respectively). Statistically, patients in the IC arm spent a significantly longer time in TOX than the nivolumab arm (0.37 months vs 0.30 months) (Table 1). Figure 1a shows the partitioned survival curve for patients in the nivolumab treatment arm, and Fig. 1b shows the partitioned survival curve for patients in the IC arm.

Table 1 Restricted mean time in health states by treatment arm for the bootstrap sample
Fig. 1
figure 1

Partitioned survival curve for a the nivolumab treatment arm, and b the investigator’s choice treatment arm. REL relapse, TOX toxicity, TWiST time without symptoms of disease progression or toxicity

3.2 Estimation of Utilities from EQ-5D-3L

Table 2 shows the estimated weights for each health state using the EQ-5D-3L values collected in the trial. The estimated utilities were higher in the IC arm for periods of TOX and TWiST compared with nivolumab; however, estimates for the REL period were higher in the nivolumab arm. The average utility associated with TOX was estimated at 0.532 for nivolumab compared with 0.623 for IC. The number of patients available for the estimation of utility during the TOX period was low (n = 44 across the treatment arms). The average utility score associated with TWiST was 0.638 for nivolumab compared with 0.671 for IC. In our sensitivity analysis, using the worst score instead of the average for TOX only slightly lowered the estimated utilities. This sensitivity analysis had a bigger impact on the estimated utilities for the REL period.

Table 2 Weightings for time spent in health states based on EQ-5D-3L utilities

3.3 Comparison of Q-TWiST Between Nivolumab and IC

Table 3 shows the results of the Q-TWiST analyses using the EQ-5D-3L index scores from the trial as the utility weightings. There was a statistically significant 1.08-month gain (95% CI 1.02–1.13; p < 0.001) in Q-TWiST in favor of the nivolumab arm. This represents a 21.2% relative improvement at 16 months, exceeding the 10% threshold for clinical relevance. Similar results were observed when using average utility values per patient instead of worst case for REL and TOX periods, with a 1.23-month gain (95% CI 1.17–1.29; p < 0.001) and a 24.1% relative improvement at 16 months. In the threshold analysis with TWiST set to 1, the gain in Q-TWiST remained clinically and statistically significant and ranged from 0.96 to 1.76 months in favor of nivolumab, representing an 18.8–34.5% relative improvement. In the threshold analysis with TWiST set to the UK population norm of 0.805, the gain in Q-TWiST remained clinically and statistically significant and ranged from 0.76 to 1.56 months in favor of nivolumab, representing a 14.9–30.6% relative improvement. Figure 2a, b show the threshold analysis using UK norms for utility of TWiST with absolute difference and relative gain in Q-TWiST months, respectively.

Table 3 Q-TWiST analysis using weights estimated from EQ-5D-3L
Fig. 2
figure 2

Threshold analysis using UK norms for utility of TWiST: a absolute difference in Q-TWiST, months with nivolumab compared with IC; b relative gain in Q-TWiST, months with nivolumab compared with IC. Utility values during TOX and REL vary from 0 to 1; the utility value for TWiST is held at 0.805 based on the UK population norm [16] weighted to the CheckMate 141 randomized population. Shading represents absolute and relative gain category for a given utility value of TOX and REL. Relative gain in Q-TWiST: difference between nivolumab and IC divided by median OS in IC arm (5.1 months). aAt a TOX utility value of 1 and a REL utility value of 0, relative gain was 14.9%. IC investigator’s choice, Q-TWiST quality-adjusted time without symptoms of disease progression or toxicity, REL relapse, TOX toxicity, TWiST time without symptoms of disease progression or toxicity

3.4 Subgroup Analysis

The results across all prespecified subgroups are summarized in Table 4. Q-TWiST within subgroups defined according to HPV status (nivolumab: HPV-positive [n = 57], 5.41 vs HPV-negative [n = 48], 4.90; IC: HPV-positive [n = 23], 4.06 vs HPV-negative [n = 32], 3.85) and prior cetuximab use (nivolumab: prior cetuximab [n = 133], 4.55 vs no prior cetuximab [n = 81], 5.13; IC: prior cetuximab [n = 60], 3.98 vs no prior cetuximab [n = 39], 3.53) were all statistically significant and met the criteria for clinical relevance, with longer Q-TWiST for the nivolumab arm. The PD-L1 negative subgroup demonstrated a statistically significant, but not clinically important, gain in Q-TWiST for the nivolumab arm compared with the IC arm (5.9%). The remaining PD-L1 subgroups showed gains similar to the overall population.

Table 4 Q-TWiST analysis using weights estimated from EQ-5D-3L, by subgroup

4 Discussion

To the authors’ knowledge, this is the first Q-TWiST analysis reported in patients receiving drug treatment for platinum-refractory R/M SCCHN. Q-TWiST analysis uses patient-reported data obtained during a clinical trial to estimate the balance of quality and quantity of survival. Unlike QALYs, which typically extrapolate trial outcomes over the likely lifetime of a patient, the Q-TWiST approach combines quantity and quality of life into a single measure by partitioning survival time into a series of health states, and applies particularly well to cancer studies because it reflects the outcomes of treatment choices by considering quantity (survival time) as well as QoL (which can be impaired by toxicity) [19]. Accordingly, the Q-TWiST approach has been used in multiple oncology settings, including immuno-oncology [20,21,22,23,24].

Our findings demonstrate clear and substantial clinical gains in quality-adjusted survival with nivolumab therapy compared with IC in patients with platinum-refractory R/M SCCHN (Checkmate 141 study) [9, 10]. Q-TWiST gains in patients receiving nivolumab are approximately 15% higher than in patients receiving IC, which exceeds the recommended clinically important difference threshold of 10% [19]. A recent systematic review assessing Q-TWiST in oncology trials reported that only approximately 23% of trials observed relative Q-TWiST gains of > 15% (i.e., clearly clinically relevant gains) [25]. These gains remained robust to sensitivity analyses that considered all possible weightings assigned to the TOX and REL health states, and analyses considering TWiST weightings based on a standard utility weight of 1 and an alternative lower weighting based on population reference values.

Results of this Q-TWiST analysis are broadly consistent with previously published observations of EORTC QLQ-C30 and QLQ-H&N35 PROs at baseline, week 9, and week 15 from CheckMate 141. The previous analysis demonstrated short-term QoL benefits with nivolumab associated with superior role and social functioning, fatigue, dyspnea, appetite loss (with EORTC QLQ-C30), and pain and sensory problems (with QLQ-H&N35) compared with IC [10]. The collection and analysis of PRO data in clinical trials for populations with advanced disease is challenging. High attrition due to progression or death is likely, which limits any possible analyses assessing QoL benefit in the longer term. This analysis supplements the short-term QoL findings by showing robust evidence of a quality-adjusted survival benefit for nivolumab. Additionally, our findings are supported by previous estimates of QALYs using CheckMate 141 data from Swiss [26] and US [27, 28] perspectives, which reported estimated mean QALYs of 0.63–0.89 with nivolumab compared with 0.29–0.55 with IC.

4.1 Limitations

This analysis incorporated all randomized patients regardless of the diminishing PRO data over time due to attrition from progression and death. Only patients who had completed at least one EQ-5D-3L assessment in that particular health state could contribute to the estimation of utility weightings; estimates employed in the Q-TWiST analysis are therefore based on small sample sizes with consequent uncertainty. The sample size for estimation of utility during TOX was particularly small (n = 44). A threshold analysis assessed this uncertainty, exploring all possible combinations of utility weightings during TOX and REL. A further sensitivity analysis assigned a utility of < 1 (i.e., less than perfect health) to TWiST. All threshold analyses were supportive of the main conclusions regardless of utility weights chosen.

5 Conclusions

This Q-TWiST analysis of the Checkmate 141 data suggested that patients receiving nivolumab for platinum-refractory R/M SCCHN had statistically significant and clinically relevant gains in quality-adjusted survival compared with patients randomized to IC. These gains remained robust to all possible combinations of utility weights assigned to time in TOX and REL. Compared with the recommended clinically relevant threshold of 10% of OS, the gains in this study were relatively large. The improved quality of survival with nivolumab vs standard of care should be considered in treatment decision making for R/M SCCHN patients.