Background

Cost-effectiveness analysis is used in most high-income countries for pricing and reimbursement of new health interventions [1]. In such analysis, effectiveness is generally measured by Quality-Adjusted Life Years (QALYs) where the expected number of years to be lived in different health states is weighted by community preferences for each health state [1, 2]. However, these health state utility (HSU) estimates are typically among the most important but also uncertain drivers of cost-effectiveness results – a paradoxical situation that seems detrimental to fair pricing and reimbursement decisions across competing new health interventions.

There are multiple sources of variability in HSU estimates, although a general adherence to the same guidelines would purposely limit variability to patient surveys [1,2,3]. Indeed, if the same preference-based, generic health-related quality-of-life (HRQoL) instrument was administered in all patient surveys, then all HRQoL profiles of the patients could be similarly converted into HSU estimates with use of country-specific social value sets [4]. However, the variability of HSU estimates may still remain considerable due to the scarcity, small sample size, and lack of representativeness of patient surveys as recently illustrated in the context of relapsed/metastatic head and neck cancer [5,6,7].

In a systematic review of HSU estimates in head and neck cancer [8], Meregaglia and Cairns identified that only 12 patient surveys collected preference-based, generic HRQoL data. Most (9/12) patient surveys relied on the same EQ-5D-3L instrument [9], but none provided HSU estimates by cancer stage due to small sample sizes [8]. Otherwise, EQ-5D-3L data are increasingly collected along clinical trials [3]. However, HSU estimates lack representativeness due to the exclusion criteria applied to the patient population such as an older age or the presence of comorbidities [10,11,12]. Altogether, none of the patient surveys were conducted in France [8] and few French patients were recruited in international clinical trials (e.g., less than 20 patients in [12]). By default, a cost-effectiveness analysis conducted in the French healthcare context should further assume that patient surveys from other countries are representative of French patients [13].

In this study, we explored another route than patient surveys to estimate consistent HSU at the country level. More specifically, the French National Hospital Discharge database allows identifying all patients cared with a severe condition such as cancer as well as health states typically used in a cost-effectiveness analysis such as cancer stage at initial treatment and relapse in the follow-up. In addition, six Activities of Daily Living (ADLs) are systematically collected in patients admitted in post-acute care. Taking head and neck cancer as a case study, we developed a multi-step process to estimate HSU. Steps I and II consist of patient data organization of the French National Hospital Discharge database including selection of incident patients and definition of five core health states over two periods (initial treatment and follow-up). Step III enables utility to be estimated daily from all records of ADLs in post-acute care with use of Item Response Theory [14]. Step IV enables HSU to be estimated by patient and month of follow-up in the whole patient population after controlling for survivorship and selection in post-acute care.

Methods

Data source

The data source was the French National Hospital Discharge (PMSI) database in the years 2008 to 2013. The database contains all public and private hospital claims for acute and post-acute care. The standardized discharge summary includes: patient’s demographics (gender, age, postal code of residency); primary and associated discharge diagnosis codes according to the WHO International Classification of Diseases, tenth revision (ICD-10); medical procedures performed; length of stay; and discharge mode (including in-hospital death). In addition, six ADLs are systematically scored at admission in post-acute care and then every week until hospital discharge (Table 1). For research purposes, all hospital discharge data of the patient could be traced in 2008–2013 with use of an unique anonymous identifier [15, 16].

Table 1 Activities of Daily Living (ADL) recorded in post-acute care among head and neck cancer patients (n = 144,012)

Step I: selection of incident patients

We included all adults residing in metropolitan France and discharged with a primary or associated discharge diagnosis code of head and neck squamous-cell carcinoma (ICD-10: C00-C06; C09-C14; C30.0; C31; C32) in the years 2008 to 2012. We selected incident cases in 2010–2012 after excluding all prevalent cases in 2008–2009 [17, 18]. In addition, we excluded all incident cases recorded with a personal history of cancer to minimize a possible misclassification of a relapse. The coding dictionary of all variables used in this study is provided in Additional file 1: Table S1.

Step II: health state definition over two periods

Most patients with head and neck cancer are diagnosed at locally advanced stage [19] and receive combined-modality treatments over a few months to decrease the high risk of relapse in the short-term [20]. In patient surveys, EQ-5D-3L was mostly (8/9) assessed after initial treatment in relapse-free patients [8]. In accordance with the usual design of patient surveys, ADLs are recorded in post-acute care in the French National Hospital Discharge database, although we aimed at expanding utility assessment to several health states including a relapse state [5,6,7,8].

We used a cross-sectional approach to define five health states over two periods: three cancer stages at initial treatment (early, locally advanced or metastatic stage) [20]; a relapse state and otherwise a relapse-free state in the follow-up. The initial treatment phase was defined by the first 6 months after diagnosis to encompass various lengths of combined-modality treatments [21] and related post-acute care. Cancer stage was identified at initial treatment from medical information that is consistently recorded at hospital [22]: a metastatic stage was identified by any record of distant metastasis; in absence of distant metastasis, a locally advanced stage was identified by any diagnosis indicating locoregional extension (e.g., lymph nodes) or any initial treatment eliminating an early stage (e.g., chemotherapy) [20]; and an early stage was considered by default in other patients.

Patients identified at the metastatic stage at initial treatment had poor prognosis and were followed in the same health state until end of follow-up. Other patients identified at early or locally advanced stage became at risk of relapse after 6 months. Relapse was identified by the first record of a local relapse (i.e., primary discharge diagnosis identical to the original cancer site) or a new event indicative of extension (i.e., distant metastasis, locoregional extension, or chemotherapy). Relapsing patients had poor prognosis and were followed in the same health state until end of follow-up. Other patients were considered relapse-free in the follow-up, starting from 6 months after diagnosis to end of follow-up.

Overall mortality was assessed from in-hospital death records as well as deaths outside hospital with right-censoring for all patients at July 1, 2013 (Additional file 1: Methods). The Kaplan-Meier method was used to test the association of health state with survival over a maximum follow-up of 12 months. The Fine and Gray method was used to test the association of health state with post-acute care admission, where deaths without post-acute care were considered as competing events [23].

Step III: utility estimation over time in post-acute care

Six ADLs are systematically scored at admission in post-acute care and then every week until hospital discharge: 4 self-care tasks (dressing/bathing; functional mobility; self-feeding; continence); social interaction; and communication (Table 1). Each ADL is scored on the same 4-level scale (0 = total dependence, 1 = partial dependence, 2 = supervision, or 3 = independence).

All records of ADLs in post-acute care were analyzed with Item Response Theory [14]. We estimated a two-parameter graded response model [24], in which ordinal scores on ADLs are assumed to be a logistic function of a latent health state scale (i.e., the probability of a higher score on each ADL increases as the latent health state increases). The model is specified as follows:

$$ {P}_{ijk\left({X}_j\ge k|{\theta}_i,{\alpha}_j\right)}=\frac{e^{\alpha_j\left({\theta}_i-{\beta}_{jk}\right)}}{1+{e}^{\alpha_j\left({\theta}_i-{\beta}_{jk}\right)}} $$

where Pijk is the cumulative probability that patient i receives a score of k or above (k = 0, 1, 2, 3) on ADL j (j = 1, 2, 3, 4, 5, 6); θi represents the latent health state value of patient i; αj is the slope parameter of ADL j and indicates the ability of ADL j to discriminate patients on the latent health state scale; and βjk is the threshold parameter of ADL j for score k or above relative to lower scores and indicates the value at which a patient has a 50% chance of scoring k or above on the latent health state scale (i.e., k-1 threshold parameters are estimated).

We assessed the unidimensionality of the latent health state scale, i.e., the assumption that all ADLs measure a single construct of health state, by examining the eigenvalues of the polychoric correlation matrix [14]. Assuming a perfect correlation between the latent health state scale and the French EQ-5D-3L social value set, we computed an ADL-related utility scale calibrated on the worst (− 0.53) and best (1.00) anchors of the French EQ-5D-3L social value set [25]:

$$ {\hat{U}}_{EQ-5D}^{IRT}=\left[\frac{\left({\hat{U}}_{RAW}^{IRT}-\min {\hat{U}}_{RAW}^{IRT}\right)}{\left(\max {\hat{U}}_{RAW}^{IRT}-\min {\hat{U}}_{RAW}^{IRT}\right)}\times \left(1+0.53\right)-0.53\right] $$

Finally, patients may have repeated assessments (i.e., weekly assessments during the same hospital stay and/or multiple hospital stays in post-acute care) and ADL-related utility was linearly interpolated on a daily basis between all assessments from first to last record of the patient in post-acute care.

Step IV: HSU estimation by month of follow-up in the whole patient population

We controlled for a possible survivorship effect on utility by estimating HSU by patient and month of follow-up in each health state. We expanded on previous cross-sectional approach (Step II) to define 48 subpopulations consisting of all patients alive at the beginning of each month of follow-up in a given health state (from 1 to 6 months in early or locally advanced stage at initial treatment; and from 1 to 12 months in the three other health states). In each subpopulation, we identified all patients recorded in post-acute care and HSU was computed by the average of daily ADL-related utility estimates in the month per patient. In the best case scenario with complete daily estimates (n = 30 in the month), HSU represented the area-under-the-curve utility estimate of the patient. In the worst case scenario with a single daily estimate (n = 1 in the month), we assumed that ADL-related utility of the patient was uniform over the month.

Then, we estimated HSU for the whole subpopulation with use of a two-step selection model [26]. In the first step, the selection equation is a binary probit regression estimating the probability of a patient to be recorded in post-acute care in the month:

$$ P\left(\mathrm{post}-\mathrm{acute}\ \mathrm{care}=1\right)=\Phi \left(\beta {\mathrm{X}}_i\right) $$

where i represents patients, X represents a vector of covariates, and ϕ is the cumulative distribution function of the normal distribution. Since our general aim was to improve inference rather than efficiency [27], we used a large set of covariates including time-independent covariates (demographics; tobacco smoking, alcohol use; year at diagnosis, primary head and neck cancer site, second synchronous head and neck cancer) and time-dependent covariates recorded before or during the given month (admission to a public teaching hospital, comprehensive cancer care center, private clinic; second primary cancer other than head and neck cancer [28, 29], each comorbidity of the Charlson comorbidity index other than cancer [30, 31], depression; palliative care) (Additional file 1: Table S1).

In the second step, the outcome equation is a standard OLS regression estimating HSU in post-acute care while controlling for selection bias:

$$ {\mathrm{HSU}}_i={\gamma \mathrm{Y}}_i+\lambda {\mathrm{IMR}}_i $$

where i represents patients, Y represents a vector of covariates, and IMR (for inverse Mills ratio) is the correction factor of selection bias calculated from the probit model at βXi in the selection equation. Selection bias was assessed by testing the null that the coefficient of IMR λ = 0. We used the set of covariates of the selection equation, although some covariates that were assumingly less related to HSU were removed from the outcome equation (region of residency, risk factors, previous admission to several types of hospital) [32]. Since the set of covariates of the selection equation was defined in all patients, we used the outcome equation to impute HSU in all patients unrecorded in post-acute care in the month.

All statistical analyses were performed with SAS 9.4 including PROC IRT for estimating the two-parameter graded response model.

Results

Step I: selection of incident patients

Of the 27.3 million adults discharged from all French hospitals in 2008–2012, 134,324 (0.49%) had a diagnosis of head and neck cancer (Additional file 1: Table S2). Of them, 53,258 (40.4%) were considered incident cases in 2010–2012.

Step II: health state definition over two periods

Five health states were defined over two periods: initial treatment and follow-up. Health states were significantly associated with survival (Fig. 1). Patients with distant metastasis at initial treatment or relapsing in the follow-up had the worst prognosis. Patients initially treated at early stage had better prognosis as compared to patients treated at locally advanced stage. Patients in a relapse-free state had the best prognosis.

Fig. 1
figure 1

Survival according to health state in head and neck cancer

Health states of poor prognosis were significantly associated with higher admission rates in post-acute care (Fig. 2). At initial treatment, patients with distant metastasis were 3.5 times more likely to be admitted in post-acute care as compared to patients at early stage (HR = 3.54, 95% CI 3.31–3.80). In the follow-up, relapsing patients were 3.6 times more likely to be admitted in post-acute care as compared to patients in a relapse-free state (HR = 3.62, 95% CI 3.42–3.82).

Fig. 2
figure 2

Admission in post-acute care according to health state in head and neck cancer

Step III: utility estimation over time in post-acute care

Six ADLs were assessed at 144,012 points in time in post-acute care (Table 1). The two-parameter graded response model fitted very well all records of ADLs. The unidimensionality of the latent health scale was supported by the examination of eigenvalues: the first eigenvalue (4.0) explained 66.8% of the variance; the second eigenvalue was below 1.0; and the ratio of the first and second eigenvalues (4.6) was above 3 (Additional file 1: Table S3). In addition, all slope parameters were above 1 indicating that all ADLs were informative regarding the latent health state scale (Additional file 1: Table S4). The assessment of dressing/bathing was the most informative on the latent health state (slope = 5.50; range between threshold parameters = 4.95) (Fig. 3). The assessment of self-feeding was the least informative on the latent health state (slope = 1.12; range between threshold parameters = 2.09). Most (14/18) threshold parameters were below 0 indicating that ADLs were generally more informative on poor health states.

Fig. 3
figure 3

Characteristic curves of 6 Activities of Daily Living (ADL) recorded in post-acute care (n = 144,006). The trait on the horizontal axis is an arbitrarily scaled representation of the latent health state scale. As the value of the latent health state increases, the probability of a higher score on each ADL increases. The relative concentration of the curves reflects the relatively high discriminative ability of an ADL. On the contrary, the relative spread of the curves reflects the relatively low discriminative ability of an ADL

Following calibration of the latent health state scale on the French EQ-5D-3L social value set, the ADL-related utility had a mean (std) of 0.44 (0.40) and a median (IQR) of 0.47 (0.18–0.76). ADL-related utility estimates were completed on a daily basis with use of linear interpolation between all assessments of the patient in post-acute care. The final dataset included 1,032,301 daily estimates of ADL-related utility in post-acute care with a mean (std) of 0.44 (0.38) and a median (IQR) of 0.47 (0.18–0.74).

Step IV: HSU estimation by month of follow-up in the whole patient population

Daily estimates of ADL-related utility in post-acute care were averaged into HSU estimates by health state, patient and month of follow-up. Patients initially treated at early stage had surprisingly lower HSU estimates than patients at locally advanced stage and a selection bias in post-acute care was suspected (Fig. 4).

Fig. 4
figure 4

Health state utility, by health state and month of follow-up of head and neck cancer patients in post-acute care

Considering all patients alive at the beginning of the month in a given health state, two-step selection models were carried out by health state and month of follow-up (parameter estimates are provided at first and last month of follow-up for the 5 health states in Additional file 1: Tables S5–S14). Overall, HSU estimates significantly increased for each health state and month of follow-up after controlling for selection in post-acute care (Fig. 5). A selection bias was primarily found in patients initially treated at early stage (p < 0.05 for 4 out of 6 months of follow-up) or locally advanced stage (p < 0.05 for 6 out of 6 months of follow-up) (Additional file 1: Table S15), although HSU estimates remained lower in patients initially treated at early stage as compared to locally advanced stage. Patients initially treated with distant metastasis had the worst HSU estimates at all months of follow-up. Patients in a relapse-free state had the best HSU estimates after 8 months of follow-up, with an increasing trend from 8 to 12 months of follow-up (max HSU of 0.61 at 12 months of follow-up).

Fig. 5
figure 5

Health state utility, by health state and month of follow-up of all head and neck cancer patients

HSU summary statistics were computed over the all period of follow-up (Table 2). As compared to the health state “distant metastasis at initial treatment” (mean HSU = 0.45), other health states were associated with a better mean HSU, although numerical differences were small around 0.54. It was primarily explained by the negative effects on HSU of an older age in the health state “early stage at initial treatment” (38.4% patients were aged ≥70 years) and comorbidities (> 50%) in other health states.

Table 2 Summary statistics of health state utility (HSU) in head and neck cancer

Discussion

Although many Health Technology Assessment bodies (such as the French HAS [13]) have deemed QALYs the principal measure of effectiveness, still only a limited number of studies report QALYs based on actual assessments of preference-based, generic HRQoL among a representative sample of patients. The assessment of new immunotherapy for relapsed/metastatic head and neck cancer provides a pressing example [5,6,7] since few patient surveys were conducted and none provided HSU estimates by cancer stage due to small sample sizes [8].

In this study, we explored another route than patient surveys to estimate consistent HSU at the country level. On the one hand, all incident patients diagnosed with head and neck cancer in France were identified from the French National Hospital Discharge database. Five health states could be reliably defined over time for the whole patient population and we found expectedly that relapsed/metastatic patients had poor prognosis. On the other hand, ADLs rather than the recommended EQ-5D-3L instrument are recorded and we had to develop a multi-step process to transform ADLs records in post-acute care into consistent HSU estimates representative of the whole patient population.

One of the main study results is that head and neck cancer was generally associated with poor HSU estimates in a real-life setting since mean HSU ranged from 0.45 for “distant metastasis at initial treatment” to around 0.54 for other health states (early or locally advanced stage at initial treatment; relapse state and otherwise relapse-free state in the follow-up) with “minimally important differences” (< 0.06) between health states [33]. In comparison, EQ-5D-3L utility estimates were much higher in most (8/9) surveys conducted in relapse-free patients (median (IQR) sample size of 79 (28–112) patients), with a median (IQR) utility of 0.80 (0.78–0.84) for patients aged 63 years on average [8]. EQ-5D-3L utility estimates were also higher in one longitudinal study of 81 patients diagnosed at early/locally advanced stage and aged ≥65 years (median (IQR) utility of 0.66 (0.55–0.76) at diagnosis and 0.64 (0.00–0.74) at 12 months of follow-up) [34]. EQ-5D-3L utility estimates were also higher in patients selected in clinical trials, with a mean (std) utility of 0.79 (0.18) in 715 patients initially treated at locally advanced stage [11] and 0.68 (0.28) in 120 relapsed/metastatic patients [12]. While attention was drawn on the expected variability of EQ-5D-3L utility estimates with community preferences of the country [4, 8], our study results suggest that the lack of representativeness of patient surveys should be of primary concern since the usual recruitment of younger patients with less comorbidities may lead to overly optimistic HSU estimates.

Another main study result is that HSU estimates significantly improved over time in patients in a relapse-free state (from 8 to 12 months of follow-up) in agreement with HRQoL improvements found over longer periods of time in cancer survivors [35, 36]. In comparison, the time to assessment of EQ-5D-3L varied dramatically within and between surveys conducted in relapse-free patients (i.e., from months to years after diagnosis) [8]. On the one hand, a longer time to assessment in cross-sectional patient surveys may also explain our lower HSU estimate since follow-up was limited to 1 year and accounted for utility at each month of follow-up in the relapse-free state. On the other hand, our study results suggest that time to assessment should be better accounted for or even standardized to achieve comparable HSU estimates between patient surveys. Otherwise, we found that HSU estimates did not improve over time in health states other than the relapse-free state. Similarly, no significant changes in EQ-5D-3L utility were found over time in old patients diagnosed at early/locally advanced stage [34], trial patients initially treated at locally advanced stage [11], or trial patients treated at relapsed/metastatic stage [12]. Altogether, it suggests that EQ-5D-3L social value sets exhibit a poor responsiveness to change during treatment in head and neck cancer [37].

The strengths of this nationwide study outline its limitations. Indeed, this study is a secondary analysis of the French National Hospital Discharge database and therefore all measurements relied on administrative records with possible misclassification. Regarding health state definition, TNM cancer staging is not recorded in the standardized discharge summary and we constructed a composite variable to identify three cancer stages at initial treatment. Overall, 37,508 (70.4%) of 53,258 patients were identified at a late stage at initial treatment (Fig. 1), in agreement with previous reports of cancer registries [19]. However, we could no longer estimate HSU related to the treatment modalities since this information was already used to construct the composite variable of cancer stage.

Regarding utility estimation, ADL scores contribute with discharge diagnoses, rehabilitation procedures, and age to the hospital billing system in post-acute care. Accordingly, the completion rate of ADLs was extremely high (> 99%), although a recording bias towards more severe scores is possible and could lead to lower HSU estimates. In absence of mapping studies of ADLs into EQ-5D-3L social value sets [38,39,40], a latent health state scale was estimated from all records of ADLs with use of Item Response Theory and then calibrated on the worst (− 0.53) and best (1.00) anchors of the French EQ-5D-3L social value set [25]. Such approach was supported by the conceptual overlap between ADLs and the EQ-5D-3L instrument regarding dimensions and their ordinal scoring as well as the unidimensionality of the latent health state scale underlying ADLs. However, the calibration implies a perfect correlation of the latent health state scale with the French EQ-5D-3L social value set and the distribution of ADL-related utility should be cross-validated with a mapping study conducted in post-acute care. In the following steps, we made a full use of the repeated assessments of ADLs by patient (linear interpolation of ADL-related utility on a daily basis and then average by month of follow-up) that resulted in smoothed and generally unimodal distributions of utility in the 48 subpopulations. In particular, we found limited evidence of a ceiling effect in post-acute care (utility of 1.00 for 8.6% of 40,812 patients selected in all 48 subpopulations; at maximum, 13.8% of 290 relapse-free patients at 12 months of follow-up) [37].

Conclusions

HSU estimates in head and neck cancer were primarily driven by age at diagnosis, comorbidities, and time to assessment of cancer survivors. This feasibility study highlights the potential of estimating HSU within and across severe conditions in a systematic way at the national level. While the multi-step process to estimate HSU was developed with use of the French National Hospital Discharge database, it may generalize to other Hospital Discharge databases including a systematic assessment of ADLs for billing purposes.