FormalPara Key Points for Decision Makers

This study provides the first value set (i.e. set of utility weights) for the EORTC QLU-C10D, a new preference-based multi-attribute utility instrument derived from the widely used cancer-specific quality-of-life questionnaire, EORTC QLQ-C30.

Cost-utility analysis (CUA) represents a major part of the reimbursement process in many countries. The availability of the EORTC QLU-C10D will facilitate CUA for cancer interventions, as it can be applied to data collected with the EORTC QLQ-C30, prospectively and retrospectively.

Sizeable utility decrements associated with cancer-sensitive dimensions, notably nausea, bowel problems and appetite, may make the QLU-C10D more sensitive than generic measures in CUA. Future research is required to assess this in datasets containing both the QLQ-C30 and a generic utility instrument.

1 Introduction

Economic evaluation is central to the evaluation of new therapies and technologies in many countries. Cost-utility analysis (CUA) is a form of economic evaluation that quantifies health outcomes on a standardised metric, typically the quality-adjusted life-year (QALY). The quality adjustment is provided by a ‘value set’; that is, a set of utility weights for a range of possible health states within any given health state classification system. Value sets can be derived for classification systems originally developed as multi-attribute utility instruments (MAUI), or by adaptation of existing health-related quality of life (HRQoL) profile measures [1, 2]. Such a measure, the EORTC QLQ-C30, is a widely used core questionnaire in the modular HRQoL suite of the European Organisation for Research and Treatment of Cancer (EORTC) [3].

The Multi-Attribute Utility in Cancer (MAUCa) Consortium aims to facilitate the use of HRQoL data in CUA in cancer settings by providing a series of country-specific value sets for the QLQ-C30. To this end, we have developed a health state classification system containing 13 of the QLQ-C30’s 30 items, combined into ten dimensions (Table 1) [4] and a valuation method based on a discrete choice experiment (DCE) [5]. These are key components of the QLU-C10D, a new cancer-specific MAUI. The aim of this current paper is to apply the valuation method in an Australian general population sample to produce the first country-specific utility weights for the QLU-C10D.

Table 1 The QLU-C10D health state classification system, how it maps to the 13 component items from the QLQ-C30, and the duration attribute included in the discrete choice experiment (DCE) valuation survey

2 Methods

2.1 The QLU-C10D

Table 1 shows the QLU-C10D health state classification system, and explains how the ten dimensions, each with four levels, map to 13 of the 30 items in the QLQ-C30. The derivation of this health state classification system is described elsewhere [4].

2.2 The Valuation Task: Discrete Choice Experiment (DCE) Presentation

The valuation task was based on methods developed for the Australian valuations for the EQ-5D(3L) and SF-6D instruments [6, 7]. The task involved choosing between two QLU-C10D health states, each with a specified duration (life years), described as ‘Situation A’ and ‘Situation B’ (Fig. 1). Because the QLU-C10D includes more dimensions than the EQ-5D(3L) or SF-6D, we first established the feasibility of the task, and pilot tested the DCE task wording, layout and presentation formats [5]. Choice sets were presented in a format preferred by participants in the QLU-C10D valuation methods experiment [5]; that is, dimensions that differed between situations A and B were highlighted in yellow. For the Physical Functioning dimension, the descriptors for levels 2 and 3 are quite complex (Table 1). To facilitate respondent understanding, we presented the two component items, ‘long walk’ and ‘short walk’, as two separate attributes in the survey (Fig. 1). Note that the Physical Functioning dimension was treated as one four-level dimension in the DCE design (online resource 1, see electronic supplementary material [ESM]) and data analysis.

Fig. 1
figure 1

An example choice set from the discrete choice experiment valuation task

2.3 Health States Valued: DCE Design

The QLU-C10D health state classification system has over a million possible health states (410 = 1,048,576). We employed a designed experiment to select 960 choices sets that would maximise statistical efficiency in estimating the utility model parameters. Health states were operationalised as 12 attributes in the DCE: one for duration, two to represent physical functioning (long and short walk), and one for each of the remaining nine QLU-C10D dimensions. Because 12 dimensions is a relatively large number for respondents to consider simultaneously, we simplified the cognitive task by constraining the number of HRQoL dimensions that differed between health states in any given choice set to four, as done in the QLU-C10D valuation methods experiment [5], using the same experimental design. Briefly, we began with a balanced incomplete block design (BIBD) to define which four of the ten QLU-C10D dimensions differed within choice sets [8]. This BIBD was then duplicated. To determine the levels of these differing dimensions, a generator-based approach was employed, designed to allow estimation of main effects and all two-factor interactions involving duration [9]. The levels of the six dimensions that were constant between options were then developed using an orthogonal main effects plan. This follows the approach outlined by Demirkale et al. [10]. The final design comprised 1920 health states in 960 choice sets (online resource 1, see ESM).

There were two levels of randomisation in the DCE component of the survey: (i) each respondent was randomly allocated 16 of the 960 choice sets without replacement; (ii) which option was seen as Situation A or B was randomised within each choice set to mitigate any ordering bias. The dimensions were always presented in the same order, as previous work showed that dimension order does not systematically bias utility weights for the QLU-C10D [11].

2.4 Survey Content

All survey content was developed by the MAUCa Consortium. In addition to the DCE, the survey contained other components (Fig. 2). The self-reported health questions included the general health question of the SF-36 [12] and the Kessler-10 (mental health) questionnaire [13]. Sociodemographic questions were worded such that they could be mapped directly to normative data to enable assessment of our sample’s representativeness of the Australian general population (Table 2).

Fig. 2
figure 2

Respondent flow and sample size for each component of the survey. DCE discrete choice experiment

Table 2 Self-reported health and sociodemographic characteristics of the sample compared with those of the Australian general population

2.5 Survey Implementation and Sample Recruitment

The content was implemented as an online survey by SurveyEngine [14], a company that specialises in choice experiments. SurveyEngine and its panel providers comply with the International Code on Market, Opinion and Social Research and Data Analytics [15]. SurveyEngine managed recruitment (via an Australian online panel provided by Toluna), administration of the survey and data collection. The target population was the Australian adult general population (aged ≥ 18 years). Participants were panel members aged 18 years or older who opted in to the survey. There were no exclusion criteria. Quota sampling by age and sex was used to achieve population representativeness on those variables.

2.6 Statistical Analysis

2.6.1 Sample Representativeness

Chi-square tests were used to assess our sample’s representativeness of the Australian population for age and sex (population data available from the Australian Bureau of Statistics as at March 2013 [16]); self-reported general health, Aboriginal and Torres Strait Islander (ATSI) status, highest level of education, and country of birth (population data available from the Household, Income and Labour Dynamics in Australia Surveys [HILDA], Wave 10 [17]); self-reported mental health (Kessler-10 Australian norms from the 2007 Australian National Health Survey [18]).

2.6.2 Utility Estimation

The DCE data were analysed in the statistical software package STATA-13 [19] using a functional form used previously to estimate utilities from DCE data consistent with standard QALY model restrictions [5,6,7, 20, 21]. The QALY model requires that all health states have zero utility at death (i.e. ‘the zero condition’) [22, 23]. A functional form that satisfied this requirement included the QLU-C10D dimension levels interacted with the duration variable (‘TIME’) (Eqs. 1, 2). Thus, as TIME tended to zero, the systematic component of the utility function tended to zero. Another requirement of the QALY model is constant proportional time trade-off, therefore the relationship between utility and TIME (life years) was constrained to be linear.

A useful feature of this functional form was that the impact of moving away from level 1 (no problems) in each dimension was characterised through the two-factor interaction term with duration (note that the experimental design allowed for all these interactions). This enabled a utility algorithm in which the effect of each level of each dimension could be included as a decrement away from full health (which had a value of 1).

We analysed the data in two ways, reflecting different approaches to modelling heterogeneity (Eqs. 1, 2). The primary analysis was underpinned by Eq. 1, in which the utility of option j in choice set s for survey respondent i was assumed to be

$$\begin{aligned} U_{isj} = \alpha {\text{TIME}}_{isj} + \beta X_{isj}^{'} T{\text{IME}}_{isj} + \varepsilon_{isj} \hfill \\ i = \, 1, \, \ldots ,I{\text{ respondents}}; \, j = {\text{ situations }}A, \, B;s = \, 1, \, \ldots , \, 960{\text{ choice sets}} \hfill \\ \end{aligned}$$

where α was the utility associated with a life year, \(X_{isj}^{'}\) was a vector of dummy variables representing the levels of the QLU-C10D health state presented in option j, and β was the corresponding vector of utility weights associated with each level in each dimension within \(X_{isj}^{'}\), for each life year. The error term \(\varepsilon_{isj}\) was assumed to have a Gumbel distribution.

In the primary analysis, DCE responses were estimated as a conditional logit model. To adjust the standard errors to allow for intra-individual correlation (as each respondent was asked to consider 16 DCE choice sets), we used a clustered sandwich estimator implemented by STATA’s vce (cluster) option. To estimate utility decrements for each movement away from level 1 (no problems) in each of the ten QLU-C10D dimensions, we divided each of the β terms by α. To estimate confidence intervals around these ratios, we used STATA’s wtp command [23], using the delta method.

Model 1 included every move away from the best level (level 1, no problems) in each dimension within \(X_{isj}^{'}\). Thus, \(X_{isj}^{'}\) contained 30 terms (i.e. 10 dimensions × [4−1] levels within each). If non-monotonicity was observed among levels within a dimension in Model 1 estimates, the non-monotonic levels were combined in Model 2. This restriction has been standardly imposed in previous studies [24,25,26,27,28,29].

In a secondary approach, we employed a mixed logit [30]. In Model 3, it was assumed that coefficients were drawn from a distribution, allowing for preference heterogeneity among individuals.

$$U_{isj} = \left( {\alpha +\gamma_{i} } \right){\text{TIME}}_{isj} + (\beta + \eta_{i} )X_{isj}^{'} {\text{TIME}}_{isj} + \varepsilon_{isj}$$

Thus, α and the vector of βs now represent population mean preferences, while γ i and η i are individual deviations around those mean preferences. These deviations were assumed to be distributed multivariate normal (0,∑). We used the mixlogit STATA command [31] to estimate α, the vector of βs and the standard deviations of γ and the vector of ηs, with one adjustment. The standard command limits the number of parameters drawn from a distribution at 20. To allow all 31 coefficients (including duration) to be drawn from a distribution, we used pseudo-random draws (personal communication, Arne Risa Hole, Department of Economics, University of Sheffield, 15 June 2015).

To compare models in terms of model fit, Akaike information criterion (AIC) and Bayesian information criterion (BIC) estimates are presented.

3 Results

3.1 Sample Characteristics and Representativeness

Figure 2 shows the recruitment flow, response rates and sample sizes for each component of the survey, and Table 2 shows the sample characteristics relative to population norms. While the sample differed statistically from the general population in all measured characteristics except age and sex, differences were generally small (≤ 3% in any one category). Notable exceptions were education (university education over-represented by 18%) and mental health (Kessler-10 sample mean was in the ‘medium risk’ range of anxiety or depressive disorder while the general population mean was in ‘low or no risk’ range).

3.2 Utility Estimates

Additional years of life were preferred, and all movements away from ‘no problems’ in each dimension were valued negatively (Model 1, Table 3), except level 2 of Social Functioning (value zero). Moving to worse levels in each dimension was associated with an absolutely larger coefficient, with only two exceptions. The worst two levels of the Sleep and Appetite dimensions were not monotonically ordered, but both violations of monotonicity were small (0.003 and 0.007, respectively).

Table 3 Conditional logit: Model 1 (unconstrained) and Model 2 (monotonicity imposed)

Model 2 constrained the coefficients for levels 3 and 4, respectively, of the Sleep and Appetite dimensions to have the same coefficient, with very little loss of model fit (Table 3). The utility decrements for each level of each dimension from Model 2 are reported in Table 4 with corresponding 95% confidence intervals, and graphed in Fig. 3. The largest utility decrements were associated with physical, role, social and emotion functioning and pain. Sizeable decrements were associated with nausea, bowel problems and appetite, while smaller decrements were associated with problems with sleep and fatigue.

Table 4 Utility decrements used in the QLU-C10D utility algorithm
Fig. 3
figure 3

Australian Utility Algorithm (derived from Model 2 conditional logit, monotonicity imposed). PF physical functioning, RF role functioning, SF social functioning, EF emotional functioning, PA Pain, FA fatigue, TS sleep, AP appetite, NA nausea, BO bowel problems

In the mixed logit results (Model 3, online resource 2, see ESM), the mean of the distributions for each of the coefficients were generally monotonic, with four exceptions. Three dimensions had small positive estimates for level 2, but none were statistically different from zero: Social Functioning (p = 0.56), Emotional Functioning (p = 0.93) and Fatigue (p = 0.81). For Appetite, levels 3 and 4 were non-monotonic (as in Model 1).

Figure 4 compares the utility decrements from Models 1 and 3, showing a strong relationship between corresponding estimates from the conditional logit and mixed logit models. The coefficients from Model 1 were absolutely larger than those from Model 3, meaning the spread of the resultant utility algorithm was slightly larger.

Fig. 4
figure 4

Scatter plot of utility decrements generated by conditional logit and mixed logit. Dotted line represents line of best fit, solid line represents line of equality

3.3 QLU-C10D Utility Calculation

The utility decrements from Model 2 (Table 4) provide the weights, w dl , for calculating QLU-C10D scores from QLQ-C30 responses (Eq. 3). Note that first, QLQ-C30 items must be converted to QLU-C10D levels, as shown in Table 1. A utility score of 1 is assigned to patients whose QLQ-C30 scores indicate they are at level 1 of all ten dimensions of the QLU-C10D. For all other health states, the utility score is 1 minus each utility decrement (w dl ) for each level down from no problems in each of the ten QLU-C10D dimensions. Thus, the QLU-C10D utility score for patient p, determined by their QLU-C10D level l for each dimension d, is

$${\text{QLU}}{-}{\text{C10D}}_{p} = 1 - \mathop \sum \limits_{d = 1}^{10} w_{dl} | {\text{QLU}}{-}{\text{C10D}}_{dlp}$$

For example, a health state with quite a lot of problems with Role Functioning, a little problem with Emotional Functioning, and a little Nausea, but no problems in any other dimensions, would be valued at 1 − the decrements for Role Functioning level 3, Emotional Functioning level 2 and Nausea level 2 = 1 − (0.09 + 0.02 + 0.047) = 0.843. By convention, the health states would be described as 1312111121. The best possible health state (1111111111) has a value of 1, and the worst possible state (4444444444) has a value of −0.096.

Appendix 3 in online resource 3 (see ESM) provides detailed instructions on calculating utility weights for all the QLU-C10D health states, and provides STATA and SPSS syntax code to implement this.

When asked about the difficulty of this survey compared with other surveys they had done, 28% of respondents reported the DCE questions to be ‘about the same’ level of difficulty and 39% felt it was ‘harder’. Most (76%) felt the presentation of the health states was clear or very clear. While 39% felt it was difficult or very difficult to choose between pairs of health states, 33% felt it was easy or very easy. Detailed participant feedback on the DCE task will be published separately.

4 Discussion

This paper reports the first value set for the QLU-C10D, a MAUI derived from the EORTC QLQ-C30. This approach has two important advantages. First, it allows direct quantification of utility for use in economic evaluation from responses to the QLQ-C30, a widely used cancer-specific HRQoL questionnaire. Second, it captures dimensions related to cancer symptoms that are not included in generic instruments (particularly appetite, nausea and bowel problems). The main drivers of utility were the generic dimensions, with the largest utility decrements for physical, role, social and emotion functioning and pain, mirroring generic MAUIs. However, sizeable decrements were associated with cancer-sensitive dimensions, particularly nausea, bowel problems and appetite. Problems with sleep and fatigue were smaller, perhaps because minor problems with sleep and fatigue are relatively common and therefore considered less important by survey respondents. It is possible that the size of relative utility weights would differ for cancer patients with experience of extreme levels of fatigue and sleep disturbance. The MAUCa Consortium is exploring this question in a related study currently underway, sampling Austrian patients and general population. Even though the utility decrements for the cancer-sensitive dimensions were smaller than those for the more generic dimensions, their inclusion provides a more relevant measure of utility for cancer interventions.

Physical functioning had larger utility decrements than other dimensions, for each level. In the QLU-C10D, physical functioning is represented by walking, making it somewhat comparable to the EQ-5D Mobility and HUI3 Ambulation dimensions. In the HUI3 multi-attribute utility function, Ambulation does not have the largest utility decrements for any level [32]. Results from EQ-5D valuation studies are mixed. For example, in Australian valuations of the EQ-5D(3L) using DCE, Mobility had the largest utility decrement of all dimensions at level 3 but not level 2 [33], while in Australian EQ-5D(3L) valuations using time trade-off (TTO), Anxiety/Depression and Pain/Discomfort had the largest utility decrements at level 3, and at level 2 Mobility had the second lowest utility decrement [34]. Why might ability to walk have the largest impact on utility in the current study? First, physical functioning appeared as the first dimension in the choice set, as in the QLQ-C30 parent questionnaire. However, we have previously investigated and dismissed order effect after randomised testing in DCEs for both QLU-C10D [11] and EQ-5D-5L [35]. Second, due to the complexity of the level descriptors, physical functioning appeared in the DCE as two attributes, even though it was a single four-level dimension in the DCE experimental design and analysis. A useful comparator here is the EORTC-8D, where Physical Functioning was presented as a single five-level dimension (representing the same QLQ-C30 long/short walk items as in the QLU-C10D) [27]. It had the third largest utility decrement for the worst level (exceeded by Social and Emotional Functioning) and the second largest for level 2 (exceed by Pain). While country is a confounder of this comparison, this issue cannot be ruled out as a driver of the effect, but will be resolved soon as UK valuations of the QLU-C10D are underway. Finally, the QLU-C10D Physical Functioning dimension covers a large range of mobility with four levels reflecting the combined range of the two QLQ-C30 walking items (see Table 1), while other dimensions are based on the range in only one item. It may therefore be appropriate that the utility decrements are correspondingly large.

In similar studies, initial models have contained some inconsistent orderings of utility decrements within dimensions, particularly for dimensions with small utility impacts [24,25,26,27,28,29]. Consistent with previous studies, we imposed constraints to remove non-monotonicities. This did not reduce model fit markedly, and avoids perverse results in QALY calculations.

Anchors at one (full health) and zero (death) are imposed by the QALY model, but there is no natural anchor for the pits state (worst possible health state). The QLU-C10D pits state value is -0.095. This is considerably lower than the 0.29 pits state value for the QLU-C10D’s precursor, EORTC-8D [27], which has eight of the ten QLU-C10D dimensions (four with exactly the same items and levels as the QLU-C10D, four that differ slightly), but lacks Sleep and Appetite. In our study, the worst levels of Sleep and Appetite had a combined utility decrement of 0.09, so the difference in content explains some of the difference in pits state values. Since the EORTC-8D was valued with TTO in the United Kingdom (UK) general population, valuation method and country likely explain much of the remainder, as both instruments share a simple additive utility function. The values of the pits state in the original UK and Australian EQ-5D(3L) TTO studies were −0.594 [36] and −0.217 [34], respectively, and in the Australian EQ-5D(3L) DCE study it was −0.516 [33]. Variations in the value of health states, including the pits state, are driven by several factors [37], including country-specific cultural differences in attitudes to trading between mortality and morbidity, different health state classification system content, valuation method and utility functional form. Arguably, a lower pits state value means a greater range in a value set which may lead to greater differences between interventions in CUA. A related issue is sensitivity to mild impairments. Values for health states with level 2 across all dimensions were 0.464 for the QLU-C10D (Australian DCE) and 0.715 for the EORTC-8D (UK TTO). Assessing the sensitivity of the QLU-C10D to differences in mild and extreme QOL impacts, and comparing this with other candidate MAUIs, are important issues for future research.

The DCE method has emerged as an alternative to TTO and standard gamble (SG) methods for valuation of health outcomes in the past decade, and has now been used in a number of studies [6, 20, 21, 33, 38, 39]. The discrete choice method is attractive for several reasons: it is embedded in a strong theoretical measurement framework; it utilises well established statistically robust experimental design and modelling methods; it is based on a relatively simple judgmental task; it is feasible with online recruitment and data collection. The use of DCEs to value health states is maturing, but still presents some challenges. While the judgmental task is simple relative to TTO and SG, thus allowing survey respondents to consider a larger number of attributes, the 12 attributes in the current study is a relatively large number. This study confirms the QLU-C10D valuation methods experiment in finding this is feasible for respondents [5]. We reduced cognitive challenge firstly by allowing only four dimensions to differ in each choice set and secondly by presenting choice sets in the format preferred by participants in the methods experiment, using yellow highlighting to identify differences between situations A and B [5]. Allowing only some dimensions to differ across choice sets has the additional advantage of requiring respondents who employ heuristics such as considering a single attribute to trade off between other attributes. We designed an experiment with 960 choice sets that would maximise statistical efficiency in estimating utility parameters. This meant the survey included some health states that might seem rather unlikely to respondents, such as severe vomiting yet no problems with social function. However, we note that in the patient-reported QLQ-C30 data used to derive the QLU-C10D health state classification system, at least one patient reported each pairwise combination of levels [4].

We used two modelling approaches, conditional logit and mixed logit, which yielded similar mean utility decrements. We have chosen conditional logit (Model 2) as the basis for calculating utilities for CUA for the following reasons: (i) for economic evaluation, we are generally most interested in the mean response, so preference heterogeneity is a secondary concern; (ii) to our knowledge, there remains uncertainty about the appropriate distributional assumptions for the mixed logit.

This study has several strengths and some limitations. It provides a preference-based measure for calculating utilities for the QLQ-C30, which is theoretically and empirically stronger than using mappings of the EORTC QLQ-C30 to other preference-based utility measures [40]. The development of the health state classification system was psychometrically thorough [4]. The valuation survey sample was large, with quota sampling achieving population representativeness for age and sex. The extent to which non-representativeness on the other measured sociodemographic variables is a limitation is as yet unknown, and will be explored in future researching by pooling valuation data across the MAUCa Consortium. We established the feasibility of our DCE method [5], and have noted its strengths and limitations above. We used modelling approaches appropriate to our data structure and analysis purpose. Our choice of a monotonic main-effects model for calculating utility is readily accessible for a range of end users, clinically interpretable and consistent with the EORTC quality-of-life conceptual model.

The appropriateness of disease-specific utility weights for CUA is debated by health economists [41]. Conventionally, generic MAUIs such as the EQ-5D are used, primarily to enable comparability across health conditions and interventions. However, the capacity of generic instruments to capture clinically relevant differences in cancer is also debated [42,43,44]. Arguably, the QLU-C10D should provide a more cancer-sensitive measure of utility than provided by generic MAUIs, although this is yet to be tested empirically. Further, data on generic utility measures may not always be available. The QLU-C10D enables utility values to be retrospectively generated from the wealth of existing QLQ-C30 data, thus facilitating economic evaluation from existing studies. It is anticipated that the QLU-C10D will have good psychometric properties, and future research will examine this, as well as assessing its performance relative to generic MAUIs.

The QLU-C10D has been developed by the MAUCa Consortium in collaboration with the EORTC QOL Group. A key strength of the Consortium’s approach is the use of identical valuation methods across countries, creating a unique opportunity to explore predictors of health outcome values, including country, age, sex, education and health status of valuation survey respondents—this will be done in future analyses.

The QLU-C10D is endorsed by the EORTC QOL Group, and supersedes the EORTC-8D. Notably, the development of the health state classification system of the EORTC-8D was based on data from 655 multiple myeloma patients, while that of the QLU-C10D was informed by a much larger (n = 2616) and more diverse sample: 13 countries, 15 primary cancer types, localised/regional (n = 1037) and recurrent/metastatic stages (n = 1579) [4]. The EORTC QOL Group now has stewardship of the QLU-C10D, being responsible for all aspects of its management, developing and maintaining information regarding administration, scoring and interpretation, and housing relevant materials on the EORTC QOL website. This will make the QLU-C10D widely available for use prospectively and retrospectively, and thereby facilitate the incorporation of quality of life into healthcare decision-making for cancer care.

5 Conclusions

CUA represents a major part of the reimbursement process in many countries. In Australia, the government guidelines for preparing submissions to the Pharmaceutical Benefits Advisory Committee (PBAC) favour direct estimation of utilities over mapping, do not mandate a particular MAUI but prefer Australian-based preference weights and encourage the use of patient-reported outcomes/MAUIs that capture all important disease- or condition-specific factors [45; pages 37, 77]. Based on the experience of RV and RN serving on PBAC and its subcommittees, submissions for cancer interventions frequently present QLQ-C30 data. Therefore, the value set presented here will aid Australian resource allocation decisions. Further, the methods presented in this paper provide a template for further international valuations of the QLU-C10D. A number of these are underway, using exactly the same DCE design, presentation format and analysis, including Austria, Canada, France, Germany, Poland, the UK and the US, enabling assessment of international comparability of preferences for cancer-specific health states.

Data Availability Statement

The dataset generated during the current study will not be publicly available until all planned analyses are complete (see Sect. 4). For updates, please contact the EORTC Quality of Life Group Health Technology Committee.