The cost of cancer therapies is constantly rising. Understanding the financial implications of these therapies on the healthcare system is important, but it is also important to understand the impact these therapies have on patients in order to assess their value. As such, there is a need for sensitive instruments that capture the different dimensions of health-related quality of life (HRQL) to reflect both the therapeutic benefit and the treatment burden experienced by patients with cancer. Ideally, responses to these instruments would provide a utility to inform cost-utility analysis (CUA), the preferred economic evaluation approach by many health technology assessment organizations [1,2,3]. Conventionally, due to their ease in administration, “off the shelf” generic multi-attribute utility instruments (MAUIs) (e.g., the EQ-5D [4, 5] or the Health Utility Index (HUI) [6, 7]) are used to generate utilities, which are then suitable for estimating QALYs. Utilities are measured on a scale, where one represents full health, 0 represents states deemed to be as bad as being dead, and negative values represent states deemed to be worse than being dead. However, by their nature, the dimensions of generic instruments cover ubiquitous aspects of health, but may be limited in their ability to capture specific impacts of particular conditions and treatments.

The Functional Assessment of Cancer Therapies-General (FACT-G) [8], a widely used cancer-specific patient-reported outcome (PRO) measure, is used often to generate HRQL endpoints in cancer clinical trials. It is a stand-alone HRQL profile measure, and is also included in the many other condition and treatment-specific questionnaires in the FACIT measurement suite [9]. The FACT-G enables patients to self-report the impact of their cancer and treatments, but the scores it yields cannot be used in CUA because they do not provide any information about the strength of individuals’ preferences for specific HRQL dimensions of the instrument or about the trade-off between HRQL and survival. Responses on the FACT-G serve as a description rather than a valuation of health states; as such, their outputs are not health state utilities.

One approach that enables FACT-G responses to be measured onto a utility scale is regressing utility scores from generic measures onto the FACT–G total score or its dimension subscores using datasets containing both measures. However, a review of cancer mapping studies indicated that these regressions generally show poor goodness-of-fit between the observed generic utilities and the mapped cancer-specific utilities [10]. Another approach is to derive a MAUI to generate utilities from data collected from the FACT-G. A disadvantage of this approach include the inability to compare results across programs designed for different conditions and the potential neglect of co-morbidities; however, cancer-specific MAUI ensure that the resulting utilities better reflect the impact of cancer when incorporated into a CUA. Further, having a cancer-specific MAUI based on the FACT-G would make better use of studies where the FACT-G has generated HRQL endpoints but generic MAUIs have not been included.

The Multi-Attribute Utility in Cancer (MAUCa) Consortium has derived a MAUI to generate utilities from data collected from the QLQ-C30 to produce the European Organization of Research and Treatment in Cancer (EORTC) QLU-C10D [11]. Since its development, a number of country-specific utility weights for the QLU-C10D have followed, including Canada [12,13,14,15,16,17,18,19]. Recently, the MAUCa Consortium derived the health state classification system from items of the FACT-G and the Australian value set for the new FACT-8D has now been produced [20]. However, there is a need to inform cancer priority setting and resource allocation decisions in other jurisdictions; as such, country-specific utility weights for the FACT-8D are needed. The aim of this current paper is to produce the FACT-8D value set for the Canadian general population as the Canadian Agency for Drugs and Technologies in Health (CADTH) recommends that the preferences of the Canadian general population should be the reference case [1].


FACT-8D health state classification system

The derivation of the FACT-8D health state classification system is described in detail elsewhere [20]. In brief, using existing FACT-G datasets: (1) confirmatory factor analysis verified the measurement model; (2) Rasch and psychometric analyses guided item selection; and (3) patient opinions informed item selection. The FACT-8D consists of eight dimensions, which map directly to nine items of the FACT-G (Table 1). The five levels of the FACT-G also described the FACT-8D dimensions: not at all, a little bit, somewhat, quite a bit, and very much. The name, FACT-8D, has been endorsed by [21]: ‘FACT’ indicates the origin of the instrument; and ‘8D’ indicates its eight dimensions.

Table 1 The FACT-8D health state classification system and mapping to FACT-G items

The FACT-G consists of a collection of positively- and negatively-phrased items. For example, the Social/Family and Functional wellbeing dimensions consist of positively-phrased items such as “I feel close to my friends” and “I am able to work (include work at home)”, respectively; whereas, the Physical and Emotional wellbeing dimensions consist of negatively-phrased items such as “I have a lack of energy” and “I feel sad”, respectively. Initial work revealed inconsistency in the responses to the interim valuation task [20]: a large number of non-monotonic dimension levels and positive utilities. Members of the MAUCa Consortium speculated that the directional change in the phrasing of the FACT-8D dimensions contributed to this inconsistency. As a result, all FACT-8D dimensions that are positively phrased were revised, such that all dimensions were phrased in the negative direction. For example, “I am able to work (include work at home)”, “I am sleeping well”, and “I get support from my family and/or friends became “problems doing work (include work at home)”, “problems sleeping”, and “problems with support from my family and/or friends”, respectively.

Discrete choice experiment

The FACT-8D health state classification system was valued in the Canadian setting using the discrete choice experiment (DCE) methodology developed for the Australian valuation. Details of the experimental design are reported elsewhere [20]. Briefly, the experimental design underpinning the DCE contained nine attributes: the eight FACT-8D dimensions and duration of survival. Because the descriptive system contains a large number of dimensions, the experimental design was specified such that only five attributes differed in each choice sets (e.g., four HRQL dimensions and duration), and these were highlighted in yellow to reduce the cognitive complexity of the task [22]. The DCE consisted of 16 choice sets, in which respondents had to choose between two health states—Situation A or Situation B—each with a specified duration of life years (four levels: 1, 2, 5, 10 years); which option seen as Situation A or Situation B was randomized within each choice set. The order of the dimensions was kept the same for each respondent, as dimension order has been shown to not systematically bias utility weights [23].

Data collection

Sampling and survey administration for all country-specific valuations in the MAUCa Consortium were undertaken by SurveyEngine, a company that specializes in choice experiments [24]. For this study, respondents over 18 years of age from the Canadian general population were recruited from an online panel. Quota sampling ensured age, sex, and province/territory of residence aligned with the Canadian Census [25].

Respondents completed the following survey components: welcome/disclosure; sex and age (for screening and quota sampling); self-reported health (SF-36 general health question [26] and FACT-G general population version [FACT-GP] [27]; the DCE; respondent perception of the difficulty and clarity of the DCE choice task and strategies used; sociodemographic variables and self-reported mental health (Kessler-10) [28]. The study protocol was approved by the Research Ethics Board at BC Cancer and the University of British Columbia (H15-03293).

Data analysis

Study sample

We used descriptive statistics to characterize the sample in terms of sociodemographic variables, perceived difficulty, and clarity of the DCE task and choice strategies. Chi-squared tests assessed the representativeness of the study sample in comparison to the Canadian general population [25].

Utility estimation

For variables that are non-representative by ≥ %2 for one or more response level, weights were derived using iterative proportional fitting, or raking, to improve the relationship between the survey sample and the population (ipfweight option in Stata). The sample weights were applied to the utility estimation.

As with the Canadian QLU-C10D valuation study [12], the DCE responses were analyzed using a functional form in which the FACT-8D dimension levels interacted with the duration variable [29]:

$$U_{isj} = \alpha TIME_{isj} + \beta X_{isj}^{{\prime }} TIME_{isj} + \varepsilon_{isj} ,$$

where α was the utility associated with a life year in full health, \(X_{isj}^{{\prime }}\) was a set of dummy variables relating to the levels of the FACT-8D health state presented in option j, and εisj was a random error term distributed independently and identically normal. This approach has previously been used to estimate utilities from DCE data to ensure consistency with standard QALY model restrictions: (1) all health states have zero utility at dead state; and (2) the proportion of remaining life years that an individual is willing to give up for an improvement in health status does not depend on the absolute number of remaining life years (i.e., constant proportional time trade-off).

The modelling approach followed that used in the valuation studies conducted in Australia and the United States [14]. The DCE responses were analyzed using a conditional logit model (Model 1). A clustered sandwich estimator using the vce (cluster) option in STATA adjusted the standard errors to allow for intra-individual correlation as each respondent considered the 16 DCE choice sets. The impact of moving away from one level of each dimension is investigated through two-factor interaction terms using the continuous duration term and each dimension level (e.g., the effect of moving from level 1 to level 2 in the pain dimension is determined by using a pain level 2*duration interaction term).

After conducting the unweighted utility estimation (Model 1), derived sampling weights were then incorporated into the model (Model 2). A scatterplot compared the distribution of unweighted versus weighted coefficient estimates. If non-monotonic ordering was present between dimension levels in the conditional logit model, we conducted another conditional logit model after collapsing non-monotonic dimension levels (Model 3). A log-likelihood ratio test assessed the model fit between Model 2 and Model 3.


Sample characteristics and representativeness

A total of 3794 individuals consented to participate in the study: n = 1672 were ineligible to participate due to using devices with small screens (e.g., cellphones, tablets) (n = 573) and due to oversampling of specific characteristic quota (n = 1099). Of the remaining 2122 eligible individuals, n = 540 were removed because these respondents did not complete at least one choice task. As a result, 1582 were included in the analysis set: n = 1501 completed all 16 choice tasks and n = 81 completed at least one choice task.

The sample differed statistically from the general population in all measured characteristics except for age, sex, and province/territory of residence (Table 2). Compared to the general population, the sample consisted of statistically more participants whose primary language is English, completed college education or higher, and reported poorer health based on the General Health Question; these variables were subjected to raking.

Table 2 Characteristics of the study population

Respondents’ perceptions of the DCE valuation task are presented in Table 3. In general, the respondents found the presentation of the health states to be clear but found it challenging to choose between the pairs of health states on each screen. A range of choice strategies was observed amongst the respondents with considering most or all of the aspects presented to them.

Table 3 Respondents’ perceptions of the valuation task and their choice strategies

Utility estimates

The conditional logit models revealed that respondents preferred additional life years (Table 4). Movements away from “no problems” (level 1) in each of the HRQL dimensions were generally valued negatively; exceptions were the positive coefficients of level 2 of the pain and levels 2 and 3 of the problems sleeping. Incremental moves to the next worst dimension level were generally associated with an absolutely larger coefficient; however, there were a few exceptions. Inconsistencies were observed between levels 2 and 3, such that level 3 was more preferred to the less severe level 2, for the following dimensions: problems sleeping, problems with support from family and/or friends, and worry that my health condition will get worse. Problems sleeping revealed an additional inconsistency for the two worst levels. Coefficients for level 2 for the pain and fatigue dimensions were negligible and therefore, combined with the level indicating no problems (level 1).

Table 4 Conditional logit results: Model 1 (unconstrained), Model 2 (raked and unconstrained), and Model 3 (montonicity imposed)

The raked weights were applied to re-estimate Model 2. The trends of utility decrements across dimension levels were generally similar between the raked and unweighted models; however, the magnitude of the decrements were observed to be larger for the raked model. More inconsistencies were observed for the sleep and sadness dimensions in the raked model versus the unweighted model.

Non-monotonic dimension levels in Model 2 were constrained (Model 3). The log-likelihood ratio test indicated that the unconstrained did not provide better fit that the constrained models (χ2 = 8.1, p = 0.3), further supporting our strategy to constrain for monotonicity within dimensions. As per the Australian FACT-8D valuation study [20], the estimates from the parsimonious Model 3 (constrained) defined the Canadian value set for the FACT-8D (Table 4). Model 3 revealed that pain, nausea, and problems with working dimensions most greatly affected the individual’s utility function; this was followed by sadness, fatigue, and worry about health condition will get worse (Fig. 1).

Fig. 1
figure 1

Canadian FACT-8D utility decrements by dimension and level (derived from Model 3 raked condition logit, monotonicity imposed)

While the majority of respondents considered most or all of the attributes presented, attribute non-attendance may be a problem (Table 3). We conducted a post hoc enhanced latent class conditional logit model using lclogit2 and lclogitml2. The four-class model appears to be the optimal model as raising and lowering the number of classes slightly worsens BIC. The estimates of the latent class model with four latent classes is available as Additional file 1.

FACT-8D utility calculation

As per the conditions set by the FACIT Group, the FACT-8D is not a standalone instrument. The Canadian value set is applied to the nine items of the completed FACT-G required to obtain FACT-8D utility scores (Table 5). A utility index of one is assigned to individuals whose FACT-G responses indicate they are at level 1 of all eight dimensions of the FACT-8D (11111111): an index of − 0.65 is the worst possible state defined by the FACT-8D. For all other health states, the utility score for individual i is calculated as follows:

$$FACT - 8D_{i} = 1 - \mathop \sum \limits_{d = 1}^{8} w_{dl} |FACT - 8D_{dli} ,$$

where w is the utility weight for each level l of dimension d of the FACT-8D. Stata and SPSS codes for the Canadian FACT-8D value set are available as Additional file 1.

Table 5 FACT-8D descriptive system: how the dimensions and levels map to the 9 component FACT-G items, and associated Canadian utility decrements


Building on the work previously reported in the FACT-8D health state classification system development and Australian valuation study [20], we determined the Canadian value set for the FACT-8D, a cancer-specific algorithm for calculating utilities. The Canadian FACT-8D value set provides another decision-making tool to inform CUA in Canada, especially in situations where only the FACT-G was administered to assess patients’ HRQL. By using select item responses of the FACT-G, FACT-8D utilities can be generated which, in turn, can inform CUA.

The Canadian valuation results revealed that the main contributors of the general population respondents’ utility were pain, nausea, problems with working, and problems with support from family and/or friends; these findings were similar to those reported in the Australian valuation study [20]. Also captured in the FACT-8D classification system were dimensions reflecting other symptoms and impacts of cancer and its treatment (e.g., fatigue, sleep problems, and worry about future health). With the exception of pain and nausea, the utility decrements for the dimensions considered more cancer-sensitive were generally smaller. However, we do not anticipate this observation will reduce the impact of cancer-sensitive dimensions when the FACT-8D utilities inform CUA as other factors will affect the prevalence of the cancer-sensitive dimensions and the difference in symptom prevalence between trial arms or other comparator groups. The extent to which the inclusion of cancer-sensitive dimensions provides a more relevant and sensitive cancer-specific utility measure will depend upon the clinical context.

The Canadian general population’s valuation of the health states defined by the FACT-8D was generally monotonic within each dimension, such that poorer HRQL levels had larger utility decrements. However, when comparing the Canadian valuation results with those of Australia, both countries’ value sets revealed some slightly inconsistent orderings of utility decrements across the levels of most dimensions. Only nausea and one other dimension demonstrated no inconsistency: fatigue for Canada and pain for Australia. Sleep was problematic in both countries’ valuations. To overcome the inconsistency for this dimension, the five levels were collapsed down to two levels to capture minor and severe sleep issues. Collapsing of levels to reflect minor and severe issues was also observed for the sadness and worry dimensions. For the remaining dimensions describing pain, problems with work, problems with support with family and friends, inconsistencies were observed in the less severe levels.

The worst possible state (i.e., PITS state) defined by Canadian FACT-8D algorithm is − 0.65, which is similar in magnitude to the Australian PITS state (− 0.54). However, when compared to other MAUIs, the Canadian FACT-8D PITS state is significantly lower than that of the EQ-5D-5L (− 0.15) [30] and the cancer-specific utility instrument, QLU-C10D (− 0.15) [12]. While the difference between the PITS values of the FACT-8D and the EQ-5D-5L may be result of different valuation methods used (e.g., FACT-8D used the DCE whereas EQ-5D-5L used TTO), comparing the PITS values for the FACT-8D and the QLU-C10D offers a more informative comparison. Assessing the common dimensions between the two cancer-specific algorithms reveals that most severe levels of pain, fatigue, and nausea demonstrates greater disutilities for the FACT-8D. This may be a result of the extra response level in the FACT-8D but this will need to be explored in the future.

The conditional logit was selected over the mixed logit results because economic evaluation is mostly concerned with the mean response; preference heterogeneity is a secondary concern. The choice of a monotonic main-effects model for calculating utility is readily accessible for a range of end users, clinical interpretable and consistent with the FACIT quality of life conceptual model.

The FACT-8D health state classification was valued using a DCE. This approach has a strong theoretical measurement framework, well established statistically robust experimental design and modelling methods, and demonstrated feasibility with online recruitment and data collection [29, 31, 32]. In this study, respondents appraised choice sets containing nine attributes. The relatively larger number of attributes raised concerns regarding the cognitive burden of the respondents; however, we conducted innovative work to have only four attributes differ in each choice set. Although previous work revealed that the respondents preferred format of yellow highlighting [22], the concerns of respondents employing heuristics, such as considering a single attribute, to trade-off between other attributes, were alleviated when the majority of the respondents considered most or all of the attributes presented to them.

There are limitations associated with this study. While the valuation survey sample consisted of a large number of respondents with quota sample sampling achieving population representativeness for age, sex, and province/territory of residence, the study sample tended to have higher education and report poorer health. To overcome this limitation, we raked the study sample on under represented characteristics.

The influence of sociodemographic variables to the utility estimates will be assessed in future analysis of pooled data from international valuations of the FACT-8D. While it may seem like having general population respondents valuing these health states is a limitation due to their inexperience with cancer, it is important to adopt an extra-welfarist approach as a publicly-funded healthcare system exists in Canada. As such, the preferences from general population respondents should maximize societal health. The health state descriptions make no mention of “cancer”, alleviating any potential stereotypes respondents may have in regards to a cancer diagnosis; although previous research has revealed that disease labels do not affect health state valuations [33]. Further, due to a logistical oversight during survey implementation, individuals on devices with small screen sizes (e.g., cellphones, tablets) were initially allowed into the survey but then encountered a system initiated timeout that prevent them from proceeding to the DCE component; as such, the reported rate of participant involvement is not accurate due to the device used.

We used the FACT-8D valuation methods developed by King et al. [20]. This involved reversing the direction of the phrasing for the three positively worded FACT-G dimensions (i.e., work, sleep, and support), a solution that solved initial problems in the patterns of utility decrements in the developmental work conducted in Australia [20]. While reframing the positively worded dimensions as problem statements in the DCE reduced the cognitive burden placed on the respondents when completing the choice tasks, this posed the question of how to map the corresponding positively framed FACT-G responses to utility decrements in the FACT-8D utility scoring algorithm. The members of the MAUCa Consortium propose mapping to the original FACT-G item wording by reverse scoring when calculating utility decrements, as shown in Table 1 and the scoring instructions in the Additional file 1. We acknowledge that when used as self-reported health items, negatively and positively framed versions of the work, sleep, and support items would not necessarily yield mirror image results. However, this solution is pragmatic in allowing utility values to be generated from existing FACT-G datasets, with the reframing used solely for the purpose of making the DCE valuation task more feasible for participants. Better solutions may be found in future research, particularly if similar problems arise in other valuation DCEs with positively and negatively framed items. The measurement properties of the FACT-8D (using the Australian value set) has been tested against the EQ-5D-5L (scored using the UK 3L crosswalk and the 5L England value set) [34]. The FACT-8D demonstrated good convergent validity and responsiveness but the EQ-5D-5L showed better known groups’ validity.

We did not test potential interactions between pairs of FACT-8D dimensions although we acknowledge that they may exist. The influence of potential interactions could be explored in the future both quantitatively and qualitatively. We opted for a more parsimonious approach as testing all possible interactions would require an unfeasibly large sample size due to the many additional coefficients that would need to be estimated from the DCE data. The model presented in the article excluded complex interactions and therefore, is clinically interpretable and make it comprehensible to end-users.

In our work, we anchored the utilities on the standard QALY scale using a common approach [29]. The resulting Canadian FACT-8D value set assumes the zero condition (i.e., a health state of zero duration is equivalent to the dead state). However, previous work has revealed that different DCE-based approaches to anchor utility scores can have varying impact on the generated utilities [35]. This may be a limitation when value sets determined by DCEs are used to guide resource allocation decisions. While it is possible to anchor utilities by including dead as a health state within the DCE, this is problematic within a random utility theory framework as some respondents may never acknowledge a health state to be less preferred than immediate death [36].

Results from the study add to the continuing debate amongst health economists regarding the use of generic versus cancer-specific utility instruments in informing resource allocation decisions. The FACT-8D contains a large number of dimensions specific to cancer. While the FACT-G dimensions are more sensitive in capturing cancer patients’ HRQL, the resulting metrics make it difficult to compare to CUA results in other therapeutic areas. Future work is needed on accessing the acceptability of cancer-specific utility instruments in informing resource allocation decisions.


The largest impacts on utility included three generic dimensions (i.e., pain, support, and work) and nausea, a symptom caused by cancer (e.g., brain tumours, gastrointestinal tumours, malignant bowel obstruction) and by common treatments (e.g., chemotherapy, radiotherapy, opioid analgesics). Our findings demonstrate that cancer-specific utilities can be determined using responses to the FACT-G (as well as many FACIT measures that embed the FACT-G items); this, in turn, facilitates CUA for cancer interventions from a Canadian perspective. The widespread use of the FACT-G to measure quality of life outcomes of cancer patients will enable utilities not only to be estimated prospectively but also from a large number of retrospective studies. While the results reveal that the Canadian value set for the FACT-8D is similar to the Australian value set [20], CADTH recommends that preferences of the Canadian general population should be the reference case to guide societal decisions [1]. The Canadian FACT-8D value set affords cancer-specific utility weights that may be more sensitive to differences resulting from cancer care than a generic MAUI, which may be more informative in guiding cancer priority setting and resource allocation decisions in Canada. We intend to conduct head-to-head comparisons of the FACT-8D versus generic MAUIs assess its performance. The availability of more QALY estimates and CUAs will enable decision makers to be more informed when allocating resources in Canada’s publicly funded health care system.