Background

Fundamental to measuring Quality-Adjusted Life Years (QALYs), as used for cost-utility analysis (CUA), are value sets for health states representing health-related quality of life (HRQoL). A value set comprises HRQoL values for all possible health states representable by the health state classification system concerned – of which well-known examples include the EQ-5D [1], HUI [2] and SF-6D [3]. Most systems’ values range between unity for ‘perfect health’ and zero for ‘dead’ , with negative values for states considered worse than dead.

Value sets are created by asking interview or survey participants to value a subset of the system’s health states and then regression models are used to interpolate values for the full set of states. Standard practice is for participants to be drawn from the general population, in their dual roles as tax-payers and potential patients [46]. Thus value sets are derived from valuations mostly for hypothetical health states that participants are not personally familiar with and are required to imagine.

A near-universal finding internationally is that patients – i.e. people with direct personal experience of illness or injury – assign different health state valuations than the general population [7, 8]. A literature review by De Wit et al. [9] and a meta-analysis by Peeters et al. [10] concluded that patients’ valuations of their own (actual) health states tend to be higher than hypothetical valuations of descriptions of the same states by the general population. In contrast, however, a meta-analysis by Dolders et al. [11], despite finding some evidence for such relationships, found no statistically significant differences overall.

Most studies in the above-mentioned literature pertain to European and North American countries. To our knowledge, no studies comparing patient and general population valuations include data for New Zealand. And yet the EQ-5D system and Devlin et al.’s value set [12], which is based on valuations from a 1999 survey of the New Zealand general population, is widely used for CUA in New Zealand, in particular by the Pharmaceutical Management Agency (PHARMAC) for health technology assessments [6]. It would be useful to know, therefore, whether the above-mentioned international finding that patient valuations tend to exceed general population valuations also applies to New Zealand.

In addition, most previous studies have sampled patients with illnesses such as cancer or stroke rather than injuries (as we do in this paper). It is possible that valuations may differ between people experiencing an illness (often accompanied by an ongoing decline in health prior to clinical diagnosis) and those experiencing an acute-onset change in health status, as caused by injury.

This paper compares Devlin et al.’s general population valuations from the 1999 survey with injured New Zealanders’ valuations of their own health (also represented on the EQ-5D). The latter valuations were collected between 2007 and 2009 in the Prospective Outcomes of Injury Study [13, 14]. We also investigate which of the EQ-5D’s dimensions are most strongly associated with the injured population valuations and compare the results with equivalent findings from the 1999 general population study.

Methods

Measurement of health state preferences

In both the injured population and general population studies (explained in turn below), health states were represented using the EQ-5D system and participants’ health state preferences were elicited using the visual analogue scale (VAS) method.

Developed by the EuroQol Group, the EQ-5D system represents HRQoL in terms of five dimensions – mobility, self-care, usual activities, pain/discomfort and anxiety/depression – where each dimension has three possible levels of experience: (1) no problems, (2) moderate problems and (3) extreme problems [1]. The three levels combine across the five dimensions to define 243 health states, each representable by a five-digit combination relating to the relevant level within each dimension listed in the order above (e.g. 11111 represents no problems on any dimension). The EQ-5D has been widely used in studies of injury and illness [15, 16].

Participants in both studies were asked to indicate on a VAS depicted as a vertical line marked from 0 (“worst imaginable health state”) to 100 (“best imaginable health state”) where they considered the relevant states were for them. Despite lacking the theoretical underpinnings of the standard gamble or time trade-off methods, the VAS has been empirically verified as a valid method for eliciting health state valuations [17, 18].

Injured population study

Injured New Zealanders’ valuations of their own health on the EQ-5D were obtained from the Prospective Outcomes of Injury Study (POIS), a prospective cohort study of injured people aged between 18 and 64 years [13, 14]. POIS participants were recruited from entitlement claims registered between December 2007 and June 2009 with the Accident Compensation Corporation, New Zealand’s government-controlled, no-fault injury compensation insurance scheme. All injury types and a wide range of severities were eligible for recruitment to POIS, except for people whose injuries resulted from self-harm or sexual assault.

POIS participants (n = 2856) completed a first interview 3.2 months after injury, with follow-up interviews 4.6 and 12.3 months (medians) after injury. At each interview, participants were asked to represent their own health on the EQ-5D’s five dimensions and then to rate it on a VAS.

General population study

The general population valuations of EQ-5D health states come from a self-completed postal survey administered to 3000 adult New Zealanders randomly selected from the electoral roll in 1999 [12] (also see [19] and [20]). Three versions of the survey questionnaire were administered, where each version comprised 13 or 14 different health states to be valued on two VAS, with some states repeated across two or more of the versions. Overall, valuations for 23 different health states were collected. For each state, when arriving at their valuation participants were instructed to imagine that it lasts for one year and to disregard what may happen afterwards.

Although 1360 responses were received (a 50% response rate of those who received the survey), 441 were unusable due to: too few health states being valued (because the valuation of so few states undermines their reliability), ‘dead’ being valued as better than full health, or all states being valued the same (considered to be implausible representations of a participant’s preferences). Also, in order to ensure that only data with the strongest claim to validity were analysed, following the approach recommended by Devlin et al. [12], we excluded a further 523 responses that had two or more logical inconsistencies in their valuations, where a ‘logical inconsistency’ occurs when a health state with a less severe problem on a particular dimension than another state, given its problems on the other dimensions are no more severe, is scored lower (e.g. state 21111 scored below 22111 would be logically inconsistent). The remaining 396 responses represent the highest quality data possible albeit one inconsistency is permitted in order to ensure a reasonable sample size (only 189 responses had no inconsistencies).

In order to calculate health utility values, Devlin et al. rescale each respondent’s valuations to lie on a scale from 0 (representing death) to 1 (representing full health). As hypothetical valuations for death and full health were not collected for POIS participants, we are unable to do a similar rescaling for these data. We therefore use the raw valuations (without rescaling) from the general population study to allow for direct comparisons between the general population and injury population.

Analyses

Our choice of the appropriate samples from the injured population and general population studies to include in the present analysis was determined primarily by the availability of matching health states for which valuation data were collected – to allow comparison between samples – rather than matching the samples in terms of their socio-demographic characteristics. International evidence has shown that the usual socio-demographic and background variables have no, or at most weak, impact on individual’s health state valuations [11, 21, 22].

Thus, of the 23 health states for which valuations were collected in the general population study, our protocol was to include only those states which at least two people (in practice, four) from the injured population study reported as their own health state. For each of these health states, an independent samples t-test was used to determine whether the mean valuations from the two studies are statistically significantly different. Robust standard errors were used for each of these tests to allow for the possibility of clustering at the individual level (as each individual could value the same health state at multiple time periods), and critical values were adjusted, using the Holm-Bonferroni correction, to account for the large number of comparisons performed.

In addition, to investigate which of the five EQ-5D dimensions are most strongly associated with the injured population valuations, we estimated equations (1) and (2) below. These two equations are analogous to the equations used to derive Devlin et al.’s two value sets [12], both of which were selected after extensive testing of a wide range of linear and non-linear specifications, following the modelling approach used by Dolan [23].

100 - VA S it = β 0 + β 1 M O it + β 2 S C it + β 3 U A it + β 4 P D it + β 5 A D it + c i + u it
(1)
100 - VA S it = β 0 + β 1 M O it + β 2 S C it + β 3 U A it + β 4 P D it + β 5 A D it + β 6 N 3 it + c i + u it ,
(2)

where subscript i indexes individuals and t indexes time periods (interviews).

Both equations were estimated with individual fixed effects. Equation (1) is a linear main effects model in which the effect of changes between levels 1 and 2 and levels 2 and 3 (see Table 1 below) is constrained to being identical. Equation (2) is an extension of equation (1), in which an additional detriment to health is introduced if any of the EQ-5D dimensions are at level 3 (extreme problems). The dummy variables to represent each of these effects are described in Table 1.

Table 1 Dummy variables used to model health state valuations

As explained in Devlin et al. [12], because the EQ-5D system values health states other than full health as negative deviations from 11111 = 1, VAS it (as explained earlier, rendered on a 0-100 VAS) can be represented as 100 minus the combination of dummy variables and their coefficients (to be estimated). This allowed the dependent variable in equations (1) and (2) to be transformed to 100 - VAS it (so that higher values correspond to ‘worse’ health states).

Consistent with the explanations above, coefficients β1-β5 are interpretable as the decrements to a health state’s valuation of a move from level 1 to 2 or from level 2 to 3 on the respective EQ-5D dimensions, all else being equal. Similarly, coefficient β6 in equation (2) is interpretable as the extra decrement from any dimension being at level 3 (i.e. additional to the effects captured by β1-β5).

Ethical approval

This study was approved by the New Zealand Health and Disability Multi-Region Ethics Committee (MEC/07/07/093).

Results

The socio-demographic characteristics of participants in the injured population and general population studies are reported in Table 2. The first column for the injured population study refers to the smaller sample whose valuations were included in the direct comparison of health state valuations (i.e. people who, in at least one of the three interviews, reported a health state that was also included in the general population study), whereas the second column refers to those whose valuations were included in the regression analysis (i.e. those who completed the EQ-5D and VAS at least once in the three interviews).

Table 2 Socio-demographic characteristics of participants in the two studies

Of the 2856 participants recruited to the injured population study, 2823 valued their own health on the EQ-5D at the first interview (3-months), 1457 at the second interview (5-months), and 2247 at the third interview (12-months). These three interviews resulted in a total of 6527 valuations, covering 115 of the 243 possible EQ-5D states. Of these 115 states, 18 matched the 23 states included in the general population study, of which 13 were valued by at least two participants, and so they could be included in the present study. Table 3 reports the mean valuations for these 13 states from the two studies.

Table 3 Mean health state valuations from the injured population and general population studies

As can be seen in Table 3, the mean health state valuations of the injured population are higher than the hypothetical valuations of the general population for all 13 health states considered except 11111. This difference, which tends to be larger the ‘worse’ the state, is statistically significant at the 10% level for all states except 11211, 12111, 22233 and 22323.

Table 4 presents the estimates of equations (1) and (2) for the injured population study and also, for comparison purposes, the general population study. For both equations, though the five dimensions all have statistically significant effects on participants’ valuations (of the expected positive sign), their relative magnitudes in the respective studies are very different.

Table 4 Estimates of equations ( 1) and ( 2) for the two studies

Consistent with our earlier finding (Table 3) that the injured population own health valuations (means) are mostly higher than the general population hypothetical valuations, almost all of the estimated coefficients for both equations are larger for the general population. For injured people, the most important dimension in terms of negative effects on valuations is self-care (SC), followed by anxiety/depression (AD), usual activities (UA), mobility (MO) and pain/discomfort (PD). For the general population, AD is also an important dimension (most important); however, in contrast to the injured population, SC is the least or second-least important dimension (equation 1 or 2), and PD is the second most important dimension (both equations). The additional decrement to health valuations due to extreme problems on any dimension (N3) is much larger in the general population than in the injured population.

Discussion

Our finding that injured New Zealanders’ valuations of their own health, represented using the EQ-5D system, are mostly higher than general population valuations of the same (hypothetical) states – with the notable exception of state 11111 (discussed below) – is consistent with the previous international research mentioned in the Introduction.

As discussed by Ubel et al. [24], possible reasons for this phenomenon include: (1) the two groups are, in effect, valuing different health states albeit they are represented identically on the EQ-5D; (2) patients adapt to their poor health states (and so value them relatively high); (3) the ‘focussing illusion’ [25], whereby members of the general population over-emphasise aspects of HRQoL most affected by illness; and (4) a shift of reference points as people assess health states with reference to their current health. See Ubel et al. for discussion of these and other explanations.

The one exception to the finding that injured population valuations exceed general population valuations is the valuation for state 11111: it is lower for the injured population than for the general population. Consistent with reason (1) above, perhaps this is because for injured people even when, as represented on the EQ-5D, they are at ‘full health’, to them there are other aspects of their health (not captured by the EQ-5D) that are sub-optimal.

Albeit the differences between the injured population and general population valuations – in the range 3.8 to 27.1 (Table 3) – are statistically significant, how practically significant are they? Minimally important differences for the EQ-5D (0-1 scale) have been estimated to be in the range of 0.04 to 0.08 [2628] – equivalent to 4.0 to 8.0 on a 0-100 scale – which suggests that the above-mentioned differences are also practically significant.

The potential implications of our results for CUA can be illustrated via the example of a CUA of laparoscopic-assisted colectomy (LAC) for colon cancer relative to open colectomy (OC) undertaken by Hayes and Hansen [29]. Applying Devlin et al.’s value set [12] – as explained earlier, derived from the general population study included in the present study – the authors reported two sets of results corresponding to data from two randomised control trials (RCTs): mean QALY gains of 0.018 and 0.049 per patient (depending on the RCT) and costs per QALY of $70,389 and $25,857 respectively.

These results were based on a simple model in which the authors of the CUA assumed that patients were in state 22222 when discharged from hospital and 11111 when they recovered from both LAC and OC (LAC has a quicker recovery time). State 22222 is worth 0.464 in Devlin et al.’s value set. For the demonstration here, it is sufficient to increase 22222’s value by 0.130, the difference between the injured population and general population means for 22222 (see Table 3) after normalising them to the conventional 0-1 scalea. The effect of 22222 being worth 0.594 instead of 0.464 is to lower the QALY gains from 0.018 and 0.049 (as above) to 0.013 and 0.037 and to raise the costs per QALY from $70,389 and $25,857 to $94,661 and $34,422 respectively. These revised cost-per-QALY estimates are approximately one-third larger than the original estimates. Although cost-effectiveness acceptability thresholds for New Zealand are not reported, such relatively large increases could, conceivably, result in LAC going from being cost-effective (if general population valuations were used) to being not cost-effective (if patient valuations were used). Note, though, the calculations above are intended only to illustrate the possible effects of differences between the injured population and general population valuations, not to serve as a bona fide CUA.

Strengths and weaknesses of the study

The main strength of this study is that, to the best of our knowledge, it is the first comparison of injured people’s valuations of their own health, represented using the EQ-5D system, vis-à-vis general population valuations of the same EQ-5D states for New Zealand.

This study has several potential weaknesses. As discussed in the Methods section, it was not possible to ensure that the samples employed from the general and injured populations were similar in terms of their socio-demographic characteristics; participants in the injured population study were younger, more likely to be male, employed and of Māori, Pacific Islands, Asian or ‘other’ ethnicity (not European or Pakeha). However, as also discussed earlier, international evidence suggests that this should not have a significant effect on health state valuations [11, 21, 22]. Another potential weakness is that the two sets of valuations are not from the same data-gathering exercise. Instead, they were run independently, separated in time by almost a decade (i.e. 1999 versus 2007-09). New Zealanders’ health state preferences could, potentially, have changed over this time period; however, the literature provides no indication as to whether this is likely to be a serious problem. Also, the general population study used a self-completed postal questionnaire whereas the injured population study, as part of the Prospective Outcomes of Injury Study, was mostly interviewer-administered. These differences between data collection methods have the potential to cause information bias; again, we cannot determine the nature or possible extent of this potential bias.

Conclusions

Consistent with the international literature, we find that injured people’s VAS valuations of their own health, represented using the EQ-5D system, are mostly higher than the general population’s hypothetical valuations of the same EQ-5D states for New Zealand. These differences are practically significant in the sense that they are larger than minimally important differences for the EQ-5D from the literature, and they appear capable of significantly affecting CUA results; hence they could potentially influence health technology assessments.

Our results are further evidence that it can make a difference whose health state valuations are included when value sets are being created. However, this does not mean, necessarily, that patients’ (or injured people’s) valuations should be used instead of the general population’s. That is a whole other matter that ultimately depends on the decision-making context and the purpose of the analysis [8]. The consensus seems to be that for decisions at the individual-patient level, such as treatment options, the patient’s own valuations should be used. For decisions at the aggregate level, such as health technology assessments, value sets representing the general population should be used.

Endnote

aNote that Devlin et al.’s value of 0.464 is not directly comparable with the general population value in Table 3 because Devlin et al.’s value set comes from estimating equation (2) above rather than it being a mean value, and also because, as is conventional when estimating values, the underlying survey valuation data were ‘rescaled’ relative to each participant’s hypothetical valuations of ‘dead’ and ‘perfect health’ (which were not collected in the injured population study).