5.1 Introduction

One of the most common questions that the EuroQol Group is asked by users of the EQ-5D-5L is: ‘Which value set should I use?’. There is no simple answer to this, as it depends on the user’s objectives in using the instrument, the decisions that it informs, and the context in which the information will be used. Selecting an EQ-5D-5L value set will also be affected by the availability of value sets and their acceptability to users. Which value set to use is straightforward under two conditions: (a) an EQ-5D-5L value set, based on the EQ-VT protocol described in Chap. 2, is available for the country to which the data to be analysed refer; and (b) that value set is acceptable to those who will make decisions based on it.

However, in many countries a local EQ-VT-generated EQ-5D-5L value set is not available; and even if there is one there is no guarantee that local decision-makers will accept it. In these circumstances, alternatives include using another country’s value set that was generated using the EQ-VT protocol; using a value set generated by an alternative valuation method; and mapping from the EQ-5D-5L to the EQ-5D-3L, where a local value set exists for the latter.

This chapter guides potential users through these and other issues that arise when choosing an EQ-5D-5L value set. Sections 5.2 and 5.3 present an overview of the principal considerations relevant to users, providing an easy access guide. Section 5.4 discusses some more technical and theoretical issues.

The first and most important question for any user of an EQ-5D-5L value set is: ‘What is the purpose of representing EQ-5D-5L profile data as a single number?’. There are broadly two main categories of use that can be identified. A first important category is when the EQ-5D-5L is used for summarising health-related quality of life (HRQoL) to estimate Quality Adjusted Life Years (QALYs) and changes in QALYs that result from the health care use. This kind of evidence is often part of health technology assessment (HTA). Section 5.2 discusses relevant considerations about choosing a value set for QALY estimation.

The second important category of use is when value sets are employed as a way of summarising and statistically analysing EQ-5D-5L profile data without the aim of estimating QALYs. Section 5.3 summarises the considerations relevant to choosing which value set to use in these ‘non-QALY’ applications.

5.2 Which Value Set Should Be Used to Estimate QALYs? – An Overview

The use of EQ-5D-5L values to estimate QALYs imposes requirements on the characteristics of those values. This specific use of values is of such importance that these requirements are largely built into the methods for eliciting and modelling them. Unfortunately, there is no consensus about the theoretical properties that the values used to estimate QALYs should have, as reflected in ongoing debates about which valuation methods best meet those properties. However, some principles are widely adopted, and requirements that meet these, detailed in Box 5.1, underlie all of the value sets produced using the EQ-VT protocol (see Chap. 4). Other valuation protocols may not. For example, value sets that rely exclusively on Discrete Choice Experiments (DCE) without a duration attribute or any other means of anchoring the DCE responses do not meet these requirements, largely ruling them out for use for QALY estimation.

Box 5.1: What Properties do EQ-5D-5L Values Need to Have to Be Suitable for Use in Estimating QALYs for Economic Evaluation?

For use in economic evaluation, QALYs must have some basic properties, for example that they can be used as an unambiguous measure of the value of every health care intervention (Morris et al. 2012). How this translates into requirements for the health state values that form the ‘Q’ element of QALYs is less clear and subject to debates over both economic and psychometric theory and practice. Possibly the only universally agreed property for these values derives from the definition of a QALY; full health maintained over one year will generate one QALY, implying that the value attached to full health should be equal to 1. Current practice underlying the value sets described in this book is therefore open to debate but does meet the basic requirements for measuring QALYs. It assumes that, at a minimum, values should be:

  • measured on a scale anchored at 1 = full health and 0 = dead. States considered worse than dead are assigned a value < 0.

  • obtained using stated preference methods from patients or a general population, rather than using external judgements by, for example, health care experts.

  • obtained by forcing respondents to make explicit choices between mutually exclusive options that describe health states.

These requirements contributed to the EuroQol Group’s decision to use time trade-off (TTO) and Discrete Choice Experiments (DCE) in the EQ-VT protocol for EQ-5D-5L valuation studies (See Chap. 2). Section 5.4.2 briefly discusses these issues further, with suggested further reading.

Values are sometimes referred to as ‘utilities’, but the value sets described in this book do not claim to measure utility according to any of its conventional technical definitions (see Drummond et al. 2015, Chapter 5, Section 5.4.2). For example, they may not conform to the axioms underlying von Neuman-Morgenstern measurable utility under conditions of uncertainty based on expected utility theory (EUT). The Standard Gamble (SG) method aims to elicit such utilities but is not widely used because of concerns about the validity of EUT and the ability of respondents to judge probabilities. Other value set properties required for estimating QALYs, such as constant proportionality and additive independence, are assumed to be satisfied, as is the case with all HRQoL instruments accompanied by values.

Figure 5.1 presents a summary of the main considerations in choosing an EQ-5D-5L value set when the main aim is QALY estimation. First, users should assess whether the QALY analysis is for use in HTA or other purposes, and who will be informed by it. HTA bodies and other decision-makers using QALY evidence may have specific recommendations about their preferred value set, which in most cases would be the first choice for the base case. If not, the choice to be made depends on factors such as the local availability of value sets, the relevance of available non-local value sets and, in either case, their empirical characteristics and their theoretical properties. These issues are discussed in more detail in the following sections.

Fig. 5.1
figure 1

Choosing which EQ-5D-5L value set to use in estimating QALYs

5.2.1 End Users’ Requirements and Recommendations

‘End users’ refers to whoever the analysis of EQ-5D-5L data is intended to inform. This could be national or local government bodies, HTA organisations, local health care budget holders, health care providers and insurers, health care professionals, patients or the general public. In practice, it is likely that the only end users who will specify a preferred or accepted value set are HTA organisations. Hence, when EQ-5D-5L data are analysed to generate estimates of QALYs for cost-effectiveness analysis, we recommend first consulting whether the relevant HTA body or other stakeholder has published a ‘methods guide’ or provide guidance stating their requirements for value sets selection.

Kennedy-Martin et al. (2020) provide a summary of stated requirements of health care decision-making bodies internationally regarding the valuation of health states. For example, the National Institute for Health and Care Excellence (NICE) in the UK (NICE 2013; currently being updated), Zorginstituut in the Netherlands (Zoorginatituut Nederlands 2016) and Haute Authorité de Santé in France (HAS 2020) each provide HTA methods guides on how EQ-5D-5L data should be valued for submissions to them. The Pharmaceutical Benefits Advisory Committee (PBAC) in Australia, in contrast to European agencies, is less prescriptive about which HRQoL instrument to use, and which value set to employ in conjunction with them (PBAC 2016). In most cases, HTA authorities’ methods guides state that a value set based on the stated preferences of that country’s general public is recommended. There are exceptions, for example, Sweden’s Dental and Pharmaceutical Benefits Agency (TLV) indicates that the values used in submissions to them should reflect Swedish patients’ experienced values, i.e. ‘appraisals of persons in the health condition in question’ (TLV 2003, 2017), rather than stated preferences of the Swedish general public.

There may be cases in which there is no end user guidance about value sets, or the guidance provided is too broad to assist in choosing between alternative value sets. This is a particular problem when QALY estimates are derived from multi-country trial data, or are used as evidence in multiple HTA submissions, or both. The choice of value set may be made even more difficult if the end user is a global organisation making recommendations that affect multiple countries. In these instances, the choice of value set is left to the user. In Sects. 5.2.2 to 5.2.4 we describe the criteria that users should consider in such cases.

5.2.2 Relevance to the Population to Whom the Analysis Refers

To our knowledge, most HTA methods guides recommend that QALY estimates should ideally be based on values obtained locally, that is from the area over which that HTA body has jurisdiction. This ensures that resource allocation decisions reflect that country’s preferences about the relative importance of different health problems. There are more national EQ-5D value sets available than for any other generic measure of HRQoL. The availability of EQ-5D-5L value sets will continue to expand, as further countries undertake valuation studies to support the development and expansion of HTA worldwide. However, there will inevitably remain countries where no local value sets are available.

For a country that does not have an EQ-5D-5L value set but does have an EQ-5D-3L value set, mapping between the two descriptive systems provides one means of valuing the EQ-5D-5L – see Box 5.2 below. Mapping methods have also been used to estimate a link between the EQ-5D and other condition-specific measures of HRQoL, but these will not be discussed here as they do not produce a value set for the EQ-5D-5L. The use of mapping methods may meet HTA requirements; for example, current NICE guidance recommends mapping EQ-5D-5L to the EQ-5D-3L (NICE 2019) thereby allowing use of values from the York MVH ‘A1 Tariff’ EQ-5D-3L value set (MVH Group 1995).

Analysts are therefore recommended to consult relevant local HTA methods guides before choosing whether to use a mapping method, and which one to use. Box 5.2 provides further details on mapping.

Box 5.2: Mapping Between 3L and 5L to Create Value Sets

The most notable example of the application of mapping methods to create value sets for the EQ-5D-5L that may be used when no valuation studies are available is van Hout et al. (2011). In this response mapping study, data from 3691 patients in six European countries who completed both the EQ-5D-3L and EQ-5D-5L were analysed using four different statistical methods. The chosen method was the ‘indirect non-parametric method’, which assumed independence of each EQ-5D dimension and removed inconsistent responses such as choosing level 1 on the 3L and level 5 on the 5L. This generates transition probabilities: the probability that a person would have recorded a particular response to the EQ-5D-3L given the response they gave to the EQ-5D-5L. The resulting 243 x 3125 table of transition probabilities can be applied to any EQ-5D-3L value set to generate a 5L ‘crosswalk’ value set.

At the time when the van Hout et al. (2011) mapping was developed, EQ-5D-5L value set studies had not yet been initiated, which made it impossible to develop a bi-directional crosswalk. More recently, following users’ demand and due to the availability of EQ-5D-5L value sets, the same data used in the original van Hout et al. (2011) crosswalk were employed for mapping the EQ-5D-3L to the EQ-5D-5L, using indirect non-parametric and ordinal logistic regression methods (van Hout and Shaw 2021).

An alternative response mapping approach for deriving EQ-5D-5L or EQ-5D-3L values has been proposed by Hernández-Alava and Pudney (2017), but it currently remains less used. This mapping was re-estimated on multiple samples, with the most recent estimation being based on a large dataset of English responders (Hernández-Alava et al. 2020). Its statistical performance is similar to that of the van Hout crosswalk for mapping the EQ-5D-3L to the EQ-5D-5L (Hernández-Alava et al. 2020). The van Hout and Shaw (2021) mapping, using ordinal logistic regression including regressors coding for other EQ-5D-3L dimensions, show a slightly better performance than that of Hernández-Alava and Pudney (2017) for mapping the EQ-5D-3L to the EQ-5D-5L. It is notable that the current iteration of the Hernández-Alava and Pudney (2017) crosswalk only allows mapping to UK/English value sets, while the models developed in van Hout and Shaw (2021) are freely accessible in R, and are easily adapted to other value sets.

As there is currently no consensus about which of these approaches should be used, users are encouraged to check the latest recommendations from the scientific advisers in the EuroQol office and the relevant HTA body. The analysis tools section of the EuroQol website reports generic and country-specific algorithms for both the van Hout et al. (2011) EQ-5D-5L to EQ-5D-3L and the van Hout and Shaw (2021) EQ-5D-3L to EQ-5D-5L crosswalks, as well as syntax for the value sets for some countries.

These are available at: https://euroqol.org/support/analysis-tools/

If there are no local value sets for either the EQ-5D-5L or EQ-5D-3L, an obvious suggestion is to use a value set from a country that has a similar population, considering socio-demographic, cultural and linguistic characteristics that might be expected to influence health preferences (evidence about how such characteristics influence values is presented in Chap. 6). That is straightforward if there is only one such country, and their value set satisfies the other criteria detailed below. Where there is more than one value set which may be considered relevant and acceptable, the choice of value set should be subject to sensitivity analysis.

A special case is where a study is undertaken in more than one distinct population, as may be the case with, for example, a multi-country or multi-region clinical trial. While it has been proposed to use a single value set to represent the preferences for a region or continent when available (e.g., Greiner et al. 2003; Łaszewska et al. 2020), this solution is currently not widely applied. The possibility of developing regional value sets for EQ-5D-5L is explored in Chap. 6. If the results of the clinical study are to be used in different HTA jurisdictions, each of which makes recommendations about the use of value sets, these should be followed - which might result in more than one value set applied to the same data.

There are advantages to having a single value set that could be used in cases where there is no local alternative, or the values are required to cover more than one locality – for example, in enabling comparison of results in such cases. The EQ-5D value sets for the UK and the USA have sometimes been used for this purpose. However, there is no scientific rationale for choosing any value set as a default option.

5.2.3 Empirical Characteristics of the Value Sets

For most analysts, it is likely that the above considerations will suffice to choose a value set. However, there may remain cases where a choice between value sets must be made. In such cases, it is helpful to examine the quality of the study that generated the value set. This includes the quality of the valuation data and modelling choices made by the study authors and how the particular properties and characteristics of the value sets compare. Analysts who do not feel able to make judgements using the criteria discussed below are encouraged to contact the EuroQol office, whose scientific officers are well placed to advise.

A check list for assessing value sets, such as the one provided by Xie et al. (2015) (Checklist for REporting VAluaTion studiEs – CREATE) provides a structured way of approaching the assessment of study quality – see Box 5.3. However, this checklist focuses on the quality of the reporting of the studies and does not directly address considerations of the quality of collected data upon which models are based (other than where these lead to exclusions). Obvious questions to ask about the quality of the data collected in the value set study include: Was the sample size appropriate and was a reasonable response rate achieved? Is the sample representative of the general public? Is there any cause for concern about data quality - for example, were there high rates of missing or implausible valuations? Were there interviewer effects? Were the interviews conducted in a manner that was compliant with the protocol? These issues are addressed in Chap. 2 and are reported for each of the value sets summarised in Chap. 4.

Box 5.3: The CREATE Checklist (Reproduced from Xie et al. 2015)

  • Descriptive systems

    1. 1.

      The attributes of the instrument are described

    2. 2.

      The number of levels in each attribute of the instrument is described

  • Health states valued

    1. 3.

      The approach to selecting health states to be valued directly is explained

    2. 4.

      The number of health states valued per respondent is stated

    3. 5.

      Method(s) of assigning the health states to respondents is stated

  • Sampling

    1. 6.

      Sample size/power calculations are stated and rationalised

    2. 7.

      Target population is described

    3. 8.

      Sampling method is stated and rationalised

    4. 9.

      Recruitment strategies are described

    5. 10.

      Response rate is reported

  • Preference data collection

    1. 11.

      Mode of data collection is stated

    2. 12.

      Preference elicitation technique(s) are described

  • Study sample

    1. 13.

      Reasons for excluding any respondents or observations are provided

    2. 14.

      Characteristics of respondents included in the analysis are described

  • Modelling

    1. 15.

      The dependent variable for each model is stated

    2. 16.

      Independent variables for each model are explained

    3. 17.

      Model specifications are provided

    4. 18.

      Model estimators are described

    5. 19.

      Goodness of fit statistics for each model are reported

  • Scoring algorithm

    1. 20.

      Criteria for selecting the preferred model are stated

    2. 21.

      The scoring algorithm is presented

With respect to the modelling methods used to produce value sets from the valuation data, quality may be judged both by the statistical methods used and also by conformity of the value set to properties that are essential or desirable for the way that they will be used. What criteria were used in selecting the specific model used to produce the value set?

In the case of the value sets reported in Chap. 4, many of these issues relating to data quality, though not subsequent modelling of the data, are dealt with by the rigorous quality control (QC) process applied to EQ-VT-generated data from wave 2 onwards (see Chap. 2). Users of the resulting value sets can therefore have greater confidence in their use. The value sets reported in Chap. 4 follow the EQ-VT protocol and study designsFootnote 1 set out in Chaps. 2 and 3. They have also been published in peer-reviewed journals, and therefore meet the scientific standards of those journals. However, the EuroQol Group does not currently have a formal process for endorsing value sets, an issue which is discussed in Chap. 7. Furthermore, the QC processes used in the first wave of studies were not standardised and did not always satisfy the requirements of users; an example is the concerns expressed by NICE about the first EQ-5D-5L value set for England (see Hernandez-Alava et al. 2020 and van Hout et al. 2020). This issue was addressed via strengthened QC in subsequent waves, as detailed in Chap. 2.

There are also ‘non-standard’ EQ-5D-5L value sets available that do not follow the EQ-VT protocol and were undertaken independently of the EuroQol Group, for example, Craig and Rand (2018) for the USA and Sullivan et al. (2020) for New Zealand. Other ‘non-EQ-VT’ value sets may be produced in future. Researchers have employed different methods, using different protocols, and analysed their data using different econometric procedures, and the resulting value sets will reflect this. The EuroQol Group encourages the use of its EQ-VT protocol in studies aiming to produce national value sets for the EQ-5D-5L, to enhance consistency and comparability. The EuroQol Group does not aim to prevent or discourage improvement or innovation in methods for valuing the EQ-5D family of instruments, indeed it actively supports methodological studies.

Users should be aware of and familiarise themselves with the characteristics of the EQ-5D-5L value sets they choose, whether generated by the EQ-VT protocol or not. Are there important differences in preferences between dimensions? Are there any interaction effects in the values that apply when there are particular combinations of health problems? These characteristics of the value sets combine with the properties of the patients’ EQ-5D-5L profile data to which they are applied with important implications for QALY estimates (Parkin et al. 2016).

In general, users should be aware of the characteristics of value sets, such as the overall range of values, how these are distributed and whether there are interaction terms, as these will all exert an influence on their use in statistical analysis (Parkin et al. 2010). For example, if the health condition under consideration involves very severe states, the way in which values for states considered ‘worse than dead’ have been calculated, rescaled or bounded in the value set will be of particular relevance. If the health states are experienced for long durations, it will be relevant to examine how this relates to the duration of states described in the valuation exercise given the possible effect of “maximum endurable time” on valuations (Sutherland et al. 1982) and the assumption of “constant proportionality” (Dolan and Stalmeier 2003). If the treatment under consideration involves marginal improvements from very good health states to full health, the way in which the constant term has been handled in modelling will affect the estimated change in QALYs.

5.2.4 Transparency and Uncertainty

The most important decision about which value set to use is for the ‘base case’ for analysis, but it is also recommended that where possible and appropriate analysts also undertake sensitivity analysis using alternative value sets.

The choice of a base case value set should be carefully considered before undertaking analyses, as well as which sensitivity analyses are required given the decision context. For a prospective study, it is important that both the choice of base case and alternative value sets and the rationale for choosing them are clearly set out in the project protocol and statistical analysis plan, and that these are adhered to.

It may be that, considering the factors discussed in the previous sections, there is no value set which is unequivocally ‘the best’. In such cases, the analyst’s choice of base case value set should be carefully justified; it is essential that analysts are transparent about the reasons for their choice of base case value set. Usual good practice for such decisions is to choose the value set that is likely to generate the most conservative set of results for the base case. For example, if used in a trial of a new treatment over an established alternative, the principle should be to choose the value set that will generate the results least favourable to it. It would clearly be unethical and contrary to principles of good scientific practices to choose a value set on the basis that it will generate results most favourable to the analyst’s preferred outcome for the study.

In cases where there remain doubts about which value set to use, analysing and reporting the sensitivity of results and conclusions to alternative value sets will increase the value of the information generated. If results are not substantially affected by the choice of value set, this increases confidence in the findings. Where results and conclusions are contingent on which value set is used, it is very important to convey this information to those who will use this evidence in health care decisions. However, it is important that this recommendation is not interpreted as meaning that users should simply undertake their analyses using different value sets.

In these cases, the EQ-5D-5L values used in an economic appraisal are appropriately considered as part of the uncertainty around the variables that form the economic appraisal model. The analyst should treat the values in an economic appraisal as uncertain parameters and subject them to sensitivity analysis, as with other non-stochastic uncertain variables such as the discount rate. Currently this is not common practice, but it is readily done and would improve confidence in results.

5.3 Which Value Set to Use in ‘Non-QALY’ Applications – An Overview

Cost-effectiveness analyses is an obvious application for which a single number summary of EQ-5D-5L profile data is essential, but there are other contexts in which this may be useful. Examples of these kinds of applications include:

  1. (a)

    Population health studies:

    • Describing population norms. For example, Szende et al. (2014) published EQ-5D-3L data for 24 countries.

    • Comparing population health between different regions, countries or other populations; or over time. For example, the Annual Health Survey for England (NatCen 2021) periodically includes the EQ-5D-3L, including the EQ VAS.

    • Setting a baseline for measuring the impact of a population health care intervention. For example, Lubetkin et al. (2020) use the EQ-5D-5L to examine the effect on the New York population of the 2020 lockdown during the COVID-19 pandemic.

    • Measuring the impact of events that affect population health. For example, Andrade et al. (2021) estimated the impact on the local population’s health of a technological disaster in a region of Brazil using the EQ-5D-3L.

    • Measuring inequalities in population health (Franks et al. 2006; Lubetkin et al. 2005).

  2. (b)

    Patient condition studies:

    • Describing the severity of illness amongst patients. For example, van Wilder et al. (2019) published EQ-5D-3L values for many chronic conditions, disaggregated by patient characteristics.

    • Waiting list management. For example, Derrett et al. (2003) applied EQ-5D-3L valuations to patients’ EQ-5D-3L profiles as a means of creating a ranking of patients on elective surgery waiting lists in terms of the severity of their condition and their suggested priority for treatment.

    • Summarising the performance of hospitals in achieving improved health outcomes for patients as a result of surgery. For example, the National Health Service (NHS) in England publishes hospital-specific data from its Patient Reported Outcome Measures (PROMS) programme using EQ-5D values from the UK population as a whole, rather than from patients who use the hospital, reflecting the fact that the NHS is a national service (Appleby et al. 2015).

Many of the considerations for choosing which value set to use in QALY estimation are also relevant in the context of ‘non-QALY’ applications, in particular the applicability of the value set to the population to whom the analysis refers (Sect. 5.2.2) and the value sets’ empirical characteristics (Sect. 5.2.3).

A further essential consideration in this context is that the values used should be appropriate to the proposed application and context. As values are not neutral, they should reflect the views of those population and groups that count in judging importance given the decision context in which they are applied.

Figure 5.2 provides an overview of the considerations concerning whether a value set is appropriate to use in applications where the principal aim is not to estimate QALYs, and which value set should be chosen in such applications.

Fig. 5.2
figure 2

Overview of considerations for using EQ-5D-5L value sets in ‘non-QALY’ applications

As indicated at the start of this chapter, the first and most important question for any user of any value set is: ‘What is the purpose of representing EQ-5D-5L profile data as a single number?’. Value sets are often used to provide a convenient means of summarising EQ-5D data as a ‘single number’ for the purposes of statistical analysis (Devlin et al. 2020).

There are important advantages in being able to summarise and represent an EQ-5D-5L profile by a single number – for example, it simplifies statistical analysis. However, it is important to note that there is no “neutral” set of values that can be used for this purpose. Any value set for the EQ-5D-5L explicitly or implicitly compares each level of each dimension with every other and attaches relative importance to them. No set of values is “objective”: they all embody judgements about both what is meant by importance and the appropriate source of information for assessing it. It is therefore not possible to offer generalised guidance about which value set to use if the sole purpose is summarising profiles for descriptive or inferential statistical analysis. However, users should be aware that using a value set can introduce an exogenous source of variance that may bias statistical inference. For example, using one value set rather than another may make a difference to conclusions about whether there are statistically significant differences between EQ-5D-5L responses between arms of a clinical trial, two groups of patients, or two regions (Parkin et al. 2010; Wilke et al. 2010). Of course, where the purpose of analysis is to reflect a society’s view about the relative importance of different kinds of health problems, this may be considered a desirable feature.

Users should consider the wider purpose for which the summary will be used. If there is no one purpose, rather just a desire to provide information, then it may not to be necessary to apply a value set to the data, but rather to report the EQ-5D-5L profiles themselves in some detail. This may also be preferable because EQ-5D values provide less detailed information than a profile. A range of methods for analysing and reporting profile data are provided in Devlin et al. (2020).

Further, in some cases where a single number is required to represent health, it may be more appropriate to focus on the EQ VAS data provided directly by the relevant patients or populations themselves, rather than using profile-based values. Whether the EQ VAS or value set-weighted profiles are most relevant will depends on the nature of the analysis, and its purpose, and whether it is patients or society’s perspective that is most important.

An alternative to applying EQ-5D-5L values sets of the kind reported in this book, or to focussing analysis just on EQ-5D-5L profiles or EQ VAS data provided by patients, is to apply a different means of aggregating profile data. One approach which has been explored is to develop a scoring algorithm based on predicted EQ VAS. Using a sample of patients’ or population data, the responses to the EQ-5D profile are used to predict the EQ VAS via regression analysis (Hardman et al. 2002; Whynes and The TOMBOLA Group 2008; Feng et al. 2014; Burstrom et al. 2014; Gutacker et al. 2020). These provide, for any given EQ-5D profile the average EQ VAS on a 0–100 scale (representing worst to best health imaginable). As such a scale is not anchored at dead = 0, it is not suitable for estimating QALYs – but does represent an average view of how good or bad health states are. Where the relationship between the profile and EQ VAS is based on patient data, such value sets are also claimed to represent patients’ experience. This use of VAS data is examined further in Sect. 5.4.1.

In contrast to the application of EQ-5D-5L data in QALY estimation, where the requirements of economic evaluation provide a broad theoretical foundation to guide the choice of value sets (see Box 5.1), the analysis of EQ-5D-5L data in other applications may lack an obvious theoretical foundation to guide how data are appropriately analysed or reported. For users concerned with choosing value sets with particular theoretical properties, Sect. 5.4.2 provides a brief discussion of the issues. Where the end user of analysis is known, and where the kinds of decisions that the analysis will inform is clear, the choice of approach should be guided by any requirements of the end user or, where none are provided, by considering what is most relevant to the decisions at stake.

Note that in many of these ‘non-QALY’ applications of EQ-5D data, analysis of EQ-5D-5L profiles, EQ VAS and EQ-5D values may all be relevant to decision makers, as each provides different and complementary information. Where this is the case, the use of value sets to summarise EQ-5D-5L profile data should be accompanied by analyses of EQ-5D-5L profile and EQ VAS data. An example of this is the use of the EQ-5D-3L in studies of the general population in different countries, including those designed to generate population norms. The key EuroQol Group publication on this (Szende et al. 2014) includes values based on value sets, but also reports comparative EQ VAS and dimension and level data for 24 countries using the EQ-5D-3L.

Finally, where there is a clear rationale for using value sets to weight EQ-5D-5L data for statistical analysis (for example, where society’s rather than patients’ preferences are considered paramount), the advice provided in Sect. 5.2 will be equally relevant. For example, the basis for choosing which value set is used should be clearly stated, ideally in advance of analysis, and sensitivity analysis undertaken to determine whether the characteristics of that value set exert an important effect on results and conclusions.

5.4 Choosing Value Sets – Some Further Considerations

This section complements the overview provided in Sects. 5.2 and 5.3 with a more detailed discussion of two issues: relevance to the decision-making context, and theoretical properties of value sets.

5.4.1 Relevance to the Decision-Making Context

We have already noted that, as a general principle, users should choose a value set which is relevant to the decision-making context. A first assessment of relevance relates to the country in which values were obtained, as described earlier. Yet, other more nuanced facets may need to be considered to deem a value set relevant, including whose values are relevant in the context of interest and what is the appropriate source of such values.

The question of whose values are relevant has been widely debated and there are different possible answers to that (Dolan et al. 2003). Most of the evidence and considerations presented in this chapter relate to “social” value sets (such as those reported in Chap. 4), which are meant to represent the average values of the general public. In essence, these “social” valuations for EQ-5D-5L are generated from members of the general public being asked to consider states that may be hypothetical to them, and to value them from the perspective of imagining being in those states.

There are normative arguments advanced for using social valuations in economic evaluation. Broadly speaking, the purpose of any economic evaluation is to assess the value for money of alternative uses of scarce health care resources. Where the context of these decisions is the public sector, it is generally argued that the valuation of health states used in the assessment of ‘benefit’ should reflect, as closely as possible, the preferences of the relevant general public. This is both because, in publicly-funded health care systems, it is the general public who are funding health care, e.g. via taxes; and because the general public are potential users of the health care system and can provide valuations ‘behind a veil of ignorance’.

An alternative could be to create a “patient-based value set” consisting of values elicited from patients, using either the same stated preference methods used for the general population or revealed preferences based on self-reported EQ VAS values. Patient-based value sets are preferred in some countries, such as Germany and Sweden (Rowen et al. 2017). Proponents of this choice argue that “patient-based value sets” reflect the preferences of those who are actually experiencing the states, and for this reason are more well-informed. Differences between patients’ and the general public’s valuation of states are common and have been extensively observed. For example, members of the general public often give a lower value to health states than those who experience them, as they cannot predict what their experience in that state would be or how they would adapt to it (Brazier et al. 2005). Ogorevc et al. (2019) report significant differences between patients’ and general public values, but these varied by dimension, with patients considering mobility and self-care problems as less problematic, but pain/discomfort and anxiety/depression more problematic. While it may be desirable to include an assessment of patients’ values as an adjunct to the main analyses in most studies, there are theoretical concerns about using these values in the context of, for example, economic evaluation. For example, the fact that values for health states may be modified by adaptation could be an argument against their use for decision making based on ex ante judgements about the value of health care interventions. Moreover, it may be difficult to include patients in valuation studies given their impaired health and unethical to perform an intrusive valuation interview with them. These considerations and practical limitations have led most HTA bodies (with the notable exception of Sweden’s TLV, as noted earlier) and end users to specify that it is general public values which are required, and this is reflected in the protocol for valuation of EQ-5D-5L. For this reason, this chapter assumes that a representative sample of the general public is preferred.

Nevertheless, it may be that, pragmatically, the only available source of values is from the patients whose health states are being analysed, or that in some applications these are regarded by the relevant decision-makers as being the most appropriate. There have also been some debates about whether or not it is appropriate to use the values from sub-groups of the population rather than the population as a whole – for example, the values of women or older people for conditions which only affect them (Sculpher and Gafni 2001, 2002; Robinson and Parkin 2002). Similarly, there are debates about whether the values of children and adolescents, who are generally excluded from sampling but are also members of the general public, are relevant to include in social values (Hill et al. 2020). There is currently no consensus on these issues.

A second relevant issue is the point in time that value sets were generated. Just as there are important differences in health state values between countries (as is evident in the value sets reported in Chap. 4, and compared in Chap. 6), it is possible there may be differences in the average values within a country, over time. This would arise if preferences regarding health are not stable, as is normally assumed in economics, but change over time (Bridges 2003), perhaps because of changing experience of and expectations about health. Further, the composition of the general public changes through time, as a result of ageing, changes in immigration and emigration, and sociodemographic shifts, and such changes may also affect the average preferences of society that value sets reflect. We currently have very little evidence on these matters for EQ-5D-5L valuations, because of the relative recency of these value sets, or for other HRQoL instruments, because differences in methods used limit the comparability of valuation data through time. However, as a general rule, a more recent value set is preferable to an older one, providing they are equally relevant in other ways, and are otherwise comparable on the empirical and theoretical grounds discussed below. This question of what the appropriate ‘shelf-life’ of a value sets is, is considered further in Chap. 7.

5.4.2 The Theoretical Properties of Values and Value Sets

As well as the TTO and DCE methods used in the EQ-VT, there are other methods for valuing health states including Standard Gamble (discussed in Box 5.1), Magnitude Estimation, Paired Comparisons (PC), Rating scales, Visual Analogue Scales (VAS), the Better than Dead approach (van Hoorn et al. 2014), Number Equivalence (also known as Person Trade-Off) and Personal Utility Functions (PUF) (Devlin et al. 2019). And, while the EQ-VT uses a specific type of TTO, composite TTO (cTTO) (see Chap. 2), there are other forms of TTO (such as lead time TTO and lag time TTO); similarly, there are still other types of DCE (such as DCE with duration; and best worst scaling). These other methods are not currently widely used for valuing the EQ-5D-5L and in many cases have only been used in smaller experimental studies, rather than the large-scale representative sample studies appropriate to the construction of value sets for practical use. However, they have been used to estimate value sets for other instruments – for example VAS for the EQ-5D-3L and PC to estimate disability weights for the World Bank / World Health Organisation Disability Adjusted Life Years project. It is possible that future non-standard value sets may be generated that have different properties to those generated by the TTO and the DCE, which may be an important factor in the choice of value sets.

Unfortunately, the theoretical and empirical case for favouring one method of health state valuation over another is far from clear-cut. In the context of QALY estimation, for example, it has been argued that the QALY is no more than a convenient device to combine length and quality of life into a single metric (Parkin and Devlin 2006) and does not need to conform to theoretical concepts such as ‘utility’ or measurable ‘utility’. The theoretical foundations of QALYs therefore do not require that quality of life be valued using a particular measurement method. However, the current dominant practice of using TTO and DCE methods, following the rationale provided in Box 5.1, has the merit of imposing consistency between the resulting value sets and giving a relatively clear interpretation to them.

The recommendation is therefore to exercise caution when considering using value sets resulting from non-standard valuation methods and to examine closely the rationale used by their developers.

5.5 Concluding Remarks

There is no simple answer to the question of which value set to use: the answer depends on the specific nature of the research application, the sort of decisions it informs, and the context in which the evidence from your research will be used.

In some cases, which value set to use will be determined by the stated requirements of those using the evidence to inform decision-making. Where this is not the case, we encourage potential users of EQ-5D-5L value sets carefully to consider each of the practical and theoretical issues discussed in this chapter. We strongly recommend that users clearly justify their choice of value sets in a transparent manner. Where there remains uncertainty over which value set to use, we recommend that researchers should report the sensitivity of their results and conclusions to the use of alternative value sets. In applications where QALY estimation is not a goal, there may not be a clear rationale for using a value set as the focus of analysis, and users are encouraged to make full use of the EQ-5D-5L profile and EQ VAS data provided by respondents.