FormalPara Key Points for Decision-Makers

Two main types of catalogues are supplied for both the UK and the USA: 1. unadjusted sample mean utilities for 199 chronic conditions and health risks and 2. corresponding regression models for the USA and UK that can be used to make utility predictions in other populations.

Results indicate that health-related quality of life (HRQoL) varies across disease groups but is lowest for renal disease, mental and behavioural disorders, some cancers, blood diseases, digestive system diseases, and nervous system diseases.

Health risks and lifestyle factors, including loneliness, perceived stress, and high BMIs, are associated significantly with low HRQoL after controlling for chronic diseases.

1 Introduction

Economic evaluations are routinely used in appraisals of health technologies, such as those used by the National Institute for Health and Care Excellence (NICE) in England or the Institute for Clinical and Economic Review (ICER) in the USA [1, 2]. These economic evaluations often measure benefits through the quality-adjusted life-year (QALY). The QALY combines quality and length of life considerations in a single measure [3, 4], and is usually based on generic patient-reported outcome measures such as EQ-5D. The EQ-5D is considered the most widely used generic health utility measure in the world [4], and it is the preferred health benefit measure of NICE for its technology appraisals [1]. It is applied in numerous research and clinical practice settings, such as cost-utility analysis (CUA), clinical trials, patient surveillance or monitoring, and population health measurement [5,6,7,8].

The EQ-5D instrument describes health using five dimensions: mobility, self-care, usual activities, pain/discomfort and anxiety/depression. There are currently two instrument versions: the original EQ-5D version, now named EQ-5D-3L [9], and the newer EQ5D-5L version [10]. Both include the same five health dimensions but differ in the number of levels of severity. In EQ-5D-3L, each dimension has three possible levels (no problems, some problems, extreme problems), and it can describe 243 unique health states [9, 11,12,13]. EQ-5D-5L incorporates two additional levels in each dimension to improve the sensitivity of the original three-level version. Separate studies have estimated country-specific utility value sets for the health states described by both instruments on the basis of general population values [11,12,13,14,15]. Utility values are anchored at an upper value of one equivalent to perfect health and zero for states considered equivalent to being ‘dead’. Negative values representing health states considered worse than dead are also possible [16]. NICE currently recommends using the three-level (3L) version in the UK [17]. However, mapping algorithms (crosswalks) exist to convert EQ-5D-3L to EQ-5D-5L values [18,19,20] which will allow for the translation of the EQ-5D-3L values in this catalogue to EQ-5D-5L utilities once the new UK EQ-5D-5L value set is released and additional countries, such as the USA, are included.

The importance of ensuring comparability across economic evaluation studies of technologies and over time has been stressed by health economists and organisations such as the Panel on Cost-Effectiveness in Health and Medicine (PCEHM) and NICE [17, 21,22,23,24]. The PCEHM recommended ‘the collection of a national catalogue of preference weights that could be used by CEA researchers without the burden of primary data collection’ given the large variability in the literature of utility values associated with different medical conditions [24, 25]. In response, Sullivan et al. (2006, 2011) published two separate EQ-5D-3L catalogues for the USA and the UK [24, 26,27,28], both based on regression estimation using representative US samples of the Medical Expenditure Panel Survey (MEPS). They included 157 mostly self-reported chronic conditions classified according to the International Classification of Diseases, Ninth Revision (ICD-9). The US catalogue was based on pooled 2000–2002 MEPS data and used the US utility value set [12]. The UK catalogue included an additional year of the MEPS dataset (2000–2003) and used the UK EQ-5D-3L value set [11]. The statistical models included gender, age, income, ethnicity, education, the number of comorbidities and the primarily self-reported chronic conditions but did not incorporate health risks. A number of less comprehensive catalogues are also available. For example, Finland (2006) [29] and Korean (2009) catalogues cover fewer than 30 conditions [30]. In 2019, a systematic review of 207 studies collated EQ-5D-3L estimates among fifteen ICD-10 disease groups [31]. However, the included studies used value sets for different countries (some failed to report the specific value set used) and even different versions of EQ-5D. These issues, coupled with heterogeneity in the quality of the studies, limit the practical usefulness of the estimates found in the review. Recently, a Danish study published a catalogue of Danish EQ-5D-3L preference scores for 199 chronic conditions based on newer, improved methodology, including the use of ICD-10-based conditions and regression models appropriate for EQ-5D [32, 33].

The current study is based on and broadly replicates the most recent Danish study, but applies the UK and US EQ-5D-3L value sets to convert the responses to the EQ-5D-3L instrument into utility scores to enable the use of these data in the UK and the USA. The objectives are: (1) to create population-level off-the-shelf catalogues of UK and US EQ-5D-3L preference-based scores for 199 ICD-10 defined chronic conditions to improve on existing UK and US catalogues and (2) to provide two separate statistical models to allow researchers to predict utilities in their own population of interest for any combination of chronic disease. This first catalogue supplies unadjusted, national sample population estimates of UK and US norm-based mean utilities for the 199 chronic conditions, socioeconomic covariates, and health risks. A second catalogue is also provided as a reference, and it presents UK and US regression-based utility estimates and marginal effects for the same 199 chronic diseases, socioeconomic covariates and health risks, adjusting for the sample composition allowing for direct comparisons across chronic diseases for the same representative individual. To construct this second catalogue, two alternative EQ-5D-3L model specifications are estimated, one with a core set of covariates and the second one including additional health risks and socioeconomic controls. To supplement the second catalogue, we also provide a Stata do file and a user guide in the online supplementary materials to enable health professionals to estimate EQ-5D-3L in their specific settings and populations. In combination with previous publications and details on the prevalence [34], multimorbidity associations [35], socioeconomic differences [36], and health risks disparities [37] of the same 199 chronic conditions, these estimates can provide valuable information for future resource allocation in the UK and the USA.

In common with the recently published Danish catalogue [32], there are several important differences with earlier studies. First, most previous studies rely partly on self-reported health conditions, which raises questions about accuracy [38,39,40,41]. Second, we use the newer ICD-10 classification system and include 199 ICD-10-based chronic conditions. Third, behavioural risk factors such as smoking, body mass index (BMI), stress and social networks have been suggested as potentially necessary controls [42]. These are missing from existing catalogues but are found important in explaining EQ-5D-3L in the Danish catalogue. Fourth, acknowledging the biases caused by model misspecification and the now well-documented idiosyncrasies of the distribution of EQ-5D, we use the adjusted limited dependent variable mixture model (ALDVMM) to model EQ-5D [43,44,45,46,47,48]. Using traditional linear models, preceding catalogues have been mainly developed to provide population-based utility estimates of individual chronic conditions and cannot be directly transferred to the populations of interest in economic evaluations where decrements in utility from developing a new chronic condition in these less healthy populations would not be expected to mirror the decrements observed in a healthier population. This has led to research into possible ways of combining utilities for individual chronic diseases estimated on separate disease-specific samples (multiplicative, additive and minimum estimators are a few common choices) [49,50,51,52]. However, the ALDVMM, a nonlinear model reflecting the characteristics of EQ-5D utility data, allows utility decrements to be dependent on the combinations of comorbidities and other covariates in the model, enabling researchers to estimate realistic, real-world multimorbid health states of their specific interest.

2 Methods

2.1 Data

Following Sullivan et al. [28], we use the responses from a large representative sample from a single country (Denmark) and derive the associated utility values using each country’s value sets. This is in line with patient reporting from multinational clinical trials where the same value set is used across data from all countries and mapping studies of EQ-5D, which often apply utility value sets that differ from the country’s data collection when, for example, country-specific population data are not accessible for researchers. The dataset combines three national health survey samples with seven national registers containing patient-level information on diagnoses, healthcare activity and socio-demographics using a civil registration number that uniquely identifies Danish citizens [33].

A brief summary of the data is presented below. Full details of the methodology, samples, variables, weighting and data handling, and content of all registers can be found in these references [32, 33, 53,54,55]. Details on variable names can also be found in the online Supplementary Material 1 (Appendix 1). Responses to the five items of the EQ-5D-3L questionnaire were included in three of the Danish National Public Health Survey (NPHS) samples [53, 56,57,58], and 55,616 unique responses were obtained using the last two consecutive waves (2010 and 2013), which incorporated the EQ5D-3L questionnaire [53]. Details on dataset and respondent characteristics can be found in Table 1 of an earlier publication [32] and in the online Supplementary Material 1 (Appendix 2) in the current study. The survey data based on randomly assigned nationals comprised self-reported data on HRQoL, health behaviours (e.g. lifestyle factors), BMI, education, stress, loneliness, etc. Data from the surveys were combined with patient-level register data on diagnoses, healthcare activity, and socio-demographics from seven national registers based on the unique civil registration number [34, 54, 55, 57, 59, 60]. The registers contained information from somatic [61] and psychiatric hospital contacts [62], primary healthcare [63], prescribed medicines [64], gender, age, and ethnicity,Footnote 1 and residence place [65]. The aim was to derive each respondent’s ICD-10-based doctor-reported diagnosis, medicine and other relevant variables from identifying chronic conditions. A medical review team identified and grouped the chronic conditions from ICD-10 codes, and definitions, algorithms and methodology identifying the chronic conditions were clinically validated separately [54, 55]. EQ-5D-3L responses were converted to utility values using Dolan’s (1997), and Shaw et al. (2005) value sets for EQ-5D-3L for the UK and USA, respectively [11, 12].

2.2 Statistical analysis

We first compute sample means, medians and interquartile ranges for each country and the corresponding standard errors of EQ-5D-3L for the 199 chronic conditions and other socio-economic and health risk variables. These statistics are thus not adjusted for the potentially different sample composition of the chronic conditions groups but provide estimates of the population values instead.

Subsequently, we conduct regression analyses controlling for several variables, as detailed below. Utilities originating from measures such as the EQ-5D have specific characteristics that challenge statistical regression modelling. The values are bounded between the minimum value, which differs by country, and the value of one (or full health). The data distribution is often multimodal and highly skewed, with a large group of observations around one, followed by a gap to the first possible utility value below full health [44, 45]. The minimum values in the UK and US EQ-5D-3L utility scores are –0.594 and –0.102, and the highest utility scores below one are 0.883 and 0.86, respectively. Figure 1 shows histograms of the distributions of EQ-5D-3L using the UK and US value sets. There are a large number of individuals in full health. In both cases, the distributions present two rough additional modes and skewness.

Fig. 1
figure 1

Histograms of EQ-5D-3L using the UK and US value sets

The ALDVMM was developed to model the UK EQ-5D-3L value set utility scores. It has been shown to offer several advantages and perform better than standard models, which do not take into account the characteristics of utility data when modelling not only EQ-5D-3L, but other generic preference-based measures such as EQ-5D-5L, SF-6D and HUI3 as well [43,44,45,46,47,48, 66]. The ALDVMM is a very flexible semi-parametric model which uses mixtures of Tobit-like components and can accommodate all the idiosyncrasies of the EQ-5D-3L distribution, such as the upper and lower boundaries, the mass of observations at full health, the gap between full health and the next feasible value in the value set, and can approximate the skewness and multimodality often present in this type of data.

Estimating mixture models is challenging, but even more so for extensive models such as the current study. They require a comprehensive search procedure to identify the global maximum because the likelihood is not globally concave. At the same time, the number of components in the mixture is unknown and requires selection as well. This involves estimating models with an increasingly higher number of components and using graphical methods and standard fit statistics to select the optimal model [66]. We used the community-contributed Stata ALDVMM command available for free download and published in the Stata Journal [44, 45].

Two separate ALDVMM regression models were estimated as follows for each of the UK and US EQ-5D-3L models:

  • Base model: EQ-5D-3L as a function of the 199 chronic conditions, sex, age, samples (to adjust for regional differences) and the number of comorbidities. This model is helpful for economic evaluations in which the extra controls included in the full model below are not available or required.

  • Full model: includes additional controls to adjust for common health risks and socioeconomic variables. The added variables are: family equalised income (£ and $), education (no education or training, students/training, short, middle bachelor equivalent, or higher education such as a master’s degree or higher, ethnicity origin (Danish, western or non-western), partnership status (partner or not), home living children (yes/no), social network (often feeling lonely or not), perceived stress (Cohen’s Perceived Stress Scale, 20% most stressed or not), BMI groups (< 18.5; 18.5–25; > 25 < 30; ≥ 30 < 35; ≥ 35), daily smoking (yes/no), alcohol intake grater national recommendations (yes/no), nationally recommended exercise and fruit intake (yes/no), SF-12 self-reported general health [67] (‘excellent’; ‘very good’; ‘good’; ‘fair’; ‘poor’; ‘missing’), and self-reported long-term illness or disability (‘none’; ‘yes’). See the online Supplementary Material 1 (Appendix 1) for model variable notations and details on model specifications.

Quadratic terms for family-equalised income and age were included in all estimated models where the levels of those variables were included.Footnote 2 Non-response and population weights standardising by age, sex and education were used when calculating EQ-5D-3L sample statistics. The estimated ALDVMMs did not use weights but incorporated controls for variables comparable to those used in the weighting procedure.

After model estimation, marginal effects (ME) for the chronic conditions and other variables included in the model were also calculated. The structure of linear regression models implies that the expected decrease in EQ-5D-3L due to developing a chronic condition (the ME) is the same for every individual regardless of, for example, how many other chronic conditions the individual has. This ME is simply the coefficient of the chronic condition in the linear regression model. However, in nonlinear models, such as the ALDVMM, the ME is a function of that specific health condition and all the other covariates incorporated in the model, such as the presence of other chronic conditions, age, sex, etc. In this case, the decrease in EQ-5D due to a new chronic condition will be different for individuals with no previous chronic conditions and those with other chronic conditions already present. A user-written program was developed to handle the links between the 199 chronic conditions variables and the variable accounting for the number of comorbidities.

Preliminary data management of registers and surveys was conducted in SAS9.4 in a secure server at Statistics Denmark research server facilities following legal regulations. Data analysis was carried out in STATA15 using the same secure research servers. Online supplemental materials are provided, including documentation and Stata programs to facilitate the use of the estimated models by analysts to calculate predictions. The parameter estimates and the (co)variance matrices of the models are also made available to allow the use of probabilistic sensitivity analysis within economic evaluations [68].

3 Results

3.1 Sample-based estimates of the EQ-5D-3L

Table 1 provides a one-page overview of the UK and US EQ-5D-3L sample mean scores for the overall sample and sex and age intervals for 20 chronic disease groups, overweight and commonly prescribed medicines. The overall sample means of EQ-5D-3L utility scores are 0.829 and 0.869 for the UK and USA, respectively. These differences reflect the differences in the EQ-5D-3L countries’ value sets. Among the seven highest prevalent disease groups (E, G, H, I, J; M, F), chronic diseases within the respiratory system [J; scores 0.776 (UK); 0.831 (USA)], endocrine disorders [E; scores 0.742 (UK); 0.807 (USA)], diseases in the circulatory system [I; scores 0.741(UK); 0.807 (USA)] and diseases of the eye and adnexa [H; scores 0.736 (UK); 0.803 (USA)] had the highest HRQoL sample mean scores. Furthermore, highly prevalent diseases of the musculoskeletal system [M; scores 0.705 (UK); 0.782 (USA)], diseases of the nervous system [G; scores 0.697 (UK); 0.775 (USA)] and mental conditions [F; scores 0.651 (UK); 0.742 (USA)] had the lowest EQ-5D-3L sample mean scores. Finally, less prevalent disease groups such as genitourinary conditions [N; scores 0.625 (UK); 0.725 (USA)], benign neoplasm and diseases of the blood [group D; scores 0.687 (UK); 0.768 (USA)], and diseases in the digestive system [group K; scores 0.692 (UK); 0.773 (USA)] showed even lower HRQoL scores. In all cases, utilities based on the US value set were higher than those based on the UK value set. Although the differences in utilities between the UK and the USA were not constant, the ranking according to utilities was preserved across these broad disease groups. The sample of men appeared to have higher mean scores than the sample of women in most cases, and utilities tended to decrease with age in both the UK and US samples.

Table 1 Overview of EQ-5D-3L UK and US sample utility scores, mean age and mean number of chronic conditions (NCC) across overall disease groups, prescribed medicine, gender and age

Table 2 presents a more disaggregated level of information and includes the sample mean EQ-5D-3L utility estimates and percentiles, mean age and number of chronic conditions (NCC) of all 199 chronic conditions and socio-economic and health risk variables for the UK and USA, respectively. The ten diseases with the lowest mean EQ-5D-3L scores are systemic sclerosis [M34; scores 0.362; 0.533 (USA)], fibromyalgia [M797; scores 0.369; 0.558 (USA)], unspecified rheumatism [M790; scores 0.390; 0.575 (USA)], dementia [F00, G30 etc.; scores 0.415 (UK); 0.572 (USA)], systemic atrophies [G10–G14, G30–G32; scores 0.475 (UK); 0.618 (USA)], post-traumatic stress disorder [F431; scores 0.482 (UK); 0.627 (USA)], cerebral palsy [G80–G83; scores 0.512 (UK); 0.646 (USA)], other inflammatory spondylopathies [M46; scores 0.522 (UK); 0.661 (USA)], dorsalgia [M54; scores 0.528 (UK); 0.662 (USA)], and spondylosis [M47; scores 0.538 (UK); 0.666 (USA)]. Furthermore, Table 2 presents that mean EQ-5D-3L scores are lower for groups with a higher number of comorbidities [0.908 versus 0.558 (UK); 0.926 versus 0.680 (USA) for 7+ conditions], lower in older age groups [from 0.891 to 0.694 (UK); 0.914 to 0.773 (USA)], lower for the lower educational groups excluding students [ranging from 0.75 to 0.898 (UK); 0.814 to 0.919 (USA)], average HRQoL is lower for women than men [0.809 versus 0.849 (UK); 0.854 versus 0.883 (USA)] and non-western immigrants [0.832 versus 0.758 (UK); 0.871 versus 0.819 (USA)]. Larger differences are found within health risk and lifestyle factors such as loneliness [0.630 versus 0.841 (UK); 0.728 versus 0.877 (USA)], high perceived stress [0.638 versus 0.881 (UK); 0.732 versus 0.906 (USA)], non-exercise [0.690 versus 0.858 (UK); 0.771 versus 0.889 (US)], BMI (normal weight 0.855 (UK); 0.887 (USA) versus BMI 35+ 0.707 (UK); 0.783 (USA)], daily smokers [0.774 versus 0.841 (UK); 0.830 versus 0.878 (USA)], but less so for those reporting excessive alcohol intake [0.813 versus 0.835 (UK); 0.857 versus 0.873 (USA)] and fruit intake below recommendations [0.829 versus 0.847 (UK); 0.869 versus 0.882 (USA)]. These averages do not consider potential differences in the composition of the groups; therefore, part of the differences between groups might be explained by the presence/absence of other comorbidities, health risks, etc. Below we present estimates that adjust for these potential differences.

Table 2 Sample EQ-5D-3L mean scores, percentiles, n, mean number chronic conditions and percentiles for the 199+ chronic conditions, socioeconomic variables, and health risks for the UK and US

3.2 Model-based adjusted estimates of the base and full model

Even though the underlying EQ-5D-3L response data are the same for both models and the distributions of the UK and USA, EQ-5D-3L (Fig. 1) show some similarities in terms of skewness and the number of modes, and the use of different value sets translates into some differences in the overall distribution, making it important to carry out the estimation procedure separately for the UK and the USA. Extensive searches were carried out for both countries’ ALDVMMs. A three-component model was chosen for the UK on the basis of sample fit measures, information criteria (AIC and BIC) and graphical methods. A two-component model was selected using the same methodological procedure for the USA. Searches for a third component did not yield any model that improved fit over the two-component models. Table 3 presents measures of fit and information criteria for the selected models for the UK and USA, respectively. Including the additional covariates in the model substantially improves the measures of fit. However, even the basic model, including all 199 chronic conditions and controlling for age and sex, shows a relatively good fit to the data (see Figs. 2 and 3 for plots of the cumulative distribution of EQ-5D-3L implied by the models versus the data). The online Supplementary Materials 2 and 3 provides the full set of parameter estimates of the selected models as Stata and Excel file downloads.

Table 3 Summary measures of fit for the UK and US base and full ALDVMMs
Fig. 2
figure 2

Cumulative percentage plots of UK models

Fig. 3
figure 3

Cumulative percentage plots of US models

The parameters of nonlinear models cannot be interpreted directly in the same way as the parameters of linear models. In nonlinear models, the effect of the conditioning variable depends on the value of the other variables in the model. For example, in our case, the disutility of a chronic condition will generally be different for a person who has no other chronic conditions and a person who already has some other chronic conditions. To interpret and compare the models and conditions meaningfully, we calculate the MEs for two representative 50-year-old individuals, male and female, with no chronic conditions. We set the rest of the variables to the mean sample values for continuous variables and the mode for discrete variables. The MEs for age are calculated by varying the age of the same individuals while maintaining the rest of the characteristics. Tables 4 and 5 present the predictions and marginal effects for the base and the full UK models. Tables 6 and 7 present the same information for the US models.

Table 4 ALDVMM predictions and marginal effects of representative 50-year-olds by sex: UK base model—chronic conditions and age
Table 5 ALDVMM predictions and marginal effects of representative 50-year-olds by sex: UK full model – chronic conditions, socio-economic, lifestyle and risk variables
Table 6 ALDVMM predictions and marginal effects of representative 50-year-olds by sex: US base model—chronic conditions and age
Table 7 ALDVMM predictions and marginal effects of representative 50-year-olds by sex: US full model—chronic conditions, socio-economic, lifestyle and risk variables

Chronic conditions from groups M, G and F have some of the largest estimated disutilities in the base UK model. Some examples of large disutilities are (ICD-10 code and ME for male/female in brackets): fibromyalgia (M797; ME − 0.1999/− 0.1900), sclerosis (G35; − 0.1379/− 0.1273), rheumatism, unspecified (M790; ME − 0.1336/− 0.1236), dorsalgia (M54; ME − 0.1041/− 0.0954), cerebral palsy (G80–G83; ME − 0.1034/− 0.0945), post-traumatic stress disorder (F431 − 0.0919/− 0.0837), HIV (B20–24; ME − 0.0833/− 0.0772), other intervertebral disc disorders (M51; ME − 0.0829/− 0.0754), diseases of myoneural junction and muscle (G70–G73; ME − 0.081/− 0.0742), dementia (F00–2, G30–32; ME −0.0806/−0.0744), spinal osteochondrosis (M42; ME − 0.0804/− 0.0733) and depression (F32, F33, F34.1, F06.32; ME − 0.0787/− 0.0716). When controlling for additional variables in the full model, the disutilities of the chronic conditions tend to decrease in size, probably due to the inclusion of the SF-12 question, which will be correlated to some extent with the presence of chronic conditions. One exception is fibromyalgia, where the disutility appears to be higher than in the base model (− 0.1999/− 0.1900); however, the sample has very few individuals with this chronic condition, and it is atypical to have fibromyalgia on its own; the median number of chronic conditions in these individuals is eight.

The disutilities of chronic conditions estimated using the US base model are generally smaller than in the UK model, reflecting the differences in the US/UK value sets. The largest disutilities correspond to chronic viral hepatitis (B18; ME − 0.1024/− 0.1040), sclerosis (G35; ME − 0.0842/− 0.0816), post-traumatic stress disorder (F431; ME − 0.0739/− 0.0717), dorsalgia (M54; ME − 0.0716/− 0.0695) and rheumatism (M790; ME − 0.0671/− 0.0650). Chronic diseases of group M show relatively higher marginal disutilities in the US model.

Both base models include age as a covariate for adjustment, and the full UK and US models also include other socioeconomic, lifestyle and health risk variables. For all four models, an increase in age is associated with a decrease in utility. The decrease per year is relatively small after controlling for chronic conditions and other covariates, and the rate of decrease increases with age. For example, in the base UK model, males with no chronic conditions at age 20 are estimated to have an average utility of 0.9121 (0.9270 for females); at age 80, the average utility declines to 0.8719 (0.8920 for females), with a difference of 0.0402 (0.0350 for females). In the US base model, the corresponding average disutility for that group is 0.0246 (0.0231 for females). This decrease is much smaller at younger ages, with an increase of 1 year expected to decrease utility for males by 0.0003 at age 20 but by a much larger 0.0037 at age 80.

We found relatively large decreases in marginal utilities within the single SF-12 general health question, stress, self-reported long-standing illness, high BMI and loneliness. The largest single disutilities were found for the SF-12 question of poor general health (UK − 0.2369/− 0.2260 for males/females; USA − 0.1650/− 0.1606 for male/females) followed by perceived stress (UK − 0.0595/− 0.0535 for males/females; USA − 0.0708/− 0.0692 for males/females), self-reported long-standing illness or injury (UK around − 0.073 for both sexes; USA around − 0.020 both sexes), BMI > 35 (UK − 0.0129/− 0.0113 for males/females; USA − 0.0296/− 0.0283 for males/females) and a sense of feeling loneliness often (UK − 0.0152/− 0.0136 for males/females; USA − 0.0231/− 0.0220 for males/females). Finally, higher education showed a relatively high positive difference relative to the baseline of no education (UK 0.0252/0.0244 for males/females) but less for the US models (0.0165/0.0155 for males/females).

Finally, estimates of socioeconomic disparities provide potential health improvements and have become increasingly important to public health assessments, interventions, researchers, healthcare decision-makers, and the industry [36, 70,71,72]. Hence, Appendices 3 and 4 in the online Supplementary Material 1 present easily available, off-the-shelf health inequality UK and US EQ-5D-3L average sample scores based on subgroups of educational, income, socioeconomic, age, gender, BMI groups across the 199+ chronic conditions and covariates.

3.3 Using the two types of catalogues

The sections above provide two separate catalogues that may be of interest. The first one provides population-based estimates of utility; the second one provides utility estimates adjusted for covariates for a representative individual so that comparisons across chronic conditions adjusted for other covariates can be made for UK and US EQ-5D-3L norms. Catalogue one, the population estimates, are useful when interest lies in EQ-5D-3L in the general population. The estimated regressions used to construct catalogue two may be used to calculate utility estimates in samples other than the general population sample used here, for example, when using it to appraise interventions with different age and sex distributions or lower baseline health than the general population. For this purpose, a technical guide on how to use the model estimates to predict EQ-5D-3L scores has been provided in the online Supplementary Material 2, along with Stata data and programming files to do so.Footnote 3

4 Discussion

EQ-5D is one of the most widely used measures of benefit for CEA. However, EQ-5D is not always collected in trials, and when it is, different populations, methodologies and value sets make comparisons across trials difficult. Utility catalogues of chronic diseases which use a standardised methodology can help to ensure meaningful comparisons can be made. This paper presents newer, standardised catalogues of mean-based EQ-5D-3L preference scores for 199 doctor-reported chronic conditions in the UK and the USA. Two versions of the catalogues are included, one based on a model controlling for chronic conditions, age and sex, and a second, more comprehensive one, which controls for additional socioeconomic, lifestyle and health risk variables. The study provides evidence of the association between different chronic conditions, health risks and lifestyle factors with HRQoL and their differences. This may be useful for benchmarking, and comparisons, including identifying what factors, such as conditions and/or health risks, are associated with the lowest HRQoL, of interest for policy- and decision-makers, health professionals, and researchers. The catalogue may also be of use to inform evidence in Public Health interventions designed to reduce the prevalence of some chronic conditions.

This study has several methodological strengths. First, it uses registry records to determine chronic conditions instead of self-reported data. Second, it includes 199 chronic conditions on the basis of the newer ICD-10 classification. Third, we use statistical models developed specifically to model the characteristics of generic preference-based measures such as EQ-5D-3L.

The study also has some limitations. Despite the large number of chronic conditions and other controls included in the models, heterogeneity within the chronic conditions can be large. Being able to control for other chronic conditions goes some way in alleviating this problem, as the estimates can be tailored to different populations, but when using the estimates in economic evaluations, analysts would need to make additional assumptions regarding the impact of new treatments in the long run.

We use a Danish data sample to estimate HRQoL models for the UK and USA by applying the respective value sets to the EQ-5D-3L instrument data. As earlier pointed out by Sullivan and colleagues (2011), this is not an ideal approach [28] but a practical one. Despite efforts to produce EQ-5D-3L versions that are comparable across countries, there still remains the possibility that the dimension descriptors are interpreted slightly differently in different countries, leading to differences in both the distribution of responses to the individual dimensions and the value sets. Furthermore, as confirmed in the current study, issues such as country differences in health risks, for example, BMI, may impact HRQoL. Differences in BMI across the USA and Europe are well documented. For instance, WHO estimated the mean BMI in Denmark to be 25.3 in 2016, while both the UK and US estimates are larger, 27.1 and 28.9 for the UK and the USA, respectively [73]. Other research also finds similar patterns in BMI, with the UK slightly closer to the Danish population than the USA [74].

Another factor that may contribute to differences might be educational attainment. Educational levels in the USA are the highest and closest to the Danish population, although some have also argued that the three countries have similar high educational population levels compared with other countries [75]. We find that the EQ-5D-3L educational sample means of the current study (Table 2) are similar to the older MEPS used for the previous catalogues: no degree 0.814 versus 0.83 and master’s degree 0.919 versus 0.91. Overall, US EQ-5D-3L population means are also similar between the current sample and the MEPS, with 0.869 versus 0.867 in the MEPS for the overall mean; 0.883 versus 0.880 for men and 0.854 versus 0.850 for women. A similar pattern emerges across age groups [24, 26]. However, the differences across some conditions are slightly larger: diabetes 0.779–0.758, asthma 0.803–0.820 and hypertension 0.801–0.787, respectively [26]. We expect some differences between conditions because the current catalogue is based on doctor-reported conditions instead of the MEPS self-reports. Nevertheless, it is difficult to know to what extent the differences are due to the chronic conditions reporting methodology versus other reasons.

Moreover, other studies have found country population differences regarding mortality and ethnic, income and education composition [76, 77]. Thus, it cannot be ruled out that the US and partly UK unadjusted estimates of the current study could be larger (overestimated) than they would have been using ‘native’ samples, although the precise impact is unclear. Nevertheless, we might get some indication of the possible size of the impact from other studies. For example, research has shown that the UK VAS ratings (83) are close to the Danish estimates (84), compared with 81 in the USA.

Similarly, a cross-country mean EQ-5D-3L comparison based on the European value set gave close estimates between DK (0.866) and the UK (0.856) and less so for the USA (0.825). Although these studies are based on older datasets, this might be the pattern we expect to see due to population differences, whether they are due to health directly or any socioeconomic differences. Thus, given this evidence, we can expect the DK sample to be closer to the UK population-based scores than the US sample. At the same time, the US EQ-5D-3L estimates might be slightly overestimated on the basis of a Danish population than if we had used a US population. Furthermore, using foreign-based sample estimates, although not ideal, might still be more acceptable than other alternatives, such as not having any baseline or reference condition estimates or using other more unreliable sources [28]. Moreover, all regression models adjusted for sex and age, as well as the full regression model for socio-economic variables and health risks, including BMI, thus reducing the impact of possible population differences.

Finally, the present study includes the latest national EQ-5D-3L data available, but the age of data might be a concern due to possible changes in HRQoL and treatments if improvements in technology and treatments, for example, lead to higher HRQoL. However, there does not seem to be evidence of positive changes in HRQoL across years or even decades. For example, comparing the latest available EQ-5L-3L population scores of 2010, 2013 and 2017, there is a slight decline from 0.85 to 0.83 (Danish norms) [78]. Another Danish study identified a slight decline from 0.89 in 2000 to 0.87 in 2010, suggestively due to an ageing population [56]. Other more recent Danish reports also suggest national declines in HRQoL. For instance, HRQoL measured by the SF-12 showed varying and slightly worsening national population health from 2010 to 2021 for physical health (10.0–11.3% having bad health in 2017 and 11.0% in 2021) and mental health dimensions (from 10.0% in 2010 to 13.2% having bad health in 2017, increasing to 17.4% in 2021) [79, 80]. This may indicate that population-based HRQoL is still slightly declining over time, although there seems to be an impact of COVID-19 around 2021 on mental health, in which long-lasting impacts on the population HRQoL are currently unknown. However, if anything, the estimates provided in the current study may be slightly conservative, e.g., lower, compared with the present time. Finally, any decline in HRQoL due to ageing populations over time can be factored in when modelling the population of interest using the provided regression estimates in catalogue two.

5 Conclusion

Along with the rapidly growing burden of chronic diseases in both prevalence, personal health and economic terms [33, 81,82,83,84,85], the future holds an increasing need for policy and decision-makers to prioritise and ensure most ‘health per pound or dollar’. The current study provides updated, larger catalogues of EQ-5D-3L preference scores for the UK and the USA. The catalogues describe aspects of disease burden, including identifying diseases with the lowest HRQoL and which can be used in CEA to allocate healthcare resources. To our knowledge, this is the largest and most comprehensive mean-based regression modelling of EQ-5D-3L scores for chronic conditions for these two countries. The models enable valid comparisons and assessments of the impact of HRQoL between 199 chronic conditions, socioeconomic factors, health risks and lifestyle factors.