Population norms for the EQ-5D-3L: a cross-country analysis of population surveys for 20 countries

This study provides EQ-5D population norms for 20 countries (N = 163,838), which can be used to compare profiles for patients with specific conditions with data for the average person in the general population in a similar age and/or gender group. Descriptive EQ-5D data are provided for the total population, by gender and by seven age groups. Provided index values are based on European VAS for all countries, based on TTO for 11 countries and based on VAS for 10 countries. Important differences exist in EQ-5D reported health status across countries after standardizing for population structure. Self-reported health according to all five dimensions and EQ VAS generally decreased with increasing age and was lower for females. Mean self-rated EQ VAS scores varied from 70.4 to 83.3 in the total population by country. The prior living standards (GDP per capita) in the countries studied are correlated most with the EQ VAS scores (0.58), while unemployment appeared to be significantly correlated in people over the age of 45 only. A country’s expenditure on health care correlated moderately with higher ratings on the EQ VAS (0.55). EQ-5D norms can be used as reference data to assess the burden of disease of patients with specific conditions. Such information, in turn, can inform policy-making and assist in setting priorities in health care.


Introduction
EQ-5D is a standardized health-related quality of life questionnaire developed by the EuroQol Group in order to provide a simple, generic measure of health for clinical and economic appraisal [1]. Applicable to a wide range of health conditions, it provides a simple descriptive profile, a self-report visual analogue scale (EQ VAS) and an index value ('utility') for health status that can be used in the clinical and economic evaluation of health care as well as in population health surveys.
Since EQ-5D was first developed, a substantial amount of research has been carried out worldwide using the instrument [2]. Among this research were surveys conducted in various countries that measured the health-related quality of life of the general population [3]. These EQ-5D surveys have been informative in providing new data on population health characteristics, complementing the traditionally collected morbidity and mortality data.
Although recently an expanded five-level version of the EQ-5D instrument (EQ-5D-5L) has become available and was translated for use across countries, the general 1 3 population survey datasets available in the EuroQol archive that were analyzed in this study were still based on the original three-level version of the EQ-5D (EQ-5D-3L), here referred to as EQ-5D.
The purpose of the current study is to present EQ-5D population norms for 20 countries, including reported problems by the five EQ-5D dimensions, self-reported EQ VAS ratings (by country, age, and gender), and EQ-5D index values (by country, age, and gender). The index values, presented in country-specific value sets, are a major feature of the EQ-5D instrument. EQ-5D value sets are typically obtained using representative samples of the general public, thereby ensuring that they represent the societal perspective, traditionally based on visual analogue scale (VAS) and time trade-off (TTO) valuation techniques. Apart from VAS-and TTO-based value sets, we also included the European VASbased value set as a common metric for all countries. We hypothesized that reported health problems will increase by age and will be higher for females. Cross-country analyses of population health based on EQ-5D are presented with the aim of exploring which macroeconomic factors are associated with the self-reported health of the population. Additionally, we performed exploratory analyses on comparing the different value sets.

Data
Datasets per country were generally made available through the central data archive of the EuroQol Research Foundation. Countries included in the analysis were: Argentina, Belgium, China, Denmark, England, Finland, France, Germany, Greece, Hungary, Italy, Korea, the Netherlands, New Zealand, Slovenia, Spain, Sweden, Thailand, United Kingdom, and the United States [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18]. For two countries (Argentina and China), the dataset transfer to the central archive was not possible. For these countries, data were analyzed locally by two collaborating researchers (FA, SS, respectively). All of the surveys included the standardized three-level version of EQ-5D, using the appropriate language version in each country. The Dutch, Swedish, and Finnish versions were translated in 1987 according to a 'simultaneous' process while the remaining versions were translated according to the EuroQol Group's translation protocol based on international guidelines. Table 1 provides a detailed account of the data by country. All datasets were collected in representative samples of the general population for each country. The datasets were structured in a standardized format to facilitate comparative research, although each survey also has its own characteristics and variables specific to the individual research context in which they were conducted. The datasets captured for the current analyses include observations on 163,838 individuals. Sampling weights were applied for Belgium, England, France, Germany, Italy, the Netherlands, and Spain according to a stratified, multistage, cluster-area, probabilitysample design [5]. For the United States, sampling weights were applied resulting from a sampling design including stratification, clustering, multiple stages of selection, and oversampling of minority populations [18].
Surveys differed in methods of data collection and sample sizes. Some of the surveys were postal, while others were performed as part of a face-to-face interview or administered by telephone. The Argentinean dataset had the largest sample with over 41,000 respondents, while the Greek and the Swedish national surveys had the smallest sample of around 500 respondents.

Methods of describing population norms
Population norm data were calculated for the five dimensions, self-rated EQ VAS, and EQ-5D index values for the total population, by gender, and the following age groups: 18-24, 25-34, 35-44, 45-54, 55-64, 65-74, and 75 + years. Aggregate EQ-5D dimension results were dichotomized, reporting the proportion of respondents scoring any problem on each dimension (the sum of the proportion of reported level-2 and level-3 problems). EQ-5D index value were calculated using the following value sets: European VAS value set for all countries, country-specific time trade-off (TTO) value set if available (11 countries), and country-specific VAS value set if available (10 countries).
The TTO method has played an important role in generating value sets for the EQ-5D as one of the most widely accepted preference elicitation methods in economic evaluation [19] and the method of choice in the first [20] and several subsequent large-scale EQ-5D valuation studies [21]. The VAS has become the other widely used valuation method to elicit preferences for the EQ-5D, including 9 countries. Note that the VAS valuation method needs to be distinguished from the EQ VAS, which is a self-reported rating of the respondents' own health. The European VAS value set was constructed using data from 11 valuation studies in 6 countries: Finland (1), Germany (3), The Netherlands (1), Spain (3), Sweden (1), and the UK (2). This survey included sufficient data from different European regions to make the European VAS dataset moderately representative for Europe [22,23]. Relevant information on the TTO-and VAS-based value sets, including the scoring algorithms, can be found in Szende et al. [21], Xie et al. [24], and Scalone et al. [25].
Results were tabulated in alphabetic order.

Cross-country analysis
It is important to note that while results in each age group may be compared across countries, the total population scores cannot be compared directly, as they reflect the unique age structure within each country. Cross-country summary data for reported problems by the five dimensions and EQ VAS were estimated using a standardized population structure for all countries with national EQ-5D surveys. Standardization for age was performed to avoid bias due to the fact that some populations have a relatively higher proportion of elderly people. Age standardization of reported problems by dimension and EQ VAS were based on the European population structure using Eurostat data from 2010 [26], using the following proportions for each age group: 11% (18-24), 17% (25-34), 18% (35-44), 18% (45-54), 15% (55-64), 11% (65-74), and 10% (75 +).
To explore reasons for cross-country differences in EQ-5D data, correlations between country-specific EQ-5D data (five dimensions and self-rated EQ VAS) and countryspecific macroeconomic indicators were calculated, including indicators of living standards and health system performance. Living standards were estimated by means of gross domestic product (GDP) per capita and unemployment rate. Indicators for health care system performance were health expenditure per capita and health expenditure as a percentage of GDP, number of hospital beds per 1000 people, and number of physicians per 1000 people. The indicators were selected on the basis of a presumed or possible relationship with self-reported health. Data were obtained from the World Health Organization Statistical Information System and the World Bank [27,28]. The data were from 2010 or the closest year with available data ( Table 2). An alternative set of macro data was also used to see how results might change when using macro data from the same year as the EQ-5D data collection, including variables on gross national income on purchasing power parity, unemployment rate, and health expenditure data. A non-parametric measure (Spearman rank correlation) was used to assess the association between self-reported health using EQ-5D and the above-mentioned indicators of living standards and health system performance. We expected that poorer populations will show more reported health problems than richer populations, and countries with a shorter life expectancy will also display more reported health problems. Generally, the positive association of good health with higher health expenditures probably rests on a common explanatory factor, i.e., wealth on the country level. As additional exploratory analysis, we performed linear regression analyses on macroeconomic indicators and mean VAS rating.
The inclusion of both the European VAS value set as well as country-specific VAS value sets allowed for exploring the impact of the preferences of a specific country, using the European VAS value set as a reference. The inclusion of the country-specific TTO value sets also allowed for exploring the effect of valuation method (VAS versus TTO). All data analyses were performed using SPSS version 19 and Stata version 12 statistical software packages.

EQ-5D population norms
Results for reported problems along the five dimensions by gender for each country are presented in Table 3. As hypothesized, reported health problems were generally higher for females, with the exception of Slovenia. Problems with pain/discomfort were generally the most prevalent in each country, while problems with self-care were the least prevalent across countries. Thailand and Slovenia appeared to have generally high reported problems in all dimensions compared to other countries, while China and Korea showed the lowest reported problems. The pattern of reported problems across the five dimensions was rather similar across countries, although the absolute number of reported problems varies. Table 4 shows results for self-rated EQ VAS scores for each country by age and gender and for the total population. EQ VAS ratings decreased with increasing age and were generally lower for females in all countries, which confirmed our hypotheses. Country-specific differences can be observed in the overall level of health (mean EQ VAS ratings), and to a lesser extent in the level of health decrease (age-slope). Korea displayed a very small age slope. The age slope was considerably higher in Southeastern Europe compared to Northwestern Europe. Gender differences were generally more pronounced with increasing age, and stronger for some countries while almost absent in others (New Zealand, Slovenia, and Thailand). For illustrative purposes, Fig. 1 shows the detailed age and gender pattern for the pooled dataset.
EQ-5D index norm values based on the European value set generally decreased with age, with values ranging from  Population norms based on the European VAS value set were generally higher than or similar to countryspecific VAS value sets (except for Germany), while population norms based on country-specific TTO value sets tended to be higher compared to the same countries using countryspecific VAS-based value sets (see Tables 5, 6). reported by Slovenia and Thailand. It needs to be noted that while Hungary and Korea reported a lower mean EQ VAS than Slovenia and Thailand, generally more problems were reported in Slovenia and Thailand across the five dimensions. At the other end of the spectrum, China reported the lowest proportion of problems but reported average EQ VAS ratings, while Denmark and the UK reported the highest EQ VAS ratings and average proportions of problems. These results indicate that countries also differed in the overall level of health resulting from the more general EQ VAS question relative to the more specific questions on the EQ-5D dimensions. Table 8 shows the association on the country level of the macroeconomic indicators and the EQ VAS rating and reported health problems. As hypothesized, the prior living standards (GDP per capita) and health expenditure per capita in the countries studied were correlated with the mean EQ VAS scores (0.58 and 0.55, respectively). Unemployment significantly correlated in people over the age of 45 only. The number of physicians did not correlate with better EQ VAS data (0.03). Contrary to our expectations, life expectancy did not result in any significant association.

Cross-country comparison
The positive relationship between living standards and self-reported EQ VAS was further examined and is graphically presented in Fig. 2. As shown, EQ VAS correlated well with a country's GDP, although China and Thailand were outliers with an exceptionally low GDP (combined Table 3 Reported problems by five dimensions (proportions (%) of respondents scoring any problem, not standardized)

Mobility
Self-care Usual activities Pain/discomfort Anxiety/depression Argentina  13  9  3  2  10  6  36  25  26  19  Belgium  15  10  5  3  15  10  31  26  8  5  China  6  4  3  3  6  4  13  8  10  7  Denmark  12  10  3  2  20  15  40  33  19  12  Finland  29  24  12  9  24  18  52  43  15  12  France  16  11  4  4  11  9  38  33  16  13  Germany  17  15  3  2  11  9  30  25  5  4  Greece  14  13  9  3  12  9  20  14  12 Table 4 Self-reported EQ VAS ratings by age group and total population (mean values, not standardized) with relatively high EQ VAS scores). The European value set showed a more moderate correlation with GDP with only China as outlier and a smaller slope. Linear regression analyses showed that GDP level explained 29% of EQ VAS at the country level (p = 0.02), but explained 67% of the EQ VAS when excluding 'outliers' China and Thailand. Health expenditure per capita was the only other statistically significant explanatory factor that explained 26% of the country mean VAS (p = 0.03). Another set of regression analyses, which used macro data from the year of EQ-5D data collection in each country on gross national income expressed in purchasing power parity in 2010 values, did not yield statistically significant results. However, health care expenditure remained a statistically significant factor (p = 0.03), explaining 27% of variation in the country mean VAS scores.

Discussion
The current study generated population norms for self-rated EQ VAS and EQ-5D index values, and for self-reported problems on each of the five dimensions of the EQ-5D descriptive system for 20 countries, all classified by age. These EQ-5D norms are highly relevant for future research initiatives, as they can be used to compare EQ-5D data from patients to the average person in the general population of a certain country in a similar age (or gender) group, which also helps to identity the burden of the disease of patients or patient groups. This multi-country analysis is unique in terms of reporting EQ-5D data based on a standard methodology and allowing for comparisons across countries and explaining differences using macroeconomic indicators.
Our hypothesis on age and gender was confirmed by results for both the EQ VAS and reported problems on the five dimensions (where the age effect was visible through  the index values, providing a summary score for the five dimensions). Cross-country differences occurred in EQ-5D outcomes in terms of the overall level of health but also in terms of the age slope, which was considerably higher in Southeastern Europe compared to Northwestern Europe. The overall patterns in each country regarding reported problems were spectacularly similar in terms of pain/discomfort being the most prevalent and self-care being the least prevalent problem. However, the actual rates of reporting problems differed widely across countries after accounting for demographic differences, and no consistent trend was observed on how countries score in terms of EQ VAS relative to morbidity reported along the five dimensions, which seems to indicate that the EQ VAS is measuring a different (or at least wider) health concept than the five dimensions of EQ-5D, or that countries differ in responses to the various dimensions. An obvious implication of these findings for multi-country studies with the EQ-5D is the need to factor in the country of origin of patients when analyzing and interpreting results. In addition, when examining population norms for EQ-5D index values, results highlighted the importance of also taking into account the value set used to calculate the EQ-5D index when interpreting results or making comparisons across studies. Country-specific value sets are generally recommended for use in the corresponding country, while for comparative purposes, the European value set seems to be the most optimal choice. Country-specific value sets showed differences between valuation methods, which is consistent with previous evidence indicating that TTO methodology leads to higher values than VAS-based techniques [29].
The fact itself that self-reported health differs across countries is not unexpected. Previous studies, such as those based on categorical assessment of self-assessed health [30], or those based on generic quality of life questionnaires [31], found results that self-reported health differed across countries. These cross-country differences in the general level of health (EQ VAS) were at least partially explained by looking at macro data on the living standards and health system characteristics of each country. The analysis highlighted that it is the prior living standards of a country that mostly explain cross-country differences in self-reported health. Indeed, the result that GDP level explained 67% of EQ VAS at the country level when excluding two 'outlier' countries underlined the high importance of viewing self-reported health within a broader macroeconomic context. At the same time, health expenditure per capita was also quantified to be an important factor, one that policy-makers at a national level have more control over than determining annual GDP. In addition, while GDP showed a stronger correlation with VAS than health expenditure, a dollar unit of health expenditure had eight times the impact of a dollar unit of GDP on the country mean VAS scores (with coefficients of 0.0001 for GDP and 0.0008 for health expenditure). However, expenditure might be confused with GDP, since a high GDP might lead to higher health care expenditures, which in turn might influence the number and quality of interventions per capita, and consequently lead to better health in a population.
The most important limitation of this analysis relates to differences in samples across countries. While all samples were representative samples of the general population of each country, differences exist across study methodologies, such as sample size, administration method, purpose of data collection, and time of the data collection. While adjustments were made for sample structure, some of these factors may have influenced the comparability of the results. In particular, some surveys in the dataset archive were older, and limited evidence suggests that population norms may or may not change over time, depending on the country [3]. Non-response may have introduced a potential bias towards underestimation of self-reported health problems. Some countries applied a sampling design, whereas other countries did not, which might lead to a more accurate reflection of representativeness for the former. Although mode of administration might contribute to observed differences, a recent study showed equivalence between various modes of administration using the EQ-5D [32]. Further variability between countries might be caused by translations of the different versions of the EQ-5D. Another limitation is the use of the European population structure for age standardization, which might not be fully justified for the non-European countries, especially for China, where the population structure is quite different. Finally, influences due to reporting behavior heterogeneity, such as education, might also impact variability between self-reported health problems [33].
While results from these analyses can be used to compare profiles for patients with specific conditions or to assess the burden of disease in question, understanding inequalities in self-assessed health among the population is also important, but fell beyond the aims of this paper. However, more indepth analyses on contributors to levels of population health could be important.
Finally, this manuscript focused on existing data from the three-level version of the EQ-5D instrument; however, a more refined version of the EQ-5D (EQ-5D-5L), which extends the three response levels in each dimension to five levels, has been introduced [34]. The extra levels are expected to lead to a much more accurate reflection of population health, especially in relation to mild health problems. Further important research in the field would be the reporting of population norms using the EQ-5D-5L version of the questionnaire.