Background

Generic preference-accompanied measures (PAMs) are health status measures that consist of two parts: the first is a descriptive system, and the second is a value set that allows assigning utilities to health profiles defined by the descriptive system. Over the past decades, an increasing number of generic PAMs have been developed, such as the EQ-5D, Short-Form 6-Dimension (SF-6D), Assessment of Quality of Life (AQoL) and Health Utilities Index (HUI) [1]. Despite the abundance of PAMs, the most commonly used one on an international level is the EQ-5D [2, 3]. In over 20 countries, national health technology assessment guidelines recommend the use of this instrument in cost-utility analyses of health interventions [4]. It was developed in the late 1980s by an international organization, the EuroQol Group, and currently, it has two versions for adults, the EQ-5D-3L and the newer EQ-5D-5L [5, 6]. Both versions showed good validity in several countries, languages, and patient populations [7, 8].

The 15D is a 15-dimensional generic PAM, which was developed in Finland starting from the 1970s [9]. The instrument has been validated in numerous patient populations and translated to multiple languages; however, its popularity is predominantly concentrated in the Nordic countries [10]. Country-specific 15D value sets have been developed in Finland [11], Denmark [12], and Norway [13, 14]. Two countries (Norway and Chile) mention the 15D as an acceptable instrument in their health technology assessment guidelines [15, 16]. It has also been used in several cost-effectiveness and cost-utility analyses in different countries and as part of health technology assessment dossiers, in conditions such as hip and knee arthrosis [17], Parkinson’s disease [18], cataract [19], acute and chronic liver failure [20, 21] and anorexia nervosa [22] in Finland, multiple myeloma in three Nordic countries (Denmark, Norway, Sweden) [23], stress urinary incontinence in Canada [24], and breast cancer in Iran [25].

Compared to the EQ-5D, the descriptive system of the 15D is considerably longer, more detailed, and comprehensive. Notwithstanding, the 15D and EQ-5D-5L instruments are similar in many aspects, which offers a strong basis for comparison. Firstly, a range of corresponding dimensions can be found between the two measures with similar wording, such as mobility, usual activities, pain/discomfort, and anxiety/depression/distress. Secondly, on both instruments each dimension of health has one item with five response levels measured on a severity or capability scale. Finally, both instruments investigate the current health status of the respondent. A few studies compared the measurement properties of the EQ-5D-3L and 15D in different patient populations, such as epilepsy [26], HIV/AIDS [27] and stroke [28] in Norway, multiple chronic conditions [29], chronic pain [30], critical care setting [31] and patients after cardiac surgery [32] in Finland. However, to date, only one study has examined the psychometric properties of the EQ-5D-5L and 15D, in a sample of Parkinson’s disease in Spain [33]. Furthermore, no studies have provided a comparison of the measurement properties of either the EQ-5D-3L or EQ-5D-5L and 15D in a general population sample.

Comparing PAMs in different populations is important to inform researchers, analysts and health policy decision-makers about their performance and support the choice of instrument for cost-utility analysis. Although the EQ-5D-5L has proved to be a valid instrument in a multitude of health conditions, it might not capture all important aspects of health, especially in sensory disorders [34] and mental health conditions [35]. Furthermore, a clear need emerged to include extra dimensions in the EQ-5D-5L, so-called “bolt-ons” [36]. In that respect, the 15D with its broader descriptive system may offer advantages over the EQ-5D-5L. On the other hand, the 15D needs to fit many more dimensions into the utility space, allowing on average less space for each separate dimension. Given the abovementioned similarities between the EQ-5D-5L and 15D, one may almost consider the 15D a variant of the EQ-5D-5L expanded with bolt-ons. Interestingly, five of the 15 dimensions of 15D have a corresponding EQ-5D dimension and further eight have earlier been proposed as possible bolt-ons for the EQ-5D (vision, sleeping, hearing, vitality, speech, breathing, mental function, and sexual activities) [37]. A comparative analysis between the two instruments may also provide new evidence that can later support the development of candidate bolt-on dimensions.

Therefore, in this study, we aim to conduct an exploratory analysis that compares the measurement properties of the EQ-5D-5L and 15D in a large general population sample in Hungary. We compare measurement properties of both the descriptive systems and the index values focusing on ceiling and floor effects, informativity, agreement, redistribution properties, convergent and known-groups validity.

Methods

Study design

A cross-sectional survey was conducted with a targeted sample size of 2000 members of the Hungarian adult general population (response rate 77.8%). The broader aim of the survey was to assess the mental health of the population. Permission for conducting the study was granted by the Research Ethics Committee of the Corvinus University of Budapest (no. KRH/166/2021). Participants were recruited in August 2021 from one of the largest available online panels in Hungary by a third-party survey company. Respondents registered voluntarily to complete surveys in return for points, which could be redeemed for rewards. Respondents were included who were at least 18 years old at the time of completion, gave informed consent, and confirmed that they had understood the terms and were willing to participate. ‘Soft’ quotas were applied to ensure the representativeness of the sample for the general population by age, gender, the highest level of education, geographical region, and settlement type.

Outcome measures

A self-administered survey was designed for the study that asked questions about health-related quality of life, well-being, presence of physical and mental health conditions, resource utilization related to mental health care, and sociodemographic characteristics. The list of the physical health conditions was selected according to the 2019 Hungarian results of the European Health Interview Survey (EHIS) [38] complemented by some common chronic diseases. Similarly, the list of mental health conditions was chosen according to the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) [39]. We asked respondents to report any physical and mental health conditions experienced in the past 12 months in two questions. Firstly, they had to state whether they had any self-reported physical or mental health conditions. Secondly, they had to mark those that were also diagnosed by a physician. All participants completed a set of standardized questionnaires, including the validated Hungarian versions of EQ-5D-5L and 15D. The participants answered the questions in a fixed order, starting with the EQ-5D-5L and multiple questions were included between the EQ-5D-5L and 15D. As a base case, we used the Danish value sets for both the EQ-5D-5L [40] and 15D [12], because currently, Denmark is the only country with national value sets for both measures. However, using these value sets may have limitations. They were developed in different decades, using different preference elicitation methods, and thus have largely different value set ranges. Furthermore, using Danish value sets for Hungary may also pose additional problems given the differences in sociodemographic and economic characteristics and cultural values between the two countries [41]. Therefore, to test the robustness of our results, we repeated all analyses using the Hungarian EQ-5D-5L [42] and Norwegian 15D value sets [14]. The former was selected because of the study country, while the latter was considered as the most recently developed 15D value set with a similar value set range to the Hungarian EQ-5D-5L value set.

EQ-5D-5L

The EQ-5D-5L is a generic PAM that consists of two parts: a five-item descriptive system and a 0–100 visual analogue scale (EQ VAS) [5, 6]. The descriptive system contains five dimensions of health: mobility, self-care, usual activities, pain/discomfort and anxiety/depression, each with five response levels (no problems = 1, slight problems = 2, moderate problems = 3, severe problems = 4 and unable to/extreme problems = 5), allowing 55 = 3125 unique health states in total [6]. Respondents are asked to recall their current health state (i.e., ‘your health today’). The Danish value set used as a base case in this study is based on a heteroscedastic censored hybrid model using both composite time trade-off (cTTO) and discrete choice experiment (DCE) data from a representative adult general population sample in Denmark (data collection period 2018–19) [40]. The lowest possible value with this value set is − 0.758, where negative values describe health states considered to be worse than dead and 1 refers to full health. The Hungarian value set that was used for the sensitivity analysis had been estimated by a pooled heteroscedastic Tobit model using cTTO data from a representative sample of the Hungarian adult general population (data collection period 2018–19) [42]. Index values range from − 0.848 to 1 with this value set.

15D

The 15D is another generic PAM that covers 15 dimensions of health-related quality of life: mobility, vision, hearing, breathing, sleeping, eating, speech, excretion, usual activities, mental function, discomfort and symptoms, depression, distress, vitality, and sexual activities [9]. Each of these dimensions has five response levels (1 being the best possible health state of an individual, while 5 being the worst) with 515 (more than 30 billion) possible distinct health states. The 15D asks respondents to recall their current health (i.e., ‘present health status’). The Danish value set was selected in this study as a base case. This was developed using an additive model of the multi-attribute utility theory based on valuations on a 0–100 visual analogue scale (VAS). Firstly, respondents were asked to weigh the top and bottom levels of each dimension individually on a VAS, then they were asked to assign a score to each level of each dimension on VAS (‘within dimension tasks’). Data were collected in 2001 and preferences of the non-institutionalized general population of Denmark aged 18–75 were assessed [12]. The index values of the final value set range from 0.160 to 1. The Norwegian value set, used for the sensitivity analysis, also relies on an additive model [14]. However, it only kept the ‘within dimension tasks’ from the original valuation that was supplemented by a pits-task, whereby respondents were asked to rate the worst possible health state on a VAS together with ‘being dead’. Data were collected in 2010 and 2015–16 from a representative sample of the Norwegian general population. The index values range from − 0.516 to 1.

Statistical analyses

Our analytical framework builds on previous studies that compared the measurement properties of other generic PAMs [43,44,45,46]. As a result of a technical problem in the online survey interface, a few respondents’ EQ-5D-5L responses may have been inadvertently recorded as level 5 responses. Therefore, the research team examined all level 5 responses attentively in the EQ-5D-5L and compared them with other information (i.e. self-reported health status on other measures, physician-diagnosed physical and mental health conditions) provided by the respondents. As a result, 113 participants were excluded from the sample before the statistical analysis. To compare the two instruments, corresponding dimensions of EQ-5D-5L and 15D were matched, e.g. EQ-5D-5L mobility and 15D mobility. All analyses were performed on the total sample, and also for two subsets of respondents: (1) respondents with physical health conditions, and (2) respondents with mental health conditions. Statistical analyses were performed using R Statistical Software (version 4.1.1; R Foundation for Statistical Computing, Vienna, Austria). All the statistical tests were two-sided, and p < 0.05 was considered statistically significant.

Ceiling and floor

The proportion of participants reporting ‘no problems’ (ceiling) and ‘extreme problems’ (floor) was computed for each dimension of the descriptive systems. In addition, we calculated the ceiling and floor for the EQ-5D-5L and 15D health profiles, i.e. ‘no problems’ and ‘extreme problems’ in all dimensions, respectively. We expected a higher overall ceiling in the EQ-5D-5L than the 15D at an instrument level since the descriptive system of the latter is more detailed [28].

Informativity

The informativity of EQ-5D-5L and 15D dimensions, index values, and health state profiles was examined by Shannon’s (absolute informativity, H′) and Shannon’s Evenness (relative informativity, J′) indices [47, 48]. The Shannon index (H′) can be defined as

$$H^{\prime} = - \mathop \sum \limits_{i = 1}^{L} p_{i} *\log_{2} p_{i}$$

where pi is the proportion of observations in the ith level (where i = 1, …, L), and L is the number of levels in a dimension of the descriptive system. The greatest amount of information can be gathered if the responses are equally used across the levels. The Shannon Evenness index (J′) measures the evenness of distribution and was calculated as

$$J^{\prime} = \frac{H^{\prime}}{{H^{\prime}_{max} }} = \frac{{ - \mathop \sum \nolimits_{i = 1}^{L} p_{i} *\log_{2} p_{i} }}{{\log_{2} L}}$$

Thus, H′ ranges from 0 to log2L, and J′ ranges from 0 to 1, where a higher value indicates better informativity.

Inconsistencies and agreement

We performed cross-tabulations of the corresponding EQ-5D-5L and 15D dimensions to explore how consistent the responses were. We considered an EQ-5D-5L and 15D response pair inconsistent if the 15D response was at least two levels away from the EQ-5D-5L response [49]. The average size of inconsistencies was assessed according to the following weights: 0 if EQ-5D-5L and 15D responses did not differ more than 1 level, 1 if responses differed by 2 levels, and so forth [49].

The agreement between the EQ-5D-5L and 15D index values was examined using intraclass correlation coefficient (ICC) [50] and Bland–Altman plot [51]. A two-way random model with absolute agreement was applied to obtain an ICC value [52]. Agreement was considered poor 0 ≤ ICC < 0.4, fair 0.4 ≤ ICC < 0.6, good 0.6 ≤ ICC < 0.75, and excellent 0.75 ≤ ICC < 1 [53].

Convergent validity

We examined the convergent validity between the EQ-5D-5L and 15D dimensions (Spearman’s correlation) and index values (Pearson’s correlation). The absolute value of the correlation coefficient (r) was interpreted as follows: very weak correlation |r| < 0.2, weak correlation 0.2 ≤ |r| < 0.4, moderate correlation 0.4 ≤ |r| < 0.6 and strong correlation 0.6 ≤ |r| ≤ 1 [54]. We expected higher correlations among the corresponding dimensions covering similar aspects of health [26].

Known-groups validity

Known-groups validity was evaluated for self-reported physician-diagnosed health condition groups in contrast to being healthy. We hypothesized that respondents with a diagnosed physical or mental condition had significantly lower EQ-5D-5L and 15D index values. Student’s t test was used to compare the healthy and non-healthy groups. Effect size (ES, Cohen’s d) and relative efficiency (RE) were calculated. ES values were interpreted as negligible d < 0.2, small 0.2 ≤ d < 0.5, medium 0.5 ≤ d < 0.8, and large 0.8 ≤ d [55]. The RE was calculated as the ESs ratio of the two indices, where the 15D test statistic was used as reference; thus, a RE > 1 indicated that the EQ-5D-5L was more efficient in discriminating between two subgroups. To test whether the RE statistically differs from 1, 95% confidence intervals were calculated using 2000 bootstrap samples with accelerated bias correction.

Results

Characteristics of the study population

The distribution of the sample (n = 1887) reasonably approximated that of the general population in terms of sociodemographics (Additional file 1: Supplementary material 1). Altogether 63.4% of the sample responded that they had one or more physical conditions and 35.2% reported at least one mental health condition diagnosed by a physician.

Dimension-level analysis

As for the EQ-5D-5L dimensions, the floor varied between 0.2% (usual activities) and 1.2% (anxiety/depression), while the ceiling ranged from 50.8% (pain/discomfort) to 87.7% (self-care) (Table 1). Regarding the 15D dimensions, the floor reached its lowest at 0.2% (eating) and its highest at 3.9% (sexual activities), while for the ceiling, the values varied between 48.4% (sleeping) and 94.4% (eating). The EQ-5D-5L had lower ceiling in all corresponding dimension pairs, except for the EQ-5D-5L anxiety/depression vs. 15D distress pair. The highest difference in ceiling was found between EQ-5D-5L pain/discomfort (50.8%) and 15D discomfort and symptoms (68.2%). Similarly, the floor was equal or lower in the EQ-5D-5L for all pairs but EQ-5D-5L anxiety/depression vs. 15D depression. The largest difference in floor was seen between EQ-5D-5L anxiety/depression (1.2%) and 15D distress (1.7%).

Table 1 Floor and ceiling of EQ-5D-5L and 15D

EQ-5D-5L outperformed 15D regarding relative informativity (J′) for all dimensions (ranging from 0.51 to 0.70 for the EQ-5D-5L and from 0.44 to 0.69 for the 15D), except for the EQ-5D-5L anxiety/depression (0.65) vs. 15D distress (0.69) (Table 2). Considering all dimensions of each instrument, the average J′ values showed better results for the EQ-5D-5L (0.56) than for the 15D (0.49).

Table 2 Relative informativity of EQ-5D-5L and 15D (Shannon’s Evenness index)

Responses covered all levels in both the EQ-5D-5L and 15D among the corresponding dimensions (Additional file 1: Supplementary materials 2–5). The rate of inconsistent response pairs was ranging from 4.6% (EQ-5D-5L anxiety/depression and 15D depression) to 7.9% (EQ-5D-5L mobility and 15D mobility). The average size of inconsistency was relatively low, ranging from 1.20 to 1.24.

As for the corresponding dimensions, we observed strong correlation between the EQ-5D-5L and 15D usual activities dimensions (0.619) (Table 3). The EQ-5D-5L anxiety/depression correlated stronger with 15D depression (0.690) than with 15D distress (0.642). Moderate correlation was found between the two mobility dimensions (0.558), as well as between the EQ-5D-5L dimension pain/discomfort and the 15D dimension discomfort and symptoms (0.583). The non-corresponding dimension pairs were correlated weakly to moderately, ranging from 0.115 (EQ-5D-5L mobility and 15D eating) to 0.541 (EQ-5D-5L pain/discomfort and 15D vitality). We observed moderate correlation between the EQ VAS and all EQ-5D-5L domains (except for self-care, where correlation was weak), while mostly weak and moderate connection with the 15D dimensions.

Table 3 Correlation coefficients between 15D and EQ-5D-5L items

Analysis of the index values

The distributions of the EQ-5D-5L and 15D index values are presented in Fig. 1, while the main characteristics of the indices can be found in Table 4. Overall, 270 unique health states were observed for the EQ-5D-5L and 1030 for the 15D. The most common health state profile for both instruments was full health, accounting for 36.0% of the EQ-5D-5L answers and 21.0% of the 15D answers. As for the EQ-5D-5L, the second most common profile was slight pain or discomfort with no problems on the other dimensions (6.4%), while for the 15D, slight problems with sleeping and no other problems (3.2%).

Fig. 1
figure 1

Distribution of EQ-5D-5L and 15D index values

Table 4 Characteristics of EQ-5D-5L and 15D health state profiles and index values

In the total sample, the mean index value was the highest using the Danish 15D (0.91, SD = 0.11), followed by the Hungarian EQ-5D-5L (0.87, SD = 0.21), the Danish EQ-5D-5L (0.86, SD = 0.22), and the Norwegian 15D value set (0.81, SD = 0.22). The floor was negligible for 15D and not present for the EQ-5D-5L. For the Danish EQ-5D-5L, 1.4% of the index values were in the negative range, while for the Danish 15D, the theoretical minimum is higher than 0. However, 1.2% of the Hungarian EQ-5D-5L and 0.9% of the Norwegian 15D index values were negative. When the index value range was split with a bin width of 0.05, the Norwegian 15D showed the best relative informativity (J′) (0.63), followed by the Danish EQ-5D-5L (0.53), the Hungarian EQ-5D-5L (0.49), while the lowest J′ was demonstrated by the Danish 15D (0.44) (Table 4).

Poor agreement was found between the Danish EQ-5D-5L and 15D index values with an ICC of 0.363 (95% confidence interval: 0.342 to 0.385, p < 0.001) but a good agreement was found between the Hungarian EQ-5D-5L and Norwegian 15D index values with an ICC of 0.607 (95%CI 0.516–0.677, p < 0.001). The Bland–Altman plot indicated that 93.3% of the points lay within the 95% limits of agreement between the Danish EQ-5D-5L and 15D (94.2% between the Hungarian EQ-5D-5L and Norwegian 15D). Differences between the EQ-5D-5L and 15D index values increased at lower mean values for both value set pairs (Fig. 2).

Fig. 2
figure 2

Bland–Altman plot of EQ-5D-5L and 15D index values. The horizontal red line represents the mean of the differences (D) between EQ-5D-5L and 15D index values, while the 95% confidence interval is represented by the dashed lines, which was obtained as D ± 1.96*SD (SD: standard deviation of the differences)

Using the Danish value sets, a strong correlation was found between the EQ-5D-5L and 15D index values (0.671), and the EQ-5D-5L index value and EQ VAS value (0.604), while a moderate correlation was found between the 15D index value with the EQ VAS (0.534). The EQ-5D-5L index value demonstrated a strong correlation with its dimensions, except for self-care, where the correlation was moderate (− 0.482). By contrast, correlation coefficients between 15D dimensions and the EQ-5D-5L index value were ranging from − 0.596 (vitality) to − 0.176 (eating). 15D index value correlated moderately or strongly with most of its dimensions, while only weakly with the eating dimension (− 0.346). Considering the EQ-5D-5L dimensions with the 15D index value, the strongest correlation was observed for the pain/discomfort dimension (− 0.629), while the weakest for self-care (− 0.369). The convergent validity results were confirmed by the sensitivity analysis (Table 3).

Both the Danish EQ-5D-5L and 15D index values were able to discriminate between all chronic condition groups with moderate or large effect sizes (ranging from 0.688 to 3.810 for the EQ-5D-5L and from 0.623 to 3.018 for the 15D) (Table 5). Overall, the EQ-5D-5L was able to discriminate more effectively between 38/41 (93%) known-groups (RE > 1). Nevertheless, the bootstrap analysis suggested that results were significant in only five condition groups, dementia (RE = 1.465), other physical health conditions (RE = 1.448), bipolar depression (RE = 1.385), thyroid diseases (RE = 1.269), and gastroesophageal reflux disease (RE = 1.251). Using the Hungarian EQ-5D-5L and the Norwegian 15D value sets, effect sizes were large in all condition groups, and RE was > 1 in 36/41 (88%) known-groups. However, according to the results of the bootstrap analysis, the difference was only significant in four condition groups: dementia (RE = 1.672), chronic kidney disease (RE = 1.456), other physical health conditions (RE = 1.454), and urinary incontinence (RE = 1.302) (Table 6).

Table 5 Known-groups validity of the EQ-5D-5L and 15D (Danish value sets)
Table 6 Known-groups validity of the EQ-5D-5L (Hungarian value set) and 15D (Norwegian value set)

Subgroup analysis

The subgroup analysis for the physical and mental health condition subgroups yielded similar results to those of the total sample. Lower ceiling was observed both in the mental (18.7%) and physical health conditions subgroups (25.5%) compared to the total sample (36.0%) for the EQ-5D-5L, while the floor was 0% in both subgroups. Similarly, for the 15D, the ceiling was reduced to a greater extent in the mental health condition subgroup (10.1%) than in the physical health condition subgroup (12.3%) against the total sample (21.0%) (Table 1). In line with previous results, J′ was greater for the EQ-5D-5L than for the 15D in both subgroups (Table 2). The average size of inconsistency was similar for physical and mental health conditions (Additional file 1: Supplementary materials 10–11). The correlation between the Danish EQ-5D-5L and Danish 15D index values was higher in both the physical and mental health condition subgroups (0.736 and 0.702) than in the total sample (0.671). The ICC stood at 0.311 (95% CI 0.285–0.338, p < 0.001) for the physical health conditions subgroup, while reached 0.336 (95% CI 0.302–0.371, p < 0.001) for the mental health subgroup. As for the corresponding dimensions, correlations between dimensions were, in general, higher in both subgroups than in the total sample (Additional file 1: Supplementary materials 12–13). The sensitivity analyses (Additional file 1: Supplementary materials 6–9, 12–13) with the Hungarian EQ-5D-5L and Norwegian 15D value sets mostly supported these results; however, the agreement was good in both the physical (ICC = 0.653, 95% CI 0.561–0.722, p < 0.001) and mental (ICC = 0.632, 95% CI 0.495–0.725, p < 0.001) health condition subgroups.

Discussion

To our knowledge, this is the first study to compare the measurement properties of the EQ-5D-5L and 15D instruments in a general population sample. The sample showed good representativeness across demographic characteristics and allowed conducting subgroup analyses for physical and mental health conditions. EQ-5D-5L dimensions showed a substantially lower ceiling than those of the 15D in all but one corresponding dimension pairs. We identified a considerably larger ceiling in the EQ-5D-5L index value than the 15D index value, which corroborates earlier findings in various patient populations [26, 28, 29, 31, 32]. The ceiling decreased notably in both the physical and mental conditions subgroups compared to the total sample concerning both indices. The EQ-5D-5L demonstrated better overall relative informativity. Strong correlations were seen between the index values, which can be confirmed by previous research [30, 31]. Differently from our expectations [56], the anxiety/depression composite dimension correlated stronger with 15D depression than with 15D distress. Both the EQ-5D-5L and 15D were able to discriminate effectively between the healthy and non-healthy respondents with moderate or large effect sizes; however, EQ-5D-5L produced larger effect sizes in most groups regardless of the value set used.

Both instruments were able to effectively discriminate between the healthy and non-healthy groups of respondents. However, it is worth mentioning that although the index values in the healthy subgroup were reasonably similar for both the EQ-5D-5L and 15D using the Danish value sets, the mean index values of the EQ-5D-5L were substantially lower than those for the 15D in respondents with health conditions. On the contrary, the sensitivity analysis suggests that the Norwegian 15D index values were sizeably lower in 15/41 health conditions than the Hungarian EQ-5D-5L index values, while the difference was negligible in the rest. This is mainly attributable to the different value sets of the 15D. The range of the Danish value set is considerably narrower than that of the Norwegian, which has a utility of − 0.516 for the worst possible state that is more comparable to either EQ-5D-5L value sets used in this study. Therefore, there is less space for potential improvement using the Danish 15D value set and for this reason, the index values of more severe health states are already relatively high. The difference between the value sets is also well indicated by the fact that the ICC is poor between the Danish EQ-5D-5L and 15D index values, but good between the Hungarian EQ-5D-5L and Norwegian 15D.

A few 15D dimensions demonstrated (very) weak correlations with all EQ-5D-5L dimensions, such as vision, hearing, eating, speech, excretion, and mental function, which may indicate potential room for EQ-5D-5L bolt-ons. This is in line with earlier research that acknowledged these health areas as potentially not captured by the EQ-5D and proposed bolt-ons for these, including vision, hearing, speech, and cognition [34, 37, 57,58,59]. As bolt-on identification, development and testing are recommended to be based on mixed-methods evidence from multiple investigations and populations [36], our results support these efforts by informing future EQ-5D bolt-on development studies.

The following limitations should be considered. Firstly, due to the cross-sectional design of our study, we could not test the responsiveness or the test–retest reliability of the instruments. Secondly, according to census data, 48.0% of the Hungarian general population reported having chronic illness [38], whereas in our sample this proportion reached 71.6%. This difference is likely due to the fact that our questionnaire was rather detailed regarding questions about different health conditions and considered addictions (e.g. smoking) as well. Thirdly, clinical data including information on disease severity were not available from our survey, which would have allowed a more comprehensive known-groups validity testing. Finally, we have to acknowledge some linguistic specificities of our findings. For instance, in English, the mobility dimensions of both instruments use the phrase ‘walking’, while the Hungarian version of the 15D uses a different translation with a meaning of ‘moving about’ (‘közlekedés’) that could be responsible for the relatively high proportion of inconsistent response pairs between these two dimensions (7.9%).

Conclusions

In conclusion, our findings may contribute to the discussion of which generic PAM to use in decision-making and provide useful and broad information for health economic evaluations. Despite having 10 fewer dimensions, the EQ-5D-5L performed better than the 15D in many aspects. However, certain 15D dimensions (e.g. vision, hearing, mental function) showed a relatively weak relationship with the dimensions of EQ-5D-5L, which signals room for potential EQ-5D-5L bolt-on dimensions. Future research is recommended to assess the added value of such bolt-on dimensions and compare their measurement properties to other PAMs that include these health areas among their dimensions (e.g. 15D, AQoL). Additionally, longitudinal studies are needed to test the responsiveness of these instruments in relevant patient populations.