FormalPara Key Points for Decision Makers

This paper presents the EQ-5D-5L value set for Slovenia, which was obtained following the EuroQol Group valuation protocol for EQ-5D-5L. The values were calculated from composite time trade-off (cTTO) data on the preferences of 1012 adults from Slovenia towards EQ-5D-5L health states, using the Tobit model.

The use of EQ-5D-5L in Slovenia is on the rise; health care providers are obliged by law to use EQ-5D-5L as a patient-reported outcome measure in the endoprosthetics registry. It is also one of the compulsory outcome indicators defined in the 2023 National Quality and Safety Strategy draft. The presented value set will allow further analysis and will support outcome-based decision making in Slovenia.

1 Introduction

Health technology assessment (HTA) has been increasingly used in healthcare decision making on resource allocation in many countries. Health technologies in Slovenia are assessed by various bodies that publish guidelines and recommendations on conducting economic evaluations of health interventions [1, 2]. The most common way to express the benefits of health intervention is quality-adjusted life-years (QALYs). The QALY is a measure that combines a treatment’s impact on a patient’s length of life and health-related quality of life (HRQoL) into a single outcome. To calculate QALYs, we need to express HRQOL in the form of a single value, known as health utility, which is scored on a scale that assigns a value of 1 to a state equivalent to full health and 0 to a state equivalent to being dead [3]. The EQ-5D, a preference-based quality-of-life measure, is one of the most used measures for the valuation of health [4, 5].

The EQ-5D instrument was developed by the EuroQol Organization. Currently, the following EQ-5D instruments are available: three-level instrument (EQ-5D-3L), five-level instrument (EQ-5D-5L), and youth version (EQ-5D-Y). Each of these instruments can be adapted for the mode of administration (e.g. self-complete, proxy, or interviewer administration) or for use on a different platform (paper or digital). The EQ-5D five-level version (EQ-5D-5L) was developed by the EuroQol Organization in 2009 [6] to avoid the methodological limitations [7] of the three-level version. EQ-5D-5L is currently available in more than 150 languages and in various modes of administration [8]. The descriptive system of the EQ-5D questionnaire consists of five dimensions: mobility (MO), self-care (SC), usual activities (UA), pain/discomfort (PD), and anxiety/depression (AD). In the original 3L version, each dimension had three levels of problems: no, some, or severe [9]; however, in the 5L version, these levels are no, slight, moderate, severe, or unable to/extreme problems [4]. The EQ-5D-5L has shown strong psychometric properties [7]. A systematic review [10] that included 24 studies found that Shannon’s indices were always higher for 5L than for 3L, and all but three studies reported lower ceiling effects (‘11111’) for 5L than for 3L. There was mixed and insufficient evidence on responsiveness and test–retest reliability, although results on index values showed better performance for 5L on test–retest reliability. Other studies also showed higher discriminatory power and more even distribution, with improved informativity and reduced ceiling effect for EQ-5D-5L than EQ-5D-3L [11].

The EQ-5D-5L questionnaire has been translated into the Slovenian language [10]. As it had no corresponding values for each of the 3125 health states, and its use was therefore hindered, an interim EQ-5D-5L value set for Slovenia using the crosswalk methodology developed by the EuroQol was developed in 2020 [12]. The crosswalk value set is of course not based on preferences directly elicited from representative general population samples, which was the aim of this study. Currently, there are 37 published EQ-5D-5L value sets worldwide, three of which are in Central and Eastern Europe: Romania [13], Poland [14], and Hungary [15].

As Slovenia has had the official translation of EQ-5D instruments as well as value sets and population norms available, EQ-5D has been widely used in studies measuring HRQoL in various patient populations [16, 17]. EQ-5D-5L is also one of the recommended patient-reported outcome measures in the new National Quality and Safety Strategy draft 2023–2031 [18] and was, by law, included in the Registry of Endoprosthetics in 2021 [19]. For all these reasons, we can expect that the new value set will be widely used.

The aim of this study was to develop the EQ-5D-5L value set for Slovenia by eliciting general adult population preferences. The elicited preferences will replace the crosswalk preferences for the EQ-5D-5L-defined health states currently used in the assessment of healthcare interventions.

2 Methods

The EuroQol Group’s valuation protocol was strictly followed throughout the study [20].

2.1 Sampling

For the survey, a representative sample of 1012 Slovenian adults aged 18+ years was obtained. The non-probability quota samples were formed across 12 statistical regions in Slovenia, according to age groups (18–24, 25–29, 30–34, 35–39, 40–44, 45–49, 50–54, 55–59, 60–64, ≥ 65 years) and sex (female/male). Inclusion criteria for the study were age ≥ 18 years and agreement to participation via an online informed consent form. Participation in the study was voluntary, however participants received compensation in the form of a €10 gift card. Respondents were recruited via the interviewers, who mostly covered their respective areas of residence. A mixed recruitment strategy was used as the respondents were recruited through personal contact as well as in public spaces.

2.2 Interviews

The interviewer team consisted of 11 university students studying economics, social sciences, or medicine, as well as one principal investigator. All interviewers underwent one full day of training in accordance with the programme developed by the EuroQol Group. Each interviewer conducted at least 10 training interviews or more in case the results had still not reached a sufficient quality level. The interviews started in March 2022 and continued until November 2022 when the quotas were full. The interviewers covered at least one region but some covered more regions, depending on the number of respondents in each region. Face-to face interviews were conducted that used composite time trade-off (cTTO) and discrete choice experiment (DCE) methods. The minimum number of interviews performed per interviewer was 35 and the maximum was 131.

The latest available version of the EuroQol Valuation Technique (EQ-VT) was used for the study (version 2.1). The study received approval from the Commission of the Republic of Slovenia for Medical Ethics (no. 0120-381/2021/6, dated 3 November 2021) prior to data collection. The target sample size was 1000 respondents, as defined in the EuroQol Group’s valuation protocol [20].

The interview consisted of a welcome and an explanation of the purpose of the interview, self-reported health using the 5L descriptive system and EQ-VAS task, cTTO valuation tasks (wheelchair example, three practice states, 10 real tasks, debriefing questions, feedback module), DCE valuation tasks (seven tasks, debriefing questions), demographic questions, comment box, and an additional DCE survey trying to determine the value of QALYs in Slovenia.

2.3 Research Design

Overall, 86 of the 3125 EQ-5D-5L health states were included in two different preference elicitation tasks according to the EQ-VT design [21]: (1) composite time trade-off (cTTO), and (2) DCE without duration. For the cTTO tasks, the health states were grouped into 10 blocks consisting of 10 health states each. Some states were present in multiple blocks, with each mild state (21111, 12111, 11211, 11121, 11112) repeated in two blocks and the pits state (55555) repeated in all 10 blocks. The remaining 80 states were generated using Fedorov’s exchange algorithm. For the DCE tasks, we used 196 pairs of EQ-5D-5L health states, divided into 28 blocks of seven pairs. The assignment of the cTTO and DCE blocks to each of the respondents was random.

The aim of the cTTO is to find a point at which the respondent is indifferent between a longer period of impaired health and a shorter period of full health. The cTTO approach incorporates the lead time in cases when the respondents consider a certain health state as worse than dead [22]. The whole cTTO approach is explained to the respondent at the beginning using the ‘wheelchair example’, where the worse than dead and better than dead health states are valued. The task is then practiced on three practice states (one good health state, one bad health state, and one hard-to-imagine health state). The cTTO values range from −1 (trading whole lead time) to 1 (trading no years in full health). At the end of the cTTO task, all health states are ranked from the best to the worst. If the respondent is not happy with the ranking, the responses are flagged and removed from the valuation tasks.

DCE tasks without duration consist of seven pairs of EQ-5D-5L health states, where the respondent is required to choose the better one.

2.4 Quality Control

Throughout the data collection period, interviewer performance was monitored as part of the quality-control procedure developed by the EuroQol Group [23]. The EuroQol Group appointed two supervisors of quality, with whom regular meetings were held during the entire period of data collection. The quality criteria were:

  1. 1.

    no explanation of the lead time in the wheelchair example;

  2. 2.

    the time used for the demonstration of the wheelchair example was shorter than 3 min;

  3. 3.

    the time used to complete 10 cTTO tasks was shorter than 5 mins;

  4. 4.

    inconsistency in the cTTO ratings, as 55555 is not the lowest and is at least 0.5 higher than the health state with the lowest value.

All the interviews that did not meet all the above-mentioned criteria were discussed with the interviewers in person and were not necessarily excluded if the respondent demonstrated an understanding of the cTTO task according to the interviewer’s judgement.

2.5 Data Analysis

While DCE data were collected as part of the EuroQol Group’s valuation protocol, we chose to focus solely on TTO data in our study. This decision was based on standard guidelines for analysing health state preference data, as well as our assumption that the TTO data alone should provide sufficient and logically consistent estimates for our purposes [24].

Descriptive statistics of the sample characteristics and cTTO utilities were computed. No exclusions were made based on data quality. Similar to the previous valuation studies [15, 25,26,27], we excluded the responses flagged by respondents in the feedback module. Data management and statistical analyses were performed using R version 4.0.2 (The R Foundation for Statistical Computing, Vienna, Austria).

The cTTO data were modelled via the Tobit model, a statistical model used to analyse censored data, also known as the censored regression model. Censored data refer to observations that are either truncated or censored, meaning that the true value of the observation is unknown but is known to fall within a certain range. This is the case when the respondent is still not indifferent between a longer 20-year period of impaired health and immediate death and prefers to die immediately.

The variant of the Tobit model used in this paper is the model with conditional heteroscedasticity, which accounts for the fact that the variance of the error term may vary with the predictor variables. According to the recent systematic literature review, the (heteroscedastic) Tobit model with censoring at –1 is the most commonly estimated model when dealing with cTTO data [28].

If the variance of the error term is not constant but varies with the predictor variables, then failing to account for this in the model may result in a poor fit. By modelling heteroscedasticity, the model can better capture the relationship between the predictor variables and the dependent variable, leading to more accurate predictions.

Another advantage is that it can improve the interpretability of the model. If the variance of the error term is not constant, then the standard errors of the estimated coefficients may not accurately reflect the uncertainty in the estimates. By modelling the heteroscedasticity, the standard errors can be adjusted to account for the varying variance, resulting in more accurate inferences about the coefficients.

The dependent variable in the model was disutility [25, 29], defined as 1 minus the number of years at the indifference point divided by 10. For example, if the respondent was indifferent between 5 years in full health and 10 years of impaired health, that state would have a utility of 0.5 and a disutility of −0.5. As the cTTO values range from −1 (trading whole lead time) to 1 (trading no years in full health) the disutility ranged from −2 to 0, where −2 represents the censoring threshold and 0 indicates that the model has no constant.

In the following Tobit model specification, incremental dummies were used to test whether the effect of a categorical predictor on the disutility differs significantly between the reference category (‘No problems’) and each of the other categories (levels of problems). The variance of the error term was modelled with health dimensions treated as continuous variables (1—‘No problems’, …, 5—‘Unable/extreme problems’). The Tobit model had the following form:

$${Y}^{*}= {\beta }_{1}\mathrm{MO}2+{\beta }_{2}\mathrm{MO}3+{\beta }_{3}\mathrm{MO}4+{\beta }_{4}\mathrm{MO}5+{\beta }_{5}\mathrm{SC}2+{\beta }_{6}\mathrm{SC}3+{\beta }_{7}\mathrm{SC}4+{\beta }_{8}\mathrm{SC}5+{\beta }_{9}\mathrm{UA}2+{\beta }_{10}\mathrm{UA}3+{\beta }_{11}\mathrm{UA}4+{\beta }_{12}\mathrm{UA}5+{\beta }_{13}\mathrm{PD}2+{\beta }_{14}\mathrm{PD}3+{\beta }_{15}\mathrm{PD}4+{\beta }_{16}\mathrm{PD}5+{\beta }_{17}\mathrm{AD}2+{\beta }_{18}\mathrm{AD}3+{\beta }_{19}\mathrm{AD}4+{\beta }_{20}\mathrm{AD}5+u,$$
$$Y=\mathrm{max}\left({Y}^{*},-2\right),$$
$$u=\sigma \left(MO, SC, UA, PD, AD\right)+\varepsilon,$$

where Y* is the latent and Y is the censored dependent variable. MO, SC, UA, PD, and AD are the predictor variables—health dimensions, where numbering corresponds to the level of problems, β1-20 are the model coefficients, −2 is the censoring threshold, and u is the error term. The error term in this model is assumed to have a conditional heteroscedasticity structure, meaning that the variance of the error term varies with the predictor variables MO, SC, UA, PD, and AD. σ(MO, SC, UA, PD, AD) is a function of the predictor variables that captures the variance of the error term, and ε is a normally distributed error term with a mean of 0.

3 Results

3.1 Respondent Characteristics

A total of 1012 respondents, representative of the Slovenia general population for age, sex and regions, were successfully interviewed (Fig. 1). The characteristics of the sample are summarised in Table 1. 38.5% of the respondents reported no problems in any of the five EQ-5D dimensions. The share of the respondents who reported any problems was highest in the PD dimension (51.9%), followed by MO (27.5%), UA (24.2%), AD (32.9%), and SC (11.2%). Not more than 0.5% of all respondents had extreme problems in any of the dimensions.

Fig. 1
figure 1

Sample of respondents by age and sex quotas. F female, M male

Table 1 Demographics of the respondents in the Slovenian valuation sample

3.2 Data Characteristics

In our study, there were no missing responses for any valuation task, resulting in a total of 10,120 (5L) cTTO responses from 1012 respondents. In the feedback module, 582 (57.5%) respondents flagged at least one health state (n = 356 flagged one health state, n = 174 flagged two, n = 48 flagged three, and n = 4 flagged four health states). Overall, 864 (5L) cTTO responses were removed by respondents in the rank ordering. Thus, data analysis included 9256 (5L) cTTO observations from 1012 respondents. 29.6% of mean cTTO values were negative, and most of these worse-than-dead responses were elicited at −1 (8%). The proportion of values clustered at 0 was 2.7%, and 8.8% at 1 (Fig. 2). The higher the severity level (i.e., sum of levels across dimensions), the lower the mean cTTO value, whereby the standard deviation increases with the severity level (Fig. 2). The observed mean cTTO values ranged from −0.700 for health state 55555, to 0.959 for health state 21111.

Fig. 2
figure 2

Observed cTTO value distribution and cTTO composite time trade-off

3.3 EQ-5D-5L Value Set for Slovenia

Results of the Tobit model are shown in Table 2 and Fig. 3. The incremental dummy variables in the model were used to test whether the effect of a categorical predictor on the disutility differs significantly between the reference category (‘No problems’) and each of the other categories (levels of problems). Because all regression coefficients are negative, this means that they are also logically consistent. With the exception of UA5 (unable to perform usual activities), all other levels on all health dimensions are statistically different from 0 and statistically different from each other. In the case of ‘usual activities’, a move from level 4 to level 5 was seen, on average, as worsening of health, although the difference was not statistically significant at 5%. Additionally, variance increased with any increase in severity level on all dimensions. An overview of the combined values of the incremental dummies is shown in Fig. 3.

Table 2 Parameter estimates for the model
Fig. 3
figure 3

Disutility estimates according to the EQ-5D-5L

3.4 Comparison of EQ-5D-3L and EQ-5D-5L Values

The kernel density plot of the 3125 values in the EQ-5D-5L value set shows a left-skewed distribution, whereas the EQ-5D-3L and crosswalk value sets are characterised by two peaks (bimodal distribution). The EQ-5D-5L value set covers a larger evaluation space without a constant as a deviation from full health (−1.090 to 1) than in the EQ-5D-3L and the crosswalk value sets (−0.495 to 1) (Fig. 4).

Fig. 4
figure 4

Kernel density plot of all possible dimensions of the EQ-5D-3L and EQ-5D-5L

4 Discussion

In this study, a Slovenian value set for the EQ-5D-5L was estimated. In the estimation process, the latest EQ-VT protocol approved by the EuroQol Research Foundation was used. The Tobit model with conditional heteroscedasticity, based solely on cTTO data, produced a logically consistent and statistically significant parameter.

To date, there is no agreement on which modelling strategy might be the best in estimating value sets [30]. In some countries, only cTTO data were used to derive a value set [15, 31]. Using only cTTO data can deliver logically inconsistent estimates, therefore researchers used the DCE scoring algorithms [32] anchored on cTTO data or a so-called hybrid model [25,26,27, 33, 34], which uses both types of data, i.e. DCE and cTTO, to derive a value set.

There is no solid theoretical justification to combine both elicitation methods as they represent two very distinct valuation methods [30]. Namely, there are fundamental differences between the two methodologies that may exclude linking the DCE and cTTO data. A researcher can either assume that utility can be observed directly (as with cTTO) or that it cannot be, as it is latent, unobserved (DCE), but not both. While DCE, rooted in random utility theory, is a superior methodology for preference elicitations, its current design and protocol at EuroQol do not enable the estimation of a value set on its own.

The initial idea was to combine the cTTO and DCE data to address the issues that occurred with previous TTO data studies and led to logically inconsistent parameter estimates [28]. The pooling of TTO and DCE data is based on the assumption that there is a relationship between them and a constant proportionality assumption implied by the cTTO. In a study published in 2022, Augustovski et al. [35] found that it was not appropriate to combine the data. After estimating the value sets separately, the equivalence of their parameters was rejected and the DCE rejected the constant proportionality assumption implied by the cTTO [28]. Moreover, it has been shown that individuals were willing to give up more years of their life to avoid severe health states in TTO than in DCE (TTO tended to produce higher valuations for severe health states) [36].

The Slovenian EQ-5D-5L value set was also compared with the Romanian, Polish and Hungarian EQ-5D-5L value sets. These countries were selected for the comparison as they are geographically located in Central and Eastern European (CEE) and have certain similarities regarding their history and culture. Nevertheless, differences were noted between the four value sets in terms of values assigned to the worst health state or value range, model approach, and the relative importance of the five EQ-5D-5L dimensions. The value range was largest in Slovenia as Slovenians assigned the lowest value to the worst health state 55555 (Slovenia −1.090, Hungary −0.848, Poland −0.590 and Romania −0.323). The PD dimension was ranked highest in Slovenia, Romania and Poland, and came second in Hungary, where mobility was the most important dimension. The least important dimension in Slovenia was self-care, followed by usual activities, while in all other CEE countries, the least important dimension was usual activities [13,14,15].

Finally, modelling approaches in arriving to the final value set differ among countries: Hungary and Slovenia used only cTTO data for their model, while on the other hand, Poland and Romania used cTTO and DCE data for their final model. All these differences stress the importance of having a national value set for EQ-5D health states.

The distinct feature in the Slovenian value set is high disutility for the fifth level of PD dimension (extreme pain). In comparison with the EQ-5D-3L set, the disutility connected to PD was lower. High disutility could be connected to the translation of extreme pain and discomfort. The Slovenian translation of this dimension is more in terms of unbearable pain and discomfort instead of extreme. Although this does not impact the relative position of the pain and discomfort levels, some respondents might have felt that pain that someone cannot bear is worse and that it is only possible to choose ‘dead’ over unbearable pain/discomfort. The disutility attached to the extreme pain could hence be exaggerated.

The EQ-5D-5L questionnaire has been widely used in population studies in Slovenia in the last few years. Users will benefit from a better descriptive system and the use of high-quality valuation data, which were derived from a more representative sample of the adult population. Furthermore, the EQ-5D-3L value set for Slovenia was obtained in 2005 and 2006. The EQ-5D-5L value set is a robust and up-to-date value set and should be the preferred value set used in adults in Slovenia and in neighbouring countries without their own value set.

5 Conclusions

The Slovenian cTTO-based EQ-5D-5L value set is recommended for use as an up-to-date EQ-5D value set in Slovenia. It is recommended for use in population studies, as well as in cost-utility studies, for decision making in clinical assessments and HTAs. EQ-5D is the only generic instrument with its own value set in Slovenia, enabling a refined preference-based HRQoL measurement to describe patients’ health. The set shows the relative importance that the Slovenian adult population places on different EQ-5D dimensions: greater importance is placed on the PD dimension followed by the AD dimension. The so-called physical EQ-5D dimensions (MO, SC, UA) seem to be less important for the Slovenians. Such societal preferences have implications for the assessment of treatments and should be taken into account in fund allocation decision making in health policy.