FormalPara Key Points

This article presents the first value set for health states experienced by Belgian children and adolescents between the ages of 8 and 15.

This will enable to include adequate utility values and quality-adjusted life year calculations in the health technology assessments of new and existing pediatric treatments in Belgium.

The EQ-5D-Y-3L value set will be included in the updated pharmacoeconomic guidelines for Belgium published by the Belgian Health Care Knowledge Centre.

1 Introduction

Since 2002, pharmaceutical companies submitting a Belgian Health Technology Assessment authority application for new drug reimbursement must include a pharmacoeconomic evaluation [1]. The Belgian Health Technology Assessment guidelines recommend the use of quality-adjusted life-years to express the benefit of new treatments, which can be gains in life expectancy and/or in quality of life. These gains need to be valued in a comparable way across disease areas, which can be achieved using health-state utility values. Utilities represent the relative value of life in a particular health state and are measured on a scale between “1” (representing the value of full health) and “0” (representing dead). The 0 and 1 values are used as anchors, and all health states of different severity levels are allocated a utility value relative to those points of reference. Some health states may even be considered worse than dead and will therefore take a negative value.

These utility values are computed using (national) value sets, which reflect preferences of the general population with regard to different aspects of health-related quality of life (HRQoL). In those valuation studies, health states are described by a validated HRQoL instrument. The EQ-5D-3L was developed in 1990 by the EuroQoL group with the aim to create a generic standardized tool to measure HRQoL in different disease areas, and is now the most widely used instrument [2, 3]. Most value sets are referring to the valuation of adult health states, described either by the EQ-5D-3L or the EQ-5D-5L versions of the instrument [4]; but these instruments were not developed for use in pediatric populations. Indeed, previous research has demonstrated that perspective matters when valuing health states, i.e. that the same health state is valued differently when it reflects the HRQoL of an adult or a child [5, 6]. This has been observed with a variety of valuation methodologies, including the visual analogue scale, various time-trade-off variants and discrete choice experiment (DCE) methods [6,7,8,9,10]. The EQ-5D-Y-3L (Y for youth and 3L referring to the three-levels of response categories) was first introduced in 2015 [11] and is developed for use in younger populations aged 8–15 years.

In Belgium, an EQ-5D-5L value set was published in 2021 [12] with nationally representative data for use in economic evaluations of new treatments for adults and youth from age 16 years. This value set complements the EQ-5D-3L value set from 2001, which only reflected Flanders’ preferences for health states, omitting Brussels and Wallonia [13]. As no value set reflecting preferences for health states in Belgian children aged 8–15 years old is available, guidelines recommend using adult weights in pediatric Health Technology Assessments, even though this is a sub-optimal solution [12].

This study aimed to estimate an EQ-5D-Y-3L value set for children using preferences of the Belgian general adult population according to an international valuation protocol developed by Ramos-Goñi et al. [14]. Knowing national tax-payers’ preferences for child and adolescent health states will allow appropriate price setting of new drugs for this population and help in the decision making on the optimal allocation of resources in Belgium.

2 Methods

This study used the EQ-5D-Y-3L descriptive system for describing the health states that were used in the composite Time Trade-Off (cTTO) and DCE questions. In order to make utility comparisons between countries as comparable as possible, all recent EQ-5D-based valuation studies follow published protocols with a fixed study design and follow strict quality-control procedures developed and imposed by the EuroQol Group. These procedures increase the likelihood that differences between the EQ-5D national value sets are mostly a reflection of different preferences and cultures, and are not due to different designs, data quality or other methodological aspects. Below, a brief introduction is given to the EQ-5D-Y-3L instrument and the valuation protocol; however, full details of the study design and interview protocol can be found in Ramos-Goñi et al. [14].

2.1 EQ-5D-Y-3L

EQ-5D-Y-3L was presented in 2010 as an instrument suitable for measuring HRQoL in children and adolescents aged 8–15 years [11]. The EQ-5D-Y-3L descriptive system has the same five dimensions as the EQ-5D-3L, but is worded for children: walking about, looking after myself (washing and dressing), doing usual activities (going to school, hobbies, sports, playing, meeting friends and family), having pain/discomfort, and feeling sad, worried or unhappy; and three levels per dimension: no problems, some problems or a lot of problems. The order in which the five dimensions are asked, reported and analyzed is always the same. This allows health states to be summarized by sequencing the levels assigned to each dimension. A total of 243 (35) unique health states can be generated, with “11111” being the best (no problems in any dimension) and “33333” being the worst health state or pits state (a lot of problems in all dimensions) [2].

2.2 Study Design

The international EQ-5D-Y-3L protocol prescribes a two-pronged approach [14], of which the first component is an online DCE survey administered to 1000 adult members of the general public in order to determine the relative importance of the five EQ-5D-Y-3L dimensions and their three levels. In this DCE, respondents are presented with one out of ten blocks of 15 questions comprising two health states described by the EQ-5D-Y-3L descriptive system and are asked to state which of the two health states they would prefer for a hypothetical 10-year-old child. In addition to the 15 questions from the study design, three “dominated alternative” quality-checking questions were added at the beginning, middle and end, where one choice was objectively better than the other (e.g. worse on one dimension and all other dimensions being equal). Respondents completed the DCE as an online survey, in which they were first asked to complete the EQ-5D-Y-3L for themselves and a small set of demographic questions, followed by the 15 DCE questions, and subsequently, a more elaborate medical and demographic questionnaire.

The second component of the protocol is face-to-face interviews in which cTTO valuation tasks [15] are conducted in a different sample of 200 adult members of the Belgian public. The cTTO interviews included the EQ-5D-Y-3L as self-complete demographic questions (age, sex, education, whether they have children, living situation, experience with severe illness), warm-up tasks (valuation of two health states described as “being in a wheelchair” and “worse than being in a wheelchair”, and three practice EQ-5D-Y-3L states) and ten cTTO valuation tasks. In the valuation tasks, respondents were asked to carefully consider the description of a life in less-than-full health (in the health state being valued) spent by a 10-year-old child during a period of 10 years. Respondents were then asked to compare this life to another life that is spent in full health, and they were asked to modify the duration of the life spent in full health between 0 and 10 years. This was done until equivalence was found between the (shorter) life in full health and the 10 years of life in the impaired health state. Of the ten health states to be valued, three were mild health states (11112, 11121, 21111), two were of moderate severity (22223, 22232) and five were severe (31133, 32223, 33233, 33323, 33333). An important purpose of the cTTO interviews was to obtain a precise estimate for the equivalent healthy life-years for the worst health state (33333), as this will provide an estimation of the minimum value of the utility range. This minimum value together with the maximum value (1 = full health) and the intermediate value (0 = dead) will provide anchors for rescaling the relative importance of the domains and levels of the EQ-5D-Y-3L obtained in the DCE.

2.3 Participants

Participants were recruited between July and December 2020 from a nationwide online panel with 150,000 Belgian respondents with profiles based on over 300 criteria, managed by a marketing research company, Bilendi. This panel is constantly being updated, and panel respondents are recruited from different channels, including television, radio and newspapers adverts; each medium contains a variety of psychographic profiles (Het Nieuwsblad, De Standaard, Knack, De Morgen, HLN, VTM, Radio 1, Sudpresse, RTL TVI, Radio Contact and others). Panel members are rewarded with points each time they fill in a questionnaire, which they can exchange for gifts. Selected members of this panel received either a unique link to the DCE survey by e-mail or an e-mail invitation for a cTTO interview, after which they were contacted by an interviewer over the phone to schedule an interview.

2.4 Interviewers

The cTTO interviews were conducted by three Master’s degree students from three universities (one in each region of Belgium) in French or Dutch. Interviewers followed Time Trade-Off (TTO) training given by the EuroQol Office and received a written script in their own language. Interviews were conducted face to face between July and October 2020 but switched to online video conference interviews between October and December 2020 because of the coronavirus disease 2019 (COVID-19) pandemic limiting proximity to others and restricting travel. The DCE data were collected between January and March 2021. A separate analysis on the effect of mode of administration (face-to-face versus online video conferencing) on the quality of the cTTO data in this valuation study has been published elsewhere [16]. This analysis showed that there was no evidence suggesting that the quality of cTTO data was compromised when using video conferencing compared with face-to-face interviews. This conclusion was reached based on the absence of differences in interviewer and respondent engagement, task duration and number of moves during the cTTO valuation. Another national valuation study using video conferencing supported the organisational and protocol feasibility of this mode of administration and concluded that the data quality was good [17].

2.5 Quality Control

Weekly meetings were held between the Principal Investigator and the EuroQol Office in order to monitor data quality [18]. To identify and exclude data from respondents who failed to indicate their preferences with sufficient understanding, precision and consideration, minimum criteria were established.

All data from respondents who chose the dominated alternative at least once (out of three) in a DCE choice pair were removed from the analysis set, as were the data from respondents who completed the DCE tasks under 150 seconds. Furthermore, data from a cTTO interview were removed if either the interviewer or the respondent failed the following quality-checking criteria: if the interviewer spent less than 3 minutes on or failed to explain all elements of the cTTO task (including explaining worse-than-death tasks); or if the respondent spent less than 5 minutes completing ten cTTO valuations; or when the respondent gave severely inconsistent responses (when the worst health state did not receive the lowest value and was 0.5 higher than the lowest value for that respondent).

2.6 Data Analysis

The data analysis was performed in STATA and consisted of three parts. First, the DCE data were modelled under a random utility theory with a statistical framework accounting for preference heterogeneity with latent class models (up to n latent classes after the minimum Bayesian information criterion [BIC] was reached [19]); and conditional logit and mixed logit models were also tested. All models had the choice of the health state (A or B) as the dependent variable and the difference between health state A and B in the level 2 and level 3 of each domain as the main effects. The preferred DCE model was selected based on BIC being the minimum value, while Akaike information criterion and mean absolute error (MSE) were also presented for demonstrational purposes. For latent class models, the final DCE coefficients were obtained by calculating weighted averages of the coefficients of each class, each weighted by their class share, whereas for the conditional and mixed logit DCE models this step was not necessary.

Second, the cTTO data were analyzed with four different statistical models, all with the utility complement (= 1 − utility value) as the dependent variable: a separate censored Tobit model for each health state; a six-parameter model with a parameter for each of the five domains, plus an extra parameter characterizing the relative difference across all domains between level 2 versus level 3 utilities; the same six-parameter model but with censoring of values that were − 1; and the same six-parameter model with a random intercept. It was not possible to estimate the standard ten-parameter additive model with a separate parameter for each level 2 and each level 3 domain with this study design, as there were only ten health states that were evaluated. The preferred TTO model was selected based on the lowest MSE and a model resulting in the widest range of possible TTO values (i.e. close to the mean censored value for 33333).

The DCE results are in a latent scale, meaning that its results represent the relative importance of the levels of the dimensions. Therefore, third, the anchoring of the DCE data on the utility scale was explored in two ways: (1) anchoring with mapping between predicted values for the preferred DCE statistical model and predicted values from the preferred cTTO six-parameter model; and (2) anchoring with the cTTO censored Tobit value for 33333. The preferred anchoring method was selected based on predictions being in the plausible range of utility values (not above 1) and based on the lower bound of the range of utilities being close to the mean censored value for 33333. Anchoring rescaled the coefficients of the DCE model with the range of the utilities to obtain the coefficients for the value set; standard errors of these rescaled coefficients were obtained with bootstrapping. Finally, the resulting Kernel distribution of the utility values was compared with Kernels from other EQ-5D-Y-3L studies and with two adult value sets for Belgium.

3 Results

3.1 Participants and Data Quality

Out of the 1000 respondents, 972 contributed to the DCE dataset with 14,580 DCE choices (the data from 28 respondents were removed as they failed one dominant question). The cTTO dataset consisted of the data from 200 adults. The data from 19 other respondents were excluded because of either protocol non-compliance issues by the interviewer (N = 7) or speeding by the respondent (N = 12). None of the included respondents had inconsistencies in their responses (i.e. none of the milder health states had lower valuations than more severe health states, and the worst health state always had the lowest valuation). The included cTTO data fulfilled all quality-control criteria [18].

The 972 DCE respondents were representative in terms of age, sex and region for the Belgian population (Table 1). A larger proportion of the 200 cTTO respondents were in younger age categories than the national average, although a reasonable spread was obtained across all age groups; and this sample was also equally distributed over the three regions rather than being representative, for organisational reasons. In both samples, respondents were more highly educated than the national average. Generally, respondents’ HRQoL was similar to the national average in terms of reporting problems on EQ-5D-3L domains, with one exception: they reported considerably more often having problems with anxiety/depression.

Table 1 Description of the sample characteristics

3.2 Description of cTTO Data

Figure 1a presents the Kernel distribution of the 200 cTTO valuations, with peaks at values − 1 (9.4%), − 0.5 (3.0%), 0 (3.5%), 0.5 (6.9%) and 1 (17.2%). In total, 22.4% of all valuations were considered worse than dead. In Fig. 1b, the cumulative average of the raw mean cTTO value of health state 33333 is plotted over time (from September to November 2020). This shows that the mean utility value for the worst health state was stable from the second week of the data collection onwards. Table 2 describes the raw and censored mean utility values and their distribution for the ten health states collected with cTTO. The raw cTTO values for the EQ-5D-Y-3L ranged from − 0.348 to 0.951, and censoring the values amplified the utility range downwards to − 0.475 to 0.951.

Fig. 1
figure 1

a Kernel distribution of all composite Time Trade-Off health-state valuations. b Cumulative average of the composite Time Trade-Off utility of the 33333 health state over time during the data collection period

Table 2 Description of composite Time Trade-Off data

3.3 Description of the DCE Statistical Model

Comparing the goodness of fit of the different statistical models with the BIC (Fig. 2a), showed that the latent class analysis (LCA)-type models had the best fit compared with the conditional logit model and the mixed logit model. Among LCA models, the one with four latent classes had the lowest BIC and was selected as the best model (Fig. 2a). Conceptually, the latent class model was also preferred as this type of model explicitly explores and models the heterogeneity of preferences in the population. In an LCA, the scale of the class reflects the agreement within each class, whereas the class share indicates the size of the classes. In this model, Class 2 was the largest class (63% of respondents) and paid the most attention to pain/discomfort, followed by worried/sad/unhappy and usual activities; Class 3 (16%) gave the most importance to mobility, usual activities and pain/discomfort; whereas Class 4 (13%) gave near-equal importance to looking after oneself, usual activities and pain/discomfort. Class 1 (8%) was considered as “noise” (Fig. 2b). Looking at the total column, equalling the weighted average of the four class results with the proportion of responders in each class as weights, the relative size of the level 2 and level 3 parameters indicated that pain/discomfort was the most important dimension, followed by worried/unhappy and usual activities. Looking after oneself was the dimension that was considered the least important (Fig. 2c). Whilst having a different underlying parameterization and fit, ultimately all DCE models resulted in a similar ranking of importance of the different domains and levels, which confirmed the robustness of the findings.

Fig. 2
figure 2

a Overview of model fit criteria. b Description of the four latent classes. c Relative importance of the level 2 and level 3 parameters of the five dimensions

3.4 Description of the cTTO Statistical Model

All of the proposed statistical models had a reasonable fit to the cTTO data, although the only comparisons that could be made were with the ten valued health states included in the cTTO design. We chose the six-parameter censored model without intercept as a good conceptual model for the cTTO values over the uncensored six-parameter model, and over the six-parameter model with a random intercept, based on this model having the lowest Mean Squared Error (MSE), and the model resulted in a relatively wide range of cTTO values (i.e. low 33333 health state). This model, as well as the censored Tobit model, was subsequently tested for anchoring the DCE data. The ranking of the importance of the domains in the six-parameter censored model was the same as in the DCE statistical model, with pain/discomfort being the most important dimension, followed by worried/unhappy and self-care the least important dimension.

3.5 Anchoring the DCE Results with cTTO Data

Combining the LCA4 model for the DCE with the six-parameter censored TTO model for anchoring the DCE predictions resulted in predicted values above 1 for mild health states, which led to the need to rescale the mapped utility values with an intercept and a slope parameter. This caused an overfitting of the TTO model and resulted in poorly predicted values in the middle range; therefore, this modelling option was not selected. The best results were obtained by anchoring the DCE values with the censored Tobit value for 33333 (= − 0.475); based on this anchoring method, the final value set was produced (Table 3). This was done by dividing the estimated (weighted average LCA) coefficients by the overall utility range (= 1 − (− 0.475)) and rescaling them to the weighted censored average value of the worst health state (33333: − 0.475). Using this value set, the Kernel density of all possible utility values was plotted in Fig. 3a. The utility values of all 243 health state profiles can be found in Appendix 1 of the Electronic Supplementary Material (ESM). A worked example on how to use the value set to calculate utilities is available in Appendix 2 of the ESM. SAS, STATA and SPSS codes for programming the utilities based on the responses to the five domains are found in Appendices 3–5 of the ESM.

Table 3 EQ-5D-Y-3L value set for Belgium
Fig. 3
figure 3

a Histogram and Kernel density of the utilities of the 243 health states based on the Belgian EQ-5D-Y-3L value set. b International comparison of densities associated with the EQ-5D-Y-3L value sets. c Comparison of the density of the utilities based on the Belgian EQ-5D-5L, EQ5D-3L and EQ-5D-Y-3L value sets

3.6 Comparing Results with Other Valuation Studies

The results of this valuation study were compared with results from published child and adolescent valuation studies from other countries, and with adult Belgian valuation studies. Comparing the resulting Kernel densities of all 243 health state utilities with Spanish [20] and Slovenian Kernels [21], it was observed that the distribution of the Belgian values has a similar shape to that of the two other countries, but Belgian child and adolescent utilities are slightly higher. This may be due to the re-scaling method, as the worst state value was lower in Spain (− 0.539) and in Slovenia (− 0.691), compared with Belgium (− 0.475) (Fig. 3b). Furthermore, comparing the Belgian general population’s preferences for child and adolescent versus adult health states showed that the Kernel densities of the EQ-5D-5L valuation study for adults [12] is the same shape and scale and close to the curve of the EQ-5D-Y-3L valuation study (Fig. 3c), and both are different from the older EQ-5D-3L VAS-based curve [13]. The width of the utility range was similar in the child and adolescent EQ-5D-Y-3L study (− 0.475 to 0.954) to the adult EQ-5D-5L study (− 0.532 to 0.939).

4 Discussion

4.1 Summary

In this study, we followed the international protocol for conducting youth valuation studies based on the EQ-5D-Y-3L, with the aim of estimating a value set for Belgium. We used the data of 1172 respondents across three regions in Belgium to estimate utility values that reflect preferences of the Belgian public for child and adolescent HRQoL. First, data on the relative importance of the five domains of the EQ-5D-Y-3L were obtained with DCE, and we observed that for children and adolescents, respondents attached the most importance to the domains of pain/discomfort and worried/sad/unhappy. Looking after oneself was considered the least important dimension, which may be explained by the fact that children are still learning to take care of themselves, and, especially at younger ages, it is normal that they still need help from an adult. The importance of pain and mental health in child and adolescent HRQoL identified in this study is similar to findings from child and adolescent valuation studies from other countries, such as Germany [22], the Netherlands [23], Spain [24], Slovenia [21], Hungary [24] and Japan [25], and similar to findings from the Belgian adult EQ-5D-5L study [11].

Second, data used for anchoring these relative preferences on the utility scale came from 200 respondents who participated in the cTTO interviews. We used the censored value of the worst health state to rescale preferences. The resulting distribution of utilities based on the value set was similar to other child and adolescent valuation studies, and similar to the Belgian EQ-5D-5L adult value set. This value set will be included in the updated pharmacoeconomic guidelines in Belgium (personal communication, Belgian Health Care Knowledge Centre) and will inform future cost-utility analyses for child and adolescent treatments.

4.2 Study Limitations

This study also has limitations that may have impacted the results. First, although we aimed to obtain DCE and cTTO samples representative of the Belgian population and the larger DCE sample was representative in terms of age, sex and region, representativeness was more difficult to obtain in the face-to-face interview sample (including video conferencing) during the COVID-19 pandemic. There were fewer elderly participants over the age of 70 years in the cTTO sample, and a slightly higher proportion of respondents was female, which may bias the results. Furthermore, the sample was not stratified on a number of characteristics (such as having children and level of education), which may impact the values of the survey. Indeed, respondents with children living at home may have different views on children’s HRQoL [9], and the responses of people with higher education may also be more consistent and their opinions may differ from people with fewer years of education.

Another limitation of the study induced by the protocol design is that views of respondents were collected on health states described for a 10-year-old child, but the data will be used to calculate utility values for children aged 8–15 years. It is unclear whether results based on a 10-year-old child are representative for adolescents up to age 15 years. Recent research [5] indicates that the interpretation of the five dimensions might be different for different age groups, and this, in turn, may affect preferences for health states described for those age groups. Adolescence is a transition period during which physical health, mental health, emotional health and usual activities are changing. It is possible that the relative importance of the different domains, the anchoring value, or both, may change when respondents are considering an adolescent instead of a 10-year-old child during the valuation tasks.

Furthermore, on the technical side, the protocol design did not allow for diminishing marginal utilities to be modeled. Therefore, we could not capture a scenario in which the existence of multiple problems across dimensions lessens the impact of each additional health decline. This inevitably impacted our study results through the anchoring method. Even using a six-parameter TTO model (instead of a 10-parameter model) resulted in poor predictions, either in the top end or the middle part of the utility values. Indeed, the study protocol with ten cTTO states was not designed to estimate a full hybrid model (as more health states would have to be included in the cTTO design), but instead was better suited at obtaining a precise value for the worst health state in order to use that for anchoring [14]. Although there was good agreement between the six-parameter cTTO and the DCE models in terms of preference ranking of the domains, anchoring with this model led to an over-parameterized model with a poor fit. Using the worst health state was therefore the selected option to produce the value set for the EQ-5D-Y-3L. This limited our capacity to use all collected cTTO data when producing the value set. However, given the agreement between cTTO and DCE models, we believe that this limitation is minor.

4.3 Impact of the COVID-19 Pandemic

Importantly, the fact that the data collection was conducted during the COVID-19 pandemic may have impacted the survey on different levels: first, because many respondents indicated having mental health problems (48% and 54% reported being a bit or a lot worried/sad/unhappy in the cTTO and DCE samples, respectively, compared with 6.5% in the general population in non-pandemic times [26]), which may have affected their perceptions and preferences on HRQoL. To put this effect in perspective, the Belgian adult EQ-5D-5L valuation study began collecting data before the pandemic and the last months of the data collection coincided with the first lock-down period in Belgium. Analyses including and excluding all data collected during the pandemic revealed a minimal impact of these data on the estimates of the value set (average difference of the coefficients of the value set was 0.002). Furthermore, the ranking of the domains remained identical with and without the data collected during the pandemic [12]. However, a dedicated study on the effect of the COVID-19 pandemic on the attributed VAS values [27] to the best health state (11111), dead and the worst health state (55555, when using the five-level version of the EQ-5D-5L) indicated that the best health state and dead were valued lower, whereas the worst health state was valued higher post-pandemic compared with before the pandemic. Differential effects were also found by age, sex and education on the raw VAS values, and particularly relevant is the age effect on the worst health-state valuation, with older participants receiving higher scores post-pandemic. If this effect was present in our study, it would impact our rescaling factor and affect all the parameter estimates from the value set; and this cannot be excluded.

The second impact of the pandemic was that the face-to-face cTTO data collection had to switch mid-way to online video conferencing, which was still interviewer led, but not in person. This led to selection bias in that participants in the online interviews were more often female and generally younger. By using video-conferencing survey methods, it is more difficult to reach older respondents or respondents without Internet. Furthermore, it can be debated whether respondents engage as well in online interviews than in face-to-face interviews. A separate analysis [16] verified minimal to no impact of online video conferencing on cTTO data quality in this study in terms of interviewer and respondent engagement, time and consistent responses. Another study investigating the feasibility of using video conferencing to conduct TTO interviews also found this mode of administration feasible and produced similar results to face-to-face interviews [28].

5 Conclusions

This study presents an EQ-5D-Y-3L value set for Belgium that follows the international EQ-5D-Y-3L valuation protocol. The value set presents Belgian adults’ preferences for health states applicable to children and adolescents aged 8–15 years. This will allow for the calculation of age-appropriate quality-adjusted life-years for Belgian children and adolescents in pharmacoeconomic evaluations and will help further research on child and adolescent HRQoL. This value set will be included in the forthcoming updated pharmacoeconomic guidelines in Belgium and will inform future cost-utility analyses in child and adolescent treatments.