FormalPara Infographic
figure a
FormalPara Key Points for Decision Makers

This study developed a national value set of the EQ-5D-Y-3L from the Brazilian population of adults.

The EQ-5D-Y-3L value set allows to calculate the quality-adjusted life-years in economic evaluations investigating health technologies for children and adolescents in Brazil.

This is the first study to estimate the EQ-5D-Y-3L value set in North, Central, and South America.

1 Introduction

There has been an increasing growth of interest in the development of generic preference-based measures specific to children and adolescents in the last decades [1, 2]. Among these instruments, the Adolescent Health Utility Measure (AHUM) [3], Assessment of Quality of Life-6 Dimensions (AQoL-6D) [4], Child Health Utility 9 Dimensions (CHU9D) [5], EQ-5D-Y-3L [6], Health Utility Index 2 (HUI2) [7], and Health Utility Index 3 (HUI3) [8] are some examples available in the literature. The EQ-5D-Y-3L is a generic instrument developed by the EuroQol Group to measure health-related quality of life (HRQoL) in children and adolescents aged 8–15 years old [6, 9]. This instrument was adapted from the adult instrument EQ-5D-3L, but with appropriate wording for children and adolescents [6, 10]. Its scoring system is based on value sets, which allow utilities to be calculated for conducting economic evaluations [11]. In addition, according to the literature, this instrument has been one of the most used in children and adolescents [12].

The process of recommendations for coverage and reimbursement for new medicines, devices, equipment, or interventions in the Brazilian Unified Health System (SUS) is conducted by the National Commission for Technology Incorporation, called CONITEC in Brazil [13]. This commission (that responds to the Ministry of Health) advises on incorporation, exclusion, or alteration based on health technology assessment (HTA) for adults and children, which includes measures of safety, efficacy, effectiveness, and cost-effectiveness [13]. The Brazilian Health Technology Assessment guideline recommends that cost-utility analysis using quality-adjusted life-years (QALYs) are employed as the main outcome [14,15,16]. However, the HTA recognizes that making decisions in the health system for adults and children should not be based only considering the QALYs, as other factors beyond this one are also relevant, including safety, efficacy, evidence, and other aspects (social, economic, and organizational) [15, 16]. The commission formulate recommendations for clinical protocols and therapeutic guidelines to be published through legal documents, including economic analysis guidelines publicly available [15, 16]. Since the introduction of the policy (April 2022) when economic evaluation became mandatory for decision-making related to the incorporation of health technologies in the Brazilian Public Healthcare System [17], there were 123 recommendations (76 for incorporations and 47 for nonincorporations) made by the commission [18]. The updated numbers of health technologies submitted to the commission in the Brazilian Unified Health System can be found elsewhere [18]. Furthermore, in November 2022, the Ministry of Health established a cost-effectiveness threshold of US$ 15,834.49 (R$40,000 Brazilian real) per QALY gained for health decision-making by the National Commission for Technology Incorporation—an alternative threshold [valued up to US$ 47,503.47 (R$120,000)] be considered for children and adolescents’ health conditions [15, 19]. This cost-effectiveness willingness to pay threshold represents a significant advancement for healthcare in Brazil, as it provides an additional parameter for the incorporation of new technologies [15]. QALYs is a measure that combines length of life and quality of life, and it can be calculated using a preference-based measure of health-related quality of life to describe patients’ health states [20,21,22,23]. These patients’ health states are converted into utilities using national value sets [14, 21,22,23]. Utilities represent individuals’ preferences for a given health state anchored on a scale between 0 (representing “dead”) and 1 (representing “full health”), with negative values representing a specific health state named “worse than dead” [21].

There are a few valuation methods to elicit preferences, such as visual analog scale [24], discrete choice experiment (DCE) [25], time trade-off (TTO) [26], and standard gamble [27]. Over the last decades, these valuation methods have been investigated and used to elicit the preferences, especially in adults, with the TTO being the main method used for the EQ-5D [28]. However, although these methods are considered robust and mature for the adult population, the valuation methods for children and adolescents seems to present unique challenges, as the methods are emergent and still evolving [29, 30]. The International Valuation Protocol of the EQ-5D-Y-3L is a first step in setting out the minimum requirements, but it is also expected that the learnings and findings from valuation studies will possibly provide method updates in the future [29]. Thus, it is important to understand the complexity of eliciting preferences and valuing health states for children and adolescents underscoring the importance of the use of appropriate methods to estimate it [31]. The main challenges are related to how well children and adolescents understand the tasks from the taxpayer’s perspective, as the values are used to inform health technology assessment and resources allocation [30]. Whereas for adults the challenge is to imagine being in a child's health state, for children it can be challenging to understand abstract concepts, hypothetical scenarios, or trade-offs between health states. Furthermore, there are ethical implications in asking children and adolescents about health states worse than “dead” [32, 33]. Therefore, the EuroQol Group has developed a protocol for valuing the EQ-5D-Y-3L, which involves adults completing valuation tasks, from a perspective of what they would prefer for a 10-year-old child [10, 26, 30, 34].

In contrast to the valuation methods of the EQ-5D-5L, the EQ-5D-Y-3L valuation consists of a two-step approach: an online DCE to estimate the relative importance of the levels and dimensions of the EQ-5D-Y-3L and collecting composite time trade-off (cTTO) in a smaller sample to anchor the DCE values onto 1 (full health) or 0 (dead) [30]. In the cTTO, a conventional TTO is used to value the health states considered better than dead and a lead-time TTO is used to value the health states considered worse than dead (negative values) [30]. Several countries such as Japan [35], Indonesian [36], Hungary [37], the Netherlands [38], Slovenia [39], Germany [40], Belgium [41], Spain [42], and China [43] also developed their EQ-5D-Y-3L value sets using the minimum requirements of International Valuation Protocol with different anchoring methods such as just the worst health state, hybrid modelling or mapping approaches. The International Valuation Protocol provides guidelines that serve as minimum requirements for valuing EQ-5D-Y-3L. Each respondent should complete 15 pairs of the DCE from 150 pairs distributed over 10 blocks with a minimum sample size of 1,000 respondents. For the cTTO, the interview should collect ten cTTO tasks/health states with 200 respondents [30]. Furthermore, authors can determine to have larger samples or even use more than those ten health states suggested for the cTTO, by the International Valuation Protocol. Different from the study conducted in Germany that considered ten health states, studies from Japan, China, Netherlands, and Indonesia considered adding more health states in the cTTO aiming to improve the estimate of the DCE-cTTO relationship [35, 36, 38, 40, 43].

In Brazil, only the EQ-5D-3L value set is available and the 243 health states of the EQ-5D-3L instrument were based on a sample of 3,362 literate individuals aged between 18-64 years old who lived in an urban area [44, 45]. There is no value set available for the EQ-5D-Y-3L in the Brazilian population that would allow to calculate the QALYs in economic evaluation. Therefore, this valuation study aimed to develop the value set of the EQ-5D-Y-3L instrument for Brazilian children and adolescents.

2 Methods

2.1 Study Design

This valuation study was conducted following the International Valuation Protocol of the EQ-5D-Y-3L and reported according to the Checklist for Reporting Valuation Studies of the EQ-5D (CREATE) [30, 46]. Details of the CREATE can be found in the Supplementary Information 1. A two-step valuation approach using independent surveys with different samples and modes of administration was conducted in this study: an online structured self-completed survey with discrete choice experiment (DCE) tasks and face-to-face computer-assisted personal interviews with cTTO tasks. The language of DCE and cTTO surveys was Brazilian Portuguese. Both surveys included information on the study, informed consent, three introductory questions (i.e., age, gender, and experience with severe illness), the self-reported version of the EQ-5D-Y-3L, the valuation task, feedback questions, and complementary questions (e.g., educational level, marital status, socioeconomic level, religious beliefs, general health, health conditions, and pain conditions).

This study was approved by the Human Ethics Committee of the Universidade Cidade de São Paulo (UNICID) (CAAE: 45241321.6.0000.0064). DCE and cTTO surveys were conducted in a sample of adults aged 18 years or above from the general population who provided informed consent to participate in this study. The respondents were adults asked to do the tasks considering a child’s perspective (i.e., adults imagined the health state of a 10-year-old child). This choice was made on the basis of previous studies to possibly avoid ethical issues associated with the consideration of dead if a sample of children was used [27, 44].

2.2 EQ-5D-Y-3L

This study used the Brazilian Portuguese version of the EQ-5D-Y-3L [6]. The EQ-5D-Y-3L is a child-friendly version adapted from the EQ-5D-3L instrument for measuring general health-related quality of life. The EQ-5D-Y-3L is composed of two parts: a descriptive system and a Visual Analog Scale (EQ-VAS). The descriptive system consists of five dimensions with appropriate age wording: walking about (mobility), looking after myself (self-care), doing usual activities (usual activities), having pain or discomfort (pain/discomfort), and feeling worried, sad, or unhappy (anxiety/depression). Each dimension has three levels of severity: (1) no problems/no pain or discomfort/not worried, sad, or unhappy; (2) some problems/some pain or discomfort/a bit worried, sad, or unhappy; and (3) a lot of problems/a lot of pain or discomfort/very worried, sad, or unhappy [6]. The responses generate a five-digit numeric code that expresses the child’s health state ranging from 11111 (full health, representing no problems in any domain) to 33333 (the worst health state, representing a lot of problems in all domains), totalizing 243 health states. The EQ-VAS ranges from 0 to 100, where 0 represents “the worst health state you can imagine” and 100 represents “the best health state you can imagine” [6].

2.3 Valuation tasks

The DCE and cTTO tasks were conducted using the EuroQol Valuation Technology (EQ-VT) software (v2.1) [28]. Two valuation methods were used in this study: DCE and cTTO. The DCE and cTTO tasks were collected in two samples considering the perspective of the respondents for a 10-year-old child. In the DCE tasks, respondents expressed their preference between two different EQ-5D-Y-3L health states, classified as options A or B, and no specification of the duration of living in the health states was included [30].

The cTTO task consisted of a conventional time trade-off to elicit values for the health states considered “better than dead” (i.e., 10-year time trade-off) and a lead-time to elicit values for the states considered “worse than dead” [47] (i.e., 20-year time trade-off, 10 years in full health followed by 10 years in the health state), in which, the lowest tradable time trade-off was 6 months. Thus, respondents were asked to trade-off life-years in one hypothetical situation to identify the number of life-years in full health, where they would be indifferent for a kid to be living a shorter period of life-years in full health and a longer period of life-years in a specific health state [26, 30]. In this hypothetical situation, the respondent can prefer “life A,” “life B,” or consider “life A and B about the same” [26]. “Life A” corresponds to living with full health for a few years before death, and “life B” corresponds to living with full health before living a few years in a hypothetical health state from the EQ-5D-Y-3L before death [26]. Depending on the answer, the time of full health in “life A” changed, and the task finished when the respondent answered that “life A and B are about the same” (i.e., respondent’s point of indifference).

2.4 Health State Selection

The DCE design consisted of 150 pairs of health states distributed over 10 blocks of 15 pairs. Each respondent was randomly allocated to complete one of the ten blocks. The order of the health state pair presentation and right/left presentation was randomized in each block. A two-dimension overlap was imposed for all pairs. Thus, the health states in each pair presented two dimensions with the same level of severity and three dimensions with different levels of severity. To help responders identify the differences between health states and reduce attribute non-attendance, a bold font was applied in the dimensions with level overlap [48]. The bold font was used as there is no color coding in the International Valuation Protocol [30]. Each respondent also completed three fixed dominant pairs for quality control. These dominant pairs were presented in the first pair, in the last pair, and at a random point. A dominant pair represents the health state that was always obviously better than the other (e.g., three dimensions with equal severity levels and two with worse severity levels). Thus, each respondent completed a total of 18 DCE tasks (15 health state pairs for valuation and three dominant pairs for quality control that were extensions of minimum requirements of the International Valuation Protocol). As DCE generates values on a latent scale, another valuation method is needed to anchor the utilities ranging from 1 to 0, where 1 represents “full health”, 0 represents “dead”, and negative values represent health states considered “worse than dead” [30]. The anchored utilities were developed through the cTTO task. The samples of the DCE and cTTO tasks were different, and those answering one did not participate in the other.

In the cTTO design, an orthogonal design including 18 health states was used [49, 50]. In addition to the orthogonal array, we considered the worst health state (33333), five mild health states (21111, 12111, 11211, 11121, and 11112), and four moderate health states to maintain near-orthogonality. In this study, the cTTO design consisted of 28 health states divided into 3 blocks of 10 health states (i.e., 9 health states per block, and the worst health state “33333” were included in all blocks). At least one mild health state was included in each block. Each respondent was randomly allocated to complete one of the three blocks. The order of the health states was also randomized. Before respondents completed the ten health states for the valuation study, each respondent valued two wheelchair examples considering a situation better than dead and another worse than dead, as well as three health states: a mild (21112), a severe (32323), and a health state difficult to imagine (13311). After the cTTO tasks, a ranking with all ten health states was presented to the respondents as an opportunity to verify if they agree with the rank ordering that is inferred from their responses. If necessary, inconsistencies flagging one or more cTTO valuations could be detected at this stage [51]. The average time for completing the cTTO task was around 60 min.

2.5 Sampling, Recruitment, and Data Collection

The International Valuation Protocol for the EQ‐5D‐Y‐3L [30] recommends sample sizes of 1000 respondents for the DCE survey and 200 respondents for the cTTO survey. For the DCE, the respondents were recruited from general population, considering the proportional distribution of Brazilian regions according to the Brazilian Institute of Geography and Statistics (Instituto Brasileiro de Geografia e Estatística—IBGE) data (i.e., of the total sample, 8.9% were recruited from north region, 27% from northeast region, 7.8% from central-west region, 14.3% from south region, and 42% from southeast region) [52]. For the cTTO, the respondents were recruited from the general population in three urban centers from three Brazilian states: Sao Paulo (69.5% of the total sample, representing the southeast region), Rio Grande do Sul (17% of the total sample, from Rio Grande do Sul, representing the south region), and Ceara (27% of the total sample, representing the northeast region) states [52]. The total population from all these three urban centers represents 31.6% of the whole Brazilian population. Additionally, the population from Sao Paulo consisted of 45.5% immigrants from other Brazilian states, showing good cultural representativeness from different states of Brazil [53]. Furthermore, quotas were used for age (18–24 years, 25–34 years, 35–44 years, 45–54 years, 55–64 years, > 65 years), sex (male and female), educational level (primary, middle, and high), and socioeconomic level based on the household income per month according to the Brazilian Association of Survey Companies [classification ranges from A to D/E, as A represents a household higher than $9678 (R$25,000)/month and D/E represents a household lower than $278 (R$719)/month] [54]. All quotas were considered for the recruitment of both surveys (DCE and cTTO).

Respondents were invited through a recruitment panel company for both surveys (DCE and cTTO). In the DCE approach, the respondents received a unique link by e-mail to access the DCE task. In the cTTO, the respondents were asked to attend a face-to-face interview at the local university according to their geographical location (e.g., Universidade Cidade de São Paulo in the Southeast region; Universidade Federal de Ciências da Saúde de Porto Alegre in the South region; and Universidade Federal do Ceará in the Northeast region). The interviews were conducted in private rooms and the respondents received a compensation for their transport and a food voucher.

The cTTO interviews were conducted by five interviewers, all of them working in research positions in the health area. Principal investigators of this study (GCM and TPY) received training from the EuroQol group on the valuation methods, EQ-VT protocol, and the quality control procedure. After the training, each principal investigator conducted ten pilot interviews with family and friends before offering standardized training for the interviewers. Interviewers received a 1-day hybrid training and a written script on the cTTO approach in Brazilian Portuguese. After the interviewers’ training, each interviewer conducted 10 pilot interviews with family and friends followed by 20 pilot interviews with the general population (assuring diverse socioeconomic and level educational background) before the data collection commenced. Although not all interviewers conducted 40 interviews in data collection because they were in Ceara and Rio Grande do Sul states (which needed a smaller number of interviews compared Sao Paulo state), our three interviewers from Sao Paulo state [CMES (n = 50), (VSS n = 49), and GCM (n = 40)] completed the interviews. Our interviewers had intensive training in conducting more pilot interviewers when compared to other conventional studies, and the data collection started when we reached an adequate quality of the interviewers following the quality control criteria from the EuroQol. Thus, we ensured that all interviewers (background in health area) had sufficient training and were guided by the principal investigators and also by the EuroQol group to maintain the quality of the interviews. During the pilot interviews and data collection, the interviewers shared their experiences as a group, and the principal investigators provided daily feedback on their performance. During data collection period, each interviewer performed a maximum of ten interviews per week to not overload the capacity of each interviewer. Data collection of the DCE and cTTO was performed from October to December 2022.

2.6 Quality Control

According to recommendations from previous studies, all respondents in the DCE tasks who failed in any of the three dominant tasks (i.e., respondents preferred the obviously worse health state), and those “fast respondents” (i.e., respondents completed the DCE survey in less than 150–8.3 s/DCE task) were excluded from the final sample [28, 39, 51].

The quality control process consists of two main steps: (1) ensuring the protocol compliance and (2) investigating the presence of interviewer effects. In the first step (protocol compliance), we assessed four main criteria: (1) whether there was no explanation of the worse than dead task (lead-time) on the wheelchair example to each respondent; (2) whether a short time was spent on explaining the wheelchair examples (less than 3 min); (3) whether a short time was spent to complete all cTTO tasks (less than 5 min total time for completing the ten health states); and (4) whether there was any inconsistency in the cTTO responses (i.e., 33333 should be the lowest or at least 0.5 value higher than the health state with the lowest value) [51]. If at least one of the four criteria was met, the interview was flagged as an indication of potential poor quality. In the second step (presence of interviewer effects), we have monitored interviewer effects to ensure the data quality. The quality of the cTTO data was assessed every ten completed interviews (by each interviewer), using the quality control process defined by the EuroQol group [51]. The quality control process assesses interviewers’ performance related to interviewer effects and cTTO protocol compliance. Every week the data quality was discussed with the EQ-VT support team until completing cTTO data collection. After EQ-VT support team considerations, every week, the principal investigators discussed feedback on the performance and quality of the interviews with each interviewer to improve their skills.

2.7 Data Analysis and Modeling

All statistical analyses were conducted with STATA v.15 (StataCorp, College Station, Texas, USA). Descriptive analysis was used to describe the characteristics of the respondents, the descriptive system of the EQ-5D-Y-3L, the EQ-VAS, and the cTTO values. Statistical analyses were conducted in two steps. In the first step, two different models were used to estimate the DCE values on a latent scale derived from the DCE responses, namely (1) a conditional logistic model and (2) a mixed-logit model. The DCE model was estimated using a mixed-logit model as it handles unobserved preferences heterogeneity by allowing the model parameters to vary across individuals [55]. The dependent variable was coded as 1 for the chosen option in each DCE task (e.g., ‘life A’) or as 0 for the alternative (e.g., ‘life B’). Dummy variables for response levels 2 and 3 in each EQ-5D-Y-3L dimension were included as covariates in both models. Bayesian Information Criteria (BIC) was used to select the best DCE model [56]. The coefficients derived from both models were used to calculate the predicted DCE values for all 243 health states of the EQ-5D-Y-3L (i.e., the DCE values).

In the second step, a mapping function was used to map the predicted DCE values onto the mean observed cTTO values for the 28 health states in the cTTO design exploring two mapping approaches: linear mapping (Eq. 1) and non-linear mapping (Eq. 2). The mapping function consisted of a regression model to identify the relationship between the predicted DCE values for the 243 health states and the mean observed cTTO values for the 28 health states considered in the cTTO survey design [36, 37, 57].

$$V_{{\text{cTTOh}}} = \gamma_{0} + \gamma_{{1}} V_{{\text{DCEh}}}$$
(1)
$$V_{{\text{cTTOh}}} = \gamma_{0} + \left( {\gamma_{{1}} V_{{\text{DCEh}}} } \right)^{\delta }$$
(2)

VcTTOh” is the mean observed cTTO value for health state “h”, “VDCEh” is the predicted DCE value for the same health state “h”, “γ0” is the regression intercept, “γ1” is the slope between the cTTO and DCE values, and “δ” is a power parameter that may adjust for any possible non-linearity in the relationship between cTTO and DCE data. Linear and non-linear mapping models with and without intercept (including or excluding γ0, respectively) were applied. We structured the cTTO and DCE values as “disutilities” instead of “utilities” to facilitate the analysis (i.e., the cTTO and DCE values were scaled with 0 being the lowest value and least disutility, and the other values that were positive indicated greater disutility). To estimate the utilities of the 243 EQ-5D-Y-3L health states, we rescaled the coefficients of both DCE models based on rescaling parameters (i.e., the coefficients of the mapping approaches were considered with and without intercept) [58]. The intercept in the mapping model indicates an estimated mean utility when all EQ-5D health dimensions are zero. In general population samples, a relatively high proportion of responders score no problems in all EQ-5D dimensions leading to a large intercept, which can result in a gap between full health and the second-best health state. To partially deal with this gap, the mapping can be performed without the intercept in which utility decrements depend only on health dimensions coefficients size [59]. The preferred mapping approach was selected based on model fit. A relatively higher R2 and a lower Akaike Information Criteria (AIC) value, root mean squared error (RMSE), and mean absolute error (MAE) indicate a better model fit and thus prediction accuracy [58].

3 Results

3.1 Respondent Characteristics

In total, 3006 individuals accepted to participate in this valuation study, 2790 for the DCE and 216 for the cTTO. For the DCE survey, 1638 individuals were excluded (1365 respondents did not complete all DCE tasks and 273 respondents completed the DCE tasks in less than 150 s or failed in the logically dominant pairs). Data from 1152 respondents were analyzed in the DCE. For the cTTO survey, only five scheduled respondents did not attend the interview and two interviews were flagged to be poor quality [both interviews were flagged as they did not receive an explanation of the worse than dead task (lead-time) in the wheelchair example—note that those interviews were still considered in the final analyses]. Data from the 211 respondents were analyzed in the cTTO.

Table 1 shows the sociodemographic characteristics, experience with severe illness, religious beliefs, general health, health conditions, and pain conditions. Both samples (DCE and cTTO) were similar in all characteristics, except for experiences with severe illness. Most respondents did not present any health or pain conditions at the moment of the data collection. Table 2 reports the EQ-5D-Y-3L responses for both samples. The EQ-VAS and response distribution of the EQ-5D-Y-3L were also similar in both samples.

Table 1 Characteristics of respondents
Table 2 EQ-5D-Y-3L responses for the DCE and cTTO samples

3.2 Description of the cTTO data

Table 3 describes the mean cTTO values for the 28 health states of the EQ-5D-Y-3L. The highest mean cTTO value was 0.955 [standard deviation (SD): 0.050] for the health state “12111,” and the lowest mean cTTO value was − 0.280 (SD: 0.577) for the health state “33333”, which was the only negative observed mean value. The distribution of cTTO values is presented in the Fig. 1.

Table 3 Block design for composite time trade-off
Fig. 1
figure 1

The density distribution of the 28 observed composite time trade-off (cTTO) values

3.3 Description of the DCE data

Table 4 shows the results from the conditional logistic regression model and the mixed-logit model on the DCE data. In both models, all coefficients were statistically significant (p < 0.001). Both models resulted in a larger coefficient magnitude in each domain with a more severe response level (i.e., level 3 indicated a lot of problems/a lot of pain or discomfort/very worried, sad, or unhappy). For the conditional logistic model, walking about (mobility) received the largest weight, followed by having pain or discomfort, looking after myself, doing usual activities, and feeling worried, sad, or unhappy. For the mixed-logit model, having pain or discomfort received the largest weight, followed by walking about (mobility), looking after myself, doing usual activities, and feeling worried, sad, or unhappy. In both models, feeling worried, sad, or unhappy with moderate response level (i.e., level 2 indicated some problems/some pain or discomfort/a bit worried, sad, or unhappy) received the smallest weight. The best DCE model was the mixed-logit model, as it demonstrates better goodness of fit than the conditional logistic model with lower BIC (22613 versus 23979, respectively).

Table 4 Discrete choice experiment models

3.4 Mapping results

Table 5 shows the results of four different mapping functions used to define the relationship between predicted DCE values and mean observed cTTO values: linear mapping with (column 1) and without (column 2) intercept, and non-linear mapping with (column 3) and without (column 4) intercept. Figure 2 shows the relationship between the mapped DCE derived values and the observed cTTO values across all tested mapping functions. The mapping functions were conducted considering only the data derived from the mixed-logit model, as it assumes responses are correlated within participants. The mixed-logit model also considers preference heterogeneity providing a more precise estimate given that respondents are allowed to differ in their preferences. The linear mapping without intercept presented a higher R-squared and an AIC value but lower RMSE and MAE compared to the linear mapping with intercept, indicating a relatively better fit. Despite a slightly lower R2 and a higher AIC value, the linear mapping without intercept presented lower RMSE and MAE compared with the non-linear mapping approaches, suggesting a relatively better fit, especially in predicting moderate health states as shown in Fig. 2. The linear mapping without the intercept was, therefore, selected as the preferred approach to describe the relationship between the predicted DCE values and the observed cTTO disutilities (Table 5 and Fig. 2).

Table 5 Estimation results for linear and non-linear mapping using mixed logit model
Fig. 2
figure 2

Scatterplots showing the relationship between the predicted DCE values and observed cTTO disutility values. A Linear mapping with intercept; B) Linear mapping without intercept; C) Non-linear mapping with intercept; and D) Non-linear mapping without intercept. cTTO composite time-trade off, DCE discrete choice experiment, cTTOd observed cTTO disutility values

The linear mapping without intercept was selected as the preferred approach to rescale the predicted DCE values. All mixed-logit coefficients were multiplied by the rescaling parameter identified by the linear mapping without intercept, namely 0.125148. Therefore, the final model used for the value set is presented in the following algorithm:

$$U = 1 - ((0.9982155*0.125148*MO2) + (1.827159*0.125148*MO3) + (0.8202576*0.125148*SC2) + (1.525397*0.125148*SC3) + (0.8257411*0.125148*UA2) + (1.484434*0.125148*UA3) + (0.87543136*0.125148*PD2) + (1.891593*0.125148*PD3) + (0.4842359*0.125148*AD2) + (1.309006*0.125148*AD3))$$

For example, in this value set, the health state 13231 would be assigned the value U = 1 – (0.1908 + 0.1033 + 0.237), which equals 0.4692. The final model ranged from 1 for “full health” (11111) to – 0.0059 for “the worst health state” (33333), which was the only negative utilities of the whole elicited value set. Table 6 provides the utilities for all 243 health profiles of the EQ-5D-Y-3L.

Table 6 Utilities for all 243 health profiles of the EQ-5D-Y-3L

4 Discussion

4.1 Main Findings

According to the International Valuation Protocol, this valuation study generated an EQ-5D-Y-3L value set for Brazil [30]. We collected the DCE tasks to generate the latent scale values and 28 cTTO health states to anchor the utilities (i.e., five health states more than the 23 recommended in the Indonesian valuation study) [30]. We investigated four different mapping approaches to estimate the Brazilian EQ-5D-Y-3L value set. The linear mapping without intercept was chosen as the preferred approach for rescaling predicted DCE values, resulting in a final model for the value set. The respondents assigned highest weights dimensions related to having pain or discomfort, followed by walking about (mobility), looking after myself, doing usual activities, and feeling worried, sad, or unhappy.

4.2 Comparison with other EQ-5D-Y-3L value set studies

Although the International Valuation Protocol did not recommend a specific modeling technique for estimating DCE values from the responses, the mixed-logit model has been the most employed method among the valuation studies for the EQ-5D-Y-3L [35,36,37, 39, 40]. Our findings are aligned with such studies as the mixed-logit model outperformed the conditional model when predicting DCE values on a latent scale. The International Valuation Protocol also did not recommend a specific modeling technique for mapping DCE values into cTTO [30]. Similar to analyses conducted in the Indonesian EQ-5D-Y-3L valuation study [36], we used four different mapping approaches to evaluate the relationship between DCE and cTTO data and selected the one that presented a better overall fit33. The main differences between our study and the Indonesian value set were that we had 28 health states for the cTTO and the best mapping approach was linear; whereas the Indonesia study considered 23 health states and the best mapping approach was non-linear [36]. The valuation study in Hungary (ten health states) and the Netherlands (28 health states) explored two and three mapping approaches, respectively. In both studies, the ordinary least squares regression without constant was chosen as the preferred mapping model [37, 38]. The valuation study in China and Spain adopted a hybrid model that combined DCE and cTTO data jointly [42, 43]. Overall, these methodological choices (e.g., mapping approaches) highlight the complexity of eliciting preferences, particularly in health valuation research in pediatric populations and country-specific contexts. Further research is needed to evaluate the magnitude and impact of such methodological choices on eliciting preferences and decision-making to provide clear guidance on this matter.

We found a discrepancy between the predicted value for the worst health state (33333) of − 0.006 and the mean observed value of − 0.28. The value for the worst health state (33333) could not be predicted well from the values assigned to other health states; however, this may not be very problematic because this health state is rare in child and adolescent patient populations. As mild and moderate health states are more prevalent in this population, it would possibly be more relevant to predict those health states more accurately than the severe ones. On the other hand, despite being rare, severe health states usually are those that also drive high costs (i.e., children and adolescents with cancer). Thus, it is relevant to validity predict preferences for those severe health states, and users should be aware of this limitation when using the Brazilian EQ-5D-Y-3L. In the Indonesia valuation study, the authors presented a similar issue. The preferred model was the non-linear one (without intercept), demonstrating lower misprediction for the milder health states [36]. The findings suggested that anchoring the worst health state (33333) may generate lower minimum values than other anchoring approaches (i.e., hybrid model). This highlights the importance of (1) considering different anchoring approaches in the valuation studies and (2) understanding if this reflects the non-linearity in the preferred model or if this is regarding the choice of methods [29].

In the EQ-5D-3L Brazilian value set, the worst health state (33333) was valued as − 0.176, which is lower than the lowest value of the EQ-5D-Y-3L [45]. This difference may be explained by the descriptive systems and the valuation methods used in both studies. Our results were similar to those observed in other previous valuation studies [35, 36, 39, 43], highlighting the differences in the importance of dimensions between the adult and child EQ-5D-3L versions. For instance, pain/discomfort (having pain or discomfort) was the highest weighted dimension in the child valuation whereas mobility, in the adult. There are several challenges regarding the differences of using the QALY in HTA for children and adolescents when compared to adults. While we acknowledge the importance of considering age-specific QALY values and the issues associated with their incorporation into models spanning the lifetime, it is essential to remember that recommendations are not solely based on cost-effectiveness analyses [29]. Other factors such as available resources, characteristics and impact of the condition, equity, and context are also taken into account, especially for the population of children and adolescents [29, 60]. Valuation studies with the EQ-5D-Y-3L are still being developed around the world and new methodologies with different versions (i.e., age-dependent valuation) can be considered such as the proxy version for younger children (< 8 years), the EQ-5D-Y-5L version, and/or the experimental Toddler and Infant Populations version (EQ-TIPs) [29]. Furthermore, new methods for conducting valuation studies with the EQ-5D instrument in children and adolescents should be considered, as this is also the case with other similar instruments (i.e., CHU9D). In this scenario, the adolescents (rather than adults) would elicit their preferences, taking into account their own perspectives and this could potentially minimize the likelihood of having the QALYs for different age groups [29, 61]. Thus, caution is needed for stakeholders, decisions-makers and researchers on how to interpret and use, in practice, the results descendent from the current literature.

4.3 Strengths and Limitations

This study has some strengths. Firstly, all the cTTO interviews were conducted by experienced healthcare professionals who work with research. Furthermore, only two out of 211 interviews were flagged as poor quality (less than 1%), which guaranteed a quality control process during the cTTO data collection. Secondly, we also included quotas for age, sex, educational level, and socioeconomic level for both the DCE and cTTO surveys to guarantee the representativeness of the Brazilian population in our sample. Finally, the online DCE survey included respondents from all Brazilian regions (south: 14%; southeast: 42%; north: 9%; northeast: 27%; and central west: 8%), which aligns with the latest population estimate according to the Brazilian Institute of Geography and Statistics' Demographic Census of 2022 [62]. Thus, our findings support a good geographic representation of the Brazilian population, which can be considered a great challenge as Brazil is a country with continental dimensions [63].

This study also has some limitations. Despite the good representativeness of the DCE data, attention is needed to possible levels of low engagement and random responses in the online DCE survey. Our findings demonstrated that 58% (1638 of 2790) of respondents were excluded from the DCE task due to not completing all tasks. Of those 1638 excluded, 1365 (83%) did not completed all DCE tasks and 273 (17%) completed the tasks in less than 150 s or failed in the logically dominant pairs according to our quality criteria. Furthermore, although internet access in Brazil has increased in recent years, the DCE sample may present a selection bias due to the low levels of income, education, and digital literacy of a proportion of the Brazilian population [63, 64]. Even though we have also cautiously selected quotas for the DCE and cTTO to guarantee representativeness, and cTTO was considered for three Brazilian states representing about 31.6% of the whole Brazilian population. Thus, apart from guarantying feasibility of the study, it is complex to geographically fully represent a country like Brazil considering the five regions in the cTTO as the territorial extension of Brazil is 8,510,418 km2 according to the Brazilian Institute of Geography and Statistics [63]. Another limitation is that DCE and cTTO were conducted in different samples and different modes of administration.

An important discussion has been raised around several key points regarding conducting the cTTO method in adults when eliciting preference for an instrument designed for children and adolescents aged 8–15 years [31]. There is a debate regarding the appropriateness of framing the task as “considering your views about a 10-year-old child” as it may introduce a potential bias or limitations when adults are asked to imagine the children and adolescents’ perspective. It has been discussed that there are important differences between the preferences of health for a child of 10-year-old compared with an adolescent of 17 years old [65]. Secondly, alternative perspectives, such as imagining one’s child, a known child, or simply imagining another child remain unknown as it could alter values and potentially influence preferences [66]. Lastly, there is an ongoing discussion on the feasibility of including adolescents in the valuation tasks in the EQ-5D-Y-3L and as previous studies for the CHU9D for example [33, 67,68,69]. These discussions aim to enhance the validity and reliability of preference elicitation for child and adolescent-specific instruments.

4.4 Future Studies and Implications for Practice

This study opens possibilities for future studies, such as randomized controlled trials, observational studies (i.e., cross-sectional and cohort), and methodological studies (psychometric studies) to measure HRQoL in children and adolescents using the EQ-5D-Y-3L value set in Brazil. As with the EQ-5D-3L and EQ-5D-5L, future research should be conducted with the EQ-5D-Y-3L to validate it in youth populations with specific health conditions (e.g., obesity, asthma, cancer, musculoskeletal, and mental conditions) in Brazil. Our findings raise the opportunity for researchers to use the Brazilian EQ-5D-Y-3L value set to estimate the QALYs to conduct economic evaluations (i.e., cost-utility analysis) for incorporation of health technologies assessment in the youth population in Brazil. The EQ-5D-Y-3L value set can be used in clinical studies, psychometric studies, and population-based studies as well as by researchers, clinicians, and policy/decision makers interested in valuing health-related quality of life using utilities [70]. We provide the descriptions and syntaxes for estimating utilities using R, STATA, SPSS, and Microsoft Excel in the Supplementary Information 2. The use of the Brazilian EQ-5D-Y-3L value set should be recommended in the updated version of the Brazilian Health Technology Assessment guideline [16]. Thus, in the same way that the EQ-5D-3L has been used as a preferred decision-making instrument for the incorporation of new health technologies assessment in Brazil [16], we believe the EQ-5D-Y-3L value set can be widely used in the decision-making process for the Brazilian population of children and adolescents.

5 Conclusion

To conclude, this study provided a national value set of the EQ-5D-Y-3L for Brazilian children and adolescents based on adult population preference considering the viewpoint of a 10-year-old child. The utilities ranged from 1 for “full health” to –0.0059 for “the worst health state.” The dimension with the largest weight was having pain or discomfort followed by walking about (mobility). The elicited Brazilian EQ-5D-Y-3L value set can be used to estimate the quality-adjusted life-years for economic evaluations of health technologies targeting the Brazilian youth population.