A Norwegian 15D value algorithm: proposing a new procedure to estimate 15D value algorithms

Purpose So far there is no Norwegian value algorithm to inform healthcare decision making. The 15D health state values estimated with the original 15D valuation procedure tend to be higher than the values of other generic preference-based health-related quality of life (HRQoL) instruments. The main purpose of this study was to use a new 15D valuation procedure to estimate Norwegian 15D health state values and to explore their empirical performance. Methods The visual analogue scale was used to collect 15D valuation data in a representative sample of the Norwegian general population. The new procedure used fewer valuation tasks and anchored the 15D health state values in an empirically assessed range. The Norwegian 15D health state values were compared to the values of five HRQoL instruments which were provided by Norwegian residents belonging to seven disease groups and a healthy population. Results The Norwegian 15D health state values ranged from 1 to − 0.52. Compared to 15D health state values estimated with the original procedure, the Norwegian 15D health state values were lower and more in line with values of other HRQoL instruments. Conclusions The new 15D valuation procedure is simpler, links the 15D health state values better to the requirements of the QALY model, and provides an empirically-based range. We recommend using the new valuation procedure in future 15D valuation studies, and the Norwegian health state values for use in 15D-based health economic analyses in Norway.


Introduction
The 15D is a generic preference-based health-related quality of life (HRQoL) instrument that can be used to provide preference-weights for quality-adjusted life-year (QALY) calculations when paired with valuation data [1,2]. It is used internationally [1] and was recently a part of a large multi-instrument comparison study (MIC) [3,4]. The 15D assesses HRQoL via a descriptive system with 15 dimensions, covering physical, mental, and social aspects of health [2,5]. Originally Finnish, the 15D has been translated into 30 languages, including Norwegian [1,6]. 15D valuation studies have been conducted in Finland [7] and Denmark [8]. These studies used the same study design involving three valuation tasks, based on the visual analogue scale (VAS), and were carried out through postal administration [7,8]. In this study, we addressed a selection of problems known to be related to the original 15D valuation system. We reviewed the original 15D valuation tasks with a focus on how the information they provide is combined to estimate 15D algorithm values. An earlier study suggested that the way information from the three original 15D valuation tasks is combined increases the likelihood of larger error terms which makes the link between the VAS tasks and the health state values less transparent [9]. We propose a new 15D value algorithm estimation procedure that uses information from only one of the original valuation tasks [9] and anchors the worst possible health state in an empirically assessed value.
There is no value algorithm (or tariff, or value set) based on preferences of the Norwegian general adult population for any generic preference-based HRQoL instrument. It is the main aim of this study to use a new 15D valuation procedure to estimate a Norwegian 15D value algorithm. Further, we compared the resulting Norwegian 15D health state values with corresponding values of other generic preference-based instruments collected in seven disease groups and a healthy sample.

The 15D instrument
The 15D health state space comprises 15 dimensions, each with five levels of function [2,5]. As such, a 15D health state can be represented as a vector = l 1 , … , l 15 , where each l j specifies the level of function on dimension j . The 15D has a large number of dimensions and levels and its full health state space contains 5 15 health states. It is cognitively challenging to value 15D health states, consisting of 15 attributes and unfeasible to directly evaluate a representative set of 15D health states [7]. Instead, each level of function of the 15D descriptive system was valued separately with VASbased valuation tasks and simplifying assumptions from multi-attribute utility theory were used to derive 15D health state values in the original 15D algorithm estimation procedure [7,10].
A value algorithm estimation procedure details the steps needed in order to convert valuation data from valuation tasks into a value algorithm (Fig. 1). A 15D value algorithm consists of a look-up table with algorithm values which allow estimating a value to each 15D health state (Fig. 1). As the 15D health state space is considerably large, it is common to present a look-up table which allows to estimate 15D health state values by adding up 15 algorithm values (Fig. 1). We thus describe a procedure that converts valuation data, i.e., the participants' responses to the VAS-based valuation tasks, into a value algorithm that yields health state values V H ( ) for all 15D health states .

The original 15D valuation tasks and the original value algorithm estimation procedure
The original 15D valuation procedure consisted of three VAS-based valuation tasks: the top task, the bottom task, and the within dimension tasks (Fig. 1). The top and the bottom tasks each used a single VAS on which respondents were asked to evaluate the best and the worst levels of all 15 dimensions. These tasks were intended to provide information about how important each dimension was perceived. In the 15 within dimension tasks, the respondents were asked to assign a VAS-score to each level of impaired function, referred to as L2 through L5, and "being dead" for each dimension separately, while L1 was fixed at 100 (Appendix 1). In the original procedure, the resulting level scores were multiplied with the importance weight for each level, which was "extrapolated linearly" [7] based on information from the top and bottom tasks [2]. A step-by-step description of the original procedure can be found in Appendix 3 in Michel et al. [9].

The new 15D valuation tasks and the new value algorithm estimation procedure
The new 15D algorithm estimation procedure used information of the within dimension tasks and a pits-task (Fig. 1). While the within dimension tasks were identical with the tasks used in the original valuation procedure, the pits-task was only part of the new valuation procedure. Of the original three tasks, only information from the within dimension tasks was retained. Thereby problems from combining information from different valuation tasks were avoided, and the link between the VAS-scores and the resulting health state values has become more transparent [9]. The within dimension tasks provided relative VAS-scores for all levels. The pits-task provided an average score for the worst possible 15D health state, L5 on all 15 dimensions, that has been directly scored on a VAS together with "being dead" (Appendix 2). Besides the pits-task, no interactions between the 15D levels were assessed.
The QALY model assumes that preferences for health states can be expressed on a ratio-scale, where zero corresponds to "being dead" [11], where "being dead" is a proxy for health states without an intrinsic (QALY-) value, since "being dead" suspends time [12][13][14]. Using the pits-score's relation to "being dead" and "perfect health" to determine the range of the 15D health state values provided a plausible link to the ratio-scale used in the QALY model.
The new value algorithm estimation procedure was performed on averaged within dimension VAS-scores that were weighted to match the general Norwegian population. Sixteen task-specific sets of weights have been estimated (one for each of the fifteen within dimension tasks, and one for the pits-task). Post-stratification weights have been applied (see Appendix 3 for information on which variables were used for post-stratification and a detailed description of the weighting procedure).
Health state values represent disutility values via the identity v = 1 − u . The new value algorithm estimation procedure yields a 15D health state value V H ( ) for a 15D health state = l 1 ⋯ l 15 using the following functional form: • Step 1: For each dimension j and level i (L2-L5 and "being dead"), compute the weighted means s j i over the respondents' raw VAS-scores from the within dimension tasks (  ∶T j,i def =̂ ⋅Ŝ j,i . This normalizing constant ensures that the final range of health state utilities is bounded by 1 for "perfect health" and 1 −V Pits for the pits-state. A 15D health state value V H ( ) is a simple sum of fifteen of the disutility values presented in Table 1.
We estimated a Norwegian 15D value algorithm using this new procedure. We present the range and the dimensions with the largest and the smallest disutility values. The original 15D algorithm estimation procedure was applied to the Norwegian valuation data in an earlier study [9] for comparison, and the resulting values were not recommended for use in healthcare decision making. We refer to the values resulting from the previous comparison as original 15D

The Norwegian valuation studies
The main Norwegian 15D valuation study was conducted in 2010 through the marked research firm TNS Gallup. The survey was self-completed and consisted of demographic variables, the 15D descriptive system (self-reported health), and the original 15D valuation tasks. Parallel Web and postal surveys were conducted. The postal survey was sent to a random sample of about 5000 postal addresses from the Norwegian National Population Registry. Participants received a prepaid response envelope. The Web sample was recruited by sending mails to 1936 individuals pre-registered in an online panel maintained by TNS Gallup (TNS-Gallup Panel). Several waves of emails were sent out to reach a responding sample of 1000+ individuals that resembled the Norwegian general population in terms of age, gender, educational level, and geographic distribution. Due to technical limitations in the software used for the Web data collection, the task presentation differed between the Web and postal sample (see Appendix 1 for details). For details about how the different survey versions were randomized and for the exclusion criteria applied in this study, see [9]. Sensitivity analyses for testing the influence of case exclusion on within dimension task scores and 15D algorithm values were performed and did not indicate any differences between the unselected and the selected sample [15]. An additional face-to-face data collection was conducted in 2015-2016 in order to directly assess the value associated with the worst possible 15D health state. We asked 120 members of the Norwegian general population to provide information on age and gender and to fill in the 15D descriptive system. Further, the participants performed two within dimension tasks to get familiar with the 15D valuation tasks. Finally, the pits-task was presented in which participants were instructed to score the worst possible 15D health state and "being dead" on one VAS, ranging from 0 to 100, anchored in the best and worst imaginable health state (Appendix 2). Two participants had to be excluded due to missing data in this task. Sintonen performed a similar task in the original Finnish 15D valuation study but did not use the value of the worst possible 15D health state when finally estimating the Finnish 15D value algorithm [7].

Empirical performance of the Norwegian 15D health state values
To demonstrate the properties of the Norwegian 15D health state values, we used self-reported health in the Norwegian MIC data set (N = 1177) [4,16]. Details about the data Table 1 The Norwegian 15D algorithm disutility values The Norwegian 15D algorithm values are presented as disutility values, which can be interpreted as the value loss that is associated with the respective level of function. Level  collection were described elsewhere [4,16]. Respondents filled in the descriptive systems of AQoL-8D [17], EQ-5D-5L [18], HUI 3 [19], and SF-36v2 to derive SF-6D [20], and the 15D [2]. We visually compared the mean health state values of the different instruments in a healthy sample and in seven different disease groups, all collected in Norway. The existing value sets were used to estimate health state values per instrument for a healthy sample and seven disease groups: asthma, cancer, depression and anxiety, diabetes, hearing disability, arthritis, and heart disease. Note that except for the Norwegian 15D value algorithms, none of the other instruments had a value set that is based on preferences of the Norwegian general population. We applied both the Norwegian 15D value algorithm estimated in this study and the one estimated earlier [9] to self-reported health, using the 15D descriptive system, in the Norwegian MIC data set. Table 2 provides an overview of the characteristics of the Norwegian general population in 2010 and the unweighted Norwegian valuation study sample. Responses were weighted using post-stratification to better reflect the makeup of the Norwegian adult general population [21].

Descriptive statistics
Weights were calculated separately for each of the 15 within dimension tasks, and for the pits task. Respondents were categorized by age, gender, and education, and weights were calculated by dividing the proportion of the general population in each category by the corresponding proportion of the respondent sample (see Appendix 3 for details).
Of 1936 individuals, 1003 finished the Norwegian 15D valuation survey in the Web sample (response rate 52%), while 1276 out of 4899 contacted individuals in the postal sample returned completed surveys (response rate 26%). An overview of the excluded participants can be found in Appendix 5. Participants were aged between 19 and 101 years (mean = 51.6, SD = 16.5), 52% were female, and the most common educational degree was a high school degree (43%). The sample largely matched the underlying population's characteristics, although responders of the postal survey tend to be older and better educated. Of 118 individuals who provided complete data in the pits-task, 52 participants assigned a higher ("better") score to "being dead" than to the worst possible 15D health state. While 48 participants chose the same score for "being dead" and the worst possible 15D health state, namely zero.

The Norwegian 15D value algorithm
In terms of disutility values, the Norwegian 15D health state values ranged from 0 to 1.52, being anchored in an empirical estimate of the worst possible 15D health state.
A Norwegian 15D health state value can be computed with the following formula: The values for Ŝ j,i can be found in Table 7 in Appendix 4 and the values for T j,i are shown in Table 1 (see also Appendix 4).
The 15D health state values resulting from using the new procedure are lower than the original 15D values (Figs. 2,  3). The dimensions with the largest L5 disutility values were mobility (0.1083), discomfort (0.1063), and eating (0.1061), whereas hearing (0.0959), speech (0.0960), and sleep (0.0968) had the smallest disutility values on the worst level of function (Table 1; Fig. 4).

Empirical performance of the Norwegian 15D health state values
In accordance with all other instruments, the largest mean health state disutility value using the Norwegian 15D value algorithm was found for the disease group Depression and Anxiety (Fig. 3). Compared to the original 15D values, the new Norwegian values are in general more in line with corresponding values of other instruments in seven disease groups and a healthy sample (Fig. 3). The same is true when assessing the ranges of the value algorithms (Fig. 2).

Summary of results
We estimated a Norwegian 15D value algorithm using a new value algorithm estimation procedure. Compared to the 15D health state values that were calculated with the original valuation procedure, the Norwegian 15D health state values were lower and closer to corresponding values of other generic preference-based instruments. With this study, we aimed to amend the available information on the 15D valuation procedure by increasing transparency and replicability of the valuation procedure used to estimate 15D health state values. We aimed to facilitate future research that empirically compares health state values resulting from different instruments, as well as enabling instrument users and decision makers to make more informed choices between generic preference-based instruments.

The new 15D value algorithm estimation procedure
The new 15D value algorithm estimation procedure provides a more transparent link between the valuation tasks and the resulting 15D health state values. Using fewer valuation tasks reduces the burden and costs related to data collection. Anchoring the 15D algorithm values in an empirically assessed value for the worst possible 15D health state has several advantages. First, this is the most direct way to determine the range of health state values for the instrument. The range is a crucial feature of an instrument. Anchoring the range in an empirical value for the worst possible health state increases the range's validity compared to the original value range which was based on extrapolation from single dimensions. Second, as the value for the worst possible health state is assessed in relation to "being dead," the resulting health state values fulfill the requirement of the QALY model better than the original 15D health state values as they are on a ratio-scale with a clearly defined zero. Third, the value of the worst possible health state is the only 15D health state that potentially captures interactions between the levels. This is a valuable amendment to the within dimension tasks scores, which were assessed for each dimension separately. The empirical value for the worst possible 15D health state assessed in the Norwegian general population is comparable to a similar Finnish value (− .334) [7]. However, this estimate was not used in the Finnish valuation study. The anchoring approach chosen in the new 15D valuation procedure is not without alternatives. Although it possibly is very challenging to value the worst possible 15D health state, having all dimensions at the same worst level of function might still be easier to conceptualize than a health state including different levels of function. Given that it seems unfeasible to value 15D health states that contain different levels of function, we argue that the estimate of the worst possible 15D health state is the most suitable score for anchoring, given the 15D specific constraints due to its large number of health states. One alternative approach would have been to anchor the values in the sum of all L5 disutility values. We did not use this approach as it would have resulted in an unacceptable health state value-range of 0-14.4 in terms of disutility values.

Empirical performance of the Norwegian 15D health state values
The Norwegian 15D health state values are lower than the original 15D values due to a lower value that was assigned to the worst possible 15D health state. As a consequence, Norwegian 15D health state values are more in line with the values of other generic preference-based instruments (Fig. 2). This will reduce the chance for drawing different conclusions about the cost-effectiveness of health interventions due to using different instruments to assess HRQoL

Remaining challenges and limitations
The original 15D valuation system has been criticized on several grounds. A widely hold criticism is that VAS is used as a stand-alone valuation method [23][24][25][26]. It has been questioned if VAS can assess respondent's preferences as the resulting valuations are not the result of a trade-off [27,28]. A potential concern with the new 15D valuation procedure is that it relies on trade-offs between the levels of different dimensions compared to death to determine the relative weight of dimensions, rather than direct trade-offs between dimensions. The primary strength of this method over direct trade-offs between dimensions is that it reduces the cognitive burden of the tasks, allowing respondents to focus on the impact of each specific level of each dimension compared to death. However, the absence of direct trade-offs could attenuate differences between dimensions. There are also a number of biases related to VAS that might influence health state valuation [23,29]. However, the VAS has been defended as a preference elicitation method for eliciting health state values under certainty [30] and valuation methods that are similar to those of the 15D have been recently applied in a new method for valuing health [31].
Another aspect that has been criticized is that an additive model is used to estimate 15D health state values. The additive model has been chosen by the 15D instrument developer as other models have not been considered to be feasible due to the number of potential interactions [7]. Using an additive model means to assume that all dimensions are structurally independent. To our knowledge, the structural independence of the 15D dimensions has not been tested. While this study identified and addressed a selection of methodological shortcomings of the 15D valuation system, others remain to be addressed in future research. Although we were aware of VAS' shortcomings, we kept using VAS as a valuation method in the new 15D valuation procedure. The use of other valuation methods, or allowing trade-offs between the dimensions, is complicated by the large descriptive system of the 15D. The advantage of keeping the same valuation method as in the original valuation procedure is that already existing 15D value algorithms can be converted into health state values that are more in line with requirements of the QALY model. This can be done by assessing an empirical value for the worst possible 15D health state and anchoring the already existing 15D algorithm values in this pits-estimate. In these kind of studies, the estimate of the worst possible 15D health state could also be assessed with other valuation methods than VAS. However, it has to be kept in mind that valuing a health state with 15 attributes is challenging, especially when assessing worse than death values [32]. Furthermore, we kept using the additive model in the new 15D procedure. It is unsatisfying that no explicit tests have been performed to empirically support this model choice. However, testing the structural independence of the 15 domains and assessing the interactions between them are extensive tasks which were beyond the scope of this study. Additionally, we recommend that future studies use identical task presentations of the within dimension tasks in the Web and the postal sample. Finally, it is worth noticing that the disease classification in the MIC data set was based on self-reports rather than on medical diagnosis. This may limit the clinical accuracy of these groups. For the purpose of comparing different instruments, however, this is not a central concern.

Conclusions
The Norwegian 15D value algorithm is the result of applying a new 15D valuation procedure that uses fewer valuation tasks and is anchored in an empirically assessed value for the worst possible 15D health state. The Norwegian 15D health state values are more in line with the requirements of the QALY model and are more comparable to the values of other HRQoL instruments in seven disease groups and a healthy sample. This study presents the first Norwegian value set for a generic preference-based instrument. We recommend using the new 15D valuation procedure when estimating future 15D value algorithms in general, and the Norwegian 15D health state values specifically for 15D-based health economic analyses in Norway.
In the postal sample, a vertical VAS was used, with L1 being fixed to 100 (see "Visual analogue scale used for the within dimension tasks in the postal sample"). In the Web sample, a horizontal VAS was used, and the indicators for L1 were not fixed (see "Vertical visual analogue scale used for the within dimension tasks in the Web sample"). In order to improve comparability with the postal sample, where L1 was fixed to 100, we rescaled the cases of the Web sample to 100 if the VAS-score assigned to L1 was less than 90. No formal education Appendix 2: Instruction and task presentation of the pits-task "In the following task, two health states are described. We want you to compare these health states with each other. First, read the entire description of both health states. Then, we kindly ask you to first assign a number to the health state that you consider to be worst and then to the remaining health state. You can assign all numbers between 0 and 100 as you consider it to be most suitable."

Appendix 3: Post-stratification weighting
The aim of applying weights to the valuation data was to make the valuation sample representative of the Norwegian general population at the time of data collection. We applied a weighting method referred to as "post-stratification weighting" in the literature [21]. The participants of the Norwegian 15D valuation study were randomized to perform different subsets 15D valuation tasks (for details see [9]). As also case exclusions were performed separately by task (see Appendix 5 for details), the precise makeup of respondents varied by task. Therefore we estimated 16 task-specific sets of weights: one for each of the fifteen within dimension tasks, and one for the pits-task.
The following describes the technical details of how we estimated the weights for post-stratification. First, we assigned each member of the Norwegian general population to categories that were defined by different combinations of age, gender (and education) (see Tables 3, 4 in Appendix 3). The respondents of the Norwegian valuation studies have been assigned to the same categories. The weight for the respondents of one category is defined by the proportion of the category in the population divided by the proportion of the same category in the sample (see Table 5 in Appendix 3).
For additional clarity, a specific example of how the weight of one category was estimated follows. In this example we estimated weights for the within dimension task for the mobility dimension. Consider the category included all respondents who were female, aged 30-39, with education equivalent to BA level. This group was represented by 122,516 individuals in the Norwegian general population, and by 67 persons in the respondent sample that performed the within mobility task. The mobility-specific weight for this category is estimated by dividing the proportion in the population by the corresponding proportion in the mobilityspecific subsample, i.e.: The population proportions were retrieved from Statistics Norway (http://www.ssb.no). We used the population makeup from the year of study, i.e. 2010 for the within dimension task weights, and 2016 for the pits-state weights. The variable coding used to define the stratification categories can be found in Tables 3 and 4 in Appendix 3. Table 5 in Appendix 3 provides weights for the pooled sample of the Norwegian valuation study. We also included the proportions of each category in the population and the pooled Norwegian 15D valuation sample respectively. Each cell of Table 5 in Appendix 3 contains the following information: Proportion of the respective category in the Norwegian general population (in % ) Proportion of the respective category in the pooled Norwegian valuation sample (in %) = Pooled sample weight of respective category   Table 6 Ŝ j,i can be estimated. j corresponds to dimensions and can be found in the rows of Table 6, while i corresponds to the levels which can by found in the columns of Table 6. For example, the weighted and averaged VAS scores for level four of the hearing dimension ( Ŝ 1,4 ) can be found in row 1, column 4 of Table 6 Within dimension task scores  Table 7 Rescaled within dimension disutility scores With the scores in Table 7 T j,i can be estimated. j corresponds to dimensions and can be found in the rows of Table 7, while i corresponds to the levels with can by found in the columns of Table 7. For example, the rescaled dimension disutility score for level four of the hearing dimension ( T 1,4 ) can be found in row 1, column 4 of