Introduction

Economic evaluations typically compare interventions in terms of their costs and benefits, in order to assess their value for money (Drummond et al., 2005). Ultimately, such evaluations can inform optimal allocations of scarce resources within and across different sectors. Within the healthcare sector, economic evaluations often take the form of a cost-utility analysis, in which benefits are expressed in terms of Quality-Adjusted Life-Years (QALY), a health utility index encompassing both length and quality of life. Health utilities are expressed on a scale anchored on the health state “dead” with value 0 and the health state “perfect health” with a value of 1. Being in perfect health for one year represents 1 QALY. An intervention that provides a quality-of-life improvement to a patient of 0.1 QALY per year during 10 years, produces a benefit of 1 QALY (abstracting from discounting). The quality-of-life component is typically confined to health-related quality of life (HRQoL), measured by generic instruments such as the EuroQol instrument (EQ-5D) (EuroQol Group, 1990), the Short-Form Health Survey (SF-36) (Ware & Sherbourne, 1992) or the Health Utilities Index (HUI) (Furlong et al., 2001). This closely conforms to the notion that healthcare decision makers would be especially -or even exclusively- interested in producing health (measured as HRQoL) from the available healthcare budget.

In recent years, however, it has been asserted that the evaluative space commonly adopted in economic evaluations might be too narrow. Health (and social) care interventions may not always aim to improve (only or primarily) health, but (also) broader elements of quality-of-life, or well-being (Coast, 2004; Payne et al., 2013; Weatherly et al., 2009). The fact that such interventions are commonly funded from healthcare budgets may signal that healthcare decision makers thus also consider broader benefits from interventions than only health. Areas in which this seems especially relevant include social care, palliative care, long-term care and elderly care, but prevention and cure may also have effects above and beyond health (Coast, 2014; Hackert et al., 2021; Makai et al., 2014; Milte et al., 2014). In such cases, an adequate comparison of costs and benefits of interventions and a fair assessment of their value for money requires that the instruments used to measure the benefits from interventions capture all the relevant outcomes. Failing to do so may lead to a misrepresentation of the societal value of interventions, suboptimal decision making and, ultimately, to misallocation of scarce public resources.

Over the past years, the recognition that the evaluative space of economic evaluations in healthcare needs to be broadened has stimulated the development of instruments to measure well-being. Although many of such instruments exist across disciplines, only few have been developed for use in the context of (health) economic evaluations and fulfil the necessary criteria for this purpose (Makai et al., 2014). Most importantly, several multi-dimensional measures of well-being lack preference-based weights to summarize the dimension scores into a single utility index. Prominent examples of measures with such preference-based utility weights, which are hence suitable for use in economic evaluations, include the ICEpop CAPability (ICECAP) measures (Al-Janabi et al., 2012; Coast et al., 2008; Grewal et al., 2006), the Adult Social Care Outcome Toolkit (ASCOT) (Netten et al., 2012), and the Well-being of Older People (WOOP) measure (Hackert et al., 2021). These instruments, however, differ in their conceptualisation and operationalisation of well-being. While all these measures are (or seem to be) grounded in capability theory, which focuses on peoples’ functionings and their capabilities (Binder, 2014; Robeyns, 2005, 2006; Sen, 1985), the ASCOT and the WOOP seem to focus more on the functionings of people in different life domains (i.e., what they are and do), while the ICECAP measures focus on capabilities (i.e., the freedoms or opportunities they have to be and do their potential functionings). In addition, these instruments differ in scope and target population, that is, whether they measure well-being generically or in specific subgroups or contexts (e.g., ICECAP-O and WOOP for older people, ASCOT for health and social care users). They also differ in terms of measuring wellbeing partially or comprehensively (e.g., ICECAP does not measure health directly, but also does not seem to capture all elements of health indirectly (Hackert et al., 2017)). Finally, only for the WOOP the utility weights have been anchored on a scale from dead (0) to perfect well-being (1), which facilitates combining length and quality of life effects in computing the well-being benefits of an intervention for use in cost-utility analyses (Himmler et al., 2022). Recently, the EuroQol Group introduced the EuroQol Health and Well-being (EQ-HWB) instrument (Brazier et al., 2022), which, in terms of included domains and scope, appears to be primarily based on existing quality of life measures aimed at users of health and social care services and carers (Carlton et al., 2022) and also focuses on measuring functionings.

Taken together, one may argue that a generic measure of well-being for use in economic evaluations that captures all relevant domains of well-being in the adult population comprehensively is not yet available. In addition, there seems to be scope for a measure with an alternative conceptual framework. As mentioned above, the currently most used measures are (or seem to be) grounded in capability theory (Robeyns, 2005, 2006; Sen, 1985). Although the capability approach has been praised for offering a broad conceptual framework to the assessment of quality of life, it does not specify a (core) set of capabilities or functionings that have value to people and could be used for the operationalisation of the approach in practice. Moreover, the approach is unclear about how any chosen set of capabilities or functionings can be valued, aggregated and traded-off for assessments of overall well-being (Binder, 2014; Hasan, 2019; van der Deijl et al., 2023). Subjective well-being (defined here as enduring satisfaction with life-as-a-whole, or happiness (Veenhoven, 2012)) offers an alternative framework in which assessments of well-being are grounded in peoples’ individual judgements of their life situation, which include -but is not confined to- their capabilities and functionings. Important critiques include the multitude of measures, the lack of evidence about the validity and reliability of these measures for policy evaluation (Hausman, 2015) as well as the role of adaptation in well-being measurement (Frederick & Loewenstein, 1999; Stöckel et al., 2023). Often used single-item measures of subjective well-being preserve the sovereignty of individuals to incorporate whatever they find important in their assessment of their overall quality of life, which at the same time means that it remains unclear what was incorporated in the assessment. Multi-item measures of subjective well-being aim for a common measurement of the concept across people but are susceptible to criticism of paternalism, as the domains included in the instrument eventually determine what is measured and valued (van der Deijl, 2017).

The aim of this paper is to describe the development of a well-being instrument that measures individual welfare in terms of people’s own assessment of their life-as-a-whole (Veenhoven, 2012). This new instrument, the 10-item Well-being instrument (abbreviated as WiX, with ‘W’ referring to ‘well-being’, ‘i’ to instrument’, and ‘X’ to the 10 domains of well-being that the instrument covers), thus aims to capture overall (or general) quality-of-life in terms of subjective well-being by measuring how satisfied people are in a number of important domains of life. To relate the conceptualisation of the WiX to the capability approach, satisfaction with life can be seen as fulfilment (with the achievement) of a set of happiness-relevant functionings (Binder, 2014; Sen, 1985).Footnote 1 The development of the WiX was theory driven, building systematically on several theories of well-being and explorative empirical work (van der Deijl et al., 2023) as well as existing generic instruments to ensure its comprehensiveness. In addition, the conceptualisation of the WiX as a multi-item instrument follows the notion of pragmatic subjectivism (Haybron & Tiberius, 2015), which states that confining measurement to a limited set of indicators is legitimate for policy evaluation if the indicators are based on personal well-being values, that is, what citizens consider important for their well-being (see also Alexandrova, 2016; van Exel, 2017). Given its broad evaluative space, the WiX is not only intended to be suitable for use in (economic) evaluations of interventions in health and social care, but also in other sectors and across sectors. This also allows intersectoral comparisons. With its focus on measuring satisfaction with life, the WiX is complementary to existing measures of well-being for use in economic evaluations.

The remainder of this paper describes the development of the WiX, a multi-dimensional instrument that aims to measure well-being in the adult general population comprehensively for use in (economic) evaluation studies and reports the findings from a content validation study.

Methods & Data

Development of the Instrument

Figure 1 provides an overview of all steps taken to develop the draft version of the WiX.

Fig. 1
figure 1

Overview of 12 steps in development of a draft version of the WiX

As the first step in developing the new instrument, one of the authors (A1; blinded for review) conducted a scoping review (Grant & Booth, 2009) to identify existing instruments that aim to measure well-being. A total of 16 instruments were identified (in alphabetical order): Control, Autonomy, Self-Realization and Pleasure-19 (CASP-19) (Hyde et al., 2003); Extending the Quality-Adjusted Life Year (E-QALY)Footnote 2 (Mukuria et al., 2018); ICEpop CAPability measure for Older people/ Adults (ICECAP-O/-A) (Grewal et al., 2006)/(Al-Janabi et al., 2012); Living Standards Framework (LSF) (Treasury, 2018); Office of National Statistics (ONS) four subjective wellbeing questions (ONS-4) (ONS, 2017); Personal Well-being Index Scale (PWI) (Group, 2013); Psychological Well-being Scale (PWS) (Ryff, 1989); Quality of Life – Aged Care Consumers (QOL-ACC) (Ratcliffe et al., 2019); Quality of Life at the End of Life (QUAL-E) (Steinhauser et al., 2002); Quality of Well-being Scale self-administered (QWB-SA) (Sieber et al., 2008); Self-Evaluated Quality of Life Questionnaire (SEQOL) (Ventegodt et al., 2003); Social Production Function Instrument for the Level of well-being (SPF-IL) (Nieboer et al., 2005); The Adult Social Care Outcomes Toolkit (ASCOT) (Netten et al., 2012); The World Health Organization Quality of Life (WHOQOL) (Group, 1997); Well-being Adjusted Life Years (WALY) (Birkjær et al., 2020); Well-being of Older People (WOOP) (Hackert et al., 2020).

In the second step, four authors (A1, A3, A4, A5) jointly assessed these 16 instruments and selected those that have a similar aim as the planned new instrument, namely: (1) multi-domain (or multi-attribute) instruments; (2) measuring (fulfilment with) functionings or satisfaction with life; and (3) focused on measuring well-being in the adult general population. In the end, 8 of the 16 instruments were retained for further analysis: E-QALY, ONS-4, PWI, PWS, QWB-SA, SPF-IL, WALY and WHOQOL. For example, the WOOP was not selected because it was specifically developed for the older population, and the ICECAP-A was not selected because it is focused on measuring capability well-being.

In the third step, one author (A1) assessed these eight instruments more thoroughly by identifying the key publications underlying each instrument and creating an overview of their aim, approach to development, domain (or item) and level structure, instructions for users, available languages, and sources of funding for the development of the instrument. This information was summarized in a large table (not reported here because of its size). Next, in the fourth step, two authors (A1, A5) cross-tabulated the domains and items of these eight instruments against the domains of a theoretical framework outlined by van der Deijl et al. (van der Deijl et al., 2023) that synthesized the main existing theories of well-being. This framework distinguished 11 domains of well-being, namely: physical health; safety; recreation and leisure; mental well-being; political representation; mental development; environmental conditions; social relations; material well-being; labour conditions; and achievements (see Table S1 in online Supplementary Information 1). This showed that the domains of the eight selected instruments covered 10 of the 11 domains of the theoretical framework, with the exception of political representation. In addition, some domains of the eight instruments could not be matched to 1 of the 11 domains of the theoretical framework unambiguously and were provisionally categorized as ‘other’ (in additional row Table S1).

In the fifth step, one of the authors (A1) created a table with an overview of all domains (or items) of the selected instruments and their descriptions per remaining domain of the theoretical framework (i.e., excluding political representation) (table not reported because of its size). In the sixth step, two authors (A1, A5) reviewed this table and synthesized the identified domains and items from the selected instruments by merging domains and items with similar meaning and harmonizing the wording of the resulting items and their descriptions (see Table S2 in online Supplementary Information 2, columns 1–3). Based on this, an initial version of the domain structure of the new instrument was drafted, consisting of the ten domains from the theoretical framework, the synthesis of items and their descriptions for each domain, and draft names and descriptions for the 10 items of the new instrument (see Table S2 in online Supplementary Information 2, columns 4–5).

In the seventh step, two other authors (A3, A4) independently reviewed the approach, decisions, and results of steps three to six. Their feedback was discussed and implemented in a joint meeting with the whole research team. We agreed that the domain political representation would be excluded from the new instrument, because it was not represented in any of the selected instruments and was also not considered an important constituent of well-being among the adult population of the Netherlands (van der Deijl et al., 2023). This decision was, however, flagged as an item to be verified with experts and members of the public in the content validation phase. In addition, we decided that none of the domains or items of the eight selected instruments categorized as ‘other’ in the fourth step (see bottom row of Table S1 in online Supplementary Information 1) needed to be included in the new instrument in addition to the already distinguished ten domains. Appendix 1 provides more details on these changes. Finally, several changes were made to the wording of the names and descriptions of the domains, and draft names were created for the ten items of the new instrument, hence each item corresponds to one specific domain (see Table S2 in online Supplementary Information 2, columns 4–6; see Appendix 1).

In the eighth step, to evaluate the comprehensiveness of the draft instrument, two authors (A1, A5) compared the ten domains and the corresponding items of the WiX to the findings of a study into what constitutes a good life for adults in the Netherlands and identified five views on well-being, namely: Health and feeling well; Hearth and home; Freedom and autonomy; Social relations and purpose; and Individualism and independence (van der Deijl et al., 2023).Footnote 3 By inspecting the characterizing and distinguishing aspects for these five views, we determined that no important items that adult citizens in the Netherlands identified as important for their well-being were missing from the draft instrument. Therefore, the original selection of ten domains and their corresponding items were retained for further development of a draft version of the WiX. In the ninth step, two authors (A1, A5) formulated draft descriptions for the ten items of the WiX, based on the draft descriptions of the domains (see Table S2 in online Supplementary Information 2, column 7). In the tenth step, the same two authors drafted item levels for the ten items of the WiX, taking the number and wording of levels of available instruments -collected in step 5- as starting point. Accordingly, each item is accompanied by a description conveying its meaning to respondents, and five response levels measuring the level of satisfaction of the respondent on that well-being domain, distinguishing between “I’m very satisfied”; “I’m satisfied”; “I’m reasonably satisfied”; “I’m dissatisfied”; and “I’m very dissatisfied” (on the specific domain) (see Table S2 in online Supplementary Information 2, column 8).

In the eleventh step, two other authors (A3, A4) independently reviewed the approach, decisions, and results of steps eight to ten and their feedback was discussed and implemented in a joint meeting with the whole research team. After several iterations of changes to the wording of the draft domain names, descriptions, and levels of the WiX, with particular focus on consistency and comprehensibility of formulations, consensus was achieved about a final draft version of the instrument. These adjustments were incorporated into Table S2 in online Supplementary Information 2 (see columns 6 to 8). Lastly, the recall period in the instruction for completion was set to “today” and the order in which the items in the draft version of the WiX were presented, was adapted to minimize confounding of meaning between the items (see Table S2 in online Supplementary Information 2, column 9).

The twelfth and final step in developing a draft version of the WiX was a forward–backward translation of the WiX from English into Dutch, which was commissioned to a certified translation company. The differences between the original and back-translated English versions of the WiX were discussed and resolved in a meeting with the full research team, in coordination with the translator. This resulted in final draft versions of the new instrument in English and Dutch to be used in the content validation study discussed next.

Content Validity

To assess the content validity of the WiX, we followed the COSMIN methodology (Mokkink et al., 2010), with content validity defined as the degree to which the content of the new instrument adequately reflects the construct that we intend to measure, well-being. Following this definition, content validity consists of three aspects: (1) relevance, meaning that all items of the instrument should be relevant for the construct of interest; (2) comprehensiveness, meaning that no important aspects of the construct should be missing; and (3) comprehensibility, meaning that all items of the instrument should be understood as intended. For this purpose, a qualitative and a quantitative assessment of the content validity of the WiX was conducted.

Qualitative Assessment of Content Validity

To assess the content validity of the WiX, two authors (A1, A2) conducted interviews with experts and members of the general population. The interviews were conducted online and via telephone because, at the time, COVID-19 measures did not allow for in-person interviews. As suggested by Beatty & Willis (Beatty & Willis, 2007), the interviews were conducted in several rounds until saturation was reached. After each round, interview answers were analysed and discussed by three members of the research team (A1, A2, A5), and, if needed, the instrument was adapted accordingly.

Eight experts in the field of health care, health technology assessment, well-being, and outcome measurement were interviewed. These experts worked in the Netherlands at governmental agencies, (semi-)academic institutions or in the healthcare sector. All interviews were conducted online in November and December 2020 by an experienced researcher (A1), following a semi-structured interview protocol (see Appendix 2). Experts received the draft version of the WiX beforehand. During the interview, they were asked general questions about well-being measurement and specific questions regarding the relevance, comprehensiveness, and comprehensibility of the WiX.

Individuals from the general population were interviewed by two experienced researchers (A1, A2), in three rounds. Individuals aged 18 years or older and able to communicate in Dutch were eligible to participate. To achieve a diverse representation of the general population, respondents were purposely sampled based on age, sex, education level, migration background, health status and religion. For the first round of interviews, ten respondents were recruited via the snowball sampling method. For the second and third rounds of interviews, a sampling agency recruited the respondents based on specified sampling criteria. In round 1 (January 2021), 2 (March 2021) and 3 (April 2021) respectively 10, 6 and 4 interviews were conducted. Recruitment of respondents stopped once no new issues were brought forward during the interviews.

Cognitive interviewing techniques (Willis, 2005) were used to interview respondents in the general population sample. Specifically, a think-aloud strategy combined with verbal probing was applied to study the relevance, comprehensiveness and comprehensibility of the draft instrument, its items and their descriptions and levels. Having respondents verbalizing their thoughts gives insight into how respondents understand and answer questions, and aids in checking whether the questions and answer options are well understood. In practice, this strategy implied that respondents were asked to read and answer each item of the WiX out loud, after which they were asked, for example, to explain whether they found it hard to select an answer and, if so, for what reason. Finally, to check the relevance and comprehensiveness of the WiX, respondents were presented the entire instrument and asked whether any item of the WiX was redundant or any aspect that they considered important for their well-being was missing.

To ensure that the interviews were conducted in a consistent manner, a semi-structured interview protocol was prepared by the two interviewers (A1, A2) and discussed with the rest of the research team before commencing the interviews. After the first two interviews, the interview protocol was evaluated. As no significant changes were required, these two interviews were included in the analysis. Respondents received a small financial compensation for their time.

All interviews were recorded and transcribed verbatim and analysed by the two researchers who conducted the interviews. An analysis scheme corresponding to the interview guides was developed to identify issues regarding the items, descriptions and response levels, the recall period, and the relevance, comprehensiveness, and comprehensibility of the instrument. After each round (i.e., one round of interviews for the experts, three rounds of interviews for the general population), interviews were deductively analysed using the analysis scheme, and retrieved issues were condensed into discussion points. These points were then discussed with the whole research team and, if needed, adjustments were made to the instrument.

Quantitative Assessment of Content Validity

After completion of the interviews, the draft version of the WiX was used for a quantitative content validation in a larger sample of the general population. A sampling agency recruited 501 respondents, quota-sampled to be representative for the adult general population of the Netherlands based on age, sex, education level and country region.

In the online survey, respondents first completed the Cantril ladder (Cantril, 1965), which asks them to rate their life on a scale from 1 (“the worst possible life for you”) to 10 (“the best possible life for you”). Next, they were asked an open question: “Could you explain in a couple words what well-being means to you?”, followed by questions about their age, sex, level of education, country region, migration background and self-reported health. After these questions, they were asked to complete the WiX. Then, they were consecutively shown each item of the WiX with its description and their score on the item (based on their answers when completing the WiX in the previous part of the survey) and asked how important this item was for their well-being (on a five-point scale ranging from “very important” to “very unimportant”), and to explain this in an open text field. After rating all items according to importance, respondents were shown all items of the WiX and asked to indicate whether any items they considered important to their well-being were missing from this list. If they answered “yes”, they could insert up to three items in an open text field; for each item inserted, they were asked to indicate how important this item was for their well-being (on a five-point scale ranging from “very important” to “very unimportant”). Finally, respondents were shown one randomly selected item of the WiX with its description and their score on the item (from the question before). For this WiX item, respondents were asked how clear the description of this item was to them (on a five-point scale ranging from “very clear” to “very unclear”), followed by an open question for suggestions to improve the clarity of the description. In addition, respondents received three questions about the response levels of the item: (1) how clear the response levels were to them (on a five-point scale ranging from “very clear” to “very unclear”); (2) how difficult it was to select the response option that was most applicable to them (on a five-point scale ranging from “very easy” to “very difficult”); and, (3) whether they could as well have chosen one response category higher or lower referring to being more or less satisfied with this item (on a five-point scale ranging from “completely agree” to “completely disagree”). In the randomization procedure for the question above, four items of the WiX (i.e., ‘Personal and social safety’, ‘Self-worth’, ‘Independence’ & ‘Social relations’) were shown twice as often as the other items, because these items most frequently raised issues in terms of comprehensibility during the interviews with members of the general population (as discussed later).

Frequencies were calculated for the responses to the closed questions regarding the relevance, comprehensiveness, and comprehensibility of the items of the WiX and their descriptions and response levels. Responses to the open question about what well-being meant to respondents were open coded into aspects of well-being (e.g., “not having to worry about money” or “no worries about expenditures for shelter or food” into the aspect ‘no financial worries’) using inductive content analysis (Elo & Kyngäs, 2008). Next, these aspects were matched to the ten domains of well-being included in the WiX (e.g., aspects like ‘financial stability’ and ‘no financial worries’ to the domain ‘Financial situation’) or a category ‘other’. A similar approach was used for coding the responses to the other open questions. Incomplete or unclear answers and mentions of “don’t know” were coded as missing. Because all questions were mandatory, there were no missing values. Respondents with very short answers to any of the open questions were seen as potential speeders, but after excluding these respondents from the data as a robustness check, it was concluded that their answers did not affect the results presented here.

Translation of the WiX

The interviews and surveys were administered using consecutive draft versions of the WiX in Dutch. The final version of the instrument after content validation was translated into English. The forward–backward translation was commissioned to a certified translation company. The differences between the original and back-translated Dutch versions of the instrument were discussed and resolved by the research team. The final English and Dutch versions of the WiX are included in Appendix 3 and 4.

Ethics

The study protocol was approved by the Research Ethics Review Committee of the Erasmus School of Health Policy & Management (case number 21–001). Participation in the study was voluntary and could be terminated at any point. All respondents provided informed consent for participation in the study and use of their responses for academic research and publication purposes.

Results

Qualitative Assessment of Content Validity

Below we describe the most important revisions to the initial draft version of the WiX following the consecutive rounds of interviews with experts and members of the general population. A detailed overview of the frequency of reporting issues regarding the relevance, comprehensibility, and comprehensiveness of the WiX, per interview round, is included in Appendix Table 4.

The interviews with the experts resulted in two important changes to the instrument. First, the third (or middle) response option was changed from “I’m reasonably satisfied” to “I’m satisfied nor dissatisfied”, to represent the true middle. In addition, the items, descriptions, and levels of the instrument were checked by a language specialist to meet comprehensibility at language level B1 (intermediate), and the instrument was revised accordingly. Table S3a (in online Supplementary Information 3) lists the issues identified, quotes from the interviews with experts, and the corresponding changes that were made to the draft version of the instrument.

Table 1 presents an overview of the main characteristics of the members of the general population who participated in the qualitative and quantitative validation study.

Table 1 Descriptive statistics of the members of the general population included in the qualitative interview and quantitative survey samples

The interviews with members of the public demonstrated that the instrument worked well; no issues were identified regarding the relevance and comprehensibility of the WiX, but some minor issues were reported regarding the comprehensiveness of the item descriptions, which resulted in the following changes. First, some respondents indicated to dislike the negative description of the items for physical and mental health. They mentioned that the descriptions were worded too negatively, which could potentially influence their answers. Therefore, the descriptions of the items for physical and mental health were changed from negatively worded statements (e.g., “Consider feelings of anxiety”) to positively worded statements (e.g., “Consider feeling mentally well and not suffering from feelings of anxiety”). Second, some respondents reported difficulties answering the item about safety, following from difficulties understanding a specific part of the item description related to social safety: “…where everyone is treated with dignity and respect”. However, they did acknowledge it to be an important aspect of safety. Throughout the interviews, alternative descriptions were explored and discussed with respondents, eventually resulting in describing social safety as: “…that others accept you and that you are not harassed because of who you are or what you think or believe”. Tables S3.2 to S3.4 (in online Supplementary Information 3) list the issues identified per interview round with members of the public, quotes from the interviews, and the corresponding changes that were made to consecutive draft versions of the instrument.

Quantitative Assessment of Content Validity

After completing the draft version of the WiX, 447 respondents (89%) provided a meaningful answer to the question about what well-being means to them. The most frequently mentioned aspects related to health (73%) and emotional well-being (46%), followed by financial situation (26%) and social contacts (19%) (see Table 2). Nearly all the mentioned aspects were clearly linked to the domains included in the WiX, supporting the relevance and comprehensiveness of the instrument, except for (1) the well-being of others and (2) personal development/having a certain goal or purpose in life, which both were mentioned by about 3% of the respondents. Few respondents (N = 20; 4%) reported to miss an item in the instrument. The most mentioned aspects were ‘personal development’ (e.g., future outlook, skills) (1%), ‘spirituality/religion’ (1%) and ‘society/political system’ (e.g., politics, norms, climate) (1%).

Table 2 Definition of well-being according to respondents (N = 446)

The relevance of the items of the WiX was further investigated by asking respondents how important they considered the items of the WiX to be for their well-being. Most respondents indicated to find all items (very) important, with the highest proportion for ‘Mental health’ (94%) and the lowest proportion for ‘Activities’ (77%) (see Fig. 2). When asked why they considered an item (not) to be important for their well-being, respondents reported a broad range of arguments (see Table 3). The few respondents who indicated an item not to be important, mostly mentioned that the item did not apply to them. For example, respondents who indicated finding the item ‘Activities’ not important to their well-being explained they do not, or are unable to, being involved in activities like work or household chores.

Fig. 2
figure 2

How important is this item for your well-being? (N = 501)

Table 3 Synthesis of provided answers to the question: Why do you consider this item (not) relevant for well-being?

Regarding comprehensibility, most respondents (83% or more, per item) found the item descriptions and response levels (very) clear (see Figs. 3 and 4). In addition, very few found it difficult to select the right answer (i.e., level) to the items (see Fig. 5). The high level of comprehensibility of the items and item descriptions was also evident from the written feedback provided by respondents, who reported only few suggestions for improvement. Based on these suggestions, we slightly revised the wording of the description of the ‘Relaxation and leisure time’ item. Table S3e (in online Supplementary Information 3) lists the changes that were made to the draft version of the instrument based on the quantitative validation.

Fig. 3
figure 3

How clear is the description of this item to you? (70 to 74 respondents for items ‘Relationships’, ‘Safety’, ‘Independence’ and ‘Self-worth’, 35 to 37 respondents for the other items)

Fig. 4
figure 4

How clear are the answer options for this item to you? (70 to 74 respondents for items ‘Relationships’, ‘Safety’, ‘Independence’ and ‘Self-worth’, 35 to 37 respondents for the other items)

Fig. 5
figure 5

How difficult was it to select the right answer to this item? (70 to 74 respondents for items ‘Relationships’, ‘Safety’, ‘Independence’ and ‘Self-worth’, 35 to 37 respondents for the other items)

On average, 35.3% (completely) disagreed that they could as well have chosen one answer level higher or lower, meaning higher or lower satisfaction on that item, which was lowest for the item ‘Activities’ (24.3%) and highest for the item ‘Physical health’ (45.7%), while 23.6% of respondents (completely) agreed, which was lowest for the item ‘Safety’ (13.7%) and highest for the item ‘Activities’ (32.4%).

Finally, based on differences between the Dutch language version of the instrument after the content validation and the forward–backward translation into English, we slightly revised the wording of the description of the item ‘Living environment’ in the Dutch language version (see Table S3f in online Supplementary Information 3).

Discussion

When interventions have effects beyond health, the evaluative space of common HRQoL instruments may be considered too limited to capture all the benefits relevant to individuals. In such cases, instruments measuring well-being comprehensively are required. While a few well-being instruments exist that could be used in the context of economic evaluations, all seem to have some limitations. Therefore, in this paper a new instrument for the adult general population was introduced that aims to capture overall (or general) quality-of-life in terms of people’s subjective well-being by measuring how satisfied people are in ten important domains of well-being: the 10-item Well-being instrument (WiX). This paper presented its development and content validation.

The development of the WiX was based on a theoretical framework synthesizing leading theories of well-being and an exploratory study into what adults in the Netherlands consider important for a good life (van der Deijl et al., 2023) as well as a scoping review of existing instruments with a similar aim. The final version of the instrument covers ten domains of well-being and measures satisfaction in these domains. The content validity of the WiX was investigated following the COSMIN methodology (Mokkink et al., 2010), addressing its relevance, comprehensiveness, and comprehensibility in a qualitative and a quantitative validation study. The results of these studies confirmed that the WiX covers all relevant well-being domains, does not include irrelevant domains, and is considered clear and feasible by the target population. These results are encouraging and highlight that the WiX is a promising instrument to measure well-being in the adult general population.

Considering the development process of the WiX described in this paper, the results for its content validity support that this well-being instrument is comprehensive and truly generic, meaning that it is not confined to a specific subgroup in the adult population (e.g., care users) or to a subset of the relevant well-being domains. These strengths of the WiX are especially important when interventions in the healthcare sector are expected to have broad effects on peoples’ well-being and, in addition, make the instrument more relevant for use in the evaluation of interventions across sectors and settings, as well as outside the healthcare sector.

By measuring the satisfaction of respondents in ten distinct domains of well-being, the informational density of the WiX is high. This makes it possible to offer an indication of overall well-being, but also helps to identify the domains in which satisfaction may not be optimal and, hence, understand the sources of reduced well-being (or deprivation). In addition, this information can be directly relevant for the development and implementation of policy interventions. While the ten items required to make the WiX comprehensive make it longer than most other existing instruments, the results of this study indicate that the WiX seems to be clear and concise, and, therefore, still feasible to be used for self-completion in the context of evaluation studies.

A few issues regarding the instrument and its content validity deserve further discussion here. First, in both the development and content validity phases some elements of well-being were encountered that may be important for well-being but, after deliberation and content validation, were not included in the final instrument. Two examples of this are political participation and spirituality/religion. These aspects are mentioned in the literature (van der Deijl et al., 2023) and were reported by (very small proportions of) respondents in the content validation study, but at this time we found insufficient evidence to support their inclusion in the final instrument as additional domains. Future research should explore the role of these aspects for well-being further.

Secondly, we developed the WiX and conducted the content validity study in the Netherlands. While the instrument was based on broad, international theories of well-being and available well-being instruments from the international literature, and we tried to represent the multi-cultural environment in the Netherlands in the content validation phase, future studies need to confirm the (content) validity of the instrument in other countries. Binder (Binder, 2014) stated that it is doubtful that a single list of domains is valid “once and for all” but argued that the selection of most important domains -a “skeleton list”- would probably be similar across cultures and time. Nonetheless, further validation of the WiX is recommended especially in countries where the economic, political and cultural environments differ considerably from the Netherlands, both regarding the current ten domains and potential complementary domains. In addition, the content validation of the WiX presented here took place during the COVID-19 pandemic. While it is difficult to say whether and how this may have influenced the results, we expect that respondents may have been more aware of well-being (issues) in general as a result of the pandemic and the subsequent governmental measures, and that it may have impacted the relative importance attached to certain well-being domains (like health or social activities).

Thirdly, despite the theory-driven, systematic development process and the extensive content validation, several additional development steps are needed before the WiX can be recommended for use in evaluation studies. These steps include further validation of the instrument, including its feasibility, reliability, construct validity and sensitivity/responsiveness, ideally in different contexts, populations and countries. Moreover, for use in economic evaluation studies, preference-based utility weights need to be determined, representing the relative importance of the different domains of well-being and levels of satisfaction in those domains for overall well-being. Such utility weights can then be used to compute well-being scores for the approximately 10 million (= 510) different well-being states described by the WiX.

Concluding, the thorough development and content validation phases reported in this paper have resulted in a new instrument to measure well-being in terms of satisfaction on ten important domains of life in the adult general population: the 10-item Well-being instrument (WiX). The results for the relevance, comprehensiveness and comprehensibility of the WiX are encouraging, but further validation and valuation steps are necessary before the WiX can be effectively employed in (economic) evaluation studies. Conditional on the results of these steps, the WiX seems to be a promising complement to existing measures of well-being, with an alternative conceptual approach. Multi-instrument comparison studies are recommended to further inform analysts and decision makers on the relative performance of instruments in assessing the full impact and value for money of interventions in health and social care (and beyond) with impacts broader than only health.