1 Introduction

The common denominator for social programs is the desire to improve individual well-being. However, "individual well-being" consists of many dimensions that are difficult to measure, and the relative contribution of each of the various aspects to overall well-being is unknown.

In recent decades, many researchers have taken the position that individual well-being can be measured by collecting measurements of “subjective well-being” (SWB) (see, for example Frey and Stutzer 2002; Kahneman and Krueger 2006; Diener et al. 2018; Adler 2019 and references cited therein). Formally, SWB can be defined as “good mental states, including all of the various evaluations, positive and negative, that people make of their lives and the affective reactions of people to their experiences” (OECD 2013, page 10). In practice, these SWB measures are most often assessed by asking survey respondents to provide subjective ratings of different aspects of their lives and levels of satisfaction.

In general, SWB research falls under an approach to well-being referred to as “hedonic,” “psychologic” or “experiential,” under which the mental states measured in these surveys serve as an approximation of an individual’s true utility or well-being. The specific mental states that contribute to well-being include physical and emotional sensations, such as pain, pleasure, happiness, and feelings of satisfaction, among others. This approach has received both support and criticism in the literature (Frey and Stutzer 2002). In particular, some economists have maintained in recent years that these subjective measures of wellbeing are not sufficient to capture people’s overall wellbeing because people may have preferences for outcomes that extend beyond their own mental states, such as one’s impact on the world, or the happiness of others (see, e.g. Adler 2019).

As an alternative to the experientialist approach, these researchers have adopted a broader “preference based satisfaction” approach, which calls for inclusion of additional measures into measures of overall wellbeing (Adler 2019; Van der Deijl 2018). The preference satisfaction approach, rooted in classical economics, evaluates well-being based on the extent to which different situations align with an individual's preferences. If an individual prefers situation A over situation B, then their well-being is considered higher in situation A. The key distinction is that the preference satisfaction approach defines well-being as the fulfillment of preferences over an arbitrary set of outcomes, while the experientialist approach identifies the quality of subjective experiences as the determinant of well-being.

A significant step towards the empirical estimation of a preference-based approach to well-being measurement was undertaken by Benjamin et al (2014, 2017), whose work serves as the basis for the present study. In that study, the researchers first compiled a list of 136 aspects that they would scale to create an index that combines a variety of aspects of well-being that could be measured through survey questions. These aspects include many dimensions from the academic literature, both subjective well-being measures as well as other well-being measures, in an attempt to offer a complete index of well-being. An important advantage of their method is the attempt to include as many aspects of well-being as possible to allow as broad a view as possible of the well-being of the individual. These aspects of well-being include indicators of economic status, health, education, a sense of security, community life, their social network, the quality of public services, and more. Then, using a “stated preference” approach, the authors asked 4600 U.S. survey respondents about their preferences between pairs of aspects and used the responses to compute relative marginal utilities for the 136 aspects in their study, which included measures of happiness and life satisfaction in addition to aspects related to family, health, security, values and freedoms.

This method of Benjamin et al. may more accurately estimate the well-being of the individual compared to the more standard approach of asking a single SWB question or even other multifaceted methods found in the literature (e.g. the ICECAP-A method of Al-Janabi et al. 2012, further described in page 7) because it includes a wide range of aspects so that the potential to lose information due to the exclusion of a certain aspect is significantly reduced.

Another important advantage of this method is the ability to build a ranking of the variety of aspects according to the preferences of the individuals, thus determining the marginal contribution of each aspect. These marginal utilities represent an important input for the construction of weights that could ultimately be used to build a full index of well-being suitable for applied purposes and for the evaluation of the effect of government programs on overall wellbeing. This contribution is significant because while researchers have done significant work towards understanding how to best measure various aspects of wellbeing, there is little consensus about how to combine the different aspects into a single overall measure. For example, a set of recent OECD guidelines explain that “there is no clear basis for determining the relative weights to assign to different dimensions or sub-dimensions of subjective well-being” (OECD 2013). In practice, when multidimensional measures of wellbeing need to be aggregated into a single measure, a commonly reported approach is to compute a simple, equally weighted sum. Surveying the literature, Decancq and Lugo (2013) write that “equal weighting has often been defended by its simplicity or from the recognition that all indicators are equally important, or from an agnostic viewpoint.”

With all its advantages, the Benjamin et al. methodology also has limitations. First, it relies on interpersonal comparisons, or more exactly, on the assumption that stated preferences of different individuals can be aggregated. Second, Van der Deijl (2018) suggests that applying the Benjamin et al. (2014) methodology for policy preferences is too data demanding. This is the case especially if the researcher assumes individualism—i.e. that each individual needs to be represented with his own preferences, and unrestrictedness—that any possible preferences need to be considered. Such assumptions may hold from a pure theoretical perspective, but from an empirical point of view, where welfare measures are averaged and aggregated, they seem too restrictive.

The present study adapts the methodology of Benjamin et al. (2014) and conducts a similar survey among the population in Israel, the first time this methodology has been applied in a setting other than the work in the United States conducted by that original research group. Starting with the original 136 aspects, we first reduce the list to 27 aspects that were determined to be most important to capturing wellbeing in the Israeli setting. We then conduct an online survey among a representative sample of 1032 respondents in order to measure the relative weight of each aspect. Among the several studies conducted by Benjamin et al., our methodology is actually closer to their 2017 paper, in that we use cardinal rather than ordinal ratings for each aspect, as will be described below.

The various adaptations we make to the Benjamin et al. methodology are intended to address some of the criticisms leveled against their approach. First, our focus on a shorter list of aspects allows us to offer a less data demanding methodology. Second, by exploring heterogeneity in preferences between different sub-populations, we are taking initial steps towards addressing the question of whether individual preferences can properly be aggregated. Nevertheless, we follow the existing literature and pool together the observations from different respondents, effectively assuming a certain level of interpersonal comparability between responses. Finally, as mentioned in the previous paragraph, we follow the methodology from Benjamin et al. (2017), employing cardinal measures of wellbeing rather than the ordinal measures used in the author’s earlier research. This addresses the concerns of Bond and Lang (2019), who discuss the challenges of interpreting wellbeing data collected using an ordinal scale. Cardinal measures are still subject to the critique that different respondents still likely assign different cardinal rankings to identical levels of wellbeing, a challenge we do not address in the current paper. We discuss these issues further when we present the methodology.

In our baseline specification, we find that the aspects with the highest estimated coefficients are: "the happiness of your family"; "your physical health"; “your mental health" and “the quality of your marital relationship". Overall, we find a relatively strong agreement between our rankings of the different aspects and those reported in Benjamin et al. (2014).

Moving beyond the crude baseline coefficients, we also calculate separate estimates for specific subgroups of the population. When we divide the population by gender, we find that men assign greater weights to the quality of the romantic relationship, to general life satisfaction and to their spirituality and connection to God, while women assign greater weight to personal safety and security, to physical health, and to the extent that the health condition allows them to carry out the activities most important to them. When we stratify the population by age, we find that those over the median age in our sample (42 years) place notably higher weight on health-related aspects while the degree of spirituality was more important for younger respondents. Finally, we stratify the Jewish population by reported level of religiosity, a characteristic that divides the Israeli population into distinct and important subgroups. We find that secular, non-religious individuals give greater weight to health-related aspects and to freedom of choice. An interesting finding is the difference between the two groups in their rating of the aspect of spirituality and connection to God. For self-reported religious individuals, this aspect received one of the highest weights, as expected. However, for the non-religious, the estimated weight was negative and statistically significant, meaning that a greater degree of spirituality actually reduced their well-being. These results offer new, quantitative facts regarding the complicated nature of religion in Israeli society, which is otherwise beyond the scope of this article.

We also document heterogeneity in preferences for respondents with different current levels of well-being. For some aspects of well-being, we find that respondents who already report higher levels of the aspect rate it as less important, suggesting a decreasing marginal utility. In other words, such aspects become less important to the individual as their reported level rises. For other aspects, we find evidence for what appears to be an increasing marginal utility, but we argue that this result likely reflects an endogenous determination of the level of well-being. For example, someone who particularly values their relationships with family members may have invested more in those relationships, resulting in a higher reported level of that aspect.

Finally, we discuss the policy implications of our findings and show how our results could be used to aggregate all the different aspects into a single index of well-being, which could be used to evaluate and compare the effects of different policy interventions.

2 Literature review

There is an extensive literature on wellbeing, that includes a wide variety of different measures, survey questions and results.Footnote 1,Footnote 2 One recent review (Linton et al 2016) included 99 separate instruments used to measure wellbeing, including 27 new instruments developed between 1990 and 1999. On an international level, the largest surveys of this type are the Gallup World Poll, which was started in 2005 and included over 150 countries as of 2022, and the World Values Survey, which started in 1981 and covered 80 countries as of its seventh wave in 2017–2021.

A common challenge in these measures has been the decision about which aspects of wellbeing to measure. For example, the 99 instruments surveyed by Linton et al (2016) included questions about 196 different dimensions. In order to address this challenge, and to develop a standardized list of measures that could be compared across counties, the OECD published a set of guidelines in 2013 as part of its broader “Better Life Initiative.” In addition to offering guidance on methodology for collecting and reporting on wellbeing measures, these guidelines include a set of “core measures” that they recommend all countries adopt. The recommended measures include a primary measure of life evaluation, as well as measures of affect (a person’s feelings or emotional states, typically measured with reference to a particular point in time) and eudaimonia (having a sense of meaning or purpose in life). As mentioned above, Benjamin et al. (2014) propose 136 aspects, which supposedly provide a full coverage of individual well-being. Another advantage of their method is the ability to build a ranking of the variety of aspects according to the preferences of the individuals, thus determining the marginal contribution of each aspect.

Despite these efforts at standardization, there is still no consensus about which aspects of wellbeing researchers should measure. Even within the OECD framework, in which questions about life satisfaction in different life “domains” are one component of their recommended survey, the guidelines describe the challenges in identifying which domains to include: “Ideally, the questions [evaluating satisfaction in different life domains] would meet two key criteria. First, they would be independently meaningful as measures of satisfaction with a particular aspect of life; and second, they would collectively cover all significant life domains.” (OECD 2013, p. 169) We follow these guidelines in our choice of aspects (see Sect. 4.2). A major practical challenge to this sort of approach, however, is that there is no generally agreed framework for identifying how to divide well-being as a whole into different life domains. In the United Kingdom, the Office of National Statistics settled on 77 measures of wellbeing that have been systematically collected since 2011 (HM Treasury 2021).

The literature has long struggled with the question of how to combine these many measures of different aspects of wellbeing into one overall measure. In practice, the difficulties raised by this question are often avoided and, in many surveys, the overall well-being of the individual is simply measured through a single question about satisfaction with the quality of life. The answer to this question is then interpreted as an overall level of "happiness", and some claim that it is even related to the general level of well-being of the individual. However, using only one dimension to measure individual well-being has often been criticized because it may not satisfactorily capture all aspects of individual well-being (e.g. Adler 2012; Adler et al. 2017).

The natural alternative to using a single question about life satisfaction, and the approach used by the current study, is to attempt to combine the responses to the many aspects of wellbeing into one combined measure. As described above, the most commonly reported approach to combining the different aspects of wellbeing is to compute a simple, equally weighted sum, either based on arguments of simplicity or for lack of a convincing alternative (Decancq and Lugo 2013). However, several strands of the literature have attempted to measure the relative importance of different aspects, similar to the goal of the current study. Decancq and Lugo (2013) classify these attempts to three classes—data driven, normative, and hybrid. Since we want to refrain from normative judgements, we take the data driven approach. Other studies also take this approach. For example, the OECD’s Better Life Index (Balestra et al. 2018) asks people to rate the importance of different aspects on a 0–5 scale. Van Praag et al. (2003) develops a model in which general satisfaction is expressed as a combination of domain satisfaction so that the relative weights of each domain can be estimated. Another alternative to the single-question approach is the ICECAP-A (Al-Janabi et al. 2012; Flynn et al. 2015; Van der Deijl 2018). The ICECAP-A index assumes five domains of well-being, and tries to estimate their weights based on subjective assessments, an approach somewhat more similar to the methodology of Benjamin et al. (2014). Each of these different methodologies, including the one that we have adopted in the current paper, has its advantages and disadvantages. A broader discussion about which methodology is best suited for which context is a complicated one that lies beyond the scope of this paper.

3 The Israeli context

The present study measures wellbeing within the population of the State of Israel. Compared to other countries, Israel routinely collects relatively good information about the wellbeing of its population. The most important tool for wellbeing measurement in Israel has been the General Social Survey, administered by the Israeli Central Bureau of Statistics (CBS). This routine population-based survey includes measures of people’s overall life evaluations plus a wide range of other aspects from the OECD well-being framework. Using 70 main indicators, the CBS then constructs measures based on the 11 dimensions selected by the OECD (CBS 2019).Footnote 3 As measured by a question that asks respondents to rate their overall satisfaction with their lives, life satisfaction in Israel has consistently been above the OECD average (OECD 2015).

This Israeli initiative follows a December 2012 resolution by the Israeli government to develop measures of “well-being, resilience and sustainability,” essentially adopting international best practices regarding the measurement of wellbeing and other metrics of well-being.Footnote 4 The National Academy of Sciences was also required to address the issue and in 2017 established an expert committee to examine the quality of life in Israel. The committee stresses the importance of measuring a multitude of various aspects of well-being, including aspects related to sustainability (The Israeli Academy of Sciences and Humanities 2021).

Recent studies in Israel on wellbeing include a study on the psychometric properties of a wellbeing scale among a sample of Israeli Arabs and Palestinians from the West Bank/Gaza strip (Veronese et al. 2017), which found that the questionnaire provided an accurate assessment of the well-being of respondents from this population. Shavit et al. (2021) found that the average respondent reported unchanged life evaluation through the COVID-19 pandemic even as negative feelings rose and positive feelings fell, implying a structural change during this period in the weighting of feelings and self-rated health in determining life satisfaction. To the best of our knowledge, no research has been conducted along the lines of Benjamin et al. (2014) in Israel, nor indeed in any other country outside the United States. Thus, an important part of our contribution is to measure individual well-being in Israel, allowing the weights on the different aspects to reflect the preferences of the Israeli population. Moreover, our results could be applied to construct a well-being index that could be used to evaluate and compare the effects of different policy interventions.

4 Methodology

4.1 Theory behind methodology

In classical economic theory, total utility is conceived of as a function of the goods and services consumed. If all of these goods are priced by the market, and we assume that people optimize consumption to maximize their utility, then changes in utility from consumption can be measured by the change in the total value of all the goods and services consumed. In other words, to measure the total change in utility, we can add the changes in the consumption of the different goods together, using their market prices as weights in the summation. (Formally, the marginal utility for each good is proportional to its market price, with the coefficient of proportionality equal to the Lagrange multiplier on the budget constraint.) This assumption underlies the use of traditional measures, such as GDP which sees the total value of the goods and services produced in the country as a measure of its economic well-being.

According to the theory developed in Benjamin et al. (2014), measuring changes in well-being can be thought of as similar to measuring the change in utility from consumption. The vector of consumption goods is replaced by a vector of aspects of wellbeing, and the authors develop a methodology based on stated preferences to calculate the appropriate weights to be used in the summation. This leaves the researcher two tasks: first, to determine the list of aspects that should be included; and second, to implement the proposed methodology in order to compute the marginal utility of each aspect. A sum of changes in aspects, each weighted by its marginal utility, then forms a local linear approximation to the change in the utility function.

4.2 Selecting the aspects

As noted above, there is no consensus about which aspects of wellbeing researchers should be measuring. The starting point for the current study was the aspect list used in Benjamin et al. (2014). There, the researchers sought to formulate as wide a list of aspects as possible, and therefore many aspects were collected from a very wide variety of sources—academic studies, social surveys, consultations with researchers and work of research institutes, public organizations such as the OECD, and governments around the world. The final list included 136 aspects, including 113 personal aspects of individual well-being (e.g. personal health), and 23 public aspects of societal well-being (e.g. "equal opportunities in your country").

Unlike the study by Benjamin et al., the ultimate purpose of the current study was to formulate a set of aspects that could be feasibly added to settings such as program evaluation studies. For this purpose, including 136 aspects would be impractical. Therefore, the first stage focused on reducing the number of aspects, by using the results of Benjamin et al.'s studies and the insights that emerged from them. First, we focused our attention on the 113 private aspects. This was done since our purpose was to construct an index which could be used for program evaluation surveys. In other words, we wish to observe changes in personal well-being, and not changes at the level of society. Our focus is on constructing an index for program evaluation, and as most government programs aim to improve the personal situation of the individual and not society as a whole, we focus on personal well-being and its components. Second, we restricted our list to the 37 aspects that received the highest weights in the analysis by Benjamin et al. (2014) and (2017). We consulted with the authors of those studies, from whom we received a list of 33 leading aspects, which in their opinion were most accepted in the literature and among policy makers in countries that use well-being indicators. These lists were very similar, resulting in a combined list of 35 attributes.Footnote 5 We received further input from content experts from a variety of social research disciplines at the Myers-Joint-Brookdale Institute, who also helped us adapt the list of aspects for the Israeli population. Ultimately, this process produced a final list of 27 aspects (shown in Table 1) that we used in our survey.Footnote 6

Table 1 List of aspects

4.3 Survey

After formulating the list of aspects, the next step was to estimate their relative weights, which could ultimately be used as weights in the construction of a well-being index. This assessment was carried out through an online survey, following the methodology of Benjamin et al. (2017). Following several rounds of pre-testing, an internet survey was commissioned by the research team from a commercial company that specializes in online panel surveys (i-Panel). The survey was conducted in October 2022 among a stratified sample of 1032 respondents—a representative sample of the Israeli population. The survey company maintains an online panel consisting of an extensive pool of approximately 100,000 Israeli participants from diverse backgrounds. This large-scale panel serves as a valuable resource for conducting comprehensive surveys and data collection, ensuring that the research findings accurately reflect the perspectives and experiences of Israel's heterogeneous population. The survey response rate was 45%. The representativeness of the sample was ensured through stratification based on six key demographic factors: gender, age, income, ethnic groups (Arabs and Jews), religiosity, and geographic district. Table 2 shows that our sample matches the overall population in many dimensions.

Table 2 Demographic group proportions

The survey was divided into four rounds, each presenting three of the 27 aspects. In each round respondents were asked to rate each of the three aspects on a scale of 0 to 100. For example, for the aspect “The happiness of your family,” respondents were asked:

Think about your life over the past year. On average, how would you rate the happiness of your family over the past year? Please use a scale of 0 to 100, where 0 is the lowest level you can imagine for life and 100 is the highest level you can imagine for life.

Respondents indicated their responses by moving a digital on-screen slider up or down a visual scale (depicted in Fig. 1).

Fig. 1
figure 1

An example of a game

Respondents were then presented with a series of six "games." In each game, the respondent received two alternative scenarios involving two of the three aspects presented at the start of the round. In one of the alternatives, the level of one of the aspects was increased or decreased by 2, 4, 6, or 8 units, while the rating of the other aspect remained the same. In the other alternative, the level of the other aspect was changed while the level of the first aspect remained the same. The respondent was then asked which of the two alternatives he preferred. The respondent also had the option to respond "don't know.” In all cases, the level of each aspect was restricted to be between zero and 100. If the randomly chosen 2–8 unit increase or decrease would have resulted in a value outside of this range, the respondent was instead presented with a value truncated to 0 or 100.

Figure 1 shows an example of a typical game. In the example, the respondent is asked about two aspects—"the feeling that your life is meaningful and valuable" and "the level of your physical security and personal safety". In alternative A, the rating of the sense of meaning aspect increases by 6 points, while the physical security level aspect does not change. In alternative B, the rating of the physical security aspect increases by 4 points, while the rating of the sense of meaning aspect remains unchanged. Previous studies using this methodology (Benjamin et al. 2014, 2017) suggest that respondents appeared to understand the way these games are structured and were able to give meaningful answers. The array of which aspects were presented to each respondent, which aspects were paired against one another in the games, and the size of the increase or decrease in the aspect rating in each alternative, was generated randomly.

In addition to these questions about the different aspects of wellbeing, the survey also included standard questions regarding the socio-economic and demographic characteristics of the individual: age, gender, education, income, marital status, level of religiosity, etc. The current study, including the survey and all of the study methods, was approved by the Myers-JDC-Brookdale Institute Ethics Committee.

4.4 Econometric model

Based on the preferences expressed through these games, we estimate the weights on each of the 27 aspects. We use a linear probability model (i.e. OLS) where the respondent's choice is a linear function of the difference in the various aspects between the two alternatives. In this regression, game i appears as an observation in the form:

$${choice}_{i}=\alpha +\sum_{j}\left[{\beta }_{j}\times {change}_{ij}^{2} -{\beta }_{j}\times {change}_{ij}^{1} \right] + {\varepsilon }_{i}$$

where \({choice}_{i}\) is equal to 1 if the respondent chose the first alternative in game i and zero if he chose the second; \({change}_{ij}^{1}\) and \({change}_{ij}^{2}\) are the changes in aspect j in game i in alternatives 1 and 2 respectively; the sum is of all 27 aspects, but by construction, only two of the changes are not equal to zero 0, the one aspect that is changed in alternative 1 and the one aspect that is changed in alternative 2; \({\beta }_{j}\) is the weight on aspect j that is being estimated, and \({\varepsilon }_{i}\) is the random error term.

In our baseline model, we assume that there is no variation in preferences between respondents, so that the results from all the games played by all the respondents can be pooled into a single regression.

One challenge of studies that ask for numerical ratings of wellbeing is that there are no physical units to promote interpersonal comparability. Benjamin et al. (2023) refer to this issue as “scale use” (i.e. the idea that different people may be using different numerical scales), and propose to use calibration questions to correct for it. In our methodology, the weights for the aspects are derived from the comparisons respondents are asked to make between different aspects. Therefore, as long as respondents are using the same scale to rate all the aspects, scale use should not affect these comparisons and we would expect scale use to have only a minor impact on our results. Nevertheless, scale use differences would affect the levels of aspect ratings even if they do not confound the marginal utility estimates. This is a minor issue since the paper focuses on MU estimates, but may have a small effect when using aspect levels (e.g. as in Table 9).

5 Survey results

5.1 Summary Statistics

The survey sample included 1,032 respondents. We compare the summary statistics of the main demographic variables of our sample to the 2022 Statistical Yearbook of Israel (Central Bureau of Statistics 2023) as well as to the Social Survey for 2021 (Central Bureau of Statistics 2022). Table 2 shows the distributions of the key characteristics examined in the study sample and the corresponding distributions in the general population. While the sample distributions for most of the demographic attributes are generally similar to those of the general Israeli population, secular and more educated people are somewhat over-represented in the sample, a common occurrence in on-line surveys.Footnote 7 Over the four rounds, each respondent played 24 games, yielding a total of 24,768 observations. In 2875 games (approximately 12% of the total) the respondents chose the "don't know" option. There were no aspects which were more likely to trigger a “don’t know” response. Games in which “don’t know” was selected were omitted from the analysis, producing 21,893 valid observations in the final sample.

5.2 Descriptive statistics of aspect ratings

Over the four rounds of the survey, each respondent was asked in total to provide ratings for 12 randomly selected aspects out of 27. Pooling all of these observations together, Table 3 shows the descriptive statistics for each of the 27 aspects, in descending order of the overall mean rating. The aspects with the highest ratings included "the extent to which you live your life as a good person" (mean rating 79.6), "the extent to which your health allows you to do the activities most important to you" (76.6), and "the quality of your relationships with your children, parents, siblings and relatives" (75.7). In contrast, the aspects whose average ratings were the lowest include "the degree of spirituality in your life or your connection to God" (57.8), "the quality of your marital relationship" (63.1) and "your satisfaction with the balance between work and leisure" (64.1). It is of note that the levels of some aspects—mental health, the amount of love and the happiness of your family—are rated fairly high (mean ratings between 72 and 74), while aspects as belonging to society, and peace and calm, are rated relatively low (mean ratings between 64 and 67). Notably, the economic situation, which gets a medium weight, is rated very low (65).

Table 3 Summary statistics of aspect ratings

5.3 Baseline regression results

Table 4 shows the estimated regression coefficients of the 27 aspects, arranged in decreasing order by magnitude. Most of the coefficients are significant and positive, indicating that respondents did appear to value these aspects of life. The aspects with the highest estimated coefficients were: "the happiness of your family"; "Your physical health"; “Your mental health" and “The quality of your marital relationship". For three of the variables, we estimated coefficients that were not statistically distinguishable from zero: "the degree of spirituality in your life or your connection to God", "your level of satisfaction with the area where you live" and "your level of satisfaction with belonging to society".

Table 4 Baseline regression results

In the third column of Table 4 we convert the estimated coefficients into a set of weights that could be used to aggregate all the different aspects into a single index of well-being. First, we set to zero the weights on three aspects whose coefficients were found not to be statistically different from zero. Then, we normalize the coefficients on the remaining variables so that they sum to 100%. Since respondents were asked to rate the different aspects on a 0–100 scale, this weighting will produce an index that is also on a 0–100 scale.

Looking at column 3 of Table 4, we can see that aspects which are ranked highest relate to health, emotions and family. For example, the weight for family happiness is highest (8.3%), followed by physical health (6.7%) and mental health (6.5%). In fact, 7 out of the top-10 aspects are related to health and family. The other three in the top-10 list are related to emotions (joy—6.1%, and love—5.5%) and general well-being (5.7%).

To place our results in the context of the literature, we compared our results to those in Benjamin et al. (2014) with the aim of determining whether the aspects rated as more important by our respondents were the same ones rated as more important in their survey. Out of 27 aspects included in our aspect list, 22 were also among the aspects included in the analysis of Benjamin et al. (2014). Looking at the relative rankings of these 22 aspects in the two studies, we estimated a Spearman correlation coefficient of 0.55 between the two sets of rankings (see Table 10). This indicates a relatively strong agreement between the two sets of results, perhaps surprisingly so given that the two surveys were in different languages, at different points in time, and of course, in different countries.

To better understand which categories of wellbeing were most important, we divided the aspects into seven groups: satisfaction, health, emotions, security and economy, meaning, social, and housing. As shown in Table 5, we find that aspects related to health, emotions, and personal and family satisfaction tend to be rated as more important, while social aspects and those related to housing to be rated less important. The average weight of aspects related to health, emotions, and personal and family satisfaction is 6.5, 5.6 and 5.3, respectively. Safety and economics have an average weight of 3.9, whereas aspects related to meaning, social situation, and housing receive the lowest average weights (1.9, 1.1 and 0.4, respectively).

Table 5 Aspects divided by category

In addition to the baseline results discussed above, we performed several robustness checks. In one robustness check, we test for a ceiling effect, omitting games in which the respondent rated one of the aspects with a value over 80 or over 95. In another robustness check, we include demographics as control variables in the regression. Results in both cases (not shown) are very similar to baseline estimates. These tests strengthen the stability and validity of our baseline results.

5.4 Heterogeneity by demographics

In our baseline regression, we pool the observations of all the different respondents together, effectively assuming that all respondents have identical preferences. Perhaps more accurately, pooling all the observations together produces parameter estimates that represent averages across the population. In the current section, we estimate separate parameters for different demographic subgroups to explore potential differences in preferences between subgroups of the Israeli population. In each set of results, we identify differences between the subgroups that are statistically significant at the p < 0.05 level.

In Table 6, we show results calculated separately for men and women. For most of the aspects, results are quite similar for the two genders and the estimated coefficients are not statistically different for the two groups. However, there are four aspects where men and women have different preferences that are statistically significant in our analysis: men assign greater weights to the quality of their romantic relationship, and to connection to god/spirituality; women assign greater weights to personal safety and security and to physical health.

Table 6 Results by gender

In Table 7, we show results calculated separately for older and younger respondents. The respondents were divided into two groups—below the median age of the sample (ages 18–42) and above the median age (ages 43+). Results are broadly similar for the two groups, but less so compared to the gender differences which were presented above. We find statistically significant differences in 10 out of the 27 aspects. Specifically, striking differences emerge in the health-related aspects: "your physical health" and "the extent to which your health allows you to engage in activities important to you," which received higher weights among older respondents compared to younger respondents. In addition to this, the aspect "the degree of spirituality in your life or your connection with God" was more important among younger respondents, as well as “your ability to raise children”.

Table 7 Results by Age Group

Finally, we estimated parameters for groups stratified by their self-reported level of religiosity. In Israel, segments of the population with different levels of religiosity constitute distinct and important subgroups, with significant differences in terms of employment rates, number of children, geographic location, and even different educational systems. For this sub-analysis, we restrict ourselves to the Jewish population, which constitutes 79% of the overall Israeli population. (The Arab community, which makes up the remaining 21% of the population, is largely Muslim, with small Druze and Christian minorities.) As shown in Table 2, 45% of the population describe themselves as secular, 19% as traditional not-religious, 14% as traditional-religious, 11% as religious and 11% as ultra-orthodox.

For simplicity, we divide the Jewish respondents in our sample into two groups by level of religiosity: “religious” (comprised of respondents who self-identified as either “ultra-Orthodox”, “religious”, or “traditional-religious”) and “non-religious” (respondents who self-identified as “secular” or “traditional-non-religious”). A comparison of results for the two groups is shown in Table 8. Among the religious, the aspect of spirituality and religion greatly affects the well-being of the individual, compared to non-religious people for whom the aspect of spirituality detracts from the overall well-being of the individual (7% and − 2%, respectively). This is an interesting finding that indicates the reluctance of secular individuals to have a connection to God, to the point that strengthening their connection to God detracts from their well-being. This result is consistent with a negative effect of the level of spirituality, which is reported by Benjamin et al. (2012). We also found that the religious people assign more weight to their ability to raise their children and to the quality of their marital relationships. Another difference lies in the higher rating assigned by non-religious individuals to freedom of choice.

Table 8 Results by level of religiosity

Our results on heterogeneity in the weights given to each aspect are broadly similar to those of Balestra et al. (2018). In particular, like Balestra et al. (2018), we find that women value health and safety more than men and that older people rank health and safety higher, while younger people put more weight on work-life balance. Unlike Balestra et al. (2018), we don’t find that men place a higher value on income, which may be due to differences in wording—we ask about “economic situation” while they ask about income—or may simply reflect a difference in preferences among Israelis. In addition, we offer results on many aspects of well-being, such as relationships and spirituality, that are not studied in Balestra et al. (2018).

5.5 Heterogeneity by level of well-being

In addition to examining heterogeneity based on demographic characteristics, we investigated whether respondents who reported a higher baseline level of a particular attribute placed more or less value on changes in that attribute. On the one hand, preferences surrounding these attributes might exhibit decreasing marginal utility, so that those with a higher baseline level of the attribute would place less value on further increases. On the other hand, many of these attributes are endogenous, so that people may currently have a high level of some attribute because they value it more. For example, someone who particularly values health may have invested more in a healthy life-style, resulting in a higher current level of health. In our analysis, this would appear as people with higher levels of health placing a larger value on further increases in their health. (Alternatively, preferences exhibiting increasing marginal utility could also result in those with higher levels of an aspect placing more weight on that aspect. Preferences with increasing marginal utility are, of course, rare in economic theory.)

In order to test for heterogeneity by the baseline level of the attribute, we add to our regression model an interaction term between the change in the attribute in each scenario and the baseline level of that aspect. The coefficient on this interaction term is properly interpreted as the additional weight placed on the aspect associated with a unit increase in the current reported level of that aspect. In this regression model, we also control for the (un-interacted) level of the aspect itself.

Results from the interaction term regression model are shown in Table 9. As shown in the table, the coefficients for a number of the interaction terms are significantly negative, suggesting that these aspects display a decreasing marginal utility. This is true for 16 out of the 27 aspects, though about half these coefficients are not statistically significant. The aspects showing the most pronounced decreasing marginal utility are “your satisfaction with the area where you live”, and “the extent to which you have people you can turn to in times of need”. Other aspects with such behavior include being proud in oneself, appreciation by others, freedom, satisfaction with dwelling and with work-life balance. The magnitudes of these effects are significant. As an example, consider the aspect “the extent to which you have people you can turn to in times of need.” For respondents who give an average rating of this aspect (67.9), the marginal utility of the aspect is 0.014. However, for an individual whose rating is one standard deviation below the mean (i.e. has a rating of 43.1), the marginal utility increases to 0.025, which is 70% higher.Footnote 8

Table 9 Regression with interaction terms

At the same time, the coefficients for some of the attributes are significantly positive, suggesting that the level of these aspects may be endogenously determined, with higher levels coming about as a result of some individuals placing more value on acquiring these attributes. Specifically, there are four such aspects—spirituality, quality of marital relationship, ability to raise children and quality of relationships with close relatives.Footnote 9 Our assertion that aspects which are valued higher when ranked higher are endogenous seems to be in line with the nature of these aspects. These aspects are all related to the most personal situation of the individual—marital status, relationship with close relatives, spirituality and raising children. In contrast, most aspects with decreasing marginal utility are external to the individual, such as the quality of area and dwelling, freedom and people you can turn to in times of need.

These results differ somewhat from the findings of Balestra et al. (2018), who measure whether current satisfaction with 11 aspects affect people’s preferences regarding these aspects. They find mostly positive associations between the level of satisfaction with a particular aspect and the strength of the preference for that aspect. Unlike the present results, they do not identify any aspects for which the association is negative. The difference in findings between studies based on different methodologies (among other differences) highlights the importance of continuing to explore these questions using a range of different approaches.

6 Discussion and policy implications

6.1 Usefulness for policy makers

The current research aims to help policy makers understand which aspects of well-being are most important to measure, the relative value of those different aspects to individuals, and ultimately, how those aspects can be combined into a single, more comprehensive measure of well-being. The results of our research reinforce the newly emerging perspective that policy makers should focus on a wide range of measures and that monitoring an individual’s income is not the best way to track his or her well-being (Stiglitz et al. 2009). Consistent with other research, we find that respondents rated their economic situation as less important than other aspects of well-being related to emotions such as joy and love.

An important contribution of our paper is the ability to offer an applied multidimensional view on well-being, as opposed to the more common approach of relying on a single well-being question. While the multidimensionality of well-being is not new to the literature, we are the first to offer an applied approach which allows for a consistent estimation of several dimensions of well-being, including their relative weights. Our endeavor is designed to provide policy-makers with an index which could be used in practice to measure the outcomes of government policies. The importance of looking at many aspects becomes clear from the results we present, since many different aspects receive very high weights.

In particular, having multiple aspects allows policymakers to understand the specific effects of the program. Estimating the aspects which comprise the well-being of individuals can help tailor programs to the needs of specific individuals. Our methodology transcends a mere examination of overall well-being. Instead, it facilitates a deeper exploration, elucidating the specific aspects that hold significance for individuals. This can help in identifying the policy interventions with the most profound impact on wellbeing. Furthermore, our approach enables nuanced comparisons over time, across sub-populations, and within diverse contextual settings.

6.2 Implications of heterogeneity results for policy

In addition to applying the methodology of Benjamin et al. (2014) to a new population, our main contribution is the exploration of heterogeneity in preferences over these aspects of well-being. In particular, we document differences in preferences based on gender, age, and level of religiosity within the Israeli population. Naturally, heterogeneity in preferences implies heterogeneity in the weights that should be used to tailor well-being indexes to different population subgroups. In trying to help a particular population subgroup, policy makers will achieve the greatest social benefit by offering services and interventions that improve the aspects of well-being most important to that particular group of people. In addition, population groups that are the targets of specific government interventions are likely to be concerned with very different aspects of well-being than the general population. On the other hand, policy makers aiming to reduce inequality often seek to target services towards less well-off groups. Such comparisons require a common metric to determine which groups have lower levels of well-being and should therefore be the focus of additional government support.

Similar to the discussion of demographic heterogeneity, there are important policy questions raised by our analysis of the difference in preferences based on the current level of well-being. In classical economic theory, the marginal utility of consumption decreases with increasing consumption, implying that providing additional resources to those with low consumption is a more efficient policy choice. When we generalize the analysis to non-economic measures of well-being, we indeed find a decreasing marginal utility for some aspects, including satisfaction with the area in which one lives and the extent to which people have someone they can turn to in times of need. For these aspects of well-being, efficient policies would seek to increase the level of these aspects among individuals who currently report a low level of satisfaction with regard to these aspects.

However, for other aspects, such as relationships with family members, the ability to raise children, the quality of marital relationships and the degree of spirituality, we find what appears to be an increasing marginal utility. Respondents who report a higher level of these aspects place more value on marginal changes in the value of the aspect. Rather than reflecting utility functions that exhibit increasing marginal utility, we suspect that this relationship emerges because the levels of these aspects are endogenously determined and the higher levels of these aspects are partially the result of those individuals placing more value on acquiring these attributes in the first place.

Regardless of the explanation for the increasing marginal utility of some aspects, these results have important and somewhat counter-intuitive policy implications. They imply that policy interventions aimed at increasing these aspects of well-being would actually be more efficient if focused on segments of the population who already report high levels of these aspects. These are the individuals who place the most value on these aspects, while those who report lower levels of satisfaction may have done so precisely because they place less value on these particular aspects. Paradoxically, it stands to reason that these individuals would therefore not benefit as much from efforts to increase these specific aspects of their well-being. In this case, the efficiency-based argument for providing more services to those with higher levels of satisfaction fights against the inequality-based argument that more services should be provided to those with lower levels of satisfaction. Perhaps the best overall policy would be to identify members of society with the lowest overall welfare and then to provide them with the services that increase their satisfaction in the areas they most value. Such a policy would require understanding both how to measure people’s overall welfare and also how to measure their marginal utility from improvements in particular aspects of their lives. Developing this understanding is precisely the goal of the present research.

7 Conclusion

We see the current study as an intermediate step towards the ultimate goal of constructing a practical index for individual well-being that will allow policy makers, program operators and their evaluators to directly measure the impact of a social program on the welfare of its participants, as well as to compare the benefit that a program provides to its participants with that provided by similar programs. For example, we envision that this index could be measured among program participants before and after the intervention so as to measure the effect of the program on participants’ overall well-being. In addition, the components of the index could be used to understand the specific effects of a given program and can help tailor programs to the needs of specific individuals.

An index of individual well-being would need to be composed of a relatively short list of components, so that, for example, questions regarding the aspects in the index could be included in questionnaires given to program participants. For this purpose, even the 27 aspects we have focused on in this research may be too many. The challenge of how to further restrict this list of aspects, while not missing important aspects of people’s well-being, remains an important area of further research and a topic that we are actively pursuing.

A further challenge is the problem of overlap between the various aspects. Some of the aspects are similar to each other—for example "the general well-being of you and your family" and "the joy in your life"—and may describe parts of respondents’ well-being that overlap with one another. If so, and if all the aspects enter separately into the total index, then failing to account for this overlap would lead to mismeasurement of the weights. In particular, elements of well-being that contribute to several of the measured aspects would receive too much weight in the index. This is an additional question we plan to address in the next stages of our research.