Towards a measure of desirance and choiceless utility
In what follows, we start from the premise that there are circumstances in which a policymaker wishes to measure which of two alternative policies, if imposed on the population, would maximize the utility of that population. According to the conceptual framework set out in Sect. 2.1.1, the appropriate measure for this task is a desirance. As evidenced by the empirical results set out in Sect. 2.2, using preference measures instead of desirance measures could bias the policymaker’s conclusions. The remainder of the paper considers the practicalities of measuring desirance.
The non-responsibility criterion
To elicit choiceless utility alone, it must be that there is no scope for responsibility utility to enter the utility function. We refer to this as the non-responsibility criterion. To see how difficult it is for a procedure to meet the non-responsibility criterion, consider the following scenario. Imagine we have a survey respondent from whom we wish to elicit a desirance ordering over goods \(x\) and \(y\) and that we ask the respondent the following question: “The State is deciding which of goods \(x\) and \(y\) to impose on the community. Please indicate which good would give you more utility if it were to be imposed on you by the State.” Even though the question asks for a desirance ordering, the answer that is elicited might well be a preference ordering. This will happen if the respondent believes that indicating good \(x\) increases the probability that the state imposes good \(x\); this belief that her response might make her partially responsible for whichever good is provided is a sufficient condition for responsibility utility to influence her response. To elicit a desirance then, we need an elicitation procedure that the respondent believes cannot influence the likelihood that a given outcome is realized. This is the non-responsibility criterion.Footnote 9
Trading-off hypothetical bias with bias induced by responsibility utility
Any measure that fulfils the non-responsibility criterion cannot rely on an incentivized procedure because any incentivized elicitation method, such as a Becker-deGroot-Marschak mechanism or observed choice, relies on consequential choices. The currently dominant view in welfare economics is that incentive-compatible choice is more precise and less prone to bias than unincentivized ratings as an indicator of welfare (Bernheim and Rangel 2009; Gul and Pesendorfer 2005). Welfare economists are often worried that hypothetical bias is too strong to rely on unincentivized procedures when making policy evaluations and recommendations (e.g., Hausman 2012).Footnote 10 However, stated preference measures that are hypothetical have become influential in informing policy making, especially in domains such as the evaluation of environmental goods (Atkinson et al. 2018).
Whenever there is a need to measure choiceless utility uncontaminated by responsibility utility, the welfare economist faces a dilemma. Should she be more worried about the hypothetical bias of unincentivized procedures or the bias created by responsibility utility when using preferences to measure desirances? We suggest that this is a positive question and not a normative one. The welfare economist should seek to minimize bias, regardless its source. There may exist circumstances in which an unincentivized desirance measure delivers a less biased measure of utility than an incentivized choice. This will happen whenever responsibility utility imposes a bias that is larger than hypothetical bias. Future research is needed to investigate the circumstances in which this occurs.
Experienced Subjective Well-Being (SWB)
There is an existing welfare measure that under specific circumstances can fulfil the non-responsibility criterion, namely experienced SWB, e.g. life satisfaction. Recall the definition of choiceless utility: “the utility that we would experience from an outcome if it were imposed on us” (Loomes and Sugden 1982, p. 807). Imagine that an outcome that was exogenously imposed on a population is reliably demonstrated to cause a change in SWB. The observed change in SWB cannot measure responsibility utility because an exogenously-imposed outcome cannot deliver responsibility utility and so it can only be picking up the choiceless utility delivered by the outcome. Some examples of exogenously imposed outcomes that have been valued using SWB measures include the relative costs of inflation and unemployment (Di Tella et al. 2001), the impact of the Chernobyl disaster (Berger 2010), and the negative externality induced by a neighbour’s wage (Luttmer 2005). If policy makers are interested in the net benefits delivered by the imposition of these outcomes on a population, then it follows that SWB data is conceptually more appropriate for this analysis than is choice data.Footnote 11
While experienced SWB data delivers information about desirances when comparing situations that differ only in exogenously imposed outcomes, experienced SWB data can also provide data on agentic utility when comparing outcomes that people have chosen themselves. The SWB question “how satisfied are you with your life, all things considered?” invites respondents to step back and consider the story of their life (Baerger and McAdams 1999; Steptoe et al. 2015). It invites respondents to consider the responsibility utility they derive from the choices that have brought them to this point, in addition to considering the choiceless utility they derive from living their lives. Experienced SWB data can deliver measures of the agentic utility derived from chosen outcomes as well as measures of the choiceless utility derived from exogenously imposed outcomes.
A number of practical considerations limit the degree to which experienced SWB can be used as a measure of choiceless utility. First, the goods in question must always be exogenously provided; otherwise experienced SWB might include the responsibility utility engendered by having chosen the good. Second, the goods must already exist; otherwise their impact on experienced SWB cannot be observed. Third, there must be as-good-as-random variation in the level provision of the goods; otherwise their causal impact on experienced SWB cannot be estimated. Since it is rare that all these conditions are satisfied, an alternative measure of desirances is needed.
Measuring desirances using anticipated Subjective Well-being
A leading candidate for a generalizable measure of desirances is anticipated SWB. Questions of the type used in Choshen-Hillel and Yaniv (2011), such as “Which would make you more satisfied if you were to receive it through random assignment?” fulfil the non-responsibility criterion and can be used to evaluate any good, even those that do not yet exist. In this section, we show that anticipated SWB measures can be used to measure desirances. Future research will have to quantify the trade-offs between the hypothetical bias that unincentivized procedures such as anticipated SWB measures induce and the bias resulting from disregarding responsibility utility. Here, we present a simple experiment to illustrate that it is possible to measure preferences and desirances and compare them to each other using anticipated satisfaction data. The section also illustrates that is it possible to measure responsibility utility and to test whether it explains differences between preferences and desirances.
Background and predictions
The study by Choshen-Hillel and Yaniv (2011) reported in Sect. 2.2.5 elicited desirance by asking participants to rank anticipated satisfaction and elicited preference by a different elicitation method (choice). To deliver an uncontaminated comparison of preference and desirance, we need the elicitation method to be constant. Hence, we conducted an experiment that modified the design of Choshen-Hillel and Yaniv (2011) and elicited both preference and desirance for a higher or lower payment to another participant using anticipated satisfaction. Specifically, we randomly assign participants to either a non-agentic condition that asks which outcome would make you more satisfied if it were randomly generated or an agentic condition that asks which outcome would make you more satisfied if you were to have chosen it. Additionally, we use three survey items to measure each participant’s responsibility utility.
We can apply predictions we presented in Sect. 2.1.4 to this experiment. Prediction 3 suggests that as responsibility utility increases in absolute size, so too will the discrepancy between preference and desirance. Hence, adding responsibility utility to our regressions should render the difference between agentic and non-agentic measures insignificant, suggesting that the discrepancy across choice and non-agentic procedures is explained by responsibility utility. Prediction 4 suggests that an intervention that grants choice can alter the rank ordering of outcomes relative to a no-choice situation. As such, respondents’ rankings of outcomes should differ across the agentic and the non-agentic conditions in our experiment. Prediction 5 suggests that agentic (vs. non-agentic) rankings place more positive weight on attributes that deliver positive responsibility utility. In our experiment, the relevant attribute with positive responsibility utility is “payment to the other participant”, where we assume that a positive payment to the other participant provides positive responsibility utility via warm glow or self-signalling. As such, relative to those in the non-agentic condition, respondents in the agentic condition should favour the pro-social outcome.
Participants
We conducted the experiment through Amazon Mechanical Turk where it was advertised as a “ten minute survey for academic research” and as paying $0.35. The informed consent page told the respondents that “we are interested in your preferences” but gave no further specifics on the survey content. Given concerns about data quality from Amazon Mechanical Turk (e.g. Chmielewski and Kucker 2020), we employed an instructional manipulation check that contained text buried within the question instructing respondents to click the fourth option to demonstrate that they had read the question (Oppenheimer et al. 2009). Between March 8th and 11th 2018, we recruited 332 respondents, 203 of whom passed the instructional manipulation check. Of these 203 participants, 103 were randomly assigned to be “participant B” who would passively receive the payment that is the focus of the current study. The dependent variable was not elicited from those 103 respondents and they were not aware of the fact that they might receive an extra bonus depending on other participants’ answers. The remaining 100 survey respondents are the subjects of our analysis (Mage = 35; 40% female).
Materials
At the close of an unrelated survey, we explained to our 100 subjects that they had been matched with another participant (Participant B). We asked our subjects which of two outcomes would lead them to be more satisfied—if participant B were paid the same as them ($0.35, equal compensation), or if participant B were paid more ($0.50, pro-social compensation). Table 2 displays the relevant questions and answers in the order that subjects encountered them.
Table 2 The phrasing of the questions and answers in both conditions Our subjects were randomly assigned to either a non-agentic condition (n = 51) to elicit a desirance, or to an agentic condition (n = 49) to elicit an agentic satisfaction ranking and, ultimately, a choice. The non-agentic condition told subjects that the other participants’ compensation would be determined by a random process. The agentic condition told subjects that on a later screen they would determine the level of compensation for the other participant, and asked whether they would be more satisfied if the other participant were paid $0.35 “because of your choice” or $0.50 “because of your choice”. The decision about the level of compensation for the other participant was consequential—we did match other participants with subjects and, if matched to a subject in the agentic condition, paid them in accordance with the choice made by that subject. For those matched to participants in the non-agentic condition, we randomly assigned a bonus of $0.15 to half so that they were paid $0.50 instead of $0.35. Of course, we could not make the desirance elicitation consequential because doing so would have breached the non-responsibility criterion.
On the screen after subjects had indicated which outcome would make them more satisfied, we asked three questions to measure their responsibility utility. We asked subjects how good they would feel about themselves for paying the other participant $0.35 (virtue35) and how good they would feel about themselves for paying the other participant $0.50 (virtue50) on 5-point Likert scales from “bad” to “good”. We also asked subjects how responsible they felt for the other participant’s payment on a 5-point Likert scale from “not responsible at all” to “entirely responsible”. We modelled the responsibility utility of a pro-social payment for each subject as the product of how responsible she felt for the payment (on a scale from 0 to 4) and of how much better it felt to pay the pro-social compensation than to pay the equal compensation (on a scale from 4 to − 4). Hence, our responsibility utility measure is calculated as Responsibility utilityprosocial = How responsible × (virtue50 − virtue35) and can take on values between − 16 and + 16.
Results
The probability of ranking the pro-social compensation as giving greater satisfaction is 73 percent in the agentic condition and 53 percent in the non-agentic condition (p = 0.034, see Model 1 of Table 3), which suggests that non-agentic and agentic SWB measures produce different rankings.Footnote 12
Table 3 Results of probit regressions on the likelihood of giving a higher SWB rating to the pro-social payoff Our manipulation made subjects in the agentic condition responsible for the outcome that the other participant received and explicated that whichever payment the other respondent received would be “because of your choice”. This manipulation had the intended effect: subjects in the agentic condition gave higher ratings in response to the follow-up question how responsible do you feel for the other participant’s payment than did those in the non-agentic condition (M = 3.18 vs. M = 2.22, t = 4.34, p < 0.001).Footnote 13 As a result, the responsibility utility of making the pro-social payment was higher in the agentic condition than in the non-agentic condition (M = 3.45 vs. M = 0.98, t = 2.30, p = 0.024).
We also asked the 49 subjects in the agentic condition to choose whether participant B would get the lower or the higher payment. The overwhelming majority of these consequential choices (46 of 49) were in line with subjects’ agentic SWB ratings in the sense that they chose the payment they had ranked higher in answer to the satisfaction question.Footnote 14 For instance, thirteen respondents had ranked the $0.35 payment to payer B as giving greater satisfaction than the $0.50 payment and twelve of these chose to pay $0.35. Vice versa, thirty-six respondents had ranked the $0.50 payment to payer B as giving greater satisfaction than the $0.35 payment and thirty-four of these chose to pay $0.50.
Model 2 tests whether responsibility utility explains the agentic/non-agentic discrepancy. It finds that responsibility utility is highly predictive of rating the pro-social payment as giving greater satisfaction (z = 3.85, p < 0.001) and that the inclusion of responsibility utility in the model reduces the effect of the elicitation procedure to non-significance (z = 1.41, p = 0.158; see Model 2 of Table 3). In other words, responsibility utility fully explains the discrepancy across the agentic and non-agentic SWB measures.
Discussion
The results of the experiment illustrate one core message of this paper: people’s rank-orderings of outcomes can differ systematically depending on whether they are passive recipients versus whether they choose the outcome (see Prediction 4 in Sect. 2.1.4). Moreover, the discrepancy in rank-ordering across agentic and non-agentic procedures was fully accounted for by our survey measure of responsibility utility (in line with Prediction 3). Finally, the results are in line with our Prediction 5, that attributes which deliver responsibility utility (in this case, the higher amount others are paid) will be weighted more heavily in preference than in desirance.
The study has some limitations, however. First, respondents who we had assigned to a non-agentic condition reported feeling agency (see footnote 13). Second, anticipated SWB data might not exclude forms of expressive utility other than responsibility utility, e.g. social desirability bias. For example, subject A might report that she is made satisfied by participant B receiving a higher payment than subject A will herself receive but might secretly resent participant B’s good fortune. These two limitations suggest that accurately capturing a desirance is a difficult process and future work is needed to identify the best measures for desirances.
A third limitation concerns our analysis in Model 2. There was potential for an order effect in our study design. We asked the three questions on responsibility utility directly after the satisfaction questions, and, as a result, some subjects might have answered the “how good would you feel” questions to align with the ranking that they had expressed just a moment before. It is not possible to measure our key variables within a single survey such that we would eliminate the risk of some potential order effect.Footnote 15 However, any potential order effect could have no bearing on our primary dependent variable—the satisfaction rankings that the subjects made—because we elicited these rankings before presenting any other questions. The concern is that an order effect may have inflated the correlation between the satisfaction question and the virtue50—virtue35 measure. As a result, it is possible that our Model 2 result overstates the degree to which responsibility utility explains differences across the agentic and non-agentic satisfaction rankings.
Can marginal rates of substitution be inferred from happiness data? A reconsideration
A recent stream of research by Benjaminet al. (2012,2014,2020) compares the trade-offs implied by choice (marginal rates of substitution) with those implied by anticipated SWB. They find that SWB trade-offs differ from marginal rates of substitution, but do not propose a theory to explain this difference. To identify differences between SWB trade-offs and marginal rates of substitution, the 2012 paper presents a series of hypothetical alternative outcomes and asks “which do you think would give you a happier life as a whole?” and “which do you think you would choose?” (pp. 2087–8). The 2014 paper asks medical students to report SWB based on their anticipated experiences at their top-ranked residencies. The paper then compares these SWB ratings with the students’ incentive-compatible rankings of residencies and identifies differences across both measures. Benjamin et al. (2020) summarise these papers writing that the “findings from these two papers suggest that people care about more than just what is measured by standard, single-question survey measures of ‘happiness’ or even ‘life satisfaction’.” (2020, p. 4). As such, they argue, policy making should not yet rely on SWB data.
Our framework and the experiment presented in the previous section suggest that responsibility utility provides one potential explanation for the differences Benjamin and co-authors find between anticipated SWB trade-offs and marginal rates of substitution: It may be the case that for some participants in these studies responsibility utility was greater in choice than in anticipated SWB.
Whether responsibility utility informs choice more than it informs anticipated SWB is a question for future research. It will depend in part on whether the anticipated SWB questions are agentic or not. For example, we do not know whether participants in Benjamin et al. (2012) interpreted the anticipated SWB questions as non-agentic (along the lines of “which do you think would give you a happier life as a whole if it were to happen to you via some exogenous mechanism?”) or as agentic (along the lines of “which do you think would give you a happier life as a whole if you were to have chosen it?”). For a systematic difference to arise between choice and SWB, it would be sufficient that just a subset of participants interpreted the question as non-agentic. Even if most respondents interpreted the question such that responsibility utility was equal in choice and in SWB, when trade-offs are inferred from sample averages that majority would dilute but not neutralise the systematic discrepancies induced by respondents for whom responsibility utility was stronger in choice than in anticipated SWB.
Similarly, in Benjamin et al. (2014), it is not clear how responsible participants would have felt for ending up at the various residencies they were asked about in the SWB question. On the one hand, the survey highlighted to respondents their role in ending up at the residency by reminding them of the choice ranking they had previously submitted and through question wordings that referred to “your chosen residency”, “the programs you ranked”, and “the preference ordering you submitted” (our italics). On the other hand, when asked the SWB question “Thinking about how your life would be if you matriculate into the residency program in [residency you have just indicated you ranked in second position], please answer…” the student will have known that the only mechanism that could result in their ending up at their second-ranked residency is because the matching algorithm sent them there. In other words, students were entirely responsibility for the ranking of residencies that they submitted but may not have considered themselves entirely responsible for the scenarios over which they were estimating SWB. This mechanism could potentially explain why attributes that contribute toward responsibility utility (e.g., desirability for significant other) predicted trade-offs in ranking more strongly than trade-offs in anticipated SWB.
Where responsibility utility differs systematically across choice and anticipated SWB, SWB trade-offs should not be interpreted as marginal rates of substitution. However, our framework suggests SWB trade-offs might be equivalent to marginal rates of substitution when responsibility utility is identical across both procedures. For example, if participants interpreted anticipated SWB questions as agentic, i.e. to mean something like “which do you think would give you a happier life as a whole if you were to have chosen it?”, our framework would expect similar trade-offs compared to the question “which do you think you would choose?”. A second example concerns situations where responsibility utility is zero. This reasoning is in line with the result in Benjamin et al. (2012) showing that choice data and SWB data did not differ systematically when participants were presented with the alternatives of an apple and an orange—a choice that is very unlikely to deliver any responsibility utility. Future research should investigate the extent to which variations in responsibility account for discrepancies across choice and SWB. It remains possible that this research will show that current SWB measures can provide comprehensive welfare measures that correspond to MRSs.
We close by observing that there may exist other sources of discrepancy across choice and SWB and future research ought to examine how these relate to responsibility utility. For example, Benjamin et al. (2012) show that people are systematically more likely to choose options that provide more money than they are likely to indicate that this “money-option” will lead to higher SWB. Responsibility utility could explain this if people anticipate greater negative responsibility utility from missing out on money than from missing out on higher levels of well-being. Alternatively, people may choose the option with higher monetary payoffs more frequently than they anticipate higher SWB from it for reasons totally unrelated to responsibility utility, e.g., because tit-for-tat or lay rationalist rules inform decision making more readily than they inform SWB forecasts (e.g., Amir and Ariely 2007; Comerford and Ubel 2013; Hsee et al. 2003, 2015). With the existing data, we cannot distinguish between these various explanations. However, one of the contributions of the current paper is that we suggest simple survey questions to measure responsibility utility and so test whether it explains the observed results.
Implications for policy evaluation
The distinction between preferences and desirances has several implications for the way welfare economists evaluate policies. The first implication is that variation in responsibility utility must be considered when extrapolating from one choice situation to another. The literature review presented in Sect. 2.2 demonstrates that the preferences revealed by choice data are sensitive to features of the choice environment that alter responsibility utility. When extrapolating from one choice context to another, it is important to account for responsibility utility if choice predictions and welfare estimates are to be accurate. Relatedly, accounting for responsibility utility might expand the applicability of choice-based welfare measures. Choice-based welfare analysis grinds to halt when confronted with a preference reversal because the economist is confronted with contradictory preference orderings (Bernheim and Rangel 2009). Variations in responsibility utility offer a promising explanation for some preference reversals reported in the literature, as demonstrated by our examples in Sect. 2.2. The upshot is that a more coherent ordering in terms of expected utility is likely to be retrieved from revealed preferences if responsibility utility is accounted for.
A second implication of responsibility utility is that choice data may not always and everywhere be the gold standard measure of utility. When the aim is to measure welfare effects of outcomes imposed on a population of passive recipients,Footnote 16 choice data is expected to deliver utility measures that are biased to the extent that they are contaminated by responsibility utility. Hence, alternative measures to infer choiceless utility are needed in situations where responsibility utility should not be included in the welfare evaluation of a policy. We suggest experienced and anticipated SWB as promising candidates for such welfare evaluations and look forward to further work on this important topic.