The diversity in the definitions of risk found in the judgment and decision making literature is matched only by the number of measures that have been proposed so far (Aven 2012). But very often, researchers rely on a so-called weak definition of risk attitude (Pratt 1964), which is based on individuals’ preferences towards the presence versus absence of risk, instantiated by a lottery and its expected value, respectively. In this paper, we aim to show that adopting an alternative definition can have important implications for understanding common patterns of results that have been the target of considerable research, such as the reflection effect (Kahneman and Tversky 1979) or individual and age differences in risky choice (Best and Charness 2015; Mata et al. 2011). More specifically, the goal of the present paper is to argue for a wider use of a strong definition of risk, which focuses on attitudes towards degrees of risk. As discussed in detail below, some researchers have expressed concerns regarding the generalizability of weak attitudes to its strong counterpart (see Mather et al. 2012; Schneider & Lopes 1986). Moreover, unlike other definitions of risk attitudes, the behavioral definition that we advocate can be derived from a general set of theoretical principles (Luce 2010a).

In this paper, we begin by providing formal definitions and drawing clear distinctions between weak and strong risk attitudes. We then analyze data from a past experiment that utilizes a strong definition of risk as well as data from a new experiment investigating age differences in risk attitudes. Both studies show that very few individuals express (strong) risk attitudes that are in line with the reflection effect – a mainstay finding in the field of judgment and decision making (Kühberger, 1998). Furthermore, we show that for some lottery-pair types, older adults appear to be more (strong) risk seeking than younger adults, again in contrast to past findings that have adopted the weak definition of risk taking (e.g., Best & Charness 2015). These results demonstrate the limited generalizability of weak risk attitudes as well as yield new insights into phenomena such as the reflection effect in the study of individual and age differences.

Towards the use of strong risk attitudes

Consider the following two options with the same expected value, A) a lottery that yields $1000 with probability .50, otherwise nothing, and B) the sure receipt of $500. According to the weak definition of risk, a risk averse individual will prefer the sure outcome, whereas a risk-seeking individual would prefer the lottery. A risk-neutral individual would express indifference between the two options. Weak risk attitudes are known to depend on the choice context or framing: In the example given above, in which only potential gains are involved, the vast majority of individuals tends to be risk averse and prefer the sure outcome (e.g., 84 %; Kahneman & Tversky 1979, Problem 11). However, when asked to choose between options involving losses, for example C) a lottery that yields a loss of $ 1000 with probability .50, otherwise nothing, and D) a sure loss of $500, most individuals manifest risk-seeking preferences by choosing the lottery (e.g., 69 %; Kahneman & Tversky 1979, Problem 12). This difference in risk attitudes, known as the reflection effect, corresponds to one of the most prominent phenomena in decision-making research (Kahneman & Tversky 1979; for a review, see Kühberger 1998). This prominence is such that important lines of research have relied upon a weak definition of risk attitudes and the reflection effect. For example, in the study of age differences in decision making, many researchers have focused on whether the reflection effect increases with age (e.g., Mather et al. 2012; Tymula et al. 2013; for a review and meta-analysis, see Best & Charness, 2015). Overall, results indicate that older adults focus more on maintenance of the status quo and loss avoidance, leading to an increase of the reflection effect (see Best & Charness 2015).

A concerning aspect of weak risk attitudes is that they are defined via a comparison between a lottery and a sure outcome with the same expected value. A large body of work, including classic results such as Allais (1953) paradoxes as well as more recent findings such as the so-called “uncertainty effect” (Gneezy et al. 2006), indicates that individuals manifest a disproportionate preference for certain outcomes that is categorically distinct from preferences for risky options. Traditional theoretical accounts such as Prospect Theory attempt to characterize this distinction via their subjective representation of probabilities (see Kahneman & Tversky 1979, pp. 282). However, there is still the notion that such a characterization is insufficient, leading to continuous efforts towards the development of theories of choice that explicitly capture the special status given to certain outcomes (e.g., Diecidue et al. 2004). Moreover, there is concern that the contrast between the absence and presence of risk is not a good proxy for the different scenarios a decision maker might face. For example, in empirical research on economics and finance, many scenarios involve risky options with equiprobable outcomes (e.g., stock portfolios; see Levy, 2010), not risky versus riskless options. A categorical distinction between risky and riskless options can be found in several empirical studies: For instance, Schneider and Lopes (1986) found the reflection effect to be extremely irregular when individuals had to choose between pairs of multiple-outcome lotteries (see also Levy, 2010; but see Budescu & Weiss 1987). More recently, Mather et al. (2012) showed that younger and older adults manifested the reflection effect when comparing between lotteries (in line with Budescu and Weiss 1987), but noted that both age groups only differed in their risk attitudes when one of the options available was a certain outcome.

The concerns associated with a reliance on riskless options suggests that the use of weak risk attitudes should be complemented by another formal definition of risk. In the present work, we propose the use of a strong definition of risk attitude (Rothschild and Stiglitz 1970), which hinges on the comparison of two lotteries with same expected value (e.g., $ 220), with one of the lotteries having a larger variance (e.g., A=($300,.50;$140,.50)) than the other (e.g., B=($240,.50;$200,.50)). When facing such options, strong risk averse individuals would prefer the safer, lower-variance option, whereas strong risk seeking individuals prefer the riskier, higher-variance option. Note that weak risk attitudes are concerned with the presence versus absence of risk, whereas strong risk attitudes deal with changes in the degree of risk. To better understand this difference, consider that an individual might want to avoid risk whenever possible (i.e., weak risk averse preferences), but prefer the riskier option when unable to avoid risk altogether in the hope of obtaining the better outcome (i.e., strong risk seeking preferences).

Previous empirical tests of strong risk attitudes have often relied on multiple-outcome lotteries (Baucells and Heukamp 2006; Schneider and Lopes 1986), which can be difficult for participants to understand (and for researchers to accurately interpret). For example, Schneider and Lopes (1986) relied on lotteries with at least 10 outcomes. In our work described below, we rely on 50:50 binary lotteries (e.g., win $ 100 with probability .50, otherwise $20), avoiding confounds related to individual and age differences in the ability to deal with many pieces of information (Frey et al., 2015).

The present reliance on 50:50 lotteries for testing strong risk attitudes is not arbitrary. In fact, it is line with the axiomatic work of R. Duncan Luce on p-additive utility representations, which we briefly outline here (for reviews, see Luce 2000; 2010a, 2010b). One fundamental aspect of the utility representations U considered in the last century is their additive nature. Let ⊕ be the operator for joint receipt such that XY denotes a new outcome in which goods X and Y are received together (e.g., receiving coffee and cake, different birthday gifts). According to an additive utility representation:

$$ U(X\oplus Y) = U(X) + U(Y). $$
(1)

Luce (2000) argued that multiplication could also play a role in the utility representation, similar to the assumed multiplicative relationship between outcomes and probabilities (e.g., when computing expected values or expected utilities). Under the same general assumptions used to derive the additive representation, Luce (2000) showed that this generalization yields a so-called p(olynomial)-additive representation of utility:

$$ U(X\oplus Y) \,=\, U(X) + U(Y) + \delta U(X)U(Y), \text{with} \; \delta \!= \! -1,0,1. $$
(2)

The p-additive representation of utility assumes that there are three kinds of people, corresponding to the value of δ (-1, 0, or 1).Footnote 1 One intuitive way of understanding Luce’s p-additive representation is to consider that the utility of the joint receipt, U(XY), is given by the the utility of each good separately, U(X) and U(Y), but also by their “interaction,” δ U(X)U(Y). The value of δ determines whether the utility of the joint receipt is equal (δ=0), smaller (δ=−1), or greater (δ=1) than the sum of its parts.Footnote 2

More recently, Luce (2010a, pp. 10) showed that a p-additive representation of utility yields predictions for the case of risky options, such as (X,p;Y,1−p). Specifically, under the assumption that the joint receipt of money is additive (e.g., $5⊕$10 = $5 + $10 = $15), the sign of δ for each individual necessarily implies a particular preference pattern in binary 50:50 lotteries with equal expected value but different variances (i.e., it necessarily implies a specific strong risk attitude):

$$ \delta = \left\{\begin{array}{c} -1 \\ 0 \\ 1 \end{array}\right\}{\kern-1.5pt}\Leftrightarrow{} \left( X + \xi, .50; Y {\kern-1.2pt}-{} \xi, .50\right){\kern-.5pt}\left\{\begin{array}{c} \prec \\ \sim \\ \succ \end{array}\right\}{\kern-1.2pt}\left( X, .50; Y , .50\right), $$
(3)

with ξ>0 and XY.

In the context of decisions under risk, the three types of persons defined by the values taken by δ imply risk-averse (δ=−1), risk-neutral (δ=0), and risk-seeking (δ=1) preferences. The logical equivalence between Luce’s p-additive utility representations and preferences regarding 50:50 lotteries establishes a theoretical foundation for risk attitudes, something that is not typically found (Aven 2012). Luce (2010b) argued that the classification of individuals according to this definition of risk attitude is extremely important given that the testing of many important theoretical properties requires individuals to be partitioned into distinct classes according to their respective δ. Note that many of the axioms underlying p-additive utility have been supported empirically (see Luce 2010a, 2010b).

Individual classification of strong risk attitudes

We will now classify individual strong risk attitudes using data previously published by and a new experiment that enables a comparison between younger and older adults. The classification procedure used here is in many ways similar to those adopted in previous studies (Davis-Stober and Brown 2011; Hilbig and Moshagen 2014). Let P R and P S denote the probability of a preference for a risky and safe lottery, respectively, and P I the probability of indifference being expressed between two lotteries, with 0≤P R ,P S ,P I ≤1 and P R + P S + P I =1. Moreover, let response vector x=(x R ,x S ,x I ) denote the frequencies with which the riskier option or a safer option were chosen (x R and x S , respectively) or an indifference judgment was made (x I ). Under standard assumptions (e.g., independence between trials, response probabilities are stationary) the likelihood of vector x for a given lottery-pair type follows a three-outcome multinomial distribution:

$$ f(\mathbf{x} \mid P) = \frac{(x_{R} + x_{S} + x_{I})!}{x_{R}! \times x_{S}! \times x_{I}!} \times P_{R}^{x_{R}} \times P_{S}^{x_{S}} \times P_{I}^{x_{I}}. $$
(4)

Now, let \(\mathcal {M}\) denote a model representing the stochastic specification of a given risk attitude. The response probabilities of risk-averse, risk-seeking, and risk-neutral individuals are assumed to follow the constraints below:

$$\begin{array}{@{}rcl@{}} \mathcal{M}_{\text{\scriptsize{Risk Averse}}}:& P_{R} < P_{S}, \end{array} $$
(5)
$$\begin{array}{@{}rcl@{}} \mathcal{M}_{\text{\scriptsize{Risk Neutral}}}:& P_{R} = P_{S}, \end{array} $$
(6)
$$\begin{array}{@{}rcl@{}} \mathcal{M}_{\text{\scriptsize{Risk Seeking}}}:& P_{R} > P_{S}. \end{array} $$
(7)

These simple stochastic specifications focus on whether or not there is a systematic relative preference towards one of the two alternatives. This simple approach has the advantage of sidestepping long-standing problems concerning how algebraic expressions of preference or indifference are mapped onto choice probabilities (e.g., Blavatskyy & Progrebna 2010; Davis-Stober & Brown 2011) and the possibility of individuals sometimes “switching” between types.Footnote 3

The evidence for the different \(\mathcal {M}\) will be quantified by means of the Normalized Maximum Likelihood (NML), a statistic coming from the Minimum Description Length framework (MDL; Grünwald 2007). According to MDL, models can be seen as codes that compress data according to the observed regularities. The best-performing model is the one that provides the shortest characterization of the data, providing the best tradeoff between descriptive accuracy and parsimony. The NML index for an arbitrary model \(\mathcal {M}\) is given by

$$ \text{NML}_{\mathcal{M}} = \frac{f(x \mid \mathcal{M})}{\sum\nolimits_{y} f(y \mid \mathcal{M})}, $$
(8)

where the numerator is the maximum log-likelihood of observed data x given a model \(\mathcal {M}\), and the denominator is the sum of maximum likelihoods (given the same model \(\mathcal {M}\)) for all data y that could be observed within the established experimental design. In other words, the numerator quantifies model \(\mathcal {M}\)’s goodness of fit for the observed data whereas the denominator penalizes the model for its ability to account for data in general. Models with higher NML values are considered to outperform their competitors. Several studies have shown the superiority of NML in comparison to other statistics and its close relation to Bayesian model-selection approaches (see Kellen et al. 2013).

The classification of risk attitudes will be conducted independently for each individuals as well as for each of the different lottery-pair types (e.g., comparison between lotteries involving gains vs. lotteries involving losses). Specifically, for each individual and lottery-pair type, we evaluate which model \(\mathcal {M}\) performs best according to NML.Footnote 4 One advantage of this independent classification is that it sidesteps the need to adopt specific assumptions on how risk attitudes generalize across lotteries and the problems that emerge when such assumptions fail. For instance, Davis-Stober and Brown (2013) attempted to classify individuals across lottery-pair types according to a set of “risk profiles” based on previous empirical research and theory. The choices from about one third of the participants were not well accommodated by any of these risk profiles.

Davis-Stober & Brown’s (2013) study

Twenty-four younger adults evaluated a series of 50:50 lottery pairs with equal expected value (26 unique trials replicated 12 times each) in exchange for a $10 flat fee. At each trial, participants expressed their preference for one of the lotteries in the pair or their indifference towards them. The lottery pairs used can be distinguished in five types:

  • GG: Two lotteries comprised of gain outcomes

  • LL: Two lotteries comprised of loss outcomes

  • MM: Two mixed lotteries, each comprised of a gain and a loss outcome

  • MG: A mixed (riskier) lottery paired with a gain-only lottery

  • ML: A mixed (riskier) lottery paired with a loss-only lottery

Results

The NML-based classification results reported in Tables 1 and 2 show a clear variability in risk attitudes across all lottery pairs, with considerable proportions of both risk-seeking and risk-averse individuals. Table 2 shows that very few individuals (8 %) actually expressed risk attitudes that were in line with the reflection effect, according to which individuals should be risk averse in GG and risk seeking in LL (4 % manifested the opposite pattern; see Table 2). However, they manifested a risk-attitude reversal at the level of MG and ML, being risk averse in MG and risk seeking in ML. Note that in MG, an individual can choose between a safer lottery that ensures a gain and a riskier lottery that includes the possibility of losses. In contrast, in ML the safer lottery ensures a loss. Individuals seem to avoid the possibility of losses altogether in MG, but take on the risk of a larger loss in ML in order to have the chance to gain something. Together with the large proportion of risk-averse individuals in MM, choices involving mixed lotteries are in line with the notion that individuals are loss averse (“losses loom larger than gains”; Kahneman & Tversky 1979).

Table 1 Percentages (rounded) of participants classified as risk averse, risk neutral, and risk seeking for the different lottery-pair types (%; Davis-Stober & Brown 2013; N=24)
Table 2 Percentages of individual joint risk-attitude classifications (N=24) across selected lottery-pair types (Davis-Stober and Brown 2013)

New experiment

The new experiment enables us to attempt to replicate the risk-attitude patterns found in the study by Davis-Stober and Brown (2013) as well as test whether patterns of age differences that have been documented in previous aging research studies (Best and Charness 2015) extend to the strong definition of risk taking.

Table 3 Percentages (rounded) of participants classified as risk averse, risk neutral, and risk seeking for the different lottery-pair types (Present Experiment; N younger = N older=30)

Subjects, materials, and procedure

Thirty younger adults (median age = 22, range = 18 to 34, twenty-two females) and 30 older adults (median age = 65.5, range = 61 to 78, fifteen females) took part in this study in exchange for an outcome-dependent monetary compensation. Participants first engaged in a vocabulary (Lehrl 1999) and digit-span substitution task (Wechsler 1981). These two tasks were conducted in order to enable a comparison between the two age groups in terms of crystallized and fluid cognitive abilities, respectively, as well as permit the comparison with the participant samples used in other studies (e.g., Frey et al. 2015). The following choice task comprised 300 unique trials (60 per type), diverging from Davis-Stober and Brown (2013) use of replicated trials. In each trial, participants were presented with two 50:50 lotteries of equal expected value and were requested to express their preference for one of them or indicate indifference. Participants were told that there were no correct/incorrect responses and that they should express their preferences as accurately as possible. The two options presented at each trial had the same expected value but one was more risky (had a larger variance) than the other. The five different lottery-pair types (with 60 trials each) were randomly intermixed.Footnote 5

As depicted in Fig. 1, two lotteries were shown side by side in each trial, with the value 50 % presented together with pictures of the two sides of a coin in order to emphasize the fact that the events in each of the lotteries were equiprobable. This visual format is virtually identical to the one used by Davis-Stober and Brown (2013). Responses were given by selecting the desired response option (“Left”, “Right”, or “Indifferent”) using the mousepad and then confirming it (by pressing the “OK” button). The instructions indicated that all values in the lotteries corresponded to Swiss Francs (CHF), and that at the end of the experiment, one of the trials would be selected at random and the preferred lottery would be played out (or a random lottery in the case of an indifference judgment). Participants were also informed that a fraction of the selected lottery’s occurring outcome (up to CHF 7.50) would be added/subtracted to/from their CHF 20 payment.

Fig. 1
figure 1

Depiction of a Choice Trial. The numbers correspond to the possible payoffs (in Swiss Francs) associated to each 50:50 lottery

Results

Younger and older adults did not differ in terms of their performance in the vocabulary test (Med. = 85 % and 89 % correct, respectively; Wilcoxon W=330, p=.11) but younger adults performed better in the digit-symbol task (Med. = 69 % and 45 %; Wilcoxon W=733.50, p<.001). These results are in line with previous studies comparing similar age groups (e.g., Frey et al. 2015).

The overall risk-attitude classifications (see Tables 3 and 4) are in line with results obtained with Davis-Stober and Brown (2013) data: In terms of the reflection effect (in GG and LL) only 3 % of the individuals in both age groups manifested such a pattern (10 % and 7 % of the younger and older adults manifested the opposite pattern). Younger adults manifested a risk-attitude reversal between MG and ML (57 % are risk averse in the former and risk seeking in the latter), but the same pattern was not found for older adults (only 13 %), a difference that was found to be statistically significant (ΔG 2(1)=13.08, p<.001). Crucially, in terms of age differences, older adults tended to be more risk seeking than younger adults in all choice contexts other than ML (see Table 3), a difference that was deemed statistically significant when restricting the proportion of risk-seeking relative to risk-averse individuals to be equal across age groups (ΔG 2(5)=18.42, p=.002).Footnote 6 In the specific case of MM and MG, the smaller relative proportion of risk-averse individuals among older adults (see Table 3) also suggests that they are less loss averse.

Table 4 Percentages (rounded) of joint risk-attitude classifications across selected lottery-pair types (Present Experiment; N younger = N older=30)

Discussion

Empirical evaluations of risk attitudes are typically based on a weak definition of risk. The reliance on such a definition can be seen as limited in scope because it involves only the comparison between riskless and risky options. Consequently, we asked whether past results generalize to choices between two options that involve some degree of risk. In particular, we focused on whether the reflection effect (Kahneman and Tversky 1979) and older adults’ reduced risk taking relative to younger adults (e.g., Best & Charness 2015) generalize to such situations involving two risky options with binary outcomes. Among several definitions of risk attitudes (Aven 2012), we focused on one for which there is a well-established theoretical foundation (Luce 2000; 2010a, 2010b).

The present results show that only a minority of participants, young and older adults alike, expressed a preference pattern that is consistent with the reflection effect (risk aversion in GG and risk seeking in LL). These results corroborate previous work showing that the reflection effect is less general than typically thought (Schneider & Lopes, 1986). However, one important feature of Davis-Stober and Brown (2013) study and the novel experiment we report is the reliance on binary 50:50 lotteries, which minimizes the impact of task complexity and subjective probability representations (see Quiggin 1982, p. 328). Also, all outcomes were monetary and the notion of riskiness (outcome variance) was transparent. These features address some of the concerns with previous work involving complex lotteries (e.g., Schneider & Lopes 1986; see also Kühberger 1998). However, our findings are somewhat at odds with the results reported by Budescu and Weiss (1987), who found a systematic risk-attitude reversal when comparing choices between gain-only lotteries and choices between their loss-only counterparts. The discrepancy between their study and ours could be attributed to the fact that they considered lotteries with different probability levels, which requires some integration of the subjective representations of outcomes and probabilities (unlike our studies, which attempted to minimize the role of probabilities). Given that risk attitudes can also be captured at the level of the subjective probability representations (Schmidt and Zank 2005), it is possible that discrepant results can emerge from a differential reliance on the latter.

Regarding age differences, our results suggest that a greater proportion of older adults are strong risk seeking relative to young adults in at least some choice contexts. The discrepancy between the present findings and older adults’ risk aversion typically observed when relying upon a weak definition (Best and Charness 2015) confirms concerns about relying solely on the latter to provide a complete picture of how both situational and individual factors affect risk taking. Crucially, the result suggests that a comprehensive view of the life span development of risk-taking propensity must consider different conceptions of risk aversion. There are a number of possible explanations for the pattern of age differences found that we cannot completely rule out at this point. First, the age differences found are in line with one motivational theory of aging, Socioemotional Selectivity Theory (e.g., Carstensen et al. 1999), which suggests that older adults tend to place a greater weight on the best possible outcomes within a given environment, outcomes which in the present design were always associated with the riskier lottery. Future work is needed to test this account against other motivational theories (see Depping & Freund 2011). Second, another possibility is that age differences were driven by a “peanuts effect” (Prelec and Loewenstein 1991), according to which a greater relative prevalence of risk-seeking preferences among older adults is expected given that they are typically wealthier than younger adults (especially college students), making the payoffs more consequential to the latter group than to the former. This explanation seems implausible given previous failures to find a link between income/wealth and risk-taking propensity or age differences thereof (see Josef et al. 2016). Moreover, it is unclear how such wealth effects could simultaneously lead to increases in risk aversion when riskless options are involved (Best and Charness 2015). Third, other factors such as cognitive ability and numerical competency could underlie some of the individual and age differences observed (e.g., Peters, 2012). We note, however, that our individual measures of fluid and crystallized cognitive abilities did not show a reliable association with any of our individual classifications.Footnote 7 All in all, our results suggest that future work will need to uncover the exact mechanisms and boundary conditions that are involved in tasks relying on the two definitions of risk, as well as their respective predictors (e.g., cognitive, motivational, wealth-related).

The present work focused on a formal definition of risk attitude that can be derived from basic axiomatic principles (Luce 2010a, 2010b), an approach that contrasts with typical operationalizations of risk attitudes by means of a pre-specified parametric model; e.g., some iteration of Prospect Theory. However, the present results yield relevant implications regarding the use of such parametric models. We mention a couple of them in decreasing order of generality. First, the considerable heterogeneity found in the data reinforces the notion that data aggregation is inappropriate (see Luce 2010a). However, it also suggests that established methods for modeling individual differences in decision-making under risk should focus on the presence of qualitatively-distinct classes of persons rather than simply assuming that all individual differences consist of some smooth variation around a central tendency (for a discussion, see Bartlema et al. 2014). Second, non-negligible proportions of individuals were strong risk averse or neutral in LL, implying a concave or linear utility function for losses under Prospect Theory. Previous work as argued that utility functions for losses are convex but analyses of individual-level data have provided a somewhat mixed picture that warrants further scrutiny (for similar results, see Abdellaoui et al. 2008).