1 Introduction

Despite its notorious lack of empirical validity, expected utility (EU) theory continues to prevail as the standard normative model of economic rationality in theoretical and applied risk and decision analyses. The significance of the model largely rests on the independence axiom of EU theory, which ensures dynamically consistent choices in multiple-stage decision making under risk. Dynamic consistency of risky choice, in turn, is a basic criterion of economic rationality in the sense that it requires an individual´s choices at later stages of a decision process to conform with this individual´s preferred course of action planned at earlier stages (Karni and Schmeidler 1991; for a related concept of dynamically consistent behaviour, see Hammond 1998; Hammond and Zank 2014). In recent decision research, dynamic choice theory has indeed been extended beyond EU maximisation from various perspectives, especially those of risk, uncertainty and ambiguity (Machina 1989; Sarin and Wakker 1998; Siniscalchi 2011; Nebout 2014). But the extensions have remained essentially restricted to descriptive models of decision making in the sense that they violate, or “sacrifice”, other normative principles of rational choice where necessary to preserve dynamic consistency.

The present analysis goes one critical step beyond the descriptive intent of non-EU models. It refutes the normative interpretation of EU theory. To serve as a universally valid principle of rational decision making, EU maximisation would have to take implications of static choice into account that arise in comparisons between compound lotteries, but are usually ignored in normative interpretations of EU theory. Preferences for compound lotteries will be shown not only to violate the independence axiom under well-defined conditions, but also to exclude any possibility of postulating independence consistently in models of static decision making under risk. Theoretically, this inconsistency arises from conditional preferences induced by alternative mixtures of lotteries with one common mixture component and identical mixture weights. The independence axiom is specifically concerned with convex combinations of lotteries of this type. In practice, the inconsistency occurs in assessments of one risk in the presence of others, commonly called “background risks”. An example will be given in terms of conditional preferences for compound lotteries in which the component lotteries (or “sublotteries”) are resolved at uncertain times.

In more technical terms, the scope of the present research note can be delineated as follows. We consider preference comparisons between two mixtures of lotteries p and r, and q and r, respectively. The mixture weights of p and q are the same and, hence, so are those of the common component r. Contrary to what the independence axiom of EU theory proposes, we admit choices between the two compound lotteries to depend not only on the component lotteries, but also on the particular nature and numerical values of the mixture weights. A typical example of the impact of the mixture weights on preference arises where the mixture weight assigned to p (or q) is the probability with which p (or q) is resolved first (i.e., prior to the resolution of r) or else persists and is rejected with the complementary probability in the alternative case of premature resolution of r. In this example, the compound lotteries are, by construction, one-stage and, hence, give rise to static choices, but involve uncertainty of the timing of risk resolution. Realistic examples are familiar from applied statistics (multivariate survival-time analysis) and static investment decision making with uncertain time horizon. Assume now that the decision maker first ranks p and q in preference and then makes his choice between the two compound lotteries. Then, the presence of the common component r in either mixture induces a reference risk so that the decision maker is not, in effect, concerned with the advantage of p over q when considering the impact of p and q on his choice. Rather, he must assess the advantage of p in the presence of r over q in the presence of r, for independently of his choice, he confronts r in case p (or q) fails to be resolved first. The conditionalisation of risk thus entails conditionalisation of preference of which we will show that it is an order-preserving isomorphism: the conditional preferences satisfy the independence axiom excatly if the unconditional preferences do, from which they are derived. Yet, it turns out that, given a few obvious properties of the induced preferences, the latter necessarily violate the independence axiom if they satisfy stochastic dominance preference. Stochastic dominance preference, on its part, is widely viewed as an indispensable normative requirement of rational risky choice weaker than independence.

To demarcate the present analysis from previous approaches to conditional risk preferences available in the literature, a few remarks are in order. Although we will represent lotteries as probability distributions (or, equivalently, random variables associated with them), our concept of conditional risk—unlike that used by Pratt (1988), for instance—is different from conditional probability, nor are the conditional preferences we consider of the kind which arise in multi-attribute preference analyses (Keeney and Raiffa 1976; Geiger 2012). Other than in many previous approaches to the subject, background risk is not treated as an exogenous, non-tradable risk (e.g., Franke et al. 2006; Malevergne and Rey 2010) but as one endogenously induced by convex combination of the lotteries to be compared. Likewise, we may ignore preferences induced in choices involving “temporal risk” and “temporal preference”, that is, sequential risk and decision problems in which preference depends on when uncertainty resolves (for review and references, see Machina 1984). There is no interference with the normative view of the independence axiom in the temporal risk case since dynamic consistency, as a criterion of EU preferences, deals with “atemporal” risk preferences (Kreps and Porteus 1979; Machina 1984; Karni and Schmeidler 1991). This non-interference is most obvious in representations of EU preferences as special, atemporal boundary cases of temporal risk preference (Kreps and Porteus 1979; Machina 1984) arising under conditions which also ensure the dynamic consistency of choice (Kreps and Porteus 1978). Indeed, the present analysis, too, applies to preferences depending on when uncertainty resolves, but choices are static, while risks are resolved at uncertain times. Like in our approach, convex mixtures of lotteries have previously been modelled as probability-weighted averages of bivariate lotteries (uncertainty of resolution time and monetary outcome) and applied in static-decision analyses (for examples and review of the literature, see Martellini and Urošević 2006). But as it seems, their normative implications for EU theory have not been examined so far.

After sketching out the conceptual framework of EU theory and some of its most basic normative implications in the next section, we develop formal representations of background risk and conditional preference in Sects. 3 and 4. Section 5 presents an application of the formalism. In Sect. 6, the invalidity of the independence axiom as a normative principle of static, rational risky choice is proved. Section 7 concludes, with a few remarks concerning the meaning and significance of the results obtained and their possible extensions to more broadly circumscribed domains of rational risky choice.

2 Normative implications of EU theory

Let P be the convex set of simple probability distributions, or lotteries, defined on a compact real interval I of lottery outcomes. A degenerate lottery which gives x, xI, with certainty is denoted by \( \hat{x} \), \( \hat{x} \)P. Lottery outcomes are evaluated as gains (x > x0) or losses (x < x0) with reference to some neutral point, or “aspiration level” x0, x0I (Kahneman and Tversky 1979; Diecidue and van de Ven 2008), which is normalised to zero without loss of generality (x0 = 0, 0 ∈ I). A preference relation exists which satisfies the familiar axioms of weak order and continuity. If, in addition, the independence axiom is postulated,

$$ p\, {\mathbf{ \succsim }}\,q \Rightarrow \alpha p + \, (1 - \alpha )r\,{\mathbf{ \succsim }}\,\alpha q + \, (1 - \alpha )r,\quad 0 < \alpha \le 1,\; p \in P,\;q \in P,\;r \in P $$
(1)

the more restricted version of independence (replacement of by strict preference ), the indifference version of (1) (replacement of by ~) and, eventually, EU theory follow (e.g., Hammond 1998). EU theory implies the principle of stochastic dominance preference. This principle states that pq if p has first-order stochastic dominance over q, that is, if p ≠ q and Fp(x) ≤ Fq(x), xI, where Fp and Fq denote cumulative distribution functions. As a postulate weaker than the independence axiom, stochastic dominance preference is a fundamental normative requirement of rational choice, but as such is compatible with violation of the independence axiom (Tversky and Kahneman 1986, p. S253; Machina 1989, pp. 1634–1635).

The convex combination αp + (1 − α)r entering (1) is often viewed as the single-stage representation of a two-stage lottery which, in the first stage, gives an α : (1 − α) chance of receiving a ticket for the second-stage lottery p or r. When viewed as a single-stage lottery in P, however, αp + (1 − α)r gives the outcome x, xI, with probability αp(x)+ (1 − α)r(x). Choices between multiple-stage lotteries can be dynamic or static depending on whether or not they involve contingent decisions to be taken between particular lottery stages. Dynamic consistency of preferences implies that, in a choice between the two-stage lotteries corresponding to αp + (1 − α)r and αq + (1 − α)r, the preferred two-stage lottery gives the preferred second-stage lottery p or q, respectively, when the initial α-lottery gets resolved, with r being rejected. As a normative principle of rational choice, the independence axiom rests on the result that, under a few reasonable restrictions, independence is equivalent to dynamic consistency (Karni and Schmeidler 1991; Hammond 1998). One such reasonable restriction is reduction of compound lotteries, which will be addressed in the concluding section. It means that decision makers are indifferent between multiple-stage lotteries and their single-stage, or “reduced”, representations in P (Segal 1990, 1992).

3 Background risk in compound lotteries

The independence axiom also raises a static-choice problem, which has been largely ignored in normative interpretations of EU theory, however. To show that this neglect leaves normative interpretations of the independence axiom inconsistent, we first introduce the concept of background risk into the analysis of (1). A lottery with random outcome variable X2 is usually called the background risk of another lottery with random outcome variable X1 if the decision maker faces the risk of X2 while assessing the risk of X1. Typical examples are additive and multiplicative risk-background interactions involving outcome variables X and functions χ of the form X = χ(X1, X2) = X1 + X2 and χ(X1, X2) = X1X2, respectively (Pratt 1988; Tsetlin and Winkler 2005; Franke et al. 2006; Malevergne and Rey 2010). Here, we admit more generally defined functions χ(X1, X2) as well as functions of more than two random variables. Comparison of X and X’, where X’ = χ(X1, X2) for some X1′, induces a preference ranking between X1 and X1 conditional on the background risk X2 which can be expressed symbolically as X1\(_{\chi,{{X_{2}}}}\)X1,

$$ X_{1}\, {\mathbf{ \succsim }}_{\chi ,X_{2}}\, X_{1}^{{\prime }} \Leftrightarrow \chi \left( {X_{1} ,X_{2} } \right)\,{\mathbf{ \succsim }}\,\chi \left( {X_{1}^{{\prime }} ,X_{2} } \right) $$
(2)

and similarly for the more restrictive relations \( \succ_{{\chi ,X_{2} }} \) and \( \sim_{{\chi ,{X_{2}} }} \). We refer to the random variable X = χ(X1, X2) as X1in the presence of X2, or, briefly, X1given X2. In the literature, \(_{\chi,{{X_{2}}}}\) has almost invariably been treated as an EU preference relation, excluding violation of independence from the outset (for rare exceptions, see Quiggin 2003; Geiger 2008). However, the equivalence (2) suggests that whether \(_{\chi,{{X_{2}}}}\) has, or has not, the EU property depends, besides on , on χ and X2: if χ and X2 can be shown to exist so that \(_{\chi,{{X_{2}}}}\) violates the independence axiom, (2) excludes as an EU preference relation. This is what our results obtained below demonstrate (see remark following Proposition 4 below).

Background risks do not only exist where decision makers face one risk while assessing others. They also arise in weaker cases where the decision maker, while assessing some risks, confronts another risk, but only with some finite probability. Examples of this broader notion of background risk can be found where the timing of the resolution of risk is uncertain, for instance, in the management of risky investments with uncertain time horizon or reliability engineering (Sect. 5). We begin to analyse this situation by developing a suitable conceptual framework for risk assessment and choice under constraints that are themselves uncertain.

4 Formal representation of background risk

Where not otherwise stated, we denote the probability distributions of X1, X1 and X2 by p, q and r, respectively, throughout this analysis. The ranges of X1, X1 and X2 are thus subsets of I. Without further explicit reference, we anticipate that, by construction, all multivariate, composite random functions χ, ζ, … considered below, which will typically be of the form ζ(χ(X1, X2), …), have ranges in I and, therefore, probability distributions in P. We are especially concerned with functions of X1, X1 and X2 and other variables giving rise to joint probability distributions of the form αp + (1 − α)r. We aim to establish how the comparison αp + (1 − α)rαq + (1 − α)r translates into the preference between X1 and X1 given X2. This task amounts to determining the probability distribution of χ(X1, X2), which is derived, but different, from αp + (1 − α)r. More precisely, it is derived from the joint probability distribution of X1, X2 and another real random variable, or, more generally, a random vector T which is supposed to range over a finite subset S of an m-dimensional real interval IT, m ≥ 1. The joint probability distribution of T has finite support S, S ⊂ IT. For every partition of S into subsets S1 and S2 and given T, there clearly exists some α, 0 ≤ α ≤ 1, so that the overall joint probability of t, T = t and tS1, is α, and analogously for T, S2 and 1 − α. T being constructed in this way shows that the α-lottery can be associated with any suitably distributed T. The latter can be any quantitative constraint on risk measurement, assessment and comparison specifying the impact of a given background risk on risk preference. An example of a vector T of random time variables with m = 2 will be presented in the next section; it is clearly distinct from “temporal” risk in dynamic risky choice problems, however.

Let f(x1, x2, t) be the joint probability distribution of X1, X2 and T with the marginal distributions f\( _{{X_{i} }} \)(xi), fT(t), etc. in the usual notation, and assume a partition of S into subsets S1 and S2 as mentioned above. Consider the random variable Y = ψ(X1, X2, T) with probability distribution g(y),

$$ \psi (X_{1} ,X_{2} ,\varvec{T}) = \left\{ {\begin{array}{*{20}c} {X_{1} \quad {\text{if}}\;\varvec{T} = \varvec{t},\;\varvec{t} \in \varvec{S}_{1} } \\ {X_{2} \quad {\text{if}}\;\varvec{T} = \varvec{t},\;\varvec{t} \in \varvec{S}_{2} } \\ \end{array} } \right. $$
(3)
$$ g\left( y \right) = \sum_{{x_{1} ,x_{2},\varvec{t}}} [f(x_{1} ,x_{2} ,\varvec{t})]_{\psi (x_{1},x_{2},{\varvec{t}}) = y} $$
(4)

where “\( \sum_{{x_{1} ,x_{2} ,{\varvec{t}}}} \left[ \ldots \right]_{{\psi (x_{1} ,x_{2} ,{\varvec{t}}) = y}} \)” means summation over all outcome values of X1, X2 and T satisfying ψ(x1, x2, t) = y for given y. In particular, given ψ(x1, x2, t) = y, it follows g(y) = f\( _{{X_{i} }} \)(xi) from Eq. (3) for tSi, i = 1, 2, so that Eq. (4) simplifies to g(y) = f(x1, x2, t). Putting p = f\( _{{X_{1} }} \), r = f\( _{{X_{2} }} \), and α =  \( \sum\nolimits _{{\varvec{t}} \in {{{\varvec{S}}_{1} }}} \)fT(t), one trivially has,

Proposition 1

For i = 1, 2, assume stochastic independence of XiandT. Then,

$$ f_{{X_{1} X_{2} }} \left( {x_{1} ,x_{2} } \right) = \sum_{{\varvec{t} \in \varvec{S}}} f\left( {x_{1} ,x_{2} ,\varvec{t}} \right) = \alpha p(x_{1} ) + (1 - \alpha )r(x_{2} ),\quad 0 \le \alpha \le 1 $$
(5)

Since equal random variables have the same probability distribution, one has Y = X1 and g(y) = p(y) with probability α, and Y = X2 and g(y) = r(y) with probability 1 − α. Thus, Eq. (3) and Proposition 1 imply g = αp + (1 − α)r. Note that this construction of Y as a random variable with a compound probability distribution makes no reference to the notion of sequential lottery.

The probability distribution of X1 in the presence of X2 can now be computed on the basis of Eq. (5). Similarly to the definition of Y, the dependence of X on the α-lottery can generally be expressed as a random function ζ(X, T) determined differently for T-values in S1 and S2, respectively. To define ζ(X, T) in a fashion similar to Eq. (3) means to represent the probability distribution of X1 given X2 as a convex mixture of lotteries. As such, it is in P if the mixture components are. This is a requirement necessary for ζ and χ to have finite ranges in I, and for their probability distributions to be in P, since the preference relations concerned are defined on P. Now, let ζ1 and ζ2 be real functions so thatFootnote 1

$$ \zeta (X,\varvec{T}) = \left\{ {\begin{array}{*{20}c} {\zeta_{1} (X)\quad if\;\varvec{T} = \varvec{t},\;\varvec{t} \in \varvec{S}_{2} } \\ {\zeta_{2} (X)\quad if\;\varvec{T} = \varvec{t},\;\varvec{t} \in \varvec{S}_{1} } \\ \end{array} } \right. $$
(6)

Let z(ζ1(x), ζ2(x), t) be the probability distribution of ζ(X, T). The marginal distribution zζ1ζ2(ζ1(x), ζ2(x)) is the probability function of X = χ(X1, X2) to be determined. Since Xi and T are independent, so are ζi(χ(X1, X2)) and T (as for this stochastic-independence property of composite definitions of functions of random vectors, see, e.g., Pfeiffer (1990), esp. p. 251 and Theorem 11.3.3). Equations (3) to (6) imply zT(t) = fT(t) so that

$$ z_{{\zeta_{1} \zeta_{2} }} \left( {\zeta_{1} \left( x \right),\zeta_{2} \left( x \right)} \right) = (1 - \alpha )z_{\zeta_{1} }(\zeta_{1} (x)) + az_{\zeta_{2}} (\zeta_{2} (x)) $$
(7)

Equation (7) follows immediately from Eq. (6) in a fashion similar to the derivation of Eq. (5) from Eq. (3), with the replacement of the stochastic independence condition for Xi and T by that of ζi(χ(X1, X2)) and T.

The dependence of z\( _{\zeta_{1} \zeta_{2}} \) on p and r needs to be determined next. Put ω = 1 − α in Eq. (7) and defineFootnote 2

$$ (p{\mid }_{\omega } r)\left( x \right) = z_{{\zeta_{1} \zeta_{2} }} \left( {\zeta_{1} \left( x \right),\zeta_{2} \left( x \right)} \right),\quad r \ne \hat{0},\;x \in I $$
(8)

Consistently with the interpretation of zζ1ζ2(ζ1(x), ζ2(x)) as the probability distribution of X1 given X2, pωr means p given r (“p in the presence of r”), with r being the background risk of p. The condition r ≠ \( \hat{0} \) (i.e., r(0) < 1) excludes the trivial case in which r gives zero with certainty, that is, the case of no background risk at all. Let the conditionalisation operation ∣ωr: P → Pω, r be a function operating to the left on the function variable p in “pωr”, where Pωr is the range of ∣ωr, Pω, r ⊂ P. From Eqs. (7) and (8) follows

$$ p{\mid }_{\omega } r = \omega p{\mid }_{1} r + \, \left( {1 \, {-}\omega } \right)p{\mid }_{0} r,\quad r \ne \hat{0},\;0 \le \omega \le 1 $$
(9)

It remains to determine p1r and p0r. To this goal, a few more properties of pωr have to be specified. First, consider the boundary case ω = 1. In this special case, one has α = 0 so that the comparison of αp + (1 − α)r and αq + (1 − α)r reduces to the trivial indifference r ~ r for all p, q and r. Hence, p and q in the presence of r are unconstrained by r, meaning

$$ p{\mid }_{1} r = p, \quad p \in P $$
(10)

Secondly, recall that the aspiration level x0 has been assumed to be an exogenously fixed parameter. As such, it is not related to probability and does not vary with the background risk (Diecidue and van de Ven 2008). In the normalisation x0 = 0 adopted above, this implies \( \hat{0} \)ωr = \( \hat{0} \), and conversely, if pωr = \( \hat{0} \) for some p and every ω, 0 ≤ ω ≤ 1, then p1r = p = \( \hat{0} \) as a special case for ω = 1 so that altogether

$$ p{\mid }_{\omega } r = \hat{0} \, \Leftrightarrow p = \hat{0} $$
(11)

Thirdly, if p = r, this means that p obtains with certainty. This is equivalent to ω = 0 so that

$$ p{\mid }_{0} p = p, $$
(12)

Considering Eqs. (3) and (9), one has

Proposition 2

For given r, r(0) < 1, and 0 ≤ ω ≤ 1, ∣ωr: P → Pω, ris a linear function.

The proof of Proposition 2 is outlined in the Appendix together with the proofs of Propositions 35 stated below. From Proposition 2, it follows immediately that, if {x1, …, xn} is the support of p and pi = p(xi), then

$$ p{\mid }_{\omega } r = \left(\sum_{i \le n} p_{i} \hat{x}_{i} \right){\mid }_{\omega } r = \sum_{i \le n} p_{i} (\hat{x}_{i} {\mid }_{\omega } r),\quad r \, \ne \hat{0}, $$
(13)

for p as a finite convex combination of the degenerate distributions \( \hat{x} \)1, …, \( \hat{x} \)n.Footnote 3 Given Eqs. (9) and (10), our next result completes the formal representation of pωr,

Proposition 3

For every r, r(0) < 1, |0r has the idempotent property (p∣0r)∣0r = p∣0r, and

$$ \begin{aligned} (p{\mid }_{0} r)\left( 0 \right) & = p\left( 0 \right) \\ (p{\mid }_{0} r)(x) & = r(x)\frac{1 - p(0)}{1 - r(0)},\quad x \ne 0 \\ \end{aligned} $$
(14)

Because of this idempotence, which is a familiar characteristic of projection operations (e.g., Yanai et al. 2011), the transformation ∣0r: P → P0, r can be understood as a parallel projection which maps P onto P0, r along hypersurfaces p(0) = constant in convex subsets Δ ⊂ P with rΔ and \( \hat{0} \)Δ (Fig. 1).

Fig. 1
figure 1

Convex set Δ of lotteries with possible outcomes x1 < x2 = 0 < x3, given background risk r. The dashed line is P0, r

Inserting the results (10) and (14) into Eq. (9) gives the desired representation of p in the presence of r,Footnote 4

$$ p{\mid }_{\omega } r = \omega p + \, \left( {1 \, {-}\omega } \right)p{\mid }_{0} r \quad r \ne \hat{0},\;0 \le \omega \le 1 $$
(15)

Recall that q is the probability distribution of X1. The comparison X1\(_{\chi,{{X_{2}}}}\)X1 implicitly defines a preference ranking of p and q conditional on r. It can be written as pω, rq, while (2) goes over into

$$ p\,{\mathbf{ \succsim }}_{\omega , \, r}\, q \Leftrightarrow p{\mid }_{\omega } r\,{\mathbf{ \succsim }}\,q{\mid }_{\omega } r,\quad r \ne \hat{0},\;0 \le \omega \le 1 $$
(16)

This definition means that in a choice between αp + (1 − α)r and αq + (1 − α)r, the mixture component r induces a preference ranking ω, r between p and q conditional on the background risk r. For independently of his choice, the decision maker always faces r while assessing p and q, except in case ω = 1. He is not concerned with the advantage of p over q, but the relative advantage of p given r over q given r, or, formally, pωrqωr.

5 Example: uncertain risk resolution time

In practice, often decision makers cannot be sure when, exactly, the risks they are facing will be resolved. This situation is critical for the management of risky investments with uncertain time horizon, for example (Martellini and Urošević 2006; Blanchet-Scalliet et al. 2008). It also induces time duration of risk as a random variable in many static decision tasks. In fact, “static” in the sense of “non-dynamic” means non-sequential, but not necessarily time-independent and definitely not deterministic in the sense that the date when a static risk resolves is known to the decision maker in advance with certainty. Correspondingly, time duration of risk as a random constraint on static choice is studied in many operational and applied sciences besides financial management. Survival analysis in reliability engineering is a prevalent example (e.g., Marshall and Olkin 2007; Finkelstein 2008). We adopt a few basic concepts of failure risk analysis for illustrative purposes.

Since in comparisons of the form (2) uncertain resolution time may characterise the lotteries X1, X1 and the background risk X2 alike, the preference χ(X1, X2) χ(X1´, X2) also depends on the probability that X2 persists at least until (i.e., is not resolved prior to) the resolution of X1 and X1´. The exception is the limiting case in which the background risk prematurely vanishes with certainty, that is, the decision maker knows for sure that X2 is resolved first while X1 and X1´ persist. In this case, the comparison of X1 and X1´ is in effect independent of the background risk, in agreement with Eq. (10). To provide a bivariate survival time model of (2), each Xi entering (2) is assigned to a random time variable Ti and the joint probability that Xi = xi at time Ti = ti ≥ 0, where tiI\( _{{T_{i} }} \)I\( _{{T_{i} }} \) is a real interval, and i = 1, 2. Put T = (T1, T2) and IT = I\( _{{T_{1} }} \) × I\( _{{T_{2} }} \) so that ζ(X, T) = ζ(X, T1, T2), and assume finite support S of z\( _{{T_{1}T_{2} }} \), S ⊂ I\( _{{T_{1} }} \) × I\( _{{T_{2} }} \). Let S1 = {(t1, t2)∣ T1 = t1 ≤ t2 = T2} and S2 = {(t1, t2)∣ T1 = t1 > t2 = T2} in Eqs. (3) and (6). The overall probability that T1 = t1 ≤ t2 = T2 is \( \sum_{{t_{1}}\le{t_{2}}} \)z\( _{{t_{1} }{t_{2}}} \)(t1, t2) = α, where “\( \sum_{{t_{1}}\le{t_{2}}} \)″ means summation over all pairs t1, t2 in S1. Likewise, \( \sum_{{t_{1}}\le{t_{2}}} \)z\( _{{T_{1} }{T_{2}}} \)(t1, t2) = ω, which is to be inserted into Eq. (15). By construction, α can be viewed as the relative persistence of the background risk X2, while the complementary probability ω measures the relative persistence of p in the presence of r. In other words, pωr implies an ω:(1 − ω) chance of r being resolved first or else rejected, and conversely for p.

A case in point is the failure of a single component of a technical system that may disrupt the operation of the entire system. Random time to failure, T1, of the component part will then have to be assessed in the presence of the risk of system failure at uncertain time T2 due to continuous wear-out or any other kind of deterioration of the overall system over time. One may also reasonably assume that the (economic) consequences X1 and X2 respectively incurred by continued operation or failure of the system are different. For instance, repair or replacement of a single component will normally be less expensive than that of the entire system. If each Xi is stochastically independent of (T1, T2), T1 and T2 are stochastically independent and, in a continuous approximation, exponentially distributed with constant failure rates τ1 and τ2, respectively,Footnote 5 one straightforwardly finds

$$ \sum_{{t_{1} \le t_{2} }} z_{{T_{1} T_{2} }} \left( {t_{1} ,t_{2} } \right) = 1 \, {-}\,\omega \approx \tau_{1} /(\tau_{1} + \tau_{2} ) $$

for the overall probability that T1 ≤ T2, and ω ≈ τ2/(τ1 + τ2) for the complementary probability that T1 > T2 (see, e.g., Pitman 1993, p. 352; Finkelstein 2008, Chap. 2; as for the accuracy of the approximation required to use the exponential probability density distribution within a simple-probability framework, see Pitman 1993, p. 300). In particular, if τ1 ≪ τ2, the failure rate associated with X1 is rather low, and the relative persistence of p given r is high and close to 1. Conversely, if τ2 ≪ τ1, one has ω ≪ 1. In less simple situations, the approximations made do not obtain, especially the assumption of exponential time-dependence of the failure probability density function with constant failure rate. The computation of ω then requires more complex integrations (e.g., Finkelstein 2008, Chap. 2), or summations in the finite discrete probability case, but the role of ω as a parameter quantifying the relative persistence of risk given a background risk of course remains the same.

The example is instructive not only because it illustrates the significance of the present approach to background risk in a realistic application setting. It also demonstrates that αp + (1 − α)r can be interpreted as a single-stage lottery by definition rather than by reduction of compound lotteries. The ambiguous nature of αp + (1 − α)r is due to the α-lottery, which, in the present account, gives the probability of p or r being obtained by being resolved first. Thus, endogenously induced violation of independence, as analysed in the next section, can well arise in settings in which decision making exhibits a critical dependence on time, but is in no way related to the dynamics of choice.

6 Violation of independence

The equivalence (16) entails various simple, closely related results, which we summarise in

Proposition 4

For every r, r ≠ \( \hat{0} \), and ω, 0 < ω ≤ 1: (i) Pω, ris a convex set; (ii) the preference structures (P, ω, r) and (Pω, r, ) are isomorphic underωr: P → Pω, r; (iii) (Pω, r, ) is an EU preference structure if and only if (P, ω, r) is an EU preference structure; (iv) (Pω, r, ) satisfies stochastic dominance preference if (P, ) does.

Proposition 4(iii) means that if ω, r violates the independence axiom, (16) excludes as an EU preference relation, and conversely. This result mutatis mutandis confirms the remark following (2) that, if χ and X2 can be shown to exist so that \(_{\chi,{{X_{2}}}}\) violates the independence axiom, (2) excludes as an EU preference relation. Proposition 4(iv) states that ∣ωr preserves stochastic dominance preference. This result is a trivial consequence of Pω, r ⊂ P but does not necessarily include the stochastic dominance preference of ω, r given that of . On the other hand, ω, r must satisfy stochastic dominance preference in order for ω, r and to be EU preference relations because of Proposition 4(iii). Our final result clarifies this point.

Proposition 5

For every r, r ≠ \( \hat{0} \), and ω, 0 < ω < 1, ifω, rsatisfies stochastic dominance preference, thenω, rviolates the independence axiom.

Propositions 4(iii) and 5 together exclude as an EU preference relation if ω, r satisfies stochastic dominance preference. This result means that, with stochastic dominance preference holding in the presence of r, the comparison pq does not generally entail αp + (1 − α)rαq + (1 − α)r, contrary to (1). But Proposition 5 goes further still. It strictly rules out any possibility of postulating independence consistently. To see this, assume that satisfies the independence axiom together with the other EU axioms on P and, hence, on Pω, r since this is a convex subset of P. Then ω, r, too, is an EU preference relation on P, by Proposition 4(iii). As such, it satisfies the independence axiom. Now Proposition 5 rules out stochastic dominance preference for ω, r by indirect reasoning. This violation of stochastic dominance preference by ω, r excludes the EU property of ω, r since together with the other EU axioms the independence principle implies stochastic dominance preference. The assumption of an EU preference relation thus entails a contradiction: there is no room for joint interpretations of Propositions 4 and 5 other than rejection of the independence principle.

The proof of Proposition 5 involves some technicalities, so an informal remark on the basic argument on which it builds would seem appropriate. Stochastic dominance preference in combination with the axioms of weak order and continuity is well-known to ensure the existence of a real-valued utility representation of preference. The representation is unique up to strictly increasing transforms of the utility scale, but its axiomatic basis is still too weak to determine its explicit functional form (see, e.g., Becker and Sarin 1987). In proof of Proposition 5, one can therefore start from the premise that utility functionals representing and ω, r do exist. But they do not necessarily provide EU representations since pωr is generally not a convex mixture of p and r (except in special cases p0r = r), nor does pq ⇒ pωrqωr generally hold as an analogue or substitute of (1). Now, Proposition 5 states that ω, r and, by Proposition 4(ii), not only are not necessarily EU preferences, but also that EU preferences are excluded in principle, even if and ω, r satisfy stochastic dominance preference. On the other hand, the assumption of stochastic dominance preference is strong enough to ensure a non-EU utility representation of the underlying preference relation , which the proof of Proposition 5 widely exploits.

The isopreference structures of ω, r on Pω, r and on P illustrate these results. They are depicted in Fig. 2 on convex sets Δω, r ⊂ Pω, r and Δ ⊂ P, respectively, as low-dimensional examples. The indifference lines of (solid lines, Fig. 2b) show the characteristic “fanning-out” familiar from violations of independence observed in risky choice experiments (e.g., Starmer 2000).

Fig. 2
figure 2

Indifference lines of (a) ω, r on convex set Δω, r ⊂ Pω, r (shaded area, dashed lines) and (b) on convex set Δ ⊂ P (solid lines). The indifference pattern shown is based on the example used in proof of Proposition 5

Apart from its theoretical consequences, Proposition 5 bears on the meaning of “rationality” in risky choice. An axiomatic account of non-EU preferences consistent with the present findings can be found in Geiger (2008). It deals with status quo dependent decision making, where status quo risk means the decision maker’s extant, risky economic situation as a special instance of background risk. The account straightforwardly explains various types of observed violations of EU preference, notably “fanning-out” and loss aversion (Starmer 2000, Abdellaoui et al. 2008). Together with the above results, it suggests that observed systematic violations of EU do not so much indicate departure from rationality of risky choice, but rather exhibit rationality under constraints, that is, pragmatic rationality as opposed to independence as a requirement of purely theoretical rationality. “Pragmatic rationality”, in turn, means utility maximisation with explicit reference to the decision maker’s economic situation, time constraints on risk exposure, demands, and aspiration level.

7 Conclusions and extensions

A definitional extension of EU theory has been developed on the basis of the equivalences (2) and (16). Once the concept of risk in the presence of a background risk is given a suitable formal representation, (2) and (16) define conditional preferences induced by background risks such as arise in the comparison of compound lotteries in (1). This definitional extension of EU theory is in fact necessary if the independence axiom is given a normative interpretation and applied to conditional preferences. However, it has turned out that EU theory cannot, as a matter of principle, be consistently extended in this way. Apart from definitions such as (2) and (16) and trivial normalisations such as x0 = 0, the proof of this impossibility result involves no assumptions that essentially limit the generality of the result; nor was it obtained under conditions exogenously imposed on preference rankings such as the induced non-EU preferences for temporal risk mentioned in the Introduction.Footnote 6 Rather, as Eq. (15) and the equivalence (16) make clear, the conditional non-EU preferences are defined in terms of mixtures of lotteries and arise in comparisons of such mixtures, which constitute the very meaning and significance of the independence principle. To avoid the non-EU consequences, Proposition 5 does not even admit background-dependent risk preferences to violate stochastic dominance preference, for EU preference logically includes stochastic dominance preference. On the other hand, the impossibility result is compatible with the notion of stochastic dominance preference as “perhaps the most obvious principle of rational choice” and—unlike EU theory—a “cornerstone of the normative theory of choice” (Tversky and Kahneman 1986, p. S253).

We close by a few remarks concerning the range of significance of the preceding results. Dynamic consistency and conditionalisation of risk preferences have both been referred to above as necessary theoretical requirements of rational preference orderings, one of which forces independence, while the other excludes it. To clarify this discrepancy, conditional preferences induced by identical sublotteries which occur in alternative multiple-stage lotteries, but may themselves be compound lotteries, would have to be introduced into models of dynamic choice. This task can arguably be tackled within existing frameworks of rational sequential choice corresponding to Hammond´s (1998, Sects. 5 and 6) account of dynamically consistent behaviour. In fact, using reduction of compound lotteries in combination with backward recursion,Footnote 7 Hammond provides multiple-stage lotteries with single-stage representations to which the conceptualisations and results of the present analysis directly apply. On the other hand, the concept of dynamic consistency will have to be strongly modified to incorporate the conditionalisation of risks and preferences induced by mixture of lotteries. Thus, the decision maker will face different background risks, with different conditional preference relations arising at different stages of a sequential decision task, even if his underlying unconditional preference relation is “atemporal” and as such remains unchanged. Moreover, he will generally exhibit quite different aspiration levels x0 at different stages, depending on the gains or losses incurred in previous choices made at antecedent stages. If the alternative lotteries involved in a multiple-stage decision problem are finally given reduced single-stage representations with the use of backward recursion, Propositions 4 and 5 still exclude the possibility that satisfies independence. From the proof of Proposition 5 one infers that the reduced versions of multiple-stage lotteries, too, show the fanning-out of indifference lines typical of violation of independence (Figs. 2 and 3).

Fig. 3
figure 3

Convex set Δ and shaded area Δω, r = ΔPω, r of lotteries with possible outcomes x1 < x2 = 0 < x3 and background risk r ~ \( \hat{0} \). The dotted parallel line segments in Δω, r indicate strictly increasing preference for increasing ρω, r (p), p2 = constant. The dashed lines indicate the isomorphism (A.6)

Finally, one may wish to extend the preceding analysis to infinite discrete and continuous probability distributions as important risk models. Under additional, sufficiently strong conditions, the equivalence of dynamic consistency and independence holds for non-simple probability distributions as well (Hammond 1998, Secs. 8 and 9). But on the other hand, under these conditions, convex sets P* of infinite discrete or continuous probability distributions necessarily contain convex subsets P of simple probability distributions (Hammond 1998, p. 197), to which the above impossibility result applies. More precisely, every weak preference relation * on P* possesses restrictions on P so that * violates independence if any such does. Thus, there is in effect no normative justification for the independence axiom even under the stronger conditions under which EU preference orderings exist and are equivalent to dynamically consistent ones on convex sets P* of non-simple probability distributions.