Conditional non-expected utility preferences induced by mixture of lotteries: a note on the normative invalidity of expected utility theory

This research note is concerned with static choices between alternative mixtures of lotteries with one common mixture component and identical mixture weights. It is shown that the common component induces a conditional preference relation on the underlying lottery space with given (unconditional) preference structure. Induced preferences of this type arise in the comparisons with which the independence axiom of expected utility theory is specifically concerned. Given a few obvious properties of the induced preferences, two basic results are obtained: first, the conditionalisation operation is an order-preserving isomorphism, and, secondly, if the conditional preferences satisfy stochastic dominance preference, they necessarily violate the independence axiom. Together, the two results preclude any possibility of postulating independence consistently for static decision making under risk. The independence axiom is thus generally invalid as a normative principle of rational risky choice.


Introduction
Despite its notorious lack of empirical validity, expected utility (EU) theory continues to prevail as the standard normative model of economic rationality in theoretical and applied risk and decision analyses. The significance of the model largely rests on the independence axiom of EU theory, which ensures dynamically consistent choices in multiple-stage decision making under risk. Dynamic consistency of risky choice, in turn, is a basic criterion of economic rationality in the sense that it requires an individual´s choices at later stages of a decision process to conform with this individual´s preferred course of action planned at 1 3 earlier stages (Karni and Schmeidler 1991; for a related concept of dynamically consistent behaviour, see Hammond 1998;Hammond and Zank 2014). In recent decision research, dynamic choice theory has indeed been extended beyond EU maximisation from various perspectives, especially those of risk, uncertainty and ambiguity (Machina 1989;Sarin and Wakker 1998;Siniscalchi 2011;Nebout 2014). But the extensions have remained essentially restricted to descriptive models of decision making in the sense that they violate, or "sacrifice", other normative principles of rational choice where necessary to preserve dynamic consistency.
The present analysis goes one critical step beyond the descriptive intent of non-EU models. It refutes the normative interpretation of EU theory. To serve as a universally valid principle of rational decision making, EU maximisation would have to take implications of static choice into account that arise in comparisons between compound lotteries, but are usually ignored in normative interpretations of EU theory. Preferences for compound lotteries will be shown not only to violate the independence axiom under well-defined conditions, but also to exclude any possibility of postulating independence consistently in models of static decision making under risk. Theoretically, this inconsistency arises from conditional preferences induced by alternative mixtures of lotteries with one common mixture component and identical mixture weights. The independence axiom is specifically concerned with convex combinations of lotteries of this type. In practice, the inconsistency occurs in assessments of one risk in the presence of others, commonly called "background risks". An example will be given in terms of conditional preferences for compound lotteries in which the component lotteries (or "sublotteries") are resolved at uncertain times.
In more technical terms, the scope of the present research note can be delineated as follows. We consider preference comparisons between two mixtures of lotteries p and r, and q and r, respectively. The mixture weights of p and q are the same and, hence, so are those of the common component r. Contrary to what the independence axiom of EU theory proposes, we admit choices between the two compound lotteries to depend not only on the component lotteries, but also on the particular nature and numerical values of the mixture weights. A typical example of the impact of the mixture weights on preference arises where the mixture weight assigned to p (or q) is the probability with which p (or q) is resolved first (i.e., prior to the resolution of r) or else persists and is rejected with the complementary probability in the alternative case of premature resolution of r. In this example, the compound lotteries are, by construction, one-stage and, hence, give rise to static choices, but involve uncertainty of the timing of risk resolution. Realistic examples are familiar from applied statistics (multivariate survival-time analysis) and static investment decision making with uncertain time horizon. Assume now that the decision maker first ranks p and q in preference and then makes his choice between the two compound lotteries. Then, the presence of the common component r in either mixture induces a reference risk so that the decision maker is not, in effect, concerned with the advantage of p over q when considering the impact of p and q on his choice. Rather, he must assess the advantage of p in the presence of r over q in the presence of r, for independently of his choice, he confronts r in case p (or q) fails to be resolved first. The conditionalisation of risk thus entails conditionalisation of preference of which we will show that it is an order-preserving isomorphism: the conditional preferences satisfy the independence axiom excatly if the unconditional preferences do, from which they are derived. Yet, it turns out that, given a few obvious properties of the induced preferences, the latter necessarily violate the independence axiom if they satisfy stochastic dominance preference. Stochastic dominance preference, on its part, is widely viewed as an indispensable normative requirement of rational risky choice weaker than independence.
To demarcate the present analysis from previous approaches to conditional risk preferences available in the literature, a few remarks are in order. Although we will represent lotteries as probability distributions (or, equivalently, random variables associated with them), our concept of conditional risk-unlike that used by Pratt (1988), for instance-is different from conditional probability, nor are the conditional preferences we consider of the kind which arise in multi-attribute preference analyses (Keeney and Raiffa 1976;Geiger 2012). Other than in many previous approaches to the subject, background risk is not treated as an exogenous, non-tradable risk (e.g., Franke et al. 2006;Malevergne and Rey 2010) but as one endogenously induced by convex combination of the lotteries to be compared. Likewise, we may ignore preferences induced in choices involving "temporal risk" and "temporal preference", that is, sequential risk and decision problems in which preference depends on when uncertainty resolves (for review and references, see Machina 1984). There is no interference with the normative view of the independence axiom in the temporal risk case since dynamic consistency, as a criterion of EU preferences, deals with "atemporal" risk preferences (Kreps and Porteus 1979;Machina 1984;Karni and Schmeidler 1991). This non-interference is most obvious in representations of EU preferences as special, atemporal boundary cases of temporal risk preference (Kreps and Porteus 1979;Machina 1984) arising under conditions which also ensure the dynamic consistency of choice (Kreps and Porteus 1978). Indeed, the present analysis, too, applies to preferences depending on when uncertainty resolves, but choices are static, while risks are resolved at uncertain times. Like in our approach, convex mixtures of lotteries have previously been modelled as probabilityweighted averages of bivariate lotteries (uncertainty of resolution time and monetary outcome) and applied in static-decision analyses (for examples and review of the literature, see Martellini and Urošević 2006). But as it seems, their normative implications for EU theory have not been examined so far.
After sketching out the conceptual framework of EU theory and some of its most basic normative implications in the next section, we develop formal representations of background risk and conditional preference in Sects. 3 and 4. Section 5 presents an application of the formalism. In Sect. 6, the invalidity of the independence axiom as a normative principle of static, rational risky choice is proved. Section 7 concludes, with a few remarks concerning the meaning and significance of the results obtained and their possible extensions to more broadly circumscribed domains of rational risky choice.

Normative implications of EU theory
Let P be the convex set of simple probability distributions, or lotteries, defined on a compact real interval I of lottery outcomes. A degenerate lottery which gives x, x ∈ I, with certainty is denoted by x , x ∈ P. Lottery outcomes are evaluated as gains (x > x 0 ) or losses (x < x 0 ) with reference to some neutral point, or "aspiration level" x 0 , x 0 ∈ I (Kahneman and Tversky 1979;Diecidue and van de Ven 2008), which is normalised to zero without loss of generality (x 0 = 0, 0 ∈ I). A preference relation ≿ exists which satisfies the familiar axioms of weak order and continuity. If, in addition, the independence axiom is postulated, the more restricted version of independence (replacement of ≿ by strict preference ≻), the indifference version of (1) (replacement of ≿ by ~) and, eventually, EU theory follow (e.g., Hammond 1998). EU theory implies the principle of stochastic dominance preference.
(1) p ≿ q ⇒ p + (1 − )r ≿ q + (1 − )r, 0 < ≤ 1, p ∈ P, q ∈ P, r ∈ P This principle states that p ≻ q if p has first-order stochastic dominance over q, that is, if p ≠ q and F p (x) ≤ F q (x), x ∈ I, where F p and F q denote cumulative distribution functions. As a postulate weaker than the independence axiom, stochastic dominance preference is a fundamental normative requirement of rational choice, but as such is compatible with violation of the independence axiom (Tversky and Kahneman 1986, p. S253;Machina 1989Machina , pp. 1634Machina -1635.
The convex combination αp + (1 − α)r entering (1) is often viewed as the single-stage representation of a two-stage lottery which, in the first stage, gives an α : (1 − α) chance of receiving a ticket for the second-stage lottery p or r. When viewed as a single-stage lottery in P, however, αp + (1 − α)r gives the outcome x, x ∈ I, with probability αp(x)+ (1 − α)r(x). Choices between multiple-stage lotteries can be dynamic or static depending on whether or not they involve contingent decisions to be taken between particular lottery stages. Dynamic consistency of preferences implies that, in a choice between the two-stage lotteries corresponding to αp + (1 − α)r and αq + (1 − α)r, the preferred two-stage lottery gives the preferred second-stage lottery p or q, respectively, when the initial α-lottery gets resolved, with r being rejected. As a normative principle of rational choice, the independence axiom rests on the result that, under a few reasonable restrictions, independence is equivalent to dynamic consistency (Karni and Schmeidler 1991;Hammond 1998). One such reasonable restriction is reduction of compound lotteries, which will be addressed in the concluding section. It means that decision makers are indifferent between multiplestage lotteries and their single-stage, or "reduced", representations in P (Segal 1990(Segal , 1992.

Background risk in compound lotteries
The independence axiom also raises a static-choice problem, which has been largely ignored in normative interpretations of EU theory, however. To show that this neglect leaves normative interpretations of the independence axiom inconsistent, we first introduce the concept of background risk into the analysis of (1). A lottery with random outcome variable X 2 is usually called the background risk of another lottery with random outcome variable X 1 if the decision maker faces the risk of X 2 while assessing the risk of X 1 . Typical examples are additive and multiplicative risk-background interactions involving outcome variables X and functions χ of the form X = χ(X 1 , X 2 ) = X 1 + X 2 and χ(X 1 , X 2 ) = X 1 X 2 , respectively (Pratt 1988;Tsetlin and Winkler 2005;Franke et al. 2006;Malevergne and Rey 2010). Here, we admit more generally defined functions χ(X 1 , X 2 ) as well as functions of more than two random variables. Comparison of X and X', where X' = χ(X 1 ′, X 2 ) for some X 1 ′, induces a preference ranking between X 1 and X 1 ′ conditional on the background risk X 2 which can be expressed symbolically as X 1 ≿ ,X 2 X 1 ′, and similarly for the more restrictive relations ≻ ,X 2 and ∼ ,X 2 . We refer to the random variable X = χ(X 1 , X 2 ) as X 1 in the presence of X 2 , or, briefly, X 1 given X 2 . In the literature, ≿ ,X 2 has almost invariably been treated as an EU preference relation, excluding violation of independence from the outset (for rare exceptions, see Quiggin 2003;Geiger 2008). However, the equivalence (2) suggests that whether ≿ ,X 2 has, or has not, the EU property depends, besides on ≿, on χ and X 2 : if χ and X 2 can be shown to exist so that ≿ ,X 2 violates the independence axiom, (2) excludes ≿ as an EU preference relation. This is what our results obtained below demonstrate (see remark following Proposition 4 below). Background risks do not only exist where decision makers face one risk while assessing others. They also arise in weaker cases where the decision maker, while assessing some risks, confronts another risk, but only with some finite probability. Examples of this broader notion of background risk can be found where the timing of the resolution of risk is uncertain, for instance, in the management of risky investments with uncertain time horizon or reliability engineering (Sect. 5). We begin to analyse this situation by developing a suitable conceptual framework for risk assessment and choice under constraints that are themselves uncertain.

Formal representation of background risk
Where not otherwise stated, we denote the probability distributions of X 1 , X 1 ′ and X 2 by p, q and r, respectively, throughout this analysis. The ranges of X 1 , X 1 ′ and X 2 are thus subsets of I. Without further explicit reference, we anticipate that, by construction, all multivariate, composite random functions χ, ζ, … considered below, which will typically be of the form ζ(χ(X 1 , X 2 ), …), have ranges in I and, therefore, probability distributions in P. We are especially concerned with functions of X 1 , X 1 ′ and X 2 and other variables giving rise to joint probability distributions of the form αp + (1 − α)r. We aim to establish how the comparison αp + (1 − α)r ≿ αq + (1 − α)r translates into the preference between X 1 and X 1 ′ given X 2 . This task amounts to determining the probability distribution of χ(X 1 , X 2 ), which is derived, but different, from αp + (1 − α)r. More precisely, it is derived from the joint probability distribution of X 1 , X 2 and another real random variable, or, more generally, a random vector T which is supposed to range over a finite subset S of an m-dimensional real interval I T , m ≥ 1. The joint probability distribution of T has finite support S, S ⊂ I T . For every partition of S into subsets S 1 and S 2 and given T, there clearly exists some α, 0 ≤ α ≤ 1, so that the overall joint probability of t, T = t and t ∈ S 1 , is α, and analogously for T, S 2 and 1 − α. T being constructed in this way shows that the α-lottery can be associated with any suitably distributed T. The latter can be any quantitative constraint on risk measurement, assessment and comparison specifying the impact of a given background risk on risk preference. An example of a vector T of random time variables with m = 2 will be presented in the next section; it is clearly distinct from "temporal" risk in dynamic risky choice problems, however.
Let f(x 1 , x 2 , t) be the joint probability distribution of X 1 , X 2 and T with the marginal distributions f X i (x i ), f T (t), etc. in the usual notation, and assume a partition of S into subsets S 1 and S 2 as mentioned above. Consider the random variable Y = ψ(X 1 , X 2 , T) with probability distribution g(y), where " ∑ x 1 ,x 2 ,t […] (x 1 ,x 2 ,t)=y " means summation over all outcome values of X 1 , X 2 and T satisfying ψ(x 1 , x 2 , t) = y for given y. In particular, given ψ( Proposition 1 For i = 1, 2, assume stochastic independence of X i and T. Then, Since equal random variables have the same probability distribution, one has Y = X 1 and g(y) = p(y) with probability α, and Y = X 2 and g(y) = r(y) with probability 1 − α. Thus, Eq. (3) and Proposition 1 imply g = αp + (1 − α)r. Note that this construction of Y as a random variable with a compound probability distribution makes no reference to the notion of sequential lottery.
The probability distribution of X 1 in the presence of X 2 can now be computed on the basis of Eq. (5). Similarly to the definition of Y, the dependence of X on the α-lottery can generally be expressed as a random function ζ(X, T) determined differently for T-values in S 1 and S 2 , respectively. To define ζ(X, T) in a fashion similar to Eq. (3) means to represent the probability distribution of X 1 given X 2 as a convex mixture of lotteries. As such, it is in P if the mixture components are. This is a requirement necessary for ζ and χ to have finite ranges in I, and for their probability distributions to be in P, since the preference relations concerned are defined on P. Now, let ζ 1 and ζ 2 be real functions so that 1 Let z(ζ 1 (x), ζ 2 (x), t) be the probability distribution of ζ(X, T). The marginal distribution z ζ1ζ2 (ζ 1 (x), ζ 2 (x)) is the probability function of X = χ(X 1 , X 2 ) to be determined. Since X i and T are independent, so are ζ i (χ(X 1 , X 2 )) and T (as for this stochastic-independence property of composite definitions of functions of random vectors, see, e.g., Pfeiffer (1990), esp. p. 251 and Theorem 11.3.3). Equations (3) to (6) imply z T (t) = f T (t) so that Equation (7) follows immediately from Eq. (6) in a fashion similar to the derivation of Eq. (5) from Eq. (3), with the replacement of the stochastic independence condition for X i and T by that of ζ i (χ(X 1 , X 2 )) and T.
The dependence of z 1 2 on p and r needs to be determined next. Put ω = 1 − α in Eq. (7) and define 2 Consistently with the interpretation of z ζ1ζ2 (ζ 1 (x), ζ 2 (x)) as the probability distribution of X 1 given X 2 , p| ω r means p given r ("p in the presence of r"), with r being the background risk of p. The condition r ≠ 0 (i.e., r(0) < 1) excludes the trivial case in which r gives zero with certainty, that is, the case of no background risk at all. Let the conditionalisation operation Observe that the subsequent notation for the T-dependence of ζ is inverted as compared to that of ψ as specified on the right-hand side of Eq. (3) (X 1 → ζ 2 (X), X 2 → ζ 1 (X)). The change is adopted for technical reasons and is consistent with Eq. (3); it admits a more natural and more coherent notation in the following paragraphs, especially in Eqs. (9) and (15) below, but otherwise has no theoretical significance. . 2 Reference to ω rather than α in the following definition is for reasons of convenience. It admits a more compact and consistent formalism, as will be obvious from Eqs. (8) to (15) below and from the realistic interpretation of ω in the application example of Sect. 5.
| ω r: P → P ω, r be a function operating to the left on the function variable p in "p| ω r", where P| ω r is the range of | ω r, P ω, r ⊂ P. From Eqs. (7) and (8) follows It remains to determine p| 1 r and p| 0 r. To this goal, a few more properties of p| ω r have to be specified. First, consider the boundary case ω = 1. In this special case, one has α = 0 so that the comparison of αp + (1 − α)r and αq + (1 − α)r reduces to the trivial indifference r ~ r for all p, q and r. Hence, p and q in the presence of r are unconstrained by r, meaning Secondly, recall that the aspiration level x 0 has been assumed to be an exogenously fixed parameter. As such, it is not related to probability and does not vary with the background risk (Diecidue and van de Ven 2008). In the normalisation x 0 = 0 adopted above, this implies 0 | ω r = 0 , and conversely, if p| ω r = 0 for some p and every ω, 0 ≤ ω ≤ 1, then p| 1 r = p = 0 as a special case for ω = 1 so that altogether Thirdly, if p = r, this means that p obtains with certainty. This is equivalent to ω = 0 so that Considering Eqs. (3) and (9), one has Proposition 2 For given r, r(0) < 1, and 0 ≤ ω ≤ 1, | ω r: P → P ω, r is a linear function.
The proof of Proposition 2 is outlined in the Appendix together with the proofs of Propositions 3-5 stated below. From Proposition 2, it follows immediately that, if {x 1 , …, x n } is the support of p and p i = p(x i ), then for p as a finite convex combination of the degenerate distributions x 1 , …, x n . 3 Given Eqs. (9) and (10), our next result completes the formal representation of p| ω r, Proposition 3 For every r, r(0) < 1, | 0 r has the idempotent property (p| 0 r)| 0 r = p| 0 r, and Because of this idempotence, which is a familiar characteristic of projection operations (e.g., Yanai et al. 2011), the transformation | 0 r: P → P 0, r can be understood as a parallel projection which maps P onto P 0, r along hypersurfaces p(0) = constant in convex subsets Δ ⊂ P with r ∈ Δ and 0 ∈ Δ (Fig. 1).
(9) p| r = p| 1 r + (1 − )p| 0 r, r ≠0, 0 ≤ ≤ 1 Inserting the results (10) and (14) into Eq. (9) gives the desired representation of p in the presence of r, 4 Recall that q is the probability distribution of X 1 ′. The comparison X 1 ≿ ,X 2 X 1 ′ implicitly defines a preference ranking of p and q conditional on r. It can be written as p ≿ ω, r q, while (2) goes over into This definition means that in a choice between αp + (1 − α)r and αq + (1 − α)r, the mixture component r induces a preference ranking ≿ ω, r between p and q conditional on the background risk r. For independently of his choice, the decision maker always faces r while assessing p and q, except in case ω = 1. He is not concerned with the advantage of p over q, but the relative advantage of p given r over q given r, or, formally, p| ω r ≿ q| ω r.

Example: uncertain risk resolution time
In practice, often decision makers cannot be sure when, exactly, the risks they are facing will be resolved. This situation is critical for the management of risky investments with uncertain time horizon, for example (Martellini and Urošević 2006;Blanchet-Scalliet et al. 2008). It also induces time duration of risk as a random variable in many static decision tasks. In fact, "static" in the sense of "non-dynamic" means non-sequential, but (15) p| r = p + (1 − )p| 0 r r ≠0, 0 ≤ ≤ 1 (16) p ≿ , r q ⇔ p| r ≿ q| r, r ≠0, 0 ≤ ≤ 1 Fig. 1 Convex set Δ of lotteries with possible outcomes x 1 < x 2 = 0 < x 3 , given background risk r. The dashed line is P 0, r not necessarily time-independent and definitely not deterministic in the sense that the date when a static risk resolves is known to the decision maker in advance with certainty. Correspondingly, time duration of risk as a random constraint on static choice is studied in many operational and applied sciences besides financial management. Survival analysis in reliability engineering is a prevalent example (e.g., Marshall and Olkin 2007;Finkelstein 2008). We adopt a few basic concepts of failure risk analysis for illustrative purposes.
Since in comparisons of the form (2) uncertain resolution time may characterise the lotteries X 1 , X 1 ′ and the background risk X 2 alike, the preference χ(X 1 , X 2 ) ≿ χ(X 1´, X 2 ) also depends on the probability that X 2 persists at least until (i.e., is not resolved prior to) the resolution of X 1 and X 1´. The exception is the limiting case in which the background risk prematurely vanishes with certainty, that is, the decision maker knows for sure that X 2 is resolved first while X 1 and X 1´ persist. In this case, the comparison of X 1 and X 1´ is in effect independent of the background risk, in agreement with Eq. (10). To provide a bivariate survival time model of (2), each X i entering (2) is assigned to a random time variable T i and the joint probability that X i = x i at time T i = t i ≥ 0, where t i ∈I T i I T i is a real interval, and i = 1, 2. Put T = (T 1 , T 2 ) and I T = I T 1 × I T 2 so that ζ(X, T) = ζ(X, T 1 , T 2 ), and assume finite support S of z T 1 T 2 , S ⊂ I T 1 × I T 2 . Let S 1 = {(t 1 , t 2 )| T 1 = t 1 ≤ t 2 = T 2 } and S 2 = {(t 1 , t 2 )| T 1 = t 1 > t 2 = T 2 } in Eqs. (3) and (6). The overall probability that T 1 = t 1 ≤ t 2 = T 2 is ∑ t 1 ≤t 2 z t 1 t 2 (t 1 , t 2 ) = α, where " ∑ t 1 ≤t 2 ″ means summation over all pairs t 1 , t 2 in S 1 . Likewise, ∑ t 1 ≤t 2 z T 1 T 2 (t 1 , t 2 ) = ω, which is to be inserted into Eq. (15). By construction, α can be viewed as the relative persistence of the background risk X 2 , while the complementary probability ω measures the relative persistence of p in the presence of r. In other words, p| ω r implies an ω:(1 − ω) chance of r being resolved first or else rejected, and conversely for p.
A case in point is the failure of a single component of a technical system that may disrupt the operation of the entire system. Random time to failure, T 1 , of the component part will then have to be assessed in the presence of the risk of system failure at uncertain time T 2 due to continuous wear-out or any other kind of deterioration of the overall system over time. One may also reasonably assume that the (economic) consequences X 1 and X 2 respectively incurred by continued operation or failure of the system are different. For instance, repair or replacement of a single component will normally be less expensive than that of the entire system. If each X i is stochastically independent of (T 1 , T 2 ), T 1 and T 2 are stochastically independent and, in a continuous approximation, exponentially distributed with constant failure rates τ 1 and τ 2 , respectively, 5 one straightforwardly finds for the overall probability that T 1 ≤ T 2 , and ω ≈ τ 2 /(τ 1 + τ 2 ) for the complementary probability that T 1 > T 2 (see, e.g., Pitman 1993, p. 352;Finkelstein 2008, Chap. 2; as for the accuracy of the approximation required to use the exponential probability density distribution within a simple-probability framework, see Pitman 1993, p. 300). In particular, if τ 1 ≪ τ 2 , the failure rate associated with X 1 is rather low, and the relative persistence of p given r is high and close to 1. Conversely, if τ 2 ≪ τ 1 , one has ω ≪ 1. In less simple situations, the approximations made do not obtain, especially the assumption of exponential 5 That these conditions involve strong idealisations and as such are rarely met in real technological systems is widely acknowledged in the literature on reliability and survival time analysis (e.g., Pfeiffer 1990, Sec. 8.6). Nevertheless, they often admit useful mathematical models of practical-life distributions, while facilitating mathematical tractability. time-dependence of the failure probability density function with constant failure rate. The computation of ω then requires more complex integrations (e.g., Finkelstein 2008, Chap. 2), or summations in the finite discrete probability case, but the role of ω as a parameter quantifying the relative persistence of risk given a background risk of course remains the same.
The example is instructive not only because it illustrates the significance of the present approach to background risk in a realistic application setting. It also demonstrates that αp + (1 − α)r can be interpreted as a single-stage lottery by definition rather than by reduction of compound lotteries. The ambiguous nature of αp + (1 − α)r is due to the α-lottery, which, in the present account, gives the probability of p or r being obtained by being resolved first. Thus, endogenously induced violation of independence, as analysed in the next section, can well arise in settings in which decision making exhibits a critical dependence on time, but is in no way related to the dynamics of choice.
Proposition 4(iii) means that if ≿ ω, r violates the independence axiom, (16) excludes ≿ as an EU preference relation, and conversely. This result mutatis mutandis confirms the remark following (2) that, if χ and X 2 can be shown to exist so that ≿ ,X 2 violates the independence axiom, (2) excludes ≿ as an EU preference relation. Proposition 4(iv) states that | ω r preserves stochastic dominance preference. This result is a trivial consequence of P ω, r ⊂ P but does not necessarily include the stochastic dominance preference of ≿ ω, r given that of ≿. On the other hand, ≿ ω, r must satisfy stochastic dominance preference in order for ≿ ω, r and ≿ to be EU preference relations because of Proposition 4(iii). Our final result clarifies this point.
Propositions 4(iii) and 5 together exclude ≿ as an EU preference relation if ≿ ω, r satisfies stochastic dominance preference. This result means that, with stochastic dominance preference holding in the presence of r, the comparison p ≿ q does not generally entail αp + (1 − α)r ≿ αq + (1 − α)r, contrary to (1). But Proposition 5 goes further still. It strictly rules out any possibility of postulating independence consistently. To see this, assume that ≿ satisfies the independence axiom together with the other EU axioms on P and, hence, on P ω, r since this is a convex subset of P. Then ≿ ω, r , too, is an EU preference relation on P, by Proposition 4(iii). As such, it satisfies the independence axiom. Now Proposition 5 rules out stochastic dominance preference for ≿ ω, r by indirect reasoning. This violation of stochastic dominance preference by ≿ ω, r excludes the EU property of ≿ ω, r since together with the other EU axioms the independence principle implies stochastic dominance preference.
The assumption of an EU preference relation ≿ thus entails a contradiction: there is no room for joint interpretations of Propositions 4 and 5 other than rejection of the independence principle.
The proof of Proposition 5 involves some technicalities, so an informal remark on the basic argument on which it builds would seem appropriate. Stochastic dominance preference in combination with the axioms of weak order and continuity is well-known to ensure the existence of a real-valued utility representation of preference. The representation is unique up to strictly increasing transforms of the utility scale, but its axiomatic basis is still too weak to determine its explicit functional form (see, e.g., Becker and Sarin 1987). In proof of Proposition 5, one can therefore start from the premise that utility functionals representing ≿ and ≿ ω, r do exist. But they do not necessarily provide EU representations since p| ω r is generally not a convex mixture of p and r (except in special cases p| 0 r = r), nor does p ≿ q ⇒ p| ω r ≿ q| ω r generally hold as an analogue or substitute of (1). Now, Proposition 5 states that ≿ ω, r and, by Proposition 4(ii), ≿ not only are not necessarily EU preferences, but also that EU preferences are excluded in principle, even if ≿ and ≿ ω, r satisfy stochastic dominance preference. On the other hand, the assumption of stochastic dominance preference is strong enough to ensure a non-EU utility representation of the underlying preference relation ≿, which the proof of Proposition 5 widely exploits.
The isopreference structures of ≿ ω, r on P ω, r and ≿ on P illustrate these results. They are depicted in Fig. 2 on convex sets Δ ω, r ⊂ P ω, r and Δ ⊂ P, respectively, as low-dimensional examples. The indifference lines of ≿ (solid lines, Fig. 2b) show the characteristic "fanning-out" familiar from violations of independence observed in risky choice experiments (e.g., Starmer 2000).
Apart from its theoretical consequences, Proposition 5 bears on the meaning of "rationality" in risky choice. An axiomatic account of non-EU preferences consistent with the present findings can be found in Geiger (2008). It deals with status quo dependent decision making, where status quo risk means the decision maker's extant, risky economic situation as a special instance of background risk. The account straightforwardly explains various types of observed violations of EU preference, notably "fanning-out" and loss aversion Fig. 2 Indifference lines of (a) ≿ ω, r on convex set Δ ω, r ⊂ P ω, r (shaded area, dashed lines) and (b) ≿ on convex set Δ ⊂ P (solid lines). The indifference pattern shown is based on the example used in proof of Proposition 5 (Starmer 2000, Abdellaoui et al. 2008. Together with the above results, it suggests that observed systematic violations of EU do not so much indicate departure from rationality of risky choice, but rather exhibit rationality under constraints, that is, pragmatic rationality as opposed to independence as a requirement of purely theoretical rationality. "Pragmatic rationality", in turn, means utility maximisation with explicit reference to the decision maker's economic situation, time constraints on risk exposure, demands, and aspiration level.

Conclusions and extensions
A definitional extension of EU theory has been developed on the basis of the equivalences (2) and (16). Once the concept of risk in the presence of a background risk is given a suitable formal representation, (2) and (16) define conditional preferences induced by background risks such as arise in the comparison of compound lotteries in (1). This definitional extension of EU theory is in fact necessary if the independence axiom is given a normative interpretation and applied to conditional preferences. However, it has turned out that EU theory cannot, as a matter of principle, be consistently extended in this way. Apart from definitions such as (2) and (16) and trivial normalisations such as x 0 = 0, the proof of this impossibility result involves no assumptions that essentially limit the generality of the result; nor was it obtained under conditions exogenously imposed on preference rankings such as the induced non-EU preferences for temporal risk mentioned in the Introduction. 6 Rather, as Eq. (15) and the equivalence (16) make clear, the conditional non-EU preferences are defined in terms of mixtures of lotteries and arise in comparisons of such mixtures, which constitute the very meaning and significance of the independence principle. To avoid the non-EU consequences, Proposition 5 does not even admit backgrounddependent risk preferences to violate stochastic dominance preference, for EU preference logically includes stochastic dominance preference. On the other hand, the impossibility result is compatible with the notion of stochastic dominance preference as "perhaps the most obvious principle of rational choice" and-unlike EU theory-a "cornerstone of the normative theory of choice" (Tversky and Kahneman 1986, p. S253).
We close by a few remarks concerning the range of significance of the preceding results. Dynamic consistency and conditionalisation of risk preferences have both been referred to above as necessary theoretical requirements of rational preference orderings, one of which forces independence, while the other excludes it. To clarify this discrepancy, conditional preferences induced by identical sublotteries which occur in alternative multiple-stage lotteries, but may themselves be compound lotteries, would have to be introduced into models of dynamic choice. This task can arguably be tackled within existing frameworks of rational sequential choice corresponding to Hammond´s (1998, Sects. 5 and 6) account of dynamically consistent behaviour. In fact, using reduction of compound lotteries in combination with backward recursion, 7 Hammond provides multiple-stage lotteries with single-stage representations to which the conceptualisations and results of the present analysis directly apply. On the other hand, the concept of dynamic consistency will have to be strongly modified to incorporate the conditionalisation of risks and preferences induced by mixture of lotteries. Thus, the decision maker will face different background risks, with different conditional preference relations arising at different stages of a sequential decision task, even if his underlying unconditional preference relation ≿ is "atemporal" and as such remains unchanged. Moreover, he will generally exhibit quite different aspiration levels x 0 at different stages, depending on the gains or losses incurred in previous choices made at antecedent stages. If the alternative lotteries involved in a multiple-stage decision problem are finally given reduced single-stage representations with the use of backward recursion, Propositions 4 and 5 still exclude the possibility that ≿ satisfies independence. From the proof of Proposition 5 one infers that the reduced versions of multiple-stage lotteries, too, show the fanningout of indifference lines typical of violation of independence (Figs. 2 and 3).
Finally, one may wish to extend the preceding analysis to infinite discrete and continuous probability distributions as important risk models. Under additional, sufficiently strong conditions, the equivalence of dynamic consistency and independence holds for non-simple probability distributions as well (Hammond 1998, Secs. 8 and 9). But on the other hand, under these conditions, convex sets P* of infinite discrete or continuous probability distributions necessarily contain convex subsets P of simple probability distributions (Hammond 1998, p. 197), to which the above impossibility result applies. More precisely, every weak preference relation ≿* on P* possesses restrictions ≿ on P so that ≿* violates independence if any such ≿ does. Thus, there is in effect no normative justification for the independence axiom even under the stronger conditions under which EU preference orderings exist and are equivalent to dynamically consistent ones on convex sets P* of non-simple probability distributions. Fig. 3 Convex set Δ and shaded area Δ ω, r = Δ ∩ P ω, r of lotteries with possible outcomes x 1 < x 2 = 0 < x 3 and background risk r ~ 0 . The dotted parallel line segments in Δ ω, r indicate strictly increasing preference for increasing ρ ω, r (p), p 2 = constant. The dashed lines indicate the isomorphism (A.6) preference strictly increases with (p| ω r) 3 and with ρ ω, r (p) = (p| ω r) 3 /(p| ω r) 1 along parallel straight line segments p 2 = constant < 1 in Δ ω, r . The ratio ρ ω, r (p) is invariant under the transformation p| ω r → (p| ω r)', For given p 2 < 1, the inverse transformation of (A.6) exists. Hence, under (A.6) parallel straight line segments in Δ ω, r with constant, but different values of p 2 are isomorphic, with respect to preference, to the boundary segment p 2 ′ = 0 and, hence, to each other (Fig. 3). Observe that the real-valued functional U which represents ≿ on P and, hence, on Δ ω, r can be normalised to be unique. Rewrite U(p| ω r) as U(ρ ω, r (p), p 2 ), considering that ρ ω, r (p) and p 2 uniquely determine p| ω r. Then U(ρ ω, r (p 2 ′), 0) and U(ρ ω, r (p), c) respectively represent the isomorphic preference orderings on the straight line segments p 2 ′ = 0 and p 2 = c, 0 ≤ c < 1, with U(ρ ω, r (p 2 ′), 0) and U(ρ ω, r (p), c) strictly increasing in ρ ω, r . To exhibit this isomorphic property, U(ρ ω, r (p), p 2 ) must be a strictly increasing transform of U(ρ ω, r (p), 0) for given p 2 and take on the general form where V(p 2 ) > 0 and b is a real constant which can be used to extend Eq. (A.7) to the case p 2 = 1 and then normalise U to U(ρ ω, r (0 ), 1) = 0. Observe that this extension and normalisation imply V(1) = b = 0 and that U(ρ ω, r (p), 0) is a well-defined expression even if p 2 ≠ 0 since ρ ω, r (p) is independent of p 2 . To see that V(p 2 ) strictly decreases with increasing p 2 , consider the special case p 3 = 0 in which preference and, hence, U(ρ ω, r (p), p 2 ) strictly increase with p 2 , by stochastic dominance preference. Considering r ~ 0 and the normalisation U(ρ ω, r (0 ), 1) = 0 with b = 0 in Eq. (A.7), one has Equation (A.8) implies U(ρ ω, r (r), 0) = 0, because of r 2 < 1 and V(r 2 ) > 0 for r 2 < 1. Now, one has U(ρ ω, r (p), 0) < U(ρ ω, r (r), 0) = 0 for p 3 = 0 < r 3 so that, according to Eq. (A.7), V(p 2 ) must strictly decrease for U(ρ ω, r (p), p 2 ) to increase strictly with p 2 , and, clearly, so must V(p 2 ) for arbitrary p. Since V(p 2 ) and U(ρ ω, r (p), 0) are both strictly monotonic and, hence, non-constant functions, U(ρ ω, r (p), p 2 ) is non-linear in the probabilities, by Eq. (A.7). Since U(ρ ω, r (p), p 2 ) represents ≿ ω, r uniquely on Δ, ≿ ω, r violates the independence axiom on Δ and, hence, on P. As an immediate consequence of Propositions 4(i) and 4(ii), (Δ ω, r , ≿) is a non-EU preference structure, too (Fig. 2). 8 − Step 2 Let ≿ ω, r be given as in Step 1, and assume that ≿ Ω, s satisfies stochastic dominance preference on P for some arbitrary s except s =0 , where 0 < Ω < 1. Observe that 0 ∈ P Ω, s (see Eq. (11)) and that there exists some r', r' ∈ P Ω, s , so that r' ~ 0 but r' ≠ 0 . Now construct a convex subset Δ' ⊂ P Ω, s similar to Δ in Step 1, define (Δ Ω, r' )' = Δ' ∩ (P Ω, s ) Ω, r' and, following the proof carried out in Step 1, proceed to show that ≿ Ω, r' violates independence on Δ' and, hence, on P Ω, s . From 4(ii) one immediately concludes that ((P Ω, s ) Ω, r' , ≿) is a non-EU preference structure. Recall that (P Ω, s ) Ω, r' is a convex subset of P Ω, s so that (P Ω, s , ≿) and, once more by Proposition 4(ii), (P, ≿ Ω, s ) are non-EU preference structures. Hence, ≿ Ω, s violates the independence axiom on P.