Making the Anscombe-Aumann approach to ambiguity suitable for descriptive applications

The Anscombe-Aumann (AA) model, originally introduced to give a normative basis to expected utility, is nowadays mostly used for another purpose: to analyze deviations from expected utility due to ambiguity (unknown probabilities). The AA model makes two ancillary assumptions that do not refer to ambiguity: expected utility for risk and backward induction. These assumptions, even if normatively appropriate, fail descriptively. This paper relaxes these ancillary assumptions to avoid the descriptive violations, while maintaining AA’s convenient mixture operation. Thus, it becomes possible to test and apply all AA-based ambiguity theories descriptively while avoiding confounds due to violated ancillary assumptions. The resulting tests use only simple stimuli, avoiding noise due to complexity. We demonstrate the latter in a simple experiment where we find that three assumptions about ambiguity, commonly made in AA theories, are violated: reference independence, Han Bleichrodt and Horst Zank made useful comments. An anonymous referee substantially improved the paper. Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11166-018-9273-7) contains supplementary material, which is available to authorized users. Peter P. Wakker Wakker@ese.eur.nl Stefan Trautmann trautmann@uni-hd.de 1 Alfred-Weber-Institute for Economics, University of Heidelberg, Bergheimer Str. 58, 69115 Heidelberg, Germany 2 Econometric Institute, Erasmus University Rotterdam, P.O. Box 1738, Rotterdam, 3000 DR, The Netherlands


Introduction
We, …rst, introduce a reduced version of the AA model (rAA) so as to avoid the violations of the ancillary Assumptions (1) and (2). The major violations of EU (Assumption (1)), such as the certainty e¤ect that underlies the Allais paradox, involve degenerate lotteries.
We avoid these violations by only using nondegenerate lotteries for risky decisions. The violations of backward induction are avoided in our experiment, in brief, by replacing all second-stage lotteries in two-stage uncertainty by their certainty equivalents ourselves, rather than relying on subjects doing so. Despite these two modi…cations, the rAA model still preserves the main advantage of the AA model: a convenient mixture operation on consequences. Further explanation is given later. In the rAA model, no two-stage uncertainty occurs anymore. An additional advantage of the latter is that complex stimuli, that can only be used hypothetically for normative purposes (Kreps 1988 p. 101), are avoided, reducing burden for the subjects and noise in the data.
We demonstrate the feasibility of the rAA approach in a simple experiment. We can then test the substantive Assumptions (3) and (4) without confounds. Unsurprisingly, reference dependence, demonstrated in many decision …elds outside of ambiguity 1 , and for decision under ambiguity outside of the AA model (reviewed Trautmann & van de Kuilen 2013), also holds for ambiguity within the AA model. Losses are treated di¤erently than gains, generating more ambiguity seeking. 2 The well-known disposition e¤ect in investment decisions illustrates the di¤erent treatment of gains and losses: people keep stocks (which usually comprise ambiguity) rather than taking the certainty of selling when these stocks have generated losses. But they do not do so when the stocks have generated gains (Barberis & Xiong 2009). Thus Assumptions (3) and (4) are violated.
can accommodate loss aversion, and ambiguity aversion for gains combined with ambiguity seeking for losses. Put di¤erently, we show how the AA model can be extended to cover Tversky & Kahneman's (1992) prospect theory. In many applications of ambiguity (asset markets, insurance, health) the gain-loss distinction is important, and descriptive modelings assuming reference-independent universal ambiguity aversion will be ‡awed. Faro (2005, Ch. 3) proposed another ambiguity model with reference dependence.
The smooth model of ambiguity (Klibano¤, Marinacci, & Mukerji 2005) and other utility-driven theories of ambiguity (Chew et al. 2008;Ju & Miao 2012;Nau 2006;Neilson 2010) can also treat losses di¤erently than gains. These models still focus on universal ambiguity aversion. Because they are outcome driven, they cannot model the empirically prevailing phenomenon of ambiguity seeking for unlikely events joint with ambiguity aversion for likely events (reviewed by Trautmann & van de Kuilen 2013).
Our model can readily describe this phenomenon by weighting functions that are insensitive (subadditive for unlikely events, and additive for likely events). Generalizations of other AA ambiguity theories to incorporate reference dependence, some ambiguity seeking, and other descriptive generalizations 3 are a topic for future research. Dobbs (1991) also proposed a recursive utility-driven theory of ambiguity, but particularly argued for di¤erent attitudes for gains than for losses, which he demonstrated in an experiment.
He thus is close to our approach.
This paper proceeds as follows. Section 2 presents an experiment demonstrating reference dependence of ambiguity attitudes and violating weak certainty independence.
The experiment is based on our solution to the ancillary-structure problems, formalized in later sections, where the experiment will then serve as a toy example. Yet the experiment fully …ts within the traditional AA approach, and can be understood without any knowledge of our rAA model. In particular, the violations of reference independence that we …nd directly pertain to the traditional (nonreduced) two-stage AA model. ambiguity-loss aversion. Here loss aversion can be stronger (or weaker) under ambiguity than under risk, providing an additional way to generate ambiguity aversion. The axiomatizations in § §5-7 can be read independently of § §3-4 on the AA models. The provided theorems all serve the main methodological purpose of this paper: to make the AA model suited for descriptive applications. A discussion, with implications for existing ambiguity models, is in §8. Section 9 concludes.
2 Experimental illustration of the reduced AA model and reference dependence This section presents a small experiment where we …nd violations of most of the ambiguity models in the literature that use the AA model. Our experiment will not depend on the ancillary assumptions and, thus, the violations found are substantive. First, to prepare, we present a common example. The unit of payment in the example can be taken to be money or utility. In the experiment that follows after, the unit of payment will be utility and not money, so that the violations directly pertain to the general AA and 50 black (B) balls. An unknown (ambiguous) urn A contains 100 black and red balls in unknown proportion. One ball will be drawn at random from each urn, and its color will be inspected. R k denotes the event of a red ball drawn from the known urn, and B k ; R a , and B a are analogous. People usually prefer to receive 100 under B k (and 0 otherwise) rather than under B a and they also prefer to receive 100 under R k rather than under R a . These choices reveal ambiguity aversion for gains.
We next multiply all outcomes by 1, turning them into losses. This change of sign can a¤ect decision attitudes. Many people now prefer to lose 100 under B a rather than under B k and also to lose 100 under R a rather than under R k . That is, many people exhibit ambiguity seeking for losses.
The above example illustrates that ambiguity attitudes are di¤erent for gains than for losses, making it desirable to separate these. The latter is impossible in virtually all ambiguity models existing today. We tested the above choices in our experiment.
Subjects were N = 45 undergraduate students from Tilburg University. We asked both for preferences with red the winning color and for preferences with black the winning color. This way we avoided suspicion about the experimenter rigging the composition of the unknown urn (Pulford 2009).
We then assumed indi¤erence between the safe and risky prospect with that outcome instead of j in the risky prospect. We used the monetary outcome , depending on the subject, as the loss outcome for this subject. This way the loss outcome was 100 in utility units for each subject. 4 Details of the experiment are in the web appendix.
We elicited the preferences of Example 1 from our subjects using utility units, with the gain outcome e10 generating utility +100, and the loss outcome generating utility 100. Combining the bets on the two colors, the number of ambiguity averse choices 4 How our measurement of utility incorporates loss aversion under risk is discussed in §7.
was larger for gains than for losses (1.49 vs. 1.20,z = 2:015,p < :05,Wilcoxon test,twosided), showing that ambiguity attitudes are di¤erent for gains than for losses. For gains we replicate strong ambiguity aversion (z = 3:773, p < :01, Wilcoxon test, two-sided), but for losses we cannot reject the null of ambiguity neutrality (z = 1:567, p > :10, Wilcoxon test, two-sided). 5 Our experiment con…rms that attitudes towards ambiguity are di¤erent for gains than for losses, suggesting violations of most ambiguity models used today. The following sections will formalize this claim.
Detailed empirical investigations of reference dependence of ambiguity theories in the AA model are a topic for future research. Baillon & Bleichrodt (2012) studied reference dependence of ambiguity in detail using matching probabilities. They found large differences between gains and losses. Although their study was not done in the AA model, it is consistent with our example, showing that ambiguity models need to incorporate reference dependence for descriptive applications. Dobbs (1991) contains an experiment similar in spirit to ours, also con…rming reference dependence of ambiguity attitudes.
We chose a di¤erent experiment than Dobbs did so as to obtain direct violations of weak certainty independence.

The traditional two-stage AA model
This section presents the usual AA model. Our rAA model is in the next section.
Although the presentation of these two sections is formal, it is still elementary and accessible to empirically oriented readers. Advanced formal results are in § §5 and 6.
We assume a, possibly in…nite, set S of states. D denotes a set of (deterministic) outcomes, with generic elements ; ; x i ; y i . By < we denote a preference relation of a decision maker on D, with and the usual strict and symmetric relations. L is the set of (roulette) lotteries. A (roulette) lottery is a probability distribution over D taking …nitely many values. The generic notation is x = (p 1 :x 1 ; : : : ; p m :x m ), with the obvious interpretation. Outcomes are identi…ed with the corresponding degenerate lotteries (1: ).
A (two-stage) act f = (E 1 :f 1 ; :::; E n :f n ) denotes a function from S to L taking only 5 Testing is against the null of one ambiguity averse choice in two choice situations. The exact distribution of subjects choosing ambiguous never, once, or twice is (28; 11; 6) for gains, and (21; 12; 12) for losses. h(s) = f (s) p g(s) for all s; we then also write h = pf + (1 p)g.
We now turn to the de…nition of the (two-stage) Anscombe-Aumann (AA) model. It was popularized by Gilboa & Schmeidler (1989) and Schmeidler (1989), and is commonly used in the modern literature on ambiguity. We use the terminology of those two papers as much as possible.
The decision maker also has a preference relation on A, again denoted <. Through constant acts, < generates a preference relation, also denoted <, on lotteries L. This in turn, through nondegenerate lotteries, generates a preference relation over outcomes D that we assume to agree with the preference relation de…ned there before. That is, < on D has been extended to L and A, which is why we use the same symbol. There exist a best outcome B and a worst outcome W , with B < < W for all outcomes . These best and worst outcomes will simplify utility scalings and relations between di¤erent models. They will not be assumed in the general theoretical results in later sections. A certainty equivalent (CE ) of a lottery is an outcome that is equivalent to the lottery.
The certainty equivalent condition means that there exists a unique certainty equivalent for each lottery. Uniqueness can always be achieved by collapsing indi¤erence classes of outcomes.
A function V represents < if f < g if and only if V (f ) V (g). If a representing function exists then < is a weak order, i.e. < is complete (for all acts f and g, f < g or g < f ) and transitive. < is nontrivial if (not f g) for some f and g in A.
Monotonicity holds if f < g whenever f (s) < g(s) for all s in S. A function u on L is expected utility (EU) if u((p 1 :x 1 ; : : : ; p m :x m )) = P m j=1 p j u(x j ) and it represents < on L. We use the same symbol u for the function de…ned on X and its expectation de…ned on L. We sometimes call u on L the risky utility function.
DEFINITION 2 The (two-stage) AA model holds if a nontrivial and monotonic weak order < is given on the set A of acts, with a best outcome B and a worst outcome W , the CE condition satis…ed, and with expected utility (u) holding on L.
An act f is one-stage if all lotteries f (s) are degenerate; i.e., f assigns outcomes rather than nondegenerate lotteries to all states (upper panel in Figure 1). Then all relevant uncertainty has been resolved in the …rst stage. A lottery, identi…ed with the corresponding constant act, is sometimes also called a one-stage lottery (left panel in Figure 1). Now all relevant uncertainty is resolved in the second stage.
The AA assumptions of EU on L and of monotonicity are called ancillary assumptions. They imply that a function representing preferences over acts must be of the The experiment in §2 concerns the two-stage AA model with: (a) S = fR a ; B a g; (b) bets on the ambiguous urn are acts; (c) bets on the known urn are …fty-…fty lotteries; (d) also other lotteries were used; (e) B = 10, W = 20. We used EU to analyze risky choices. We only used a subpart of the two-stage AA model, which will later be formalized as the rAA model, in two respects. First, all acts and lotteries presented to subjects were one-stage (upper and left panel in Figure 1). We generated all desired utility levels at the second stage using outcomes, i.e. degenerate lotteries.

The reduced AA model
The experiment in §2 suggested violations of the common ambiguity models. We will later show that we in fact tested and falsi…ed Schmeidler's (1989) Figure 1). This replacement is not done by the decision maker, but by the researcher. All the decision maker does is express preferences over one-stage acts and one-stage lotteries, as in the experiment in  The CCE mapping (value y j for each j) is defined by the left-panel indifferences The reduced AA model only concerns the upper (ambiguity) and left (risk) panel, and not the theoretical part. Of the risk panel we only use the subpart with a µ probability of lottery R, to stay away from the upper corner. Only this reduced model is used for empirical work, avoiding the extensive empirical violations in the theoretical part. Through the CCE mapping, the reduced AA model uniquely determines the theoretical part. Thus all theoretical features of the AA model, including its mixture operation, can be used, and all AA decision theories can be tested and applied empirically. = .
x jm j x j1 Figure 1: The reduced AA model §2. As this section will explain, the researcher then, for the purpose of using theories for the two-stage AA model, derives a corresponding two-stage AA model by implementing the ancillary assumptions by himself rather than assuming them on the part of the subjects. We thus need not assume that subjects behave as if replacing lotteries by CEs for risk, because we implement the replacement ourselves. We inferred Eq. 4 this way.
The rAA model preserves the underlying mixing of outcomes of the two-stage AA model, by temporarily returning, for a mix = 0 p = p + 0 (1 p) , to the underlying lotteries. We have added the primes before the mixing probability and after the plus symbol to indicate that this mixture operation is formally di¤erent from the (probabilistic) mixture operation in the two-stage AA model, although it will be isomorphic and will generate the same indi¤erence class. Informally, we take any lotteries x; z with = CE(x); = CE(z), we take y = x p z, and then get as CE(y). Because of EU on L, it does not matter which x and z we take in this process, and the operation is well de…ned. Always This holds irrespective of the particular choices x and z. We use Eq. 5 as the formal de…nition of the new mixture operation. The mixture operation is most easily observable from: That is, we take x = and z = . An example is Eq. 1 which showed that 0 = 100 0:5 .
The second modi…cation [avoiding degenerate lotteries for risky preferences]. In the …rst modi…cation, we put deterministic outcomes central for the analysis of ambiguity by focusing on one-stage acts. Violations of EU for such acts, due to ambiguity, are our substantive interest. In the second modi…cation considered now, concerning the analysis of risk through one-stage lotteries, we avoid degenerate lotteries, staying away from the upper left box in Figure 1. For such lotteries there are many violations of EU that distort our ancillary assumptions.
For each lottery x, we de…ne x 0 = R x. Under the ancillary AA assumptions, EU holds on L, and then a CE-indi¤erence y is not a¤ected if we bring in R, as in R R y; i:e:; 0 y 0 : In general, indi¤erences are not a¤ected under EU if we add or remove primes from all the lotteries. We call in Eq. 7 the conditional CE of y, denoted = CCE(y). We used this procedure in Eqs. 1 and 2. The CCE condition means that there exists a unique CCE for each lottery x 2 L. Given existence, uniqueness can always be achieved by collapsing indi¤erence classes.
The two modi…cations combined. The rAA model results from combining the two modi…cations. Every two-stage act f in the two-stage AA model is replaced by CCE(f ), de…ned by replacing every f (s) by CCE(f (s)), and turning every two-stage act into an equivalent one-stage act (arrow in Figure 1). We carried out the …rst modi…cation, but with primes added to Eq. 6 because of the second modi…cation.
All preferences between two-stage acts in the two-stage AA model can be recovered from their CCE versions; we used this in Eq. 4. Thus, given the ancillary assumptions of AA, all substantive assumptions (concerning the upper panel in Figure 1) can be tested and analyzed using the rAA model.
We call the rAA model derived from the two-stage AA model as just described, the corresponding rAA model. Conversely, from every rAA model the uniquely determined corresponding two-stage AA model can be recovered, mostly by deriving preferences between two-stage acts from their CCE images. We summarize the rAA model formally. Preferences over outcomes agree with those over constant acts, and are represented by u on D. Monotonicity holds. Conditional certainty equivalents, denoted CCE, are de…ned as in Eq. 7, and are assumed to uniquely exist for every x 2 L (the CCE condition). 9 The mixture operation on outcomes is de…ned through Eq. 5, and can for instance be revealed from indi¤erences through the following analog of Eq. 6: We summarize some useful relations between the corresponding reduced and twostage AA models.
OBSERVATION 4 There is a one-to-one correspondence between two-stage and reduced AA models (based on the maps f ! CCE(f ) and x ! x 0 ), and the preferences of one model uniquely determine those of the other. The rAA model is a substructure of the corresponding two-stage AA model, and its preferences agree with the restriction of the two-stage AA model preferences. Thus, whereas two-stage acts may appear as ancillary tools in proofs of mathematical theorems on V in Eq. 3, subjects are never exposed to such complex stimuli. Our recommendation explains, in formal terms, how the full literature on two-stage AA models can be used for descriptive purposes to study the function V in Eq. 3 without paying the descriptive price of the ancillary assumptions 1 (EU for risk) and 2 (backward induction) that relate to the part "EU "in Eq. 3). We have shown that the lower right part in Figure 1 is redundant.

Choquet expected utility
This section provides some well known results. Proofs can be found in the source papers, or in the didactical Ryan (2009). We present our main theorems for general mixture spaces, which covers both the two-stage and the rAA models. By Observation 18 in the appendix, all result proved in the literature for the two-stage AA model also hold for general mixture spaces.
M denotes a set of consequences, with generic elements x; y. In the two-stage AA model, consequences are lotteries, which incorporates outcomes. In the rAA model, consequences are outcomes. M is a mixture space. That is, M is endowed with a mixture operation, assigning to all x; y 2 M and p 2 [0; 1] an element of M denoted px + (1 p)y or x p y. In the two-stage AA model, mixing is probabilistic. In the rAA model, mixing is de…ned in Eq. 5. The following conditions de…ne a mixture operation.
An interesting alternative case is the following example, where mixing is not probabilistic.
EXAMPLE 5 M = IR and mixing is the natural mixing of real numbers.
As throughout, S denotes the state space. We now assume that it is endowed with an algebra of subsets, called events. An algebra contains S and ; and is closed under complementation and …nite unions and intersections. A (two-stage) act f = (E 1 :f 1 ; :::; E n :f n ) now takes values in M and the E j 's are events partitioning the state space. The set of acts A is again endowed with pointwise mixing, which satis…es all conditions for mixture operations implying that A is a mixture space. Preferences are again over the set of acts A and are denoted <, again generating < over consequences through constant acts. Continuity holds if, whenever f g and g h, there are p and q in (0; 1) such that f p h g and f q h g.
An a¢ ne function u on M satis…es u(x p y) = pu(x) + (1 p)u(y). In the two-stage independence, and Theorem 7 can obviously be applied to any mixture set other than M , such as the set of acts A. We next turn to two classic results.
Anscombe & Aumann's subjective expected utility. A probability measure P on S maps the events to [0; 1] such that P (;) = 0, P (S) = 1, and P is additive (P (E [ F ) = P (E) + P (F ) for all disjoint events E and F ). Subjective expected utility (SEU ) holds if there exists a probability measure P on S and a function u on M , such that < is represented by the function SEU de…ned as follows: In the following theorem, independence on A implies independence on M through constant acts.
THEOREM 8 [Anscombe & Aumann (1963)]. The following two statements are equivalent: (i) Subjective expected utility holds with nonconstant a¢ ne u on M .
The probabilities P on S are uniquely determined and u on M is unique up to level and unit.
Schmeidler's Choquet Expected Utility. A capacity v on S maps events to [0; 1], such . Unless stated otherwise we use a rank-ordered notation for acts f = (E 1 :x 1 ; ; E n :x n ), that is, x 1 < < x n is implicitly understood.
Let v be a capacity on S. Then, for any function w: S ! R, the Choquet integral of Choquet expect utility holds if there exist a capacity v and a function u on M such that preferences are represented by Two acts f and g in A are comonotonic if for no s and t in S, f (s) f (t) and g(s) g(t). Thus any constant act is comonotonic with any other act. A set of acts is comonotonic if every pair of its elements is comonotonic.

DEFINITION 9 Comonotonic independence holds if
f g ) f p c g p c for all 0 < p < 1 and comonotonic acts f , g, and c.
Under comonotonic independence, preference is not a¤ected by mixing with constant acts (consequences) (with some technical details added in Lemma 21). Because constant acts are comonotonic with each other, comonotonic independence on A still implies independence on M .
The capacity v on S is uniquely determined and u on M is unique up to level and unit.
If we apply the above theorem to Example 5, then we obtain a derivation of Choquet expected utility with linear utility that is alternative to Chateauneuf (1991, Theorem 1).
Comonotonic independence implies a condition assumed by most models for ambiguity proposed in the literature.

DEFINITION 11 Weak certainty independence holds if
f q x < g q x ) f q y < g q y for all 0 < q < 1, acts f; g, and all consequences x; y.
That is, preference between two mixtures involving the same constant act x with the same weight 1 q is not a¤ected if x is replaced by another constant act y. This condition follows from comonotonic independence because both preferences between the mixtures should agree with the unmixed preference between f and g (again, with some technical details added in Lemma 21). Grant & Polak (2011a) demonstrated that the condition can be interpreted as constant absolute uncertainty aversion: adding a constant to all utility levels will not a¤ect preference.
6 Reference dependence in the AA model This reasoning does not use any assumption about (utility of) the outcomes 100, 100, other than that they are of di¤erent signs (with u(0) = 0). For later purposes we show that even weak certainty independence is violated. In the proof of the following observation we essentially use the linear (probabilistic) mixing of outcomes typical of the AA models.
OBSERVATION 12 Example 1 violates comonotonic independence, and even weak certainty independence. This section presents such a generalization. As in all main results, the analysis will be analogous to Schmeidler's analysis of rank dependence in Choquet expected utility as much as possible. Given this limitation, we stay as close as possible to the analysis of Tversky & Kahneman (1992).
In prospect theory there is a special role for a reference point, denoted . In our model it is a consequence that indicates a neutral level of preference. It often is the status quo of the decision maker. In Example 1, the deterministic outcome 0 was the reference point. Under the certainty equivalent condition in the AA model, we can always take a deterministic outcome as reference point. Sugden (2003) Prospect theory holds if there exist two capacities v + and v and a function U on consequenes with U ( ) = 0 such that < is represented by We call U in Eq. 12 the (overall) utility function. There is a basic utility u, and a loss aversion parameter > 0, such that For reasons explained later, we call the ambiguity-loss aversion parameter. Because For later purposes we rewrite Eq. 12 as with decision weights j de…ned as follows. Assuming, for act (E 1 :x 1 ; :::; E n :x n ), the rank-ordering x 1 < < x k < < x k+1 < < x n . We de…ne for j k : for j > k : For gain events, the decision weight depends on cumulative events that yield better consequences. For loss events, the decision weight similarly depends on decumulative events that yield worse consequences. CEU analyzed in the preceding section is the special case of PT where v is the dual of v + and in Eq. 15 is 1.
We next turn to preference conditions that characterize prospect theory. We generalize comonotonicity by adapting a concept of Tversky & Kahneman (1992) to the present context. Two acts f and g are cosigned if they are comonotonic and if there exists no s in S such that f (s) and g(s) . Note that, whereas for any act g and any constant act f , f is comonotonic with g, an analogous result need not hold for cosignedness. Only if the constant act is neutral, is it cosigned with every other act. This point complicates the proofs in the appendix. A set of acts is cosigned if every pair is cosigned. We next generalize comonotonic independence to allow reference dependence.

DEFINITION 13 Cosigned independence holds if
f g ) f p c g p c for all 0 < p < 1 and cosigned acts f , g, and c.
< is truly mixed if there exists an act f with f + and f . Double matching holds if, for all acts f and g, f + g + and f g implies f g. Now we are ready for the main theorem of this paper.
THEOREM 14 Assume true mixedness. The following two statements are equivalent: (i) Prospect theory holds with U as in Eqs. 13-15.
The capacities are uniquely determined and the global utility function U is unique up to its unit.
We give the proof of the following observation in the main text because it is clarifying.
OBSERVATION 15 Example 1 can be accommodated by prospect theory.
and v (R k ) > v (R a ). Remember here that large values of v correspond with low values of its dual as used in the Choquet integral.
We can take v di¤erent than v + , letting the former generate ambiguity seeking in agreement with empirical evidence.
A number of new problems have to be resolved in the proof of Theorem 14. In the proof of Schmeidler's Theorem 10, constant acts are comonotonic with all acts, and serve to compare preferences across di¤erent comonotonic sets. In the proof of Theorem 14, however, gains are not cosigned with losses, and there is no direct way to compare preferences across di¤erent sign-comonotonic sets. We similarly lose the possibility to substitute comonotonic conditional certainty equivalents. A third problem is that the global utility function U is only piecewise linear in risky utility u, with a nonlinearity ("kink") at 0, under prospect theory.
Another, fourth, problem in the proof is that we do not get full-force independence on the mixture set, but we get it only separately for gains and losses. We show that this weakened condition still implies an a¢ ne representation (EU for risk x . In other words, with f; x + , and x as in the observation, we …nd p such that x + p x , and then solve from 1 1+ = p ( = 1 p p ). The condition in the theorem is intuitive: shows that, when mixing consequences (lotteries in the AA model), the loss must be weighted times more than the gain to obtain neutrality. Under ambiguity, however, f combines the preference values of x + and x in an "unweighted" manner (see the unweighted sum of the gain-and loss-part in Eq. 12), leading to the same neutrality level. Apparently, under ambiguity, losses are weighted times more than when mixing consequences (risk in the AA model). In the AA model, with consequences referring to lotteries and decision under risk, indicates how much more losses are overweighted under ambiguity than they are under risk. Thus purely re ‡ects ambiguity attitude.
As mentioned before, the smooth ambiguity model (Klibano¤, Marinacci, & Mukerji 2005) can accommodate sign dependence of ambiguity attitudes. A kink of their second-order ambiguity-utility transformation function ' at 0 will accommodate extra loss aversion due to ambiguity in the same way as our parameter does.
For a …rst prediction on values of , we consider an extreme view on loss aversion for the AA models. It entails that all loss aversion will show up under risk, and that there can be expected to be no additional loss aversion due to ambiguity. This interpretation is most natural if loss aversion only re ‡ects extra su¤ering experienced under losses, rather than an overweighting of losses without them generating disproportional su¤ering when experienced. That is, this extreme interpretation ascribes loss aversion entirely to the (utility of) consequences. Then it is natural to predict that = 1, with no special role for ambiguity. We display the preference condition axiomatizating this prediction and showing how it can be tested: Neutral ambiguity-loss aversion holds if = 1 in Observation 17.
A less extreme interpretation of ambiguity-loss aversion is as follows: There is loss aversion under risk, which can be measured in whatever is the best way provided in the literature. 11 For monetary prizes with a …xed reference point as considered in this paper, loss aversion will generate a kink of risky utility at that reference point. In the two-stage AA model, some consequences are outcomes and other consequences are lotteries. Reference dependence as analyzed in this paper takes lotteries as a whole, and their indi¤erence class determines if they are gains or losses. This is analogous to the way in which Schmeidler (1989) modeled rank dependence in his model, which also concerned lotteries as a whole. Another approach can be considered, both for reference dependence and rank dependence, where outcomes within a lottery are perceived as gains or losses, and are weighted in a rank dependent manner. Here, as elsewhere, we followed Schmeidler's approach. For reference dependence, it was recommended by Tversky & Kahneman (1981, p. 456 2nd para). In the rAA model, subjects are never required to perceive whole lotteries in a reference or rank dependent manner, but we implement it ourselves, and subjects only see the CCEs that we inserted. Hence the above issue is no problem for us. Kreps (1988 p. 101) wrote about the non-descriptive nature of two-stage acts in the AA model: "imaginary objects. . . . makes perfectly good sense in normative applications . . . But this is a very dicy and perhaps completely useless procedure in descriptive applications. . . . what sense does it make . . . because the items concerned don't exist? I think we have to view the theory to follow [the traditional two-stage AA model] as being as close to purely normative as anything that we do in this book." We have shown that the analyst can do the hypothetical reasoning for the decision-maker without him ever having to contemplate the complicated and unintuitive two-stage acts. Observation 4 demonstrates that we can still do everything that is done in the traditional two-stage AA model.

Discussion
A pragmatic objection can be raised against the rAA model. The mixture operation of outcomes is less easy to implement than in the original AA model. Now a mixture is not done by just multiplying probabilities, but it requires observing an indi¤erence. But such observations are easy to obtain, as our experiment demonstrated. They concern stimuli that are considerably easier to understand for subjects than two-stage acts.
The objection just raised can be rephrased in a methodological way. The mixture operation in the rAA model is a derived concept. The purpose of behavioral foundations is to give conditions directly in terms of the empirical primitive, being the preference relation. Derived concepts can be used only if the resulting preference conditions can easily be (re)formulated directly in terms of primitives. Our ancillary CE mixture operation uses objective probabilities that are directly available and then certainty equivalents (CCEs) that are easy to observe. The mixture operation can be implemented empirically, as demonstrated in §2. Facilitating experimental testing of an AA model was the primary motivation for us to introduce rAA models. Several studies have suggested that aversion to complex stimuli may contribute to the aversion to ambiguity often found (Charness & Levin 2005;Halevy 2007).
We next use the prevailing descriptive theory of decision under risk today, prospect theory (Tversky & Kahneman 1992), to discuss the degree in which our avoidance of degenerate lotteries justi…es the assumed EU. Because we always assign probability =2 to the best outcome B and the worst outcome W , for the preferences that we consider the probability weighting function is only relevant on [ =2; 1 =2]. The common empirical …nding is that probability weighing is approximately linear there, and mostly deviates from linearity at the boundaries of the probability domain (Camerer 1995 p. 637;Starmer 2000;Tversky & Kahneman 1992). This holds both for gains and for losses, implying that sign dependence of probability weighting plays no big role either. 12 Hence the deviations from the linear weighting of probabilities that is typical of EU will be weak.
The other new component in prospect theory for risk, loss aversion (for risk), can be incorporated into utility u. If the reference point is …xed, as it is in this paper (and as in most theoretical analyses of prospect theory), then loss aversion under risk constitutes no mathematical departure from expected utility, but simply generates a kink of u at zero, as explained before.
We chose R = B 0:5 W because the probability 0:5 is easy to understand for subjects.
Any other choice of R = B p W that is not too close to a degenerate lottery, and easy to understand for subjects, will do. We further choose su¢ ciently large to keep a safe distance from degenerate acts, such as = 0:4. Our model can be tested by testing invariance of preference under variations of R and .
Some theoretical papers derived mixture operations on outcomes endogenously in a Savagean setup (with acts mapping states to outcomes) using bisymmetry axioms. 13 As we did, they avoided two-stage acts and expected utility for risk, thus also avoiding AA's ancillary assumptions. Unlike us, they also avoided using objective probabilities.
One drawback was that they could only observe mixtures for some …xed mixture weights such as 1=2. Thus, unlike us (Observation 4), they could not use the full richness of the AA model. A second drawback was that they needed to observe several indi¤erences to obtain one mixture (e.g. by involving many certainty equivalents), losing the tractability that has made the AA approach so popular. Hence such techniques have not been applied in empirical studies. A third drawback is that these approaches cannot be extended to reference dependence because the use of certainty equivalents in bisymmetry axioms cannot be reconciled with cosignedness. The simplest endogenous mixture operation, needing only two indi¤erences, is in Baillon, Driesen, & Wakker (2012), who reviewed the literature in detail. Ghirardato et al. (2003) did obtain endogenous mixtures for all decision weights, but needed in…nitely many indi¤erences to observe nondyadic mixture 12 If probability weighting is more or less steep for losses than for gains, then this can be captured by ambiguity-loss aversion. 13 Expected uility derivations include Gul (1992) and Ramsey (1931). Nonexpected utility derivations inlude Casadesus-Masanell, Klibano¤, & Ozdenoren (2000), Chew (1989), Ghirardato & Marinacci (2001), and Vind (2003).
weights such as 1/3, which is again problematic for empirical applications.
Wakker (2010) used a Savagean setup avoiding the ancillary AA assumptions as did the aforementioned references, but used an endogenous di¤erence rather than a mixture operation. This approach can be reconciled with cosignedness, leading to an axiomatization of prospect theory (his Theorem 12.3.5). It shares the …rst two above drawbacks of not o¤ering the convenient mixuture operation that made the AA approach so popular, and thus requiring more complex preference observations.
The violation of CEU in Example 1 (see beginning of §6) concerned a violation of the elementary condition of weak certainty independence. Hence, every model implying this condition is violated the same way as CEU is. This concerns most ambiguity models considered in the literature. 14 Further, the above violation involved only binary acts, implying that every model agreeing with CEU on this subdomain is violated too (Ghirardato & Marinacci 2001: biseparable preference; tested by Choi et al. 2007). For all these models, it is desirable to develop reference-dependent generalizations. We …nally note that, whereas a violation of weak certainty independence in a traditional AA setup could be due to ancillarycondition violations, especially independence under risk, we found the violation in our rAA setup and hence it must be substantive.

Conclusion
We have made the AA model suited for descriptive applications. Up to now, it could only be used for normative purposes (Kreps 1988 p. 101). We …rst demonstrated how the two major descriptive problems of the ancillary part of the usual two-stage AA model ( power of the AA model is maintained. In particular, any axiom in the two-stage AA model based on the probability mixture operation has an obvious equivalent in terms of our alternative mixture operation in the one-stage reduced-AA model. This equivalent axiom is by itself also necessary in the two-stage AA model. This way we could test a substantive assumption of most AA ambiguity models today: reference independence, without ancillary assumptions confounding the test. We added a third step for making the AA model descriptive, by providing a reference dependent generalization of Schmeidler's (1989) Choquet expected utility. This generalization amounts to extending the AA model to prospect theory. Our generalization allows for ambiguity aversion for gains together with ambiguity seeking for losses, which is indispensible for descriptive applications of the AA model. An additional advantage of the reduced AA model for experimental applications is that it only uses simple stimuli.
Subjects are only required to rank Savage-style acts (maps from states to outcomes) an objective lotteries, but not AA-style two-stage acts.
An obvious question for future research is how to generalize other ambiguity theories to incorporate reference dependence and loss aversion, or other phenomena that are of descriptive interest. We hope that our paper will advance descriptive applications of ambiguity theories based on the AA model, having removed the major obstacles, thus potentially doubling its impact.

Appendix. Proofs A Preparation
Several results in the ambiguity literature (e.g., Schmeidler 1989), were formulated for the two-stage AA model, and not for general mixture spaces as we use them. These results can routinely be transferred to acts for general mixture spaces. Rather than spell out details for speci…c theorems, do we explain the procedure in general and somewhat informally. In all our results, Theorem 7 (or Proposition 16) gives an a¢ ne representation u on M . In a general mixture space M , we can take two consequences B W , and …rst focus on the consequences that are between B and W in preference. We replace these consequences by their u values (e¤ectively, collapsing equivalence classes), endowing those with the natural mixture on real numbers. By monotonicity, this reduction maintains all preferentially relevant information. We now have the AA model with two outcomes B; W , can apply the results from the literature here, leading back to results for the original mixture space for consequences between B and W . We then extend the results to more and more extreme values u(B) and u(W ), inductively covering the whole mixture space. We, again informally, summarize the reasoning.
OBSERVATION 18 All cited preference foundations for AA models hold for general mixture spaces.

B Proof of proposition 16; Cosigned Expected Utility
A nonloss is a consequence that is a gain or is neutral, and a nongain is a consequence that is a loss or is neutral. We …rst derive a preparatory lemma.
LEMMA 19 Assume that the preference relation <, restricted to consequences, satis…es weak ordering, continuity, and cosigned independence. If x and y are nonlosses, then so are all x p y for 0 p 1. If x and y are nongains, then so are all x p y for 0 p 1.
Proof. Assume the conditions in the lemma. We consider the case of nonlosses x; y.
Assume, for contradiction, x q y for some q. Continuity readily implies existence of a largest p < q such that x p y and a smallest r > q such x r y . De…ne x 0 = x p y and y 0 = x r y. Then x 0 and y 0 are neutral but, by continuity, every x 0 0 p y 0 must be a loss. The set of x 0 0 p y 0 (0 0 p 1) is cosigned, implying that von Neumann-Morgenstern independence holds here without a cosignedness restriction. x 0 x 0 1=3 y 0 and independence imply that their 0:5 0:5 mixture is strictly preferred to x 0 1=3 y 0 (take c = x 0 1=3 y 0 in the de…nition of cosigned independence), implyingthat x 0 2=3 y 0 x 0 1=3 y 0 . In contradiction with this, y 0 x 0 2=3 y 0 and independence imply that their 0:5 0:5 mixture is strictly preferred to x 0 2=3 y 0 , implying x 0 1=3 y 0 x 0 2=3 y 0 . A contradiction has resulted.
We next turn to the proof of Proposition 16. Necessity of the preference conditions is obvious. We, hence, assume these preference conditions and derive an a¢ ne representation. We assume the vNM axioms (the axioms in Theorem 7) for < over consequences with, however, independence weakened to sign-independence: x y , x p z y p z only if either all consequences are nonlosses or they all are nongains. By true mixedness, there exist consequences and with , and we will use these consequences in the following derivation.
Lemma 19 implies that the set of nonlosses is a mixture set (closed under mixing).
On this set, all vNM axioms are satis…ed, and an a¢ ne representing functional u + is obtained. We normalize u + ( ) = 0; u + ( ) = 1. We will similarly obtain an a¢ ne u on nongains. To extend the representation and its a¢ nity to mixed consequences, we de…ne an as-if gain preference relation < + over consequences, including losses, as follows. It agrees with < for gains, as we will see, and a¢ nity extends it to losses: x < + y if there exists p > 0 such that p x < p y < . We …rst show that the choice of p in the de…nition of < + is immaterial.
LEMMA 20 If x < + y then p x < p y for all p > 0 for which both mixtures are nonlosses.
Proof. Consider p x, p y, r x, and r y, and assume that all are nonlosses. Assume p > r. Then p x is a mixture of r x and , and p y is a mixture of r y and , where both mixtures use the same weights ((1 p)=(1 r) and (p r)=(1 r)). All consequences involved in these mixtures are nonlosses by Lemma 19. By the a¢ ne representation for nonlosses, the preference between p x and p y is the same as between r x and r y.
The above lemma shows that < + indeed agrees with < for nonlosses (take p = 1).
To see that it establishes an a¢ ne extension for losses, we brie ‡y show that < + satis…es all usual vNM axioms, also on losses. Completeness, transitivity, nontriviality, and independence all readily follow from the de…nition of < + by taking a mixture weight p in its de…nition so close to 1 that this same mixture weight p can be used for all consequences concerned in the axioms. This also holds for continuity, where, applying it to < and p f , p g, and p h with p su¢ ciently close to 1, implies it for < + , f , g, and h. All vNM axioms are satis…ed for < + , giving an a¢ ne representation, denoted u + of < + and, hence, also of < on all nonlosses.
We similarly de…ne an as-if loss preference relation: x < y if there exists p > 0 such that < p x < p y. We similarly obtain an a¢ ne representation, denoted u , of < that agrees with < for all nongains. u + and u both represent < on the set of neutral consequences. We show that this overlap is big enough to ensure that the two representations are identical.
We normalize u ( ) = u + ( ) = 1. Indi¤erences q for losses , and the a¢ ne representations, imply that u + = u for losses . Thus, u + ( ) = u ( ). This and indi¤erences r imply that u + = u for gains too. Hence u + = u everywhere, and u + = u . Thus both these functions represent < on nonlosses and on nongains.
They also represent preferences between gains and losses properly, assigning positive values to the former and negative values to the latter. Thus we have obtained an a¢ ne representation u + = u of <, implying all the vNM conditions for consequences without sign restrictions. We denote u = u + = u . This completes the proof of Proposition 16.

C Proof of Theorem 14
We …rst show that the implications in the de…nitions of independence can be reversed.
We use the term strong (comonotonic/cosigned) independence, to refer to these reversed versions.
LEMMA 21 Assume that < is a continuous weak order. Then the reversed implications also hold.
Proof. Assume the conditions in the lemma, and the implication of the de…nition considered. Consider three acts f; g; h. If f; g; h are comonotonic (or cosigned), then so is the mixture set of all their mixtures, by Proposition 16. Hence, in each case, independence holds on the mixture set considered without a comonotonicity/cosignedness restriction, and we have the usual axioms that imply expected utility and the reversed implications of Lemma 21.
Necessity of the Preference Conditions in Theorem 14 ((i) implies (ii)).
We assume (i), PT, and brie ‡y indicate how cosigned independence is implied. The other conditions are routine. Consider cosigned f; g; c. We may assume a common partition E 1 ; : : : ; E n such that the consequences of the acts depend on these events. Because of cosignedness we can have h 1 < :: for all h equal to f , g, or c, or a mixture of these acts. For example, if for j there exists a h 0 from ff; g; cg with h 0 j a gain, then all h j s are nonlosses and j k. If h i h j for a h 0 from ff; g; cg, then h i < h j for all three acts, and i < j. Thus, we can use the same decision weights (Eqs. 17 and 18) for all three acts and for all their mixtures. It implies that P T (f p c) = pP T (f ) + (1 p)P T (c), with the same equality for g instead of f . This implies cosigned independence.
Sufficiency of the Preference Conditions in Theorem 14 ((ii) implies (i)).
In Proposition 16 we derived expected utility for consequences if only cosigned independence is assumed. In agreement with the de…nition of prospect theory, we normalize expected utility for consequence such that u( ) = 0 and for some consequence (existing because of true mixedness) such that u( ) = 1. Let a nonloss act be an act g such that g(s) is a nonloss for all s. A nongain act is de…ned similarly. By Lemma 19 the set of nonloss acts is closed under mixing, and so is the set of nongain acts. By Schmeidler's Theorem 10 there exists a CEU functional CEU + = R S u(g + (s))dv + on the nonloss acts g + that represents < there.
By true mixedness, there exists a truly mixed act. By monotonicity, we can replace all nonloss consequences of the act by its maximal consequence, and all loss consequences by its minimal consequence, without a¤ecting its true mixedness. The act now only has two consequences and can be written as F with . ( abbreviates good (or gain) and abbreviates bad.) By continuity, we may, and will, assume that F , by either improving (by mixing with ) or worsening (by mixing with ) . F will be used for calibrating the P T functional, and is called the calibration act.
We now de…ne a functional P T + on nonloss acts and a functional P T on nongain acts, and a prospect theory functional P T that is the sum of those two. Next we show that P T represents preference. More precisely, we de…ne where > 0 is such that P T ( F ) = 0. Thus, P T ( F ) = P T + ( F ) + P T ( F ), and = CEU + ( F )=CEU ( F ). We de…ne c as the P T value of the gain part of F ; i.e., This c is minus the P T value of the loss part of F ; i.e., P T ( F ) = c. P T represents preference on all nonloss acts, and also on all nongain acts. Because it also compares nonloss acts properly with nongain acts (this holds in fact for every > 0), it is representing on the union of these, which is the set of all nonmixed acts.
We call an act f proper if P T (f ) = P T (g) for some nonmixed act g with f g. To prove that P T is representing it su¢ ces, by transitivity, to show that all acts are proper, and this is what we will do. That is, we will use the nonmixed acts for calibrating P T relative to preferences. We start with a set of binary acts cosigned with the calibration act: A F is de…ned as the set of all acts F with < < .
LEMMA 22 All acts in A F are proper.
proof. In this proof we only consider acts from A F . All these acts are cosigned, implying that we can use cosigned independence for all mixtures. We choose particular nonmixed acts. For any act f we …nd a nonmixed equivalent g de…ned as follows. Let x be a consequence such that with g = x F we have P T (g) = P T (f ). By continuity of P T , such an x always exists. Thus g is a nonmixed binary act with the same P T value as f , but it is in A F and is cosigned with f and . We will demonstrate properness on A F by showing that each act is equivalent to a nonmixed equivalent.
Case 1 [acts with P T value zero]: Let P T (f ) = 0. De…ne a = P T + (f + ) = P T (f ) 0. is a nonmixed equivalent of f . We show that f . Case 1.1: a c (c as in Eq. 21). P T + (f + ) = a c P T + ( F ). By CEU for nonlosses, f + ( F ) a=c . Similarly, f ( F ) a=c . By double matching, f ( a=c ) F ( a=c ) = ( F ) a=c (the last indi¤erence by cosigned independence). By transitivity, f and f is proper.
Case 1.2: a > c. We consider a mix of f with , f p . From the de…nition of the P T functional we have P T (f p ) = pP T (f ) = 0 and P T + (f p ) + ) = P T ((f p ) ) = pa.
We choose p so small that 0 < pa < c. From case 1.1 we have f p . By strong cosigned independence, this implies f . f is proper.
Case 2 [acts with positive P T value]: Let P T (f ) > 0. By continuity and the de…nition of P T , there exists a consequence between and the maximal consequence in f such that P T ( F ) = P T (f ) > 0. F is a nonmixed equivalent of f . De…ne a + = P T + (f + ) and a = P T (f ). Then P T ( F ) = a + a .
Case 2.1: a + c (hence a < c). Write b + = a + =c and b = a =c.
P T + (f + ) = b + P T + ( F ) and P T (f ) = b P T ( F ). Then it follows from CEU for gains that f + ( F ) b + . For the loss part of f we similarly have f ( F ) b .
By double matching, f ( b + ) F ( b ). We now isolate a symmetric component with absolute prospect theory value a for the gain part and the loss part (this was the hardest step to …nd in this paper): From P T (f ) = c(b + b ) = a + a = P T ( F ) and CEU for nonlosses it follows that f F . By transitivity, f F . f is proper.
Case 2.2: a + > c. We mix f and F with to obtain f # = f p and ( F ) # = ( F ) p . We de…ne a + # = P T ((f p ) + ), which is pa + , and a # = P T ((f p ) ), which is pa . We choose p so small that a # < a + # < c. From prospect theory we have P T (f # ) = P T (( F ) # ), which, by Case 2.1, implies f # ( F ) # . Because f , F , and are cosigned, f p ( F ) p implies f F . Again, f is proper.
Case 3 [Acts with negative P T value]: Let P T (f ) < 0. This case is similar to Case 2.
We have demonstrated that all acts in A F are proper.
We next show that all acts are proper. Consider a general act g, and event E such that g yields nonlosses on E and losses on E c .
Case 1: There exists a matching act f 2 A F such that P T + (f + ) = P T + (g + ) and P T (f ) = P T (g ). Thus P T (f ) = P T (g), and from CEU for nonlosses and for nongains we have f + g + and f g . From double matching, f g. Because f and g have the same P T value and are equivalent, and f is proper, it follows that g is also proper.
Case 2. There exists no matching act f 2 A F for g as in Case 1. We mix act g with to obtain an act g # = g p . We choose p so small that we …nd a matching act f # 2 A F , i.e. P T + (f + # ) = P T + (g + # ) and P T (f # ) = P T (g # ).
Letg be the nonmixed equivalent of g. Letg # similarly be the nonmixed equivalent of g # . We have P T (g # ) = P T (g # ), and because of Case 1 this impliesg # g # . Because g,g, and are cosigned, g p g p implies g g. Thus, g is proper. We have proved su¢ ciency of the preference conditions.
Uniqueness Results Uniqueness of v + (v ) follows from Schmeidler's Theorem 10 applied to nonloss (nongain) acts. It is obvious that the unit of utility can be multiplied by any positive constant. We show that no other change is possible. Restricting attention to nonloss consequences shows, by Schmeidler's theorem, that u when restricted to nonlosses is unique up to a unit, given that the scale u( ) = 0 is …xed. Similarly, restricting attention to nongain consequences shows that u when restricted to nongains is unique up to a unit that, a priori, might be di¤erent than for gains. However, the equivalence F shows that the unit of losses is joined with that of gains, and a change of one generates the same change of the other. Hence only one unit of utility is free to choose.

D Remaining proofs
Proof of Observation 12.