Journal of Risk and Uncertainty

, Volume 46, Issue 2, pp 113–132

Assessing multiple prior models of behaviour under ambiguity


DOI: 10.1007/s11166-013-9164-x

Cite this article as:
Conte, A. & Hey, J.D. J Risk Uncertain (2013) 46: 113. doi:10.1007/s11166-013-9164-x


The recent spate of theoretical models of behaviour under ambiguity can be partitioned into two sets: those involving multiple priors and those not involving multiple priors. This paper provides an experimental investigation into the first set. Using an appropriate experimental interface we examine the fitted and predictive power of the various theories. We first estimate subject-by-subject, and then estimate and predict using a mixture model over the contending theories. The individual estimates suggest that 24% of our 149 subjects have behaviour consistent with Expected Utility, 56% with the Smooth Model, 11% with Rank Dependent Expected Utility and 9% with the Alpha Model; these figures are close to the mixing proportions obtained from the mixture estimates where the respective posterior probabilities of each of them being of the various types are 25%, 50%, 20% and 5%; and using the predictions 22%, 53%, 22% and 3%. The Smooth model appears the best.


Alpha model Ambiguity Expected utility Mixture models Rank dependent expected utility Smooth model 

JEL Classification

D81 C91 C23 

Ambiguity differs from risk in that, under ambiguity, events are not certain but probabilities are not known. Over the past few years, and intensively so recently, theorists have proposed many new theories of behaviour under ambiguity. The purpose of this paper is to report on the empirical adequacy of a subset of these new theories. The subset that we concentrate on here arises naturally because of a partition of the literature. All this literature is set in the context of a single-stage decision problem where the decision-maker chooses from a set of alternative decisions, Nature chooses some outcome from a set of possible outcomes, and the decision-maker’s payoff depends on the decision and on the outcome. Ambiguity arises when the probabilities of the various outcomes are not known. Part of the ambiguity literature envisages the decision-maker as realising that he or she does not know the values of the relevant probabilities, or is not prepared to make subjective judgements about the possible probabilities; while another part envisages the decision-maker as being able to list the various possibilities for the various probabilities and, moreover, as being able to attach probabilities to the various possibilities. If you like, it is second-order probabilities, or probabilities of probabilities. It is this second subset that we investigate here.1

This set consists of Expected Utility (EU) theory in which the probabilities are compounded; the Smooth ambiguity Model (SM) of Klibanoff et al. (2005) where the preference functional of the individual is the expected value (over the probabilities of the various possible probabilities) of some function of the expected utility (over each of the possible probabilities); the Rank Dependent utility model (RD), as originally proposed by Quiggin (1982), in which probabilities are weighted by some weighting function; and the Alpha Model (AM), proposed by Ghirardato et al. (2004), in which the preference functional is a weighted average of the worst and the best expected utility.2

We compare the empirical performance of these four theories of decision-making under ambiguity, all of which use a multiple prior approach. In order to provide a fair evaluation of these theories, we use an experimental interface which reproduces exactly a situation of ‘probabilities of probabilities’. Moreover, in order to avoid identification issues concerning the correct estimation of an underlying utility function (all four theories embody such a function), our experimental design involves just two final outcomes for the subject; we normalise on these two and hence do not need any functional form for the utility function. Subjects in the experiment were given a show-up fee of €7.50 and could add to that another €40 if the ambiguity resolved itself in their favour.

Subjects were presented with a total of 49 tasks. Each task started off with a choice between two two-stage lotteries. Each two-stage lottery consisted of a set of one-stage lotteries. A one-stage lottery was composed of a known number of red balls and a known number of blue balls. At the end of the experiment a one-stage lottery was played out: this was done by picking one of the balls at random and determining the colour of the drawn ball. Depending on which colour the subject had designated earlier as their ‘winning colour’, they either got €40 in addition to their show-up fee or nothing. In each of the 49 tasks, one of the two two-stage lotteries was designated by the experimenter the “unchanging lottery” and the other the “changing lottery”. After indicating which they preferred, one of the one-stage lotteries in the changing two-stage lottery was chosen at random and removed from the two-stage lottery; subjects were then asked to choose again, and this procedure continued until the changing two-stage lottery was reduced to a one-stage lottery. After all 49 tasks were completed, one of them was chosen at random; one of the stages in playing out that task was chosen at random; the subject’s choice recovered from the computer; and the chosen two-stage lottery was played out in the obvious fashion; resulting in a payment of €40 or of nothing to the subject.

The experimental procedure made clear that the winning colour is subject to a second-order probability. Thus we have a direct test of the various theories. We give details of the theories that we are investigating in the next section, restricted to the relevant situation in which the final payoff is one of two possibilities. We give more detail of the experimental procedure in the following section, and we then present our econometric results, first estimating on a subject-by-subject basis before estimating and predicting using a mixture model. Section 6 concludes.

1 Theories under investigation

We consider those theories for which we can explicitly derive a preference functional in our experimental context. Remember that the context is that of probabilities of probabilities. Clearly Expected Utility theory is one of these. We consider also the Smooth Model of Klibanoff et al. (2005), the well-known Rank Dependent Expected Utility model, and the Alpha Model of Ghirardato et al. (2004), which is a generalisation of the Maxmin Expected Utility model of Gilboa and Schmeidler (1989). Before we can specify these models, we need to describe the context of our experiment and introduce some notation.

As we have already noted, our experimental set up was such that the final payoff (over and above the show-up fee) was either €40 or nothing. We normalise the utility function (present in all theories) on these two values and hence put u(€40) = 1 and u(€0) = 0. All the theories we investigate envisage the choice between any two lotteries as being dependent on the difference between the evaluations of the two lotteries. We ignore for the time being the issue of error. We now specify the evaluations of an arbitrary two-stage lottery under each of the theories we are examining.

We start with some notational definitions, first defining a generic one-stage lottery of the type used in the experiment, and which we denote by O(m,M). This has M balls inside it, of which m (0 ≤ m ≤ M) are of the ‘winning colour’ (as chosen by the subject), and M-m of the other (non-winning) colour. Each of these M balls is equally likely to be drawn if this one-stage lottery is to be played out, so that the probability of drawing a winning ball out of the one stage lottery O(m,M) is m/M.

We now define a generic two-stage lottery of the type used in the experiment, which we denote by T(m1,m2,…,mN;M), where the mn are distinct integers. We write this in such a way that m1 < m2 <  < mN. This consists of N one-stage lotteries, each of which has M balls in them. The nth of these has mn winning balls and M-mn non-winning balls in it. So the generic two-stage lottery consists of N one-stage lotteries as follows: O(m1,M), O(m2,M),…, O(mN,M). Each of these N one-stage lotteries is equally likely to be drawn if this two-stage lottery is to be played out, so that the probability of drawing the one-stage lottery O(mn,M) is 1/N. As we have already noted, if O(mn,M) is the one-stage lottery drawn, then the probability of drawing a winning ball from that is mn/M.

We can now explain how each of the theories we are considering evaluates the generic one-stage and the generic two-stage lotteries. Let VPF[O(m,M)] and VPF[T(m1,m2,…,mN;M)] denote these valuations respectively for preference functional PF.

1.1 Expected utility theory (EU)

Since there are m winning balls, each of which leads to utility 1, and (M-m) non-winning balls, each of which leads to utility 0, and all balls are equally likely to be drawn, the expected utility of the generic one-stage lottery is simply given by
$$ {V_{EU }}\left[ {O\left( {m,M} \right)} \right]=\frac{m}{M} $$
Since EU is linear in the probabilities, and each of the N one-stage lotteries by which it is formed are equally likely, it follows that the valuation of the generic two-stage lottery is
$$ {V_{EU }}\left[ {T\left( {{m_1},{m_2},\ldots,{m_N};M} \right.} \right]=\left[ {\frac{{{m_1}}}{M}+\frac{{{m_2}}}{M}+\ldots +\frac{{{m_N}}}{M}} \right]\frac{1}{N} $$

1.2 Smooth model (SM)

This model, proposed by Klibanoff et al. (2005), is quintessentially a multiple prior model.3 We describe its application in the context of the experiment that we have conducted. Essentially, in evaluating any two-stage lottery, this model proceeds by taking the Expected Value (with respect to the lotteries composing a two-stage lottery) of some function of the Expected Utility of each of the one-stage lotteries from which the two-stage lottery is composed. Denoting this function by ϕ(.) and noting that the one-stage lottery O(mn,M) has expected utility mn/M (as in the above) and hence has the same value as in EU, it follows that the valuation by the Smooth Model of the generic two-stage lottery is given by
$$ {V_{SM }}\left[ {T\left( {{m_1},{m_2},\ldots,{m_N};M} \right)} \right]=\left[ {\phi \left( {\frac{{{m_1}}}{M}} \right)+\phi \left( {\frac{{{m_2}}}{M}} \right)+\ldots +\phi \left( {\frac{{{m_N}}}{M}} \right)} \right]\frac{1}{N} $$
It remains to specify the function ϕ(.). Klibanoff et al. suggested the particular form \( \phi (x)=-\frac{{{e^{-sx }}}}{s} \). However, we prefer a specification for which a particular parameter value reduces the Smooth Model to EU; this is the case when ϕ(x) = x for all x. So one way of characterising the ϕ(.) function is to put
$$ \phi (x)=\frac{{1-{e^{-sx }}}}{{1-{e^{-s }}}} $$

Note that when s0 the Smooth Model reduces to EU. We need to estimate the parameter s.

1.3 Rank dependent expected utility (RD)

This model was originally introduced by Quiggin (1982) and called by him Anticipated Utility. Since then it has been further developed by Segal (1987) and Kahneman and Tversky (1979) and has now become more commonly known as Rank Dependent Expected Utility theory, or, equivalently in our context, as Cumulative Prospect Theory (Tversky and Kahneman 1992).4 It values a one-stage lottery differently from Expected Utility theory since probabilities are transformed. Let f(.) denote the transformation function, where f(0) = 0, f(1) = 1 and f is non-decreasing everywhere. Then the generic one-stage lottery is valued as
$$ {V_{RD }}\left[ {O\left( {m,M} \right)} \right]=f\left( {\frac{m}{M}} \right) $$
In evaluating the generic two-stage lottery, since the one-stage lotteries within it are naturally ranked in order by our notation (note that m1< m2 <  < mN) it follows that its valuation is5
$$ {V_{RD }}\left[ {T\left( {{m_1},{m_2}\ldots {m_N};M} \right)} \right]=\sum {_{i=1}^N} f\left( {\frac{{{m_i}}}{M}} \right)\left[ {f\left( {\frac{N-i+1 }{N}} \right)-f\left( {\frac{N-i }{N}} \right)} \right] $$
We need to specify the function f(.).6 We follow the suggestion of Tversky and Kahneman (1992) and use a form which allows for realistic shapes of the function:
$$ f(p)=\frac{{{p^g}}}{{{{{\left( {{p^g}+{{{\left( {1-p} \right)}}^g}} \right)}}^{1/g }}}} $$

We need to estimate the parameter g. When it takes the value 1, RD reduces to EU.

1.4 Alpha model (AM)

Ghirardato et al. (2004) proposed a generalisation of the Gilboa and Schmeidler (1989) Maxmin Expected Utility model. Here there is no use made of the probabilities attached to the members of the set of possible distributions, though use is made of knowledge of the best and worst members of this set. So this is a multiple prior model, but not one that makes use of actual values of the second-order probabilities. The model’s valuation of a one-stage lottery is exactly as in EU. The worst of the one-stage lotteries within the generic two-stage lottery, in our notation, is O(m1,M), and the best is O(mN,M). The Alpha model values the generic second-order lottery as a weighted average of the value of the worst and the value of the best one-stage lotteries within the two-stage lottery. Hence we have:
$$ {V_{AM }}\left[ {T\left( {{m_1},{m_2},\ldots,{m_N};M} \right.} \right]=a\frac{{{m_1}}}{M}+\left( {1-a} \right)\frac{{{m_N}}}{M} $$

We note a special case: when a = 1 this reduces to Gilboa and Schmeidler’s Maximim model (Gilboa and Schmeidler 1989). We need to estimate the parameter a. We note that the EU is not nested inside AM unless the ‘two-stage’ lottery consists of just a single one-stage lottery.

2 Our experimental implementation

We have already sketched the main features of the experiment; we now give detail. Subjects completed the experiment individually at screened and separated computer terminals. During the experiment each of the subjects was presented with 49 tasks. Each task started out with two two-stage lotteries being portrayed on the computer screen, and the subject informed as to which of the two was the changing lottery, and hence which was the unchanging lottery. An example of the opening screen of a task is given in Fig. 1. Each two-stage lottery was composed of a number of one-stage lotteries—which are the columns in the two two-stage lotteries in Fig. 1. The subject was then asked to choose a winning colour (from blue and red) for that task.
Fig. 1

The experimental interface

After choosing the winning colour the task started. The following process continued until the changing lottery was reduced to a single one-stage lottery. The subject was asked to select one of the two-stage lotteries as his or her preferred lottery. Then one of the one-stage lotteries from the changing two-stage lottery was eliminated at random—leaving a column gap in the visual presentation. The subject was again asked to state his or her preferred two-stage lottery. After the last choice for a given task (when the changing lottery had been reduced to a single one-stage lottery) a new task was presented, until all 49 tasks had been completed. Different subjects got the 49 tasks in different, randomised, orders.

The natural incentive mechanism was used. Subjects were told that all their choices on each of the pairwise choice problems in each of the 49 tasks would be recorded. Then, at the end of the experiment, one of the 49 tasks would be randomly selected, one of the pairwise choice problems would be randomly selected, and the subject’s preferred choice on that particular problem would be played out. To do this, one of the one-stage lotteries (if there were more than one), or the one-stage lottery (if there was only one), of the chosen two-stage lottery would be picked at random (with all one-stage lotteries within it having the same chance of being picked), and one of the balls in the one-stage lottery picked at random. If the colour of the ball picked was the subject’s chosen ‘winning colour’ on that task, the subject would get paid €40 in addition to the show-up fee.

We should note that the tasks used in the experiment—their number and the tasks themselves—were selected after extensive pre-experimental simulations. We were anxious to ensure that we would have both enough data, and enough informative data, to be able to distinguish between the different types. Obviously the answer to this question depends upon the noise in subjects’ responses, so we first carried out some pilot studies to get some idea as to the magnitude of this noise. Equipped with this information, we then simulated a large number of experimental questions, selecting those that appeared most discriminatory. The design and results of these simulations are available on request.

The experiment was conducted in, and financed by, the experimental lab of the Max Planck Institute of Economics, directed by Professor Werner Güth, with subjects recruited using ORSEE (Greiner 2004).

3 Related experimental literature

We provide here a brief summary of recent experimental work investigating theories of behaviour under ambiguity. Earlier literature is surveyed in Camerer and Weber (1992) and Camerer (1995). A fuller survey can be found in Hey and Pace (2011), which also gives more bibliographical information.

Halevy (2007) marks the start of new experimental work investigating the recent theories; subsequent contributions are those of Andersen et al. (2009), Ahn et al. (2010), Hey et al. (2010), Hey and Pace (2011) and Abdellaoui et al. (2011). To avoid duplication, we present here only the essential features of these experiments, specifically the following: the kinds of questions asked to the subjects; the way that ambiguity was implemented in the laboratory setting; and information about the theories (or class of theories) under test. Table 1 gives an overview.
Table 1

Recent related experimental work


Type of questions

Implementation of ambiguity

Theories being investigated

Halevy (2007)

Certainty equivalents / Reservation prices (using BDM Mechanism)

Ellsberg-type urns

Subjective Expected Utility, Maxmin, Anticipated Utilitya,c, Smooth Modeld

Andersen et al. (2009)

Pairwise Choices

Bingo cage and real events

Minimalist non-Expected-Utility model (a special case of the Smooth Modeld)

Ahn et al. (2010)


Probabilities of two of the three states not stated

Two broad classes: smooth and kinked; special cases of more general models

Hey et al. (2010)

Pairwise choices

Bingo Blower

Expected Value, Subjective Expected Utility, Choquet Expected Utility, Prospect Theory, Cumulative Prospect Theory, Decision Field Theory, Alpha Modelb plus some older theories

Hey and Pace (2011)


Bingo Blower

Subjective Expected Utility, Choquet Expected Utility, Alpha Modelb, Vector Expected Utility, Contraction Modele

Abdellaoui et al. (2011)

Certainty equivalents / Reservation prices (using Holt-Laury price lists)

8-colour Ellsberg-type urns

Rank Dependent Expected Utility

This paper

Pairwise choices

Multiple priors

Subjective Expected Utility, Smooth Model, Alpha Modelb, Anticipated Utility

aReferred to as Recursive Nonexpected Utility

bGhirardato et al. (2004) (Including the special cases Maxmin and Maxmax)

cIt is not clear what weighting function (Quiggin or Power) is used

dKlibanoff et al. (2005)

eGajdos et al. (2008)

In these experiments, there were three different types of questions posed to the subjects: (1) reservation price questions; (2) pairwise choice questions; (3) allocation questions. The advantage of the first and last types is that there is more information contained in the answer to each question than in the second type; the disadvantage is that it becomes necessary to estimate a utility function. Because we wanted to reduce the number of parameters that we needed to estimate, we made the decision not to estimate a utility function and thus we restricted the payments to the subjects to one of two values. This implied immediately that we could not ask allocation questions nor ask reservation price questions, and thus were restricted to pairwise choice questions. We compensated by having relatively many of them: 49 tasks and 256 pairwise choice questions in total; this is many more questions than is usually the case.

In contrast, the papers by Ahn et al. (2010) and Hey and Pace (2011) used allocation questions, in which subjects were asked, in each of a series of questions, to allocate a given sum of tokens to various events, with given exchange rates between tokens and money for each of the events. If a particular question was selected to determine the payment to the subject, then the question was played out and the resulting event, combined with the allocation that the subject had made to that event and the pre-specified exchange rate between tokens and money for that question and that event, determined the payment that the subject received for his or her participation (plus any given participation fee). As a result the actual payment might take one of a range of values, and hence a utility function over this range would have to be inferred from the subject’s answers. This was also the case in the experiments of Halevy (2007) and Abdellaoui et al. (2011), though both of these papers asked subjects to state certainty equivalents for given gambles. In the first of these papers, the Becker-Degroot-Marschak (BDM) method was used to get subjects to state their certainty equivalents; in the second the price-list mechanism was used. Both of these mechanisms have their problems: with BDM, it appears that subjects find it difficult to understand what the mechanism entails; and in the price-list mechanism, it appears that the answers given by subjects are sensitive to the construction of the lists.7 Our method avoids both these problems.

The implementation of ambiguity in the laboratory also varies from paper to paper. Hey et al. (2010) and Hey and Pace (2011) used a Bingo Blower—in which balls of differing colours (which define the events) are blown around inside the Blower in such a way that the balls can be seen but not counted. Andersen et al. (2009) used a Bingo Cage, which is similar to a Bingo Blower in that the balls cannot be counted, but differs from it in that the balls are stationary and not being blown about continuously. Andersen et al. (2009) also used bets on natural events (for example the temperature in Paris at a precise time in the future; see Baillon (2008) for more detail). Halevy (2007) used Ellsberg-type urns, described in much the same way as Ellsberg did:

“Urn 2: The number of red and black balls is unknown, it could be any number between 0 red balls (and 10 black balls) to 10 red balls (and 0 black balls)”

as well as urns which would be better described as two-stage lotteries:

“Urn 3: The number of red and black balls is determined as follows: one ticket is drawn from a bag containing 11 tickets with the numbers 0 to 10 written on them. The number written on the drawn ticket will determine the number of red balls in the third urn. For example, if the ticket drawn is 3, then there will be three red balls and seven black balls.”

Abdellaoui et al. (2011) also used Ellsberg-like urns, but with eight colours:

“The known urn K contained eight balls of different colours: red, blue, yellow, black, green, purple, brown, cyan. The unknown urn contained eight balls with the same eight colours, but the composition was unknown in the sense that some colours might appear several times and others might be absent.”

This has the same feature as the original unknown Ellsberg urn—subjects were not informed about the process of the formation of this urn. It might therefore be the case that they regard this as the “suspicious urn.”

In contrast, and particularly because our experiment was specifically designed to test multiple-prior models of ambiguity, we used two-stage lotteries (like Halevy’s Urns 2 and 3). These are exactly the type of ambiguity referred to in the theories under test. Moreover, we control the probabilities of the various possibilities, and they are therefore immune to subjects’ subjective interpretation and assessment.

Partly because of the different types of questions and the different implementations of ambiguity, the theories under test also differ from paper to paper. But there is a second key difference between the theories under test. Three of the papers do not investigate specific models but rather ‘generic classes’ of models. This is particularly true of Ahn et al. (2010) who investigate special cases of two classes: smooth and kinked. The kinked class essentially consists of those theories that are rank dependent in some sense: obviously Choquet EU (see Schmeidler 1989) is in this set, as is Rank Dependent EU. There is a kink (in that the derivative does not exist) in the indifference curves (plotted in payoff space) implied by the preference function, where the ranking of the outcomes changes. In contrast there is no kink in the members of the smooth set, because ranking is not important. A key member of this latter set is Klibanoff et al.’s Smooth Model. So, in a sense, the Ahn et al. paper investigates two generic classes, though it should be noted that it does not investigate any specific model in either class. Andersen et al. (2009) do a similar investigation of the smooth class, while Abdellaoui et al. do the same for the kinked class. It follows that none of these theories investigate specific theories. This is in contrast to the other papers in the table.

One key remaining difference between these various papers is their objectives. The Hey et al. and Hey and Pace papers estimate preference functionals and see how well the data fits the various theories. So does Andersen et al. (2009). In contrast Halevy tests between the various theories. Abdellaoui et al., by design, do not compare different theories but instead describe the way that ambiguity and risk enter the decision process. While Ahn et al. adopt a different methodological approach to that of Abdellaoui et al., their interest is similar; they too are interested in how attitude to ambiguity can be characterised and how it enters into the decision-making process. In contrast, given the recent activity of the theorists in producing theories of decision-making under two-stage ambiguity, our objective in this paper is to discover which of these is ‘best’ and hence worth pursuing. ‘Best’ is defined not only with respect to descriptive ability but also predictive ability.

4 Stochastic assumptions

We assume that decisions are made with error. Our specific stochastic assumption is that the difference in the values of the objective function is calculated with error. Specifically we follow precedent and assume that, while the decision on any of the pairwise choice questions in the experiment should be made on the basis of the difference between V(L) and V(R) (where L and R are the two lotteries involved in the decision), it is actually made on the basis of V(L) − V(R) + u where u is N(0,σ2) and is independent across questions. We also add a probabilistic ‘tremble’ (see Moffat and Peters 2001) of magnitude ω. We estimate the parameters of the various models (s in the Smooth Model, g in the Rank Dependent Model and a in the Alpha Model), as well as σ and ω.

5 Econometric procedures

We have four models: Expected Utility (EU), the Smooth Model (SM), the Rank Dependent model (RD) and the Alpha Model (AM). Our goal is to try and assign subjects to models/types—to see if, and which of, these models describe the behaviour of our subjects. We start by fitting the various models subject-by-subject. We then use a mixture model, pooling the data from all our 149 subjects, and employ the estimation/prediction methodology proposed by Wilcox (2007): using just over half our data for estimation, and the remaining data for prediction—hence assessing the relative predictive ability of the theories. We provide details in the next three sub-sections.

5.1 Estimation subject-by-subject and model-by-model

We have a total of 149 subjects. Each of them completed 49 tasks, composed of a grand total of 256 pairwise choice questions.8 We summarise the subject-by-subject results in Table 2. In this table we assign subjects to a particular type according to pairwise model testing: likelihood-ratio tests for nested pairs (EU v SM and EU v RD) and BIC tests for non-nested pairs (EU v AM, SM v RD, SM v AM, RD v AM).
Table 2

Individual by individual analysis




Expected utility



Smooth model



Rank dependent



Alpha model






Subjects are assigned to types by likelihood-ratio and BIC tests

We see that, using this method to classify subjects to models/types we assign 24% to EU, 56% to SM, 11% to RD and 9% to AM. We note that this assignment of subjects to type is not very different from the assignment we obtain from the mixture estimates as we shall see later: respectively 26%, 49%, 20% and 5% (see Table 3).
Table 3

Estimates results of mixture model (9)

Models parameters

Mixing proportions


.28998 (.01305)


.26043 (.03766)


1.37699 (.13250)


.49244 (.04389)


1.08495 (.10453)


.19812 (.03409)


.08778 (.00253)


.04900 (.02168)


−.17995 (.04951)


.24250 (.04073)



.02453 (.00232)



7.99490 (2.93661)



6.76221 (2.33517)



.11528 (.01091)



.01497 (.00271)


Log-likelihood −8926.8476

number of observations 19668

number of subjects 149

number of observations per subject 132

Figure 2 displays histograms of parameter estimates. The panes on the left show histograms of parameter estimates over all subjects in the sample. The panes on the right contain histograms of parameter estimates only from subjects who are classified as being of the type h, with h∈(SM, RD, AM), that is indicated in the corresponding row (see also Table 2). It may be of interest to recall that SM reduces to EU when the parameter s takes the value 0, while RD reduces to EU when g takes the value 1.
Fig. 2

Histograms of parameter estimates obtained by individual by individual and model by model analysis of the data

5.2 Mixture estimation

The individual estimates reported above are very extravagant in terms of numbers of parameters. Moreover, this subject-by-subject approach ignores any smoothness in the distribution of parameters that exists in the population from which this sample was drawn. To avoid this extravagance, and provide a more parsimonious specification, we now introduce a mixture model. In this context, a mixture model9 assumes that a certain proportion of the population from which the subject pool was drawn is of each of the four types under consideration (EU, SM, RD and AM): these proportions in the population are termed the mixing proportions; furthermore, and this is the way a mixture model reduces the extravagance, within each type the approach assumes that each parameter specified in that particular type has a distribution over the population. Estimation involves pooling the data over all subjects and estimating the mixing proportions and the parameters of the distributions (over the population) of the parameters of the various models. So, for example, instead of having 149 different estimates of the smooth parameter (one for each subject) this mixture estimation provides just two: the mean and the variance of the distribution of the smooth parameter over the population. The same is true for all the other parameters. The histograms in Fig. 2 are effectively replaced by the distributions in Fig. 3. Instead of getting a picture of 149 separate individuals, we are getting a picture of the set of individuals in the population from which these 149 individuals were drawn. Our results are more generalisable, and enable us to make predictions on samples different from the one used in this paper, while individual-by-individual analysis does not.
Fig. 3

Density plots of the relevant parameters of the functional in mixture model (9) from parameter estimates in Table 3

So we use a mixture model. We add an additional methodological component: we follow the estimation-then-prediction approach proposed by Wilcox (2007) to assess the comparative empirical plausibility of the various models. Basically this approach involves splitting the data up into two parts, one used for estimation and the other (using the resulting estimates) for prediction. He advocates this approach because of the greater challenge of the test of good prediction.

Wilcox (2007) advocates this approach not in a mixture model context but in an individual context. In this individual context the approach requires that data are estimated subject-by-subject and model-by-model on part of the data, and that the identification of a subject’s type is based on the predictive performance of each model on the remaining data. We have to appropriately modify his approach in a mixture model context. As we have already noted the mixture model approach pools data over subjects and hence deals with subjects’ heterogeneity in two ways: it allows subjects to be of different types and to differ in terms of the parameters of the functional that they have. The former point is taken care of by assuming that each subject is of one type, and that he or she does not change type throughout the experiment, and by estimating the proportion of the population who are of each type. The latter is dealt with by assuming a distribution of the relevant parameters over the population and by estimating the parameters of that distribution.

The reason we decided to combine the two approaches is twofold. We find the approach proposed by Wilcox (2007) extremely elegant and powerful, but we realised that it does not work well on an individual basis with our data in the context of our experiment. Suppose, for example, that a subject behaves exactly as EU without any mistake but one, and that the wrong choice is included in the sample of choices used for prediction. From the sample used for estimation, we infer that our subject is EU with an error term with zero variance. When we try to calculate the likelihood of the prediction, the only mistaken choice results in a log-likelihood of −∞. Consequently, we might be induced to reject the hypothesis for that subject of being EU in favour of another model, when that subject is in fact EU. The mixture approach assumes that the parameters of the relevant models follow a certain distribution but, at the same time, provides consistent estimates of the proportion of the population who are of each type. In a particular sense, it introduces a parameter smoothness assumption, but uses information about the population (mixing proportions and distribution of the relevant parameters for the functionals included in the mixture) in assigning subjects to types.

In order to implement our approach, we split our observations into two samples: (1) an estimation sample, Ie, that includes the observations from 25 tasks (selected at random), corresponding to 132 binary choices, and we use these to estimate the mixture model; (2) a prediction sample, Ip, that includes the remaining 24 tasks, corresponding to 124 binary choices, and we use these to make predictions. More precisely, we proceed as follows: we use the estimation sample to estimate the mixture model and we calculate the likelihood of the prediction from the prediction sample using the parameter estimates obtained in the first step. By Bayes’ rule, we finally compute the posterior probability for each subject of being of each type from the likelihood of the prediction, and assign subjects to types according to the highest posterior probability of the prediction.

As we have already noted, we start by assuming that a proportion πk of the population from which the experimental sample is drawn is of type k, with πk ≤ 1, ∑kπk = 1 and k ∈ (EU, SM, RD, AM). These mixing proportions πk, with k ∈ (EU, SM, RD, AM), are estimated along with the other parameters of the mixture model. The likelihood contribution of subject i is
$$ {L_i}=\sum {_{{k\in \left( {EU,SM,RD,AM} \right)}}{\pi_k}\times \widetilde{l}_i^k,} $$
where \( \widetilde{l}_i^k \) is the penalised likelihood contribution of individual i under the hypothesis of her being of type k, with k ∈ (EU, SM, RD, AM). We will define what we mean by a “penalised” likelihood contribution after having specified the likelihood contribution of each type. For this purpose, let us use the binary variable yit that takes the value 1 if subject i chooses the left-hand lottery in problem t, and takes the value −1 if subject i chooses the right-hand lottery in problem t. The likelihood contribution for subject i’s choice in problem t, given that subject i is of type EU, is
$$ \begin{array}{*{20}c} {l_i^{EU }={\prod_{{t\in {I^e}}}}\left\{ {\left( {1-\omega } \right)\varPhi \left[ {{y_{it }}\times \frac{{{V_{EU }}(L)-{V_{EU }}(R)}}{{{\sigma_{EU }}}}} \right]} \right\}} \hfill & {{y_{it }}\in \left\{ {1,-1} \right\}} \hfill \\ \end{array} $$
Here, Φ (.) is the unit normal cumulative distribution function; σEU is the standard deviation of the Fechner error term for the EU model; and ω is a tremble probability (the probability that the subject chooses at random between the two lotteries). The likelihood contribution for subject i’s choice in problem t, given that subject i is of type h ∈ (SM, SQ, AM), is
$$ \begin{array}{*{20}c} {l_i^h=\int {_{\underline{b}}^{\overline{b}}{\prod_{{t\in {I^e}}}}\left\{ {\left( {1-\omega } \right)\varPhi \left[ {{y_{it }}\times \frac{{{V_h}(L)-{V_h}(R)}}{{{\sigma_h}}}} \right]+\frac{\omega }{2}} \right\}g(r)dr} } \hfill & {{y_{it }}\in \left\{ {1,-1} \right\}} \hfill \\ \end{array} $$

Here, g(r) is the density of the parameter r. In the case h = SM, r represents the parameter s, with s ~ N(μ, θ2), \( \underline{b}=-\infty \) and \( \overline{b}=+\infty \). In the case10h = RD, r represents the parameter g, with \( ln\left( {g-0.279095} \right)\sim N\left( {\delta, {\epsilon^2}} \right) \), \( \underline{b}=0.279095 \) and \( \overline{b}=+\infty \). In the case h = AM, r represents the parameter a, with a ~ Beta(ϑ, φ), \( \underline{b}=0 \) and \( \overline{b}=1 \).11σh are the standard deviations of the Fechner error term attached to each model in h. Φ and ω are the same as defined for type EU. We assume that ω is independent of type because there is no reason to assume that any type should tremble more than another. The sum over all i in the sample for tIe of the logarithm of (9) is maximised by maximum simulated likelihood; the integrations in (11) are performed by four sets of Halton sequences.

Likelihood contributions are penalised depending upon the number of observations used in the estimation and the number of free parameters estimated.12 Thus, the relationship between \( l_i^k \) and \( \widetilde{l}_i^k \) is
$$ \begin{array}{*{20}c} {\widetilde{l}_i^k=\frac{{l_i^k}}{{{T^{{\frac{j}{2}}}}}}} \hfill & {k\in \left( {EU,SM,RD,AM} \right)} \hfill \\ \end{array} $$

Here, T = 132 is the number of choices per individual in the estimation sample. j is the number of free parameters estimated. We have that j equals one for type EU and three for types SM, RD and AM. The penalisation of the likelihood contributions used here corresponds to the Schwarz-Bayesian Information Criterion (BIC).

Table 3 reports parameter estimates of the mixture model derived above. The estimates of the mixing proportions suggest that 26% of the population is EU, 49% is SM, 20% is RD and 5% is AM. Figure 3 shows density plots of the mixture model parameters according to the estimated results in Table 3. We note, in passing, that these mixing proportions are close to the proportions assigned to types that we obtained from the individual estimates, and are also close to the posterior probabilities that we will present in Fig. 6.

5.3 Prediction results

Now that we have obtained the estimation results reported in Table 3, we can proceed with calculating the likelihoods of the choice prediction using Eqs. (9), (10), (11) and (12) but where we replace parameters to be estimated with parameter estimates, which are assembled in the vector \( \varUpsilon =\left( {{{\widehat{\sigma}}_{EU }},\widehat{\mu},\widehat{\theta},{{\widehat{\sigma}}_{SM }},\widehat{\delta},\widehat{\epsilon},{{\widehat{\sigma}}_{RD }},\widehat{\vartheta},\widehat{\varphi},{{\widehat{\sigma}}_{AL }},{{\widehat{\pi}}_{EU }},{{\widehat{\pi}}_{SM, }}{{\widehat{\pi}}_{RD }},{{\widehat{\pi}}_{AM }}} \right) \). Tasks and choices are those contained in the prediction sample, yit with t ϵ Ip. In Eq. (12) T = 124. Using Bayes’ rule we compute the posterior probabilities of types for each i in our sample as follows:
$$ \begin{array}{*{20}c} {Pr\left[ {i\,\mathrm{is}\,\mathrm{of}\,\mathrm{type}\left. k \right|{y_{it }}\,\mathrm{with}\,t\in {I^p},\varUpsilon } \right]=\frac{{Pr\left[ {\mathrm{type}\,k} \right]\times Pr\left[ {{y_{it }}\,\mathrm{with}\,t\in {I^p}\left| i \right.\,\mathrm{is}\,\mathrm{of}\,\mathrm{type}\,k} \right]}}{{Pr\left[ {{y_{it }}\,\mathrm{with}\,t\in {I^p}} \right]}}} \hfill \\ {\begin{array}{*{20}c} {=\frac{{{{\widehat{\pi}}_k}\times \widetilde{l}_i^k\left( {{y_{it }}\,\mathrm{with}\,t\in {I^p},\varUpsilon } \right)}}{{{L_i}\left( {{y_{it }}\,\mathrm{with}\,t\in {I^p},\varUpsilon } \right)}}} & {k\in \left( {EU,SM,RD,AM} \right)} \\ \end{array}} \hfill \\ \end{array} $$
Basically, we re-calculate Eqs. (9), (10), (11) and (12) by replacing parameters with parameter estimates, and, for each subject, by replacing tasks and choices from the estimation sample with tasks and choices from the prediction sample.13 We can also calculate the posterior probabilities based on the estimation sample; these are given by Eq. (13) with Ip replaced by Ie; of course here we use the maximised likelihood over the estimation sample. We can use these posterior probabilities on both the estimates and the predictions to assign subjects to types, though, following Wilcox (2007), we prefer assignment on the predictions. Interestingly, however, in this context the proportions assigned to types are not very different whether we use estimates or predictions (see Fig. 6 which we will explain shortly). First, however, we explain the assignation with the help of Fig. 4.
Fig. 4

Posterior probabilities distributions of the 4 types. In each simplex, the posterior probability of the missing type is <.005. For the estimation sample, the three graphs (left to right) involve 61, 44 and 44 subjects, respectively. For the prediction sample, 72, 31 and 46, respectively. To produce the graphs, posterior probabilities have been rounded to the closest .05. EU is Expected Utility; SM is the Smooth Model; RD is Rank Dependent; AM is the Alpha Model

Figure 4 displays six simplexes summarising the results from analysing the posterior probabilities; the first row of three simplexes refers to the estimation sample, and the second row to the prediction sample. This figure is crucial as it shows very clearly the power of our experiment in classifying the subjects by type. Each simplex represents the posterior probabilities of types for each subject in our sample computed following the procedure explained above. Subjects are points in the simplexes: the closer a subject is to the vertex of a simplex, the higher the posterior probability for that subject of the type represented on that vertex. The smallest circles represent single subjects, larger circles represent concentration of subjects in that area of the simplex: the larger the circle, the higher the number of subjects represented by that circle. Since we realised that for all subjects in our sample no more than three posterior probabilities at a time were larger than .005, we grouped subjects in three simplexes that display on each vertex the three types for which the posterior probabilities are larger than .005. The simplexes in the first (estimation) row involve 61, 44 and 44 subjects respectively; those in the second (prediction) row 72, 31 and 46 respectively. We see that most of the subjects are concentrated at the vertices of the simplexes, showing that our mixture assigns them to type with very high posterior probability. This is a really crucial finding: it implies that, except for a handful of subjects for whom there is some confusion as to their type, the vast majority of the subjects can very clearly be assigned to a specific type.

This statement is confirmed by Fig. 5 (which shows the cumulative percentage of subjects assigned to a type with maximum posterior probability less than the probability indicated on the horizontal axis) both for the estimates (the black step function) and the predictions (the grey step function). It shows that the predictive power of our mixture model is quite impressive: all subjects are assigned to a type with posterior probability larger than .5; more than 95% of subjects are assigned to a type with posterior probability .75 or more; almost 90% of subjects in our sample are assigned to a type with posterior probability .90 or more. The prediction is slightly less impressive (as the step function is above that for the estimations) but only marginally so.
Fig. 5

Cumulative percentage of subjects assigned to a type with maximum posterior probability < posterior probability indicated on the horizontal axis

Figure 6 presents the results of assignment to types according to the posterior probabilities of the estimates (the final column) and of the predictions (the bottom row); it also presents a cross-tabulation (in the cells of the matrix). Using the posterior probabilities from the estimates (predictions) 25% (22%) are assigned to EU, 50% (53%) to SM, 20% (22%) to RD and 5% (3%) to AM. The bulk of the subjects appear down the main diagonal in Fig. 6 (indicating the same assignment on both the estimation and the prediction) though 26.8% are classified differently on the two methods. This ‘instability’ is a feature pointed out by Wilcox (2007) as inevitable unless one has a lot of data: in our case it would seem that 49 tasks or 256 observations are not enough for some subjects and that these subjects are simply unstable and do not have a fixed preference function. In conclusion it would appear to be the case that we can assign virtually all subjects to a type with very high probability.
Fig. 6

Cross tabulation of subjects assigned to types according to the highest posterior probabilities of the estimate (rows) and subjects assigned to types according to the highest posterior probability of the prediction (columns). EU is Expected Utility; SM is the Smooth Model; RD is Rank Dependent; AM is the Alpha Model

6 Conclusions

We have examined the empirical plausibility (both in estimation and prediction) of four multiple prior models of choice under ambiguity (these being an important subset of all models of behaviour under ambiguity). We carried out an appropriate experiment on 149 subjects, involving 49 tasks and 256 pairwise decision problems, and used the resulting data to assign subjects to models/types. We started with estimation subject-by-subject, and then we turned to a mixture model, pooling all our 149 subjects together. The mixture model enables us to classify subjects to models with a high degree of accuracy, with posterior probabilities of subjects being of one type or another being over 0.90 for around 90% of our subjects. These posterior probabilities suggest that around 22% of our subjects follow Expected Utility (EU) theory, 53% the Smooth Model (SM) of Klibanoff et al., some 22% the Rank Dependent (RD) model and around 3% the Alpha Model (AM) of Ghirardato et al. Interestingly the assignments based on the estimated data do not differ very much from those based on prediction data—unlike in Wilcox (2007).

In a sense the poor showing of the Alpha model is not surprising as the model does not use all the information given to the subjects; indeed one might not even call it a genuine multiple prior model.14 Both EU and RD appear to perform equally well. However, the clear ‘winner’ in terms of the number of subjects with that preference functional is the Smooth model—suggesting that over 50% of subjects do not reduce two-stage gambles to their one-stage equivalents when analysing the decision problems: they do indeed view two-stage trees as different from the reduced one-stage trees. In contrast, the application of EU to this kind of problem implies that the reduction of compound lotteries is valid and hence, as Halevy (2007) points out, the individual is ambiguity-neutral. So the majority of our subjects were not ambiguity-neutral, even in our context where all probabilities are objective.


We should note that we share the doubts of those who wonder whether this is the appropriate characterisation of a situation of ambiguity. Such doubts include the fact that these models complicate an already complicated decision problem: for example, if someone does not know the probabilities, how can he or she attach probabilities to the various possibilities? However, at the end of the day, this is an empirical issue.


We note that this model does not use the values of the probabilities of the various possibilities, though it does use the set of possible probabilities.


In the general form of the theory, the decision-makers themselves are supposed to specify the set of possible probabilities and the probabilities attached to them; in our experiment these were objectively specified and we assume that the subjective set and the subjective probabilities are the same as the objective ones. It could be argued that this is not the specification envisaged by the authors of the Smooth Model, but that contains our specification as a special case. We also note that it is difficult to test the general version as one needs to be able to elicit not only the set of possible probabilities but also the subjective probabilities attached to them by the decision-maker.


Note that, with just two outcomes, the notion of a reference point is irrelevant.


Here the usual convention that f(x) = 0 if x < 0 never applies.


We also tried the power form f(p) = pg but this did not appear to represent an improvement.


See Andersen et al. (2006).


The number of questions in each task depended upon the initial number of one-stage lotteries in the changing task: to be precise if there were N such one-stage lotteries then there would be N decision-problems in that task. N varied across tasks.


For further details on the mixture approach in the context of choice under risk, see Conte et al. (2011).


In this case the parameter g has to be bounded below to make the function f(.) monotonically increasing over [0,1].


In this case the parameter a has to be between 0 and 1.


In a perfect world, where there are no other types other than the four included in our mixture and where each subject sticks to his or her type from the first to the last task, we would not need to introduce any penalisation. However, in an imperfect world, richer-in-parameters models are more able to attract “outliers”. For this reason, we decided to penalise richer models in favour of the EU model that has no parameters except for that of the additive error term. Our approach is inspired by Preminger and Wettstein (2005).


Obviously, we do not maximise the resulting likelihood at this stage, because we use parameter estimates.


This is all the more true for the Maxmin model proposed in Gilboa and Schmeidler (1989), of which the Alpha Model is a generalisation.


Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of Economics and Related StudiesUniversity of YorkYorkUK
  2. 2.Westminster Business School, University of WestminsterLondonUK
  3. 3.Strategic Interaction Group, Max Planck Institute of EconomicsJenaGermany

Personalised recommendations