People’s beliefs are mostly shaped by what they learn from other people. For many important decisions, advice from others, preferably experts, is aggregated into a final judgment. Hence, a rich literature has developed on belief aggregation (Clemen and Winkler 1999; Cooke 1991; Dietrich 2010). This literature has mostly used Savage’s (1954) subjective probabilities to quantify degrees of belief, implemented in his Bayesian (expected utility) model for decision making.

Belief aggregation typically concerns events with unknown probabilities. Such events, commonly called ambiguous, are known to generate non-Bayesian behavior (Ellsberg 1961; Keynes 1921; Knight 1921). Our paper will show that such deviations from Bayesianism are relevant for belief aggregation. We thus contribute to recent literature using ambiguity models rather than Bayesian models to analyze belief aggregation (Baraldi and Zio 2010; Gajdos et al. 2008; Zimper and Ludwig 2009; Teper 2010). These recent papers added decision models to earlier studies that investigated the aggregation of imprecise probabilities in statistics, fuzzy set theory, and artificial intelligence (Nau 2002 and its references). Whereas these recent papers and their predecessors were theoretical, our contribution will be empirical. By recognizing the empirical violations of Bayesianism, we obtain results for belief aggregation that are empirically more valid than those obtained before in Bayesian analyses. We can identify and isolate the relevant factors and their effects more reliably.

This paper will use Abdellaoui et al.’s (2011) source method to analyze ambiguity. This method is based on axiomatized decision models (Gilboa 1987; Gilboa and Schmeidler 1989; Schmeidler 1989; Tversky and Kahneman 1992), and its tradeoff between parsimony and fit suits our purposes well. In particular, we will use Abdellaoui et al.’s indexes of pessimism and insensitivity, and will adapt them to our direct measurements of ambiguity. As explained by these authors, pessimism (ambiguity aversion in our case) is a motivational component, related to a general disliking or liking of ambiguity. Insensitivity (in our case ambiguity-generated, referred to as a-insensitivity) is another, cognitive, component (Kunreuther et al. 2001) prior to any preference and orthogonal to the aversion/seeking component. It reflects a lack of understanding of uncertainty and is needed, besides ambiguity aversion, to explain the ambiguity attitudes that are found empirically. It explains, for instance, that people take uncertainty too much as fifty-fifty, and do not sufficiently discriminate between different levels of likelihood. A-insensitivity is the extension to ambiguity of the well-known inverse-S shaped probability weighting. The two components, ambiguity aversion and a-insensitivity, depend on the source of uncertainty considered, and can, for example, be different when the source of uncertainty concerns domestic stocks or foreign stocks.

For the sake of clarity, our paper will study the simplest possible situations of belief aggregation, where there is only one event to be judged by a decision maker and there are only two agents (we will use this term henceforth) whose judgments are aggregated by the decision maker. We also assume that there is no interaction between the agents themselves, or between the agents and the decision maker, so that no group process is involved.

We will investigate how decision makers aggregate belief judgments for three sources of uncertainty. The first source serves as a control treatment. Here both agents are Bayesian and agree with each other (and everyone else). This is the common case of generally accepted objective probabilities, with no ambiguity involved. We call this source risk.

For the second source of uncertainty, each agent alone fully satisfies Bayesianism, with a precise probability judgment. However, the two agents give different judgments, generating ambiguity for the decision maker aggregating their beliefs.Footnote 1 This source of uncertainty, which is characterized by between-agent ambiguity (heterogeneous beliefs), is called conflict (C-)ambiguity in this paper.

The third source of uncertainty is characterized by within-agent ambiguity and relates to the situation where each agent gives an imprecise probability judgment. This situation is closest to ambiguity as mostly studied in the literature.Footnote 2 In this paper, this third source of uncertainty is called imprecision (I-)ambiguity. To keep our analysis as simple as possible, we assume homogeneous beliefs in the I-ambiguity case; i.e., the two agents agree. Smithson (1999) and Cabantous (2007) found differences between the second and third sources of uncertainty (conflict versus imprecision) in experiments, but Cabantous et al. (2011) found no clear differences. Our paper reconsiders the case using ambiguity theories.

Our experiment concerns loss outcomes. Risk and ambiguity attitudes for losses are subject to debate and, hence, their study is of special interest. Classical economics assumes universal risk aversion, but Kahneman and Tversky’s (1979) prospect theory argued for risk seeking for losses (reviewed by Wakker 2010 p. 264). Most theoretical studies assume universal aversion to ambiguity, but most empirical studies find prevailing ambiguity seeking for losses (Viscusi and Chesson 1999; Wakker 2010 p. 354). Although losses and gains are equally important for applications, academic studies have focused almost exclusively on gains. This paper focuses on risk and ambiguity for losses.

In two experiments, we measure certainty equivalents of risky prospects and find the usual violations of expected utility, with inverse-S shaped probability weighting. This finding underscores the desirability to use nonexpected utility in our descriptive analysis. We then measure matching probabilities (objective-probability gambles equivalent to gambles on conflict or imprecision ambiguity).

For I-ambiguity, which is close to the usual form of ambiguity, we find the common amplification of the overweighting of extreme events, reflecting increased insensitivity. For C-ambiguity, if analyzed in the usual way (taking midpoints of probability intervals), we find the opposite, with reduced insensitivity. We do not interpret the latter finding as a violation of common views on ambiguity, but rather as evidence against taking probability midpoints: In C-ambiguity, experts expressing certainty are believed more than experts expressing doubts. This interpretation is supported by direct measurements of belief that were added in the second experiment. Our finding underscores that agents may misleadingly suggest certainty in cases of doubt to disproportionally influence decisions. This makes it extra desirable for principals to ensure that bonuses are incentive-compatible (Zeckhauser and Viscusi 1990).

1 Theory

1.1 Prospects and their evaluation

The preferences of a decision maker concern prospects. A prospect yields outcome x or y, where it is uncertain which of the two will result. We assume x ≤ y ≤ 0 throughout. Hence, outcomes are losses (if negative) or 0. We assume that two agents have given their judgment on the likelihood of the outcomes, and we consider the following three situations, formally referred to as sources (of uncertainty).

  • Risk: xpy denotes a prospect yielding x with known objective probability p and y with probability 1−p. Everyone agrees about the probabilities, including the two agents.

  • Imprecision (I) ambiguity: The agents are not able to give a precise probability judgment, and they only indicate a probability interval. We assume that they give the same interval [,h]. x[,h]y denotes the resulting prospect. We assume \( 0 \leqslant \ell \leqslant {\text{h}} \leqslant 1 \) throughout.

  • Conflict (C) ambiguity: x{,h}y denotes the prospect with no known probabilities available. Both agents give a precise probability judgment, but the two judgments are different. One agent judges the probability to be , whereas the other judges it to be h. We again assume \( 0 \leqslant \ell \leqslant {\text{h}} \leqslant 1 \) throughout.

Our study only uses prospects that yield no more than two outcomes, both nonpositive. Virtually all decision models existing today agree on this domain.Footnote 3 They all amount to the following evaluation, which we call binary rank-dependent utility. When choosing between prospects, the one with the highest evaluation is preferred.

$$ {{\text{x}}_{\text{p}}}{\text{y}} \mapsto {\text{w(p)U(x)}} + \left( {{1} - {\text{w}}\left( {\text{p}} \right)} \right){\text{U(y);}} $$
(1)
$$ {{\text{x}}_{{\left[ {\ell, h} \right]}}}{\text{y}} \mapsto {\text{W}}[\ell, {\text{h}}]{\text{U}}\left( {\text{x}} \right) + ({1} - {\text{W}}[\ell, {\text{h}}]){\text{U}}\left( {\text{y}} \right); $$
(2)
$$ {{\text{x}}_{{\left\{ {\ell, h} \right\}}}}{\text{y}} \mapsto {\text{W}}\{ \ell, {\text{h}}\} {\text{U}}\left( {\text{x}} \right) + \left( {{1} - {\text{W}}\{ \ell, {\text{h}}\} } \right){\text{U}}\left( {\text{y}} \right). $$
(3)

Here is the utility, assumed continuous and strictly increasing. For simplicity, we assume U(0) = 0. Moreover, w: [0,1] → [0,1] is the probability weighting function, and is assumed continuous and strictly increasing with w(0) = 0 and w(1) = 1. Similarly, W[,h] and W{,h} are event weighting functions, taking values between 0 and 1. Note that loss aversion plays no role for pure loss (or gain) prospects, as in our domain, because it only concerns the exchange rate between gain and loss utility.

Given that we focus on one event (under different states of information), we do not need to specify further restrictions on W for the purposes of this paper. It is natural that both Ws are increasing in and h. Following the conventions of prospect theory, we have chosen formulas where the weighting is first applied to the outcome x farthest remote from 0, which here is the worst loss. Alternative formulas where the best outcome is weighted first are data equivalent (Wakker 2010 §7.6). Our choice implies that overweighting (W large) enhances risk aversion and pessimism.

In our experiments, we measure certainty equivalents of risky prospects. The certainty equivalent (CE) of a prospect is the sure amount that is equally preferred (indifferent, denoted ~) to the prospect. That is, U(CE) is equal to the above evaluation of the prospect. We compare different sources of uncertainty. For example, we define the matching probability of [ℓ,h] as the probability r such that x[,h]y ~ xry for all x,y, and the matching probability of {ℓ,h} as the probability r such that x{,h}y ~ xry for all x,y. A matching probability always exists and is unique. Because an indifference for any one pair x < y implies the same indifference for all such x,y, we can use any such pair to find matching probabilities.

1.2 Properties of probability weighting functions (risk attitude)

Figure 1 depicts some possible properties of weighting functions w(p) for risk. We will later consider similar properties for other functions (matching probabilities). Figure 1a depicts overweighting of losses, implying pessimism and enhancing risk aversion. Figure 1b describes the opposite pattern. Figure 1c shows an inverse-S shape that combines optimism and pessimism, with overweighting of small probabilities and underweighting of large probabilities. All weights are then moved towards 0.5, suggesting a lack of sensitivity and of discriminatory power. It is a move in the direction of taking everything as fifty-fifty.

Fig. 1
figure 1

Some properties of weighting functions

Although there have not yet been many empirical studies into risk attitudes for losses, the prevailing shape so far has been the one in Fig. 1d (Wakker 2010 §9.5). It is the combination of some optimism and insensitivity. It is similar to the prevailing shape for gains (where the low elevation reflects pessimism rather than optimism), but is closer to linearity. The underweighting of high probabilities of worst outcomes implies, by complementarity, that small probabilities of good outcomes are overweighted. In what follows, the expression that extreme and rare events are overweighted refers to both these phenomena. For the purposes of this paper we need not define the aforementioned properties formally, because we can use graphs to illustrate them. Formal definitions are in Wakker (2010 Chs. 6 and 7).

1.3 Properties of matching probabilities (ambiguity attitude)

We now turn to ambiguity and events E without unknown probabilities. Properties similar to the ones explained for w(p) can be defined for general weighting functions W(E), even though we cannot draw graphs for W (Wakker 2010 Ch. 10). For the special case studied in this paper, graphs can still be devised and used, as will be done in what follows.

We will consider events related to probability pairs

$$ [{\text{p}} - {\text{r}},\,{\text{p}} + {\text{r}}]\,{\text{and}}\,\{ {\text{p}} - {\text{r}},\,{\text{p}} + {\text{r}}\} $$

only for one fixed r (r = 0.1 in the experiments). These probability pairs thus depend only on the midpoint probability p, and we can define

$$ {\text{W}}\left[ {{\text{p}} - {\text{r}},{\text{p}} + {\text{r}}} \right] = {{\text{w}}_{\text{i}}}\left( {\text{p}} \right) $$
(4)

and

$$ {\text{W}}\left\{ {{\text{p}} - {\text{r}},\;{\text{p}} + {\text{r}}} \right\} = {{\text{w}}_{\text{c}}}\left( {\text{p}} \right). $$
(5)

Here wi is the imprecision weighting function and wc is the conflict weighting function. In the literature, it is common to relate [p − r, p + r] and {p − r, p + r} to a degree of belief equal to the midpoint probability p,Footnote 4 and to interpret r as a measure of ambiguity. We take this approach as our working hypothesis, we test its plausibility, and we will later discuss deviations.

Weighting functions W can conveniently be summarized in terms of w, the weighting function for risk, and matching probabilities m, because, by the evaluations assumed (Eqs. 13), xEy ~ xm(E)y implies

$$ {\text{W}}\left( {\text{E}} \right) = {\text{w}}\left( {{\text{m}}\left( {\text{E}} \right)} \right). $$
(6)

For later purposes, we rewrite it as

$$ {\text{m}} = {{\text{w}}^{{\; {1}}}}\left( {\text{W}} \right). $$
(7)

Thus, the general attitude towards uncertainty consists of the risk attitude comprised by w(p), and added on top of that, the ambiguity attitude comprised by the matching probability function m. Ambiguity is the difference between uncertainty and risk, and is thus captured by m. Before ambiguity theories became popular, matching probabilities were widely used in expected utility to measure subjective probabilities (Arrow 1951, Footnote 4; Holt 2007 §30.5; Raiffa 1968; Winkler 1972 p. 272). Their usefulness for studying ambiguity has recently been recognized (Budescu et al. 2002; Hollard et al. 2010; Kahn and Sarin 1988; Viscusi and Magat 1992). Given a fixed r, we can define matching probabilities mi(p) and mc(p) as the matching probabilities for I- and C-ambiguity (details follow below). They now are maps on the unit interval, and graphs can be drawn as before to depict their properties. We have

$$ {\text{W}}\left[ {{\text{p}} - {\text{r}},\;{\text{p}} + {\text{r}}} \right] = {{\text{w}}_{\text{i}}}\left( {\text{p}} \right) = {\text{w}}\left( {{{\text{m}}_{\text{i}}}\left( {\text{p}} \right)} \right); $$
(8)
$$ {\text{W}}\left\{ {{\text{p}} - {\text{r}},\,{\text{p}} + {\text{r}}} \right\} = {{\text{w}}_{\text{c}}}\left( {\text{p}} \right) = {\text{w}}\left( {{{\text{m}}_{\text{c}}}\left( {\text{p}} \right)} \right); $$
(9)
$$ {{\text{m}}_{\text{i}}}\left( {\text{p}} \right) = {{\text{w}}^{{ - {1}}}}{{\text{w}}_{\text{i}}}\left( {\text{p}} \right)\;{\text{and}}\;{{\text{m}}_{\text{c}}}\left( {\text{p}} \right) = {{\text{w}}^{{ - {1}}}}{{\text{w}}_{\text{c}}}\left( {\text{p}} \right). $$
(10)

Pessimism of the matching-probability function m reflects higher pessimism for uncertainty than for risk, i.e., ambiguity aversion. Insensitivity of m similarly reflects higher insensitivity for uncertainty than for risk, which we call a(mbiguity-generated) insensitivity.

Figure 2 depicts some graphs of matching probabilities, using our working hypothesis that we can use midpoint probabilities on the x-axes. The general attitude towards uncertainty (beyond the utility component) is the composition of a curve from Fig. 1 (risk attitude) and one from Fig. 2 (ambiguity attitude). For example, the curve in Fig. 1d, if combined with the one in Fig. 2g, gives a curve like these two but more pronounced. This is consistent with the common empirical finding that attitudes towards uncertainty are like those towards risk, but more pronounced (Abdellaoui et al. 2005; Fellner 1961 p. 684; Gayer 2010; Hogarth and Einhorn 1990; Kahn and Sarin 1988 p. 270; Kahneman and Tversky 1979 p. 281; Kilka and Weber 2001; Machina 1982 p. 292; Weber 1994 p. 237/238). In other words, the effects of ambiguity (matching probabilities) reinforce those of risk.

Fig. 2
figure 2

Graphs of ambiguity attitudes and indexes

1.4 Indexes of aversion and insensitivity towards ambiguity

This paper quantitatively analyses ambiguity attitudes through matching probabilities, using Abdellaoui et al.’s (2011) indexes of pessimism and insensitivity. We modify these indexes regarding two aspects of our study. First, we deal with losses. Consequently, overweighting captures pessimism, and not optimism as for gains. We therefore multiply the original pessimism index by −1 so that it still corresponds with pessimism. Second, we consider these indexes for matching probabilities (as functions of midpoint probabilities) rather than for regular weighting functions. This means that the risk component w has been removed. Thus, the index of pessimism represents the extra pessimism generated by ambiguity on top of the pessimism for risk. That is, it reflects ambiguity aversion. Similarly, the index of a-insensitivity captures the extra insensitivity generated by ambiguity.

To compute the indexes, we use linear regression to find linear functions

$$ {\text{c}} + {\text{sp}}\left( {{\text{truncated}}\,{\text{at}}\,{\text{values}}\,0\,{\text{and}}\,{1}} \right) $$
(11)

that best fit the (data points observed regarding the) matching probabilities. We emphasize that this regression line should not be interpreted as a statistical estimation. It only serves to recode data using mathematical calculations. Thus, we choose c and s in Eq. 11 to minimize a squared distance without any reference to an underlying statistical model. Similarly, the linear regression should not be interpreted as any commitment to the neo-additive weighting functions of Eq. 11. It can be applied to any weighting function chosen by a researcher, also if not neo-additive, and to any set of data points, so as to obtain indexes of ambiguity attitude.

We call d the dual intercept, i.e. \( {\text{d}} = 1 - {\text{c}} - {\text{s}} \). We define

$$ {\text{c}} - {\text{d}}:index\,of\,ambiguity\,aversion $$
(12)

and

$$ {\text{c}} + {\text{d}}\,\left( { = {1} - {\text{s}}} \right):index\,of\,a - insensitivity. $$
(13)

In Fig. 2, higher rows correspond with higher curves and higher indexes of ambiguity aversion. Writing qa for an ambiguous (midpoint) probability q of [q−r, q + r] or of {q−r, q + r}, the matching probabilities of 0.5a are 0.46, 0.50, and 0.54 in the central Fig. 2h, e, and b, respectively. These correspond with indifferences −10000.5a0 ~ −10000.460, −10000.5a0 ~ −10000.500, and −10000.5a0 ~ −10000.540, respectively. The indifferent risky prospect becomes more and more unfavorable, in agreement with more and more ambiguity aversion.

Left curves in Fig. 2 correspond with increased a-insensitivity. We consider a pair of ambiguous prospects (−10000.1a0, −10000.9a0). Here, and in the pairs considered next, the right prospect is more unfavorable. We consider the pair for the right, central, and left figures in the middle row. These figures give the following pairs of risky prospects indifferent to the two ambiguous prospects. Figure 2f: (−10000.020, −10000.980); Fig. 2e: (−10000.100, −10000.900); Fig. 2d: (−10000.180, −10000.820). In these three pairs, the two risky prospects get closer and closer to each other. The subject discriminates different levels of likelihood less and less, and exhibits more and more a-insensitivity. This effect is related to the cognitive component of discriminating different levels of likelihood under ambiguity. Figure 2e depicts neutrality with respect to both components, and both indexes are 0 there.

In general, the indexes of Abdellaoui et al. (2011) can be calculated for any function on the unit interval, or set of data points of such a function. They involve the best-fitting regression line on the (0,1) interval, and reflect global elevation and sensitivity. Their interpretations depend on the function for which they are calculated. Abdellaoui et al. (2011) considered the indexes for source functions, transforming additive subjective probabilities into decision weights, and capturing all deviations from expected utility. Ambiguity attitudes, reflecting differences between known and unknown probabilities, could then be derived from differences between source functions for unknown probabilities and those for known probabilities. We consider the indexes for matching probabilities. As Eqs. 610 show, the risk component has then been removed and, hence, our indexes directly reflect ambiguity. It is a common working hypothesis in studies of probability intervals that matching probabilities equal to the midpoints of those intervals reflect ambiguity neutrality, as in Fig. 2e.

2 Experiment A: measuring risk attitudes and ambiguity attitudes for imprecise and conflicting sources of information

This first, explorative, experiment examines risk and ambiguity attitudes for I- and C-sources.

Analysis

We measure certainty equivalents of several prospects. We first derive utility and probability weighting for risk. Then we study ambiguity attitudes by analyzing how the two ambiguous sources differ from risk. The latter is done by analyzing how their matching probabilities differ from the corresponding midpoint probabilities. In what follows, we derive matching probabilities from parametric fitting. Appendix A3 reports on the results of an alternative, parameter-free analysis, based on direct comparisons of CEs, which gives results consistent with those reported in the main text. We use power utility,

$$ {\text{u}}\left( {\text{x}} \right) = - {\left( { - {\text{x}}/{1}000} \right)^{β }},{\text{ x}} \leqslant 0,β > 0, $$
(14)

and Goldstein and Einhorn’s (1987) probability weighting function,

$$ {\text{w}}\left( {\text{p}} \right) = \delta {{\text{p}}^{\gamma }}/(\delta {{\text{p}}^{\gamma }} + {\left( {{1} - {\text{p}}} \right)^{\gamma }}),\delta \geqslant 0,\gamma \geqslant 0, $$
(15)

to fit data for risk. These families are commonly used. Utility u is concave (u′ decreasing as the negative x increases towards 0), enhancing risk aversion, whenever β ≥ 1, and it is convex whenever β ≤ 1.

The larger δ is, the more elevated the probability weighting curve is, generating more pessimism. The larger γ is, the more sensitive the curve is. If we calculate the pessimism and insensitivity indexes for the probability weighting functions for risk considered here,Footnote 5 then δ will be closely related to the pessimism index and γ will be closely related to the insensitivity index.

In both experiments, we used a standard nonlinear least square regression (Levendberg-Marquadt algorithm) to simultaneously obtain the estimates of the utility and probability weighting parameters. Throughout this paper, t-tests are two-sided unless stated otherwise.

Subjects

N = 61 post-graduate students (60 male, median age = 22) in civil engineering at Arts et Métiers ParisTech, Paris, FranceFootnote 6 were invited by email. None of them had participated in an experiment on decision making before.

Stimuli

We measured the certainty equivalents of the 20 prospects in Table 1. Throughout this paper, the probability spread r is 0.1. For each source, we considered five different (midpoint) probability levels p, namely 0.1, 0.3, 0.5, 0.7, and 0.9. For example, prospect number A11 (−1000[0,0.2]0) was the second one presented to the subjects, as indicated by rank 2 in the table.

Table 1 The 20 prospects whose certainty equivalents were elicited in Experiment A

Incentives

The subjects were given a fixed €10 participation fee. Our use of hypothetical choice rather than real incentives is discussed in §4.

Procedure

The prospects were presented to the subjects on a computer screen in a fixed random order (see the column Rank in Table 1). The sure loss was always displayed on the right-hand side of the screen and the other prospect was shown on the left-hand side. The subjects had to choose between these two options. The following text was given to the subjects to describe the risky prospects, where the agents were called experts: “The two experts have exactly the same best estimate of the risk. Each expert confidently estimates that there is a (100p)% risk of losing €x (otherwise, the loss is €y)” (screenshot A in Fig. 6 in the appendix). We substituted the appropriate numerical values for p. Screenshots of typical choice tasks are available in the appendix.

The explanation for C-ambiguity was as follows: “The two experts do not agree on the risk. They have different best estimates: Expert A confidently estimates that there is a 100(p − 0.1)% risk of losing €1000 (otherwise, the loss is €0). Expert B confidently estimates that there is a 100(p + 0.1)% risk of losing €1000 (otherwise, the loss is €0)” (screenshot B in Fig. 6). We substituted the appropriate numerical values for p + 0.1 and p − 0.1. In addition, we displayed two different pie graphs, one for each expert’s prediction, to visually make clear that the two experts did not have the same estimate of the probability of the loss.

The explanation for I-ambiguity was as follows: “The two experts have exactly the same best estimate of the risk. Each expert confidently estimates that the risk of losing €x ranges from 100(p − 0.1)% to 100(p + 0.1)% (otherwise, the loss is €0).” A dynamic pie was shown on the screen to convey the imprecision of the forecast, with the size of the sectors of the pie chart slowly changing between the two bounds of the interval.

Measuring certainty equivalents

For each of the 20 prospects, the subjects were asked to make approximately 5 binary choices between the prospect and a sure loss in a bisection procedure. In this procedure, the midpoint between the best sure loss less preferred and the worst sure loss more preferred than the prospect is taken as the CE of the prospect. Details are in the appendix.

Checking consistency

At the end of the experiment, the subjects were asked to give their preferences between 6 prospects (A1, A3, A16, A18, A11, and A13) and their expected value a second time (the first time was as the first preference question in bisection).

2.1 Results

2.1.1 Consistency checks

Table 2 gives the consistency rates for the six questions presented twice. The consistency rates vary between 69% and 89%, with an average of 77.32%. In other words, approximately three-quarters of the subjects gave the same answer the second time. This rate agrees with common findings in the field (Abdellaoui 2000; Camerer 1989).

Table 2 Consistency check (Experiment A)

2.1.2 Risk attitudes from parametric fitting

Table 3 displays the results from data fitting for risk.

Table 3 Parameters of the utility and weighting functions that best fit the certainty equivalents for risky choices (Experiment A)

The estimated β exceeds 1 (p < 0.01, t 60  = 3.99), indicating concave utility. The probability weighting function exhibits a small degree of elevation (pessimism), and the usual inverse-S shape (δ < 1, p < 0.01, t 60 = −6.00; γ < 1, p = 0.03, t 60 = −2.24). Details are in the appendix.

2.1.3 Ambiguity attitudes from parametric fitting using matching probabilities

Having estimated the utility function U, weights W(E) can be obtained from indifferences CE ~ xEy, for E = {,h} or E = [,h] using the formula

$$ {\text{W}}\left( {\text{E}} \right) = \frac{{{\text{U}}\left( {\text{CE}} \right) - {\text{U}}\left( {\text{x}} \right)}}{{{\text{U}}\left( {\text{y}} \right) - {\text{U}}\left( {\text{x}} \right)}} .$$
(16)

With all weighting functions available, we obtain matching probabilities through Eq. 10. Figure 3 reports their mean values and 95% confidence intervals. This figure provides a graphical illustration of I- and C-ambiguity attitudes. The figures are similar to Fig. 2a and f, but are closer to linearity.

Fig. 3
figure 3

Estimated marginal means of I- (imprecision) and C- (conflict) matching probabilities (Experiment A)

The ambiguity aversion index for I is positive (mean = 0.06, p < 0.01, t 60 = 2.94). We find no ambiguity aversion or seeking for C-ambiguity (mean = 0.00; p = 0.97, t 60 = 0.04). The index of I-ambiguity aversion exceeds that of C-ambiguity aversion (p < 0.01, t 60 = 2.75). The a-insensitivity index for I-ambiguity is positive (mean = 0.13, p < 0.01, t 60 = 3.84) and for C-ambiguity it is negative (mean = −0.05, p = 0.04, t 60 = −2.08). It is, obviously, higher for I-ambiguity than for C-ambiguity (p < 0.01, t 60 = 6.83).

We find the following differences for matching probabilities: mi(0.1) > 0.1 (mean = 0.19, p < 0.01, t 60 = 4.43), mi(0.9) < 0.9 (mean = 0.86, p < 0.01, t 60 = −2.87), and mc(0.1) < 0.1 (mean = 0.06, p < 0.01, t 60 = −4.20). An ANOVA corrected for repeated measures (with the Greenhouse-Geisser correction) with two factors (the five probability levels and the two types of ambiguity) and their interaction confirms the above results. The probability level is significant (p < 0.01, F3.11 = 492.94). The matching probabilities differ across the two types of ambiguity (p < 0.01, F0.1 = 7.57). These differences are influenced by the probability level (p < 0.01, F205.17 = 9.24). Using paired t-tests adjusted with the Bonferroni correction, we find that mi(0.1) > mc(0.1) (p < 0.01, t 60 = −6.71) and mi(0.9) < mc(0.9) (p < 0.01, t 60 = 3.50). Other differences are not significant. To illustrate the size of the differences found, consider a loss of €1000 with a midpoint probability p = 0.10. The average CE is €102 for conflict-ambiguity, €146 for risk, and €221 for imprecision-ambiguity. Such small-probability losses are relevant for insurance, where the different sources generate big differences in insurance premiums.

2.2 Summary and discussion of results of Experiment A

Risk attitudes

Our results for probability weighting under risk agree with the prevailing findings in the literature. We find an inverse-S shape with overweighting of small probabilities and underweighting of moderate and large probabilities. The latter enhances optimism and risk seeking for losses. We find weakly concave utility. Many papers have found that utility for losses is close to linear and preferences are close to risk neutrality (reviewed by Wakker 2010 p. 264). Abdellaoui et al. (2008) also found weakly concave utility in combination with weakly prevailing risk seeking for losses.

Viscusi et al. (2011) showed that observations of other people’s choices do influence own decisions even when the own risks are fully known, so that the choices of the other people are not informative. This effect can play a role in our experiment if the experts’ information is taken as reflecting the experts’ decisions. However, this effect is intrinsic in belief aggregation, and we do not consider it to be a distortion.

Evidence against the Bayesian model

As just explained, the probability weighting functions for risk deviate significantly from the Bayesian identity function w(p) = p, falsifying expected utility. The deviations of the curves in Fig. 3 by themselves could be accommodated by the Bayesian model, by assuming that (Bayesian) subjective probabilities deviate from the midpoints of the intervals. However, these deviations are too pronounced at the extremes, especially for I-ambiguity, to be plausible. Experiment B will provide further evidence.

I-ambiguity aversion and C-ambiguity neutrality

The predictions on ambiguity aversion for losses are divided in the literature. The common assumption, especially in the theoretical literature, is universal ambiguity aversion, for both gains and losses, and some empirical studies have confirmed this. Yet, most empirical studies have found prevailing ambiguity seeking, rather than aversion, for losses (reviewed by Wakker 2010 p. 354). Thus, the case is not very clear for losses. One reason that patterns for losses are less clear is that losses are more difficult for subjects to process than gains, and losses generate more noise (de Lara Resende and Wu 2010 p. 129). The (global) ambiguity aversion indexes for matching probabilities show that there is more pessimism for I-ambiguity than for risk; i.e., we find more I-ambiguity aversion than I-ambiguity seeking. For C-ambiguity on the other hand, the overall index does not deviate from neutrality and there is as much C-ambiguity aversion as C-ambiguity seeking.

A-insensitivity for I- and C-ambiguity

Because insensitivity is a cognitive component, it can be expected to be less affected by the domain (gain or loss) of outcomes. Hence we can expect to find a-insensitivity, commonly observed in the gain domain, in the loss domain as well. We indeed find a-insensitivity for I-ambiguity. In other words, people are more insensitive towards imprecise probabilities (I-ambiguity) than towards known probabilities (risk). Thus, I-ambiguity amplifies insensitivity. This agrees with the common finding that uncertainty amplifies phenomena found under risk. Several studies have observed a-insensitivity in the gain domain (Abdellaoui et al. 2005; Fellner 1961 p. 684; Gayer 2010; Kahn and Sarin 1988 p. 270; Kahneman and Tversky 1975 p. 15 2nd para; Kahneman and Tversky 1979 p. 281 lines −6/−5 and p. 289 l. 5–6; Kilka and Weber 2001; Viscusi 1989; Weber 1994 pp. 237–238). For losses, we are aware of only one study examining a-insensitivity (Abdellaoui et al. 2005); they confirmed it. It leads, for instance, to lower insurance premiums rather than the higher ones predicted by the universal ambiguity aversion often assumed in theoretical studies.

The findings for C-ambiguity regarding insensitivity do not agree with the common finding of a-insensitivity when analyzed in the usual way. We find less, rather than more, insensitivity than under risk. This finding is hard to reconcile with modern views on ambiguity, and suggests that background assumptions are violated. We will explore and discuss this suggestion in more detail in Experiment B.

Explanation for over-sensitivity and less ambiguity aversion in C-ambiguity

In general, there must be a control for degrees of belief to measure ambiguity attitudes. For example, if we find a preference for gambling on an ambiguous rather than on an unambiguous event and want to explain this finding as ambiguity aversion, then the two events must have the same degree of belief/likelihood in some sense. In situations of ambiguity as considered here, comparisons are usually made between events with the same midpoint probabilities, where the latter should provide the required control. We have followed this averaging tradition in our analysis for both I- and C-ambiguity.

For extreme events in C-ambiguity, the above belief/likelihood control may be problematic though. If the first agent assigns probability 1 to some event and the second agent assigns probability 0.8, then the first agent is apparently sure whereas the second is uncertain. It then makes sense to assign more confidence weight (Nau 2002) to the first, sure, agent’s judgment than to the second, insecure, judgment. The perceived likelihood will then exceed the midpoint 0.9. Such a processing of information is perfectly sensible, and leads to the high weight assigned to such events. It can be captured by rational Bayesian decision models, irrespective of any ambiguity. Our finding then implies that agents who express certainty receive extra weight in conflicting-belief aggregation. This is consistent with findings by Budescu and Yu (2006, 2007), Yates et al. (1996), and Keren and Teigen (2001). It generates an effect counter to a-insensitivity, and, if it is not recognized, it may seem that neutrality or even oversensitivity was found, as happened in our experiment.

One clear implication of our research is therefore to caution against the common practice in experiments on probability intervals to control for likelihood perceptions of such intervals through their midpoints. Our findings indeed show that probability intervals and, more generally, sets of multiple priors cannot all be analyzed in the same way. The source of uncertainty matters. In particular, we cannot just equate {,h} with its convex hull [,h]. It matters whether agents have the same or different information (Crès et al. 2011; Dietrich 2010).

Gajdos et al.’s (2008) contraction expected utility model, theoretically extended to belief aggregation by Gajdos and Vergnaud (2011), can be an alternative to our source method for analyzing the phenomena found here. Their contractions of a priori given probability intervals can be used to differentiate between different sources of uncertainty. To gain more insight into motivational versus cognitive factors underlying our explanation, we devised another experiment presented in the next section.

3 Experiment B

3.1 Introduction

Because Experiment A raises questions about subjective probabilities and the (cognitive) likelihood perception of extreme events, Experiment B focuses on such events. We now additionally ask for direct judgments of probability. Although such data is not based on revealed preferences, it can shed further light on the cognitive conjectures resulting from Experiment A. These conjectures entail that people do not have a different ambiguity attitude in the C-ambiguity of x{0.8,1}y than in the I-ambiguity of x[0.8,1]y. Instead, they simply have a higher degree of belief that x will happen. It will enhance S-shaped judged probabilities (as functions of midpoint probabilities) for C-ambiguity. If these conjectures are correct, then direct probability judgments can support them.

3.2 Experimental method

In most respects, this experiment is like Experiment A. We focus on the differences in what follows.

Protocol

N = 63 bachelor and master students (36 male, median age = 20.5, 40 Dutch) at Erasmus University, Rotterdam, the Netherlands participated, taken from an email list of students willing to participate in experiments on decision making. They were guaranteed a €15 flat participation fee. The experiment was conducted in six sessions of 10 or 11 subjects.Footnote 7

We first asked the subjects to answer binary choice questions, as in Experiment A, using the same software. We elicited CEs of the 18 prospects displayed in Table 4. The order of the prospects was randomized for each subject.

Table 4 The 18 prospects whose certainty equivalents were elicited in Experiment B, with notation as in Table 1

Unlike in Experiment A, we also asked the subjects to give their judged beliefs for the prospects B13-B18. We randomly selected one prospect to check the stability of the answers. Two subjects gave erratic judged beliefs, showing lack of understanding, and were removed from the sample.

Stimuli

To elicit judged beliefs, we presented the subjects with a figure displaying a prospect (I-ambiguity or C-ambiguity) on the left-hand side of the screen. On the right-hand side, there was an input box where subjects could type their best estimate (on a 0–1 scale) of the probability of losing €1000. As soon as the best estimate was entered, a pie appeared to visually represent the probability. Figure 7 in the appendix displays an example of a screenshot.

Checking consistency

Unlike in Experiment A, in Experiment B we did not measure consistency by asking the subjects to repeat choices between some prospects and their expected value. Instead, we randomly selected two prospects and asked the subjects to go through the whole CE elicitation process again. As explained before, we also repeated the elicitation of one judged belief per subject.

3.3 Results

3.3.1 Consistency checks

We find no difference between the two CEs elicited for consistency checks (p = 0.18, t 121 = −1.36). The repeated judged beliefs did not differ either (p = 0.49, t 60 = 0.69).

3.3.2 Risk attitudes from parametric fitting

We estimated the parameters of the utility function and the probability weighting function (Table 5) using the twelve risky prospects B1–B12 in Table 4. The estimate of β exceeds 1 (p < 0.01, t 60 = 4.36), indicating concave utility. This result is consistent with Experiment A.

Table 5 Parameters of the utility and weighting functions that best fit the certainty equivalents for risky choices (Experiment B)

As in Experiment A, the mean value of δ is smaller than 1 (p = 0.02, t 60 = −2.36), enhancing optimism. Unlike in Experiment A, the mean value of γ is not different from 1 here (t-test: p = 0.24, t 60 = 1.18). Experiment B’s probability weighting function therefore is closer to linear, which is not uncommon for losses.

3.3.3 Ambiguity attitudes from parametric fitting using matching probabilities

We derived matching probabilities as in Experiment A. For I-ambiguity, the a-insensitivity index is positive (mean = 0.12, p < 0.01, t 60 = 3.64) and the ambiguity aversion index is positive but only marginally significant (mean = 0.03, p = 0.08, t 60 = 1.77). I-ambiguity therefore exhibits the same pattern as in Experiment A (i.e., Fig. 4 is similar to Fig. 2a/d). For C-ambiguity, both indexes do not differ from 0 (i.e., Fig. 4 is similar to Fig. 2e). The I-index of a-insensitivity exceeds the C-index (p < 0.01, t 60 = 3.31), but the I-index of ambiguity aversion is not larger (p = 0.28, t 60 = 1.09).

Fig. 4
figure 4

Estimated marginal means of I- and C-revealed beliefs (Experiment B)

For I-ambiguity, we find overestimation for midpoint probability 0.1 (mean = 0.16, p < 0.01, t 60 = 3.30), and we find underestimation for midpoint probability 0.9 (mean = 0.86, p = 0.01, t 60 = −2.57). The matching probabilities for C-ambiguity do not deviate from the midpoint probabilities 0.1 and 0.9. They are only somewhat higher for midpoint probability 0.5 (mean = 0.54, p < 0.01, t 60 = 2.77). As in Experiment A, C-matching probabilities do not exhibit an inverse-S shape.

An ANOVA corrected for repeated measures and with two factors (the three probability levels and the two types of ambiguity) and their interaction reveals that, in addition to the probability level (p < 0.01, F1.89 = 757.32), the interaction term is significant (p < 0.01, F1.80 = 7.96). Using paired t-tests adjusted with the Bonferroni correction, we find mi(0.1) > mc(0.1) (p < 0.01, t 60 = −4.38) as in Experiment A.

3.3.4 Judged beliefs

Figure 5 displays the mean values of the I- and C-judged beliefs as a function of midpoint probabilities. It shows that the judged beliefs never differ from the midpoint probabilities for I-ambiguity. However, they exceed the midpoint probability at probability 0.9 for C-ambiguity (mean = 0.91, p < 0.01, t 60 = 2.72). By t-tests, only the a-insensitivity index for C-ambiguity is marginally significant, and it is negative (mean = −0.01, p = 0.09, t 60 = 1.75), suggesting over-sensitivity. Apart from this, perceived probabilities seem to agree well with midpoint probabilities.

Fig. 5
figure 5

Estimated marginal means of I- and C-judged beliefs (Experiment B)

We next compare matching probabilities with judged beliefs. Matching probabilities exhibit more insensitivity. The difference is significant for I-ambiguity (mean difference = 0.11, p < 0.01, t 60 = 3.44) but not for C-ambiguity (mean difference = 0.04, p = 0.12, t 60 = 1.58). An ANOVA with two factors (the elicitation technique—matching probabilities vs judged beliefs; and the types of ambiguity) and their interaction confirms the results of t-tests: the main effect of elicitation technique is significant (p < 0.01, F0.1 = 8.39), like the source (p < 0.01, F1 = 14.46) and the interactions term (p = 0.02, F1 = 5.90). The same analysis on the ambiguity aversion indexes does not give significant results.

3.4 Summary of the results of Experiment B

The results of Experiment B are consistent with those of Experiment A for risk and ambiguity (matching probabilities) attitudes. Again, we find significant violations of expected utility for risk and we find the usual a-insensitivity for I-ambiguity but not for C-ambiguity. For I-ambiguity, the judged beliefs agree with midpoint probabilities, supporting the use of midpoint probabilities as levels of belief. This supports our claim in the discussion of Experiment A, under “evidence against the Bayesian model,” that the weights in Figs. 3 and 4 do not reflect (just) subjective probabilities, but that something else is going on: nonneutral attitudes towards ambiguity, deviating from expected utility.

The judged beliefs for C-ambiguity deviate from the midpoint probabilities. The judged belief at 0.9 exceeded 0.9, and the index of a-insensitivity was (marginally) below the neutral value 0. This extra sensitivity went against the usual a-insensitivity, and these two effects together gave the end result of no a-insensitivity for C-ambiguity. The extra oversensitivity relative to I-ambiguity was not caused by a change in the non-Bayesian component of ambiguity attitude or perception. Instead, it took place at the level of beliefs, i.e. of perceived likelihood. It would have occurred similarly had the decision makers been Bayesian.

4 Discussion

Real incentives and hypothetical choice

It is well known that performance-contingent real incentives are desirable in experiments. They enhance truthful answering and reduce noise (Camerer and Hogarth 1999; Hertwig and Ortmann 2001). Designing real decision situations with agent judgments available and disagreeing as required for our research, and doing so without deceiving these agents or the subjects, is very difficult to implement. We preferred to keep this first empirical study on belief aggregations under ambiguity simple and clear, which is why we chose a flat payment and hypothetical choice.Footnote 8 Given the ubiquity of belief aggregation in social behavior, and the necessity to consider ambiguity there, we felt that empirical studies are warranted even if no real incentives are possible.

There are extra reasons for using hypothetical choice when studying the, empirically important, losses. First, implementing real losses by making the subjects lose their own money is ethically questionable and hard to implement. The common implementation of losses, with prior endowments from which subjects pay back, has another serious drawback. Many subjects will integrate the payments and will not perceive any losses. Even if, as some studies have found, this number of subjects is a minority, say it is one-third, then still this minority may generate large distortions in the experiment, large enough to be responsible for any significant effects found. One-third of the subjects misperceiving the stimuli entails too big a distortion, and is too high a price to pay for implementing real incentives. Another drawback of prior endowments is that they may generate house money effects (Thaler and Johnson 1990).

Directly measuring matching probabilities

We measured matching probabilities m(E) indirectly from decision weights, through \( {\text{m}}\left( {\text{E}} \right) = {{\text{w}}^{{ - {1}}}}\left( {{\text{W}}\left( {\text{E}} \right)} \right) \). Matching probabilities can be inferred directly from equivalences (xq0) ~ (xE0). Substitution of Eqs. 13 then gives m(E) = q. We tried such direct measurements in pilots. Unfortunately, many subjects would routinely take q equal to the midpoint probability, not as an expression of true preference but as an easy heuristic. For this reason, and because the measurement of w and W is useful anyhow, we decided not to measure matching probabilities directly.

Alternative statistics for measuring ambiguity: avoiding commitment to parametric families

We could have chosen several equivalent ways to report the results from the three sources and the comparisons between them. We first reported absolute results for risk. For uncertainty, we reported differences with risk, i.e. how uncertainty deviates from risk. Those differences are referred to as ambiguity, an important and popular topic in the literature today. Absolute results about decisions under uncertainty can now be derived indirectly, as when taking compositions of the curves in Figs. 1 and 2. For example, we find some overweighting for risk, and some additional overweighting due to ambiguity for midpoint probability 0.1. Together this means that there is considerable overweighting under uncertainty.

To analyze ambiguity, we could also have investigated differences of certainty equivalents of the risky and the uncertain prospects. This analysis is consistent with the one reported here and is in the appendix. Thus, our conclusions do not depend on the particular parametric families that we chose to fit the risk attitudes.

Formal difference with the source method of Abdellaoui et al. (2011)

In Abdellaoui et al.’s (2011) source method, sources refer to different algebras, i.e. different collections of subsets (events) of the state space. We, instead, compared the same (algebra of) events under different informational circumstances. Although our sources are formally different, we can nevertheless readily use the same techniques of comparing ambiguity attitudes by comparing differences in subjective-probability weighting. Abdellaoui et al.’s (2011) analysis of source functions rather than matching probabilities was discussed in §1.

Specifying outcomes when measuring beliefs

We decided to specify outcomes in the elicitation of beliefs, to stay close to the other stimuli in the experiment. It implies that some subjects may have taken these questions as preference questions, directly measuring the matching probabilities. We obviously asked for judged likelihoods and not for preferences, hoping to trigger enough direct belief perceptions to obtain significant effects, which we did indeed obtain. The distortion due to misperception as decision question goes opposite to our findings and, hence, our significance inferences are not invalidated but are conservative.

The two-stage model

Several authors have studied a two-stage decomposition

$$ {\text{W}}\left( {\text{E}} \right) = \widetilde{\text{w}}\left( {{\text{J}}\left( {\text{E}} \right)} \right) $$
(17)

where J denotes directly-judged belief, and \( \widetilde{\text{w}} \) is the function carrying those judgments to decision weights (de Lara Resende & Wu 2010; Fox and Tversky 1998; Kilka and Weber 2001; Tversky and Fox 1995; Viscusi and Evans 2006; Wu and Gonzalez 1999). Earlier allusions to the two-stage decompositions with directly-judged beliefs J are in Fellner (1961 p. 672) and Kahneman and Tversky (1975 pp. 14, 15). The two-stage decomposition differs from our decomposition \( {\text{W}}\left( {\text{E}} \right) = {\text{w}}\left( {{\text{m}}\left( {\text{E}} \right)} \right) \) where m denotes matching probability and w is the risky weighting function. We can study the two-stage model in Experiment B, where we measured judged probabilities. For C-ambiguity, \( \widetilde{\text{w}} \) of Eq. 17 then exhibits the usual inverse-S shape. This illustrates once more that our findings for C-ambiguity are due to beliefs deviating from midpoint probabilities rather than to unusual ambiguity attitudes.

5 Conclusion

This paper introduced modern ambiguity models into the empirical study of belief aggregation. These ambiguity models are descriptively more accurate than the common classical Bayesian models, and explain the violations of the latter that we found in belief aggregation. They allow the distinction of cognitive factors (ability to discriminate different levels of likelihood) and motivational factors (aversion to ambiguity), and the analysis of their separate effects. In between-agent uncertainty (conflict; C-ambiguity), extreme beliefs generate extra preference value (source preference) and, as regards cognitive effects, are overweighted in belief aggregation. These phenomena do not occur in within-agent uncertainty (imprecision; I-ambiguity). The latter is closer to traditional ambiguity. For C-ambiguity, the cognitive effects entail an empirical violation of the commonly assumed averaging of beliefs.

An implication of our findings for belief aggregation is that agents may want to overpresent their certainty. Hence, it is extra warranted for principals to implement incentive-compatible bonuses. An implication for modern theories of ambiguity is that identical (convex hulls of) possible priors can be treated differently by the same individual depending on the source of uncertainty. We conclude that modern ambiguity theory has allowed a more refined, and empirically more valid, analysis of belief aggregation in our paper than would have been possible using traditional theories.