We administer the Allais paradox questions to both a representative sample of the Dutch population and to student subjects. Three treatments are implemented: one with the original high hypothetical payoffs, one with low hypothetical payoffs and a third with low real payoffs. Our key findings are: (i) violations in the non-lab sample are systematic and a large bulk of violations is likely to stem from non-familiarity with large payoffs, (ii) we can identify groups of the general population that have much higher than average violation rates; this concerns mainly the lowly educated and unemployed, and (iii) the relative treatment differences in the population at large are accurately predicted by the lab sample, but violation rates in all lab treatments are about 15 percentage points lower than in the corresponding non-lab treatments.
This paper presents evidence on the consistency of risk preferences with expected utility theory in a representative population sample. We find that consistency increases with task familiarity and is linked to several personal characteristics such as education, income and asset holdings. Moreover, we investigate the external validity of a laboratory experiment with a student population that implemented the same choice problems as our household panel study. We find that, in line with studies on other biases, deviations from rationality observed in the lab provide a lower bound for deviations in the population at large.
Recently, several studies have made significant progress in understanding risk preferences in populations, making use of innovative survey methods and field experiments (Harrison and List 2004) including game shows with large stakes (Post et al. 2008; Andersen et al. 2008). From the perspective of these studies, the present paper takes one step back by focussing on consistency of risk preferences with expected utility theory in a representative subject pool—well over 1,400 members of the CentER Panel, a representative sample of the Dutch population. We do this by falling back on the oldest consistency test of all—the Allais paradox (Allais 1953). Our results help to understand the reliability and robustness of investigations into the actual distribution of risk preferences in populations.
Our research strategy is threefold. First, we implement three different treatments in the main experiment with the panel. We analyze the original Allais question with payoffs of millions of Euros that, just as when Allais asked Savage, were purely hypothetical. In our second treatment we scaled the payments down but kept them hypothetical. Our third treatment used the same downscaled payoffs but paid them out for real. This enables us to examine to what extent violations are driven by lack of monetary incentives, on the one hand, and non-familiarity with large sums of money on the other.
Second, we are able to exploit the wide range of background information that is available for our subjects in order to study the roots of violations.Footnote 1 Which personal characteristics are correlated with violations? Are violations a matter of insufficient education or limited experience with financial decision making? Can we identify ‘problem groups’ that are, perhaps, more likely to suffer (in particular late in life) from erroneous financial decision making?
Third, we conduct a laboratory experiment with the usual laboratory subject population (students) employing the same design that we used in the panel experiment. Thus, we are able to examine the external validity of a laboratory experiment in a clear and detailed manner. In particular, we can compare whether and how a lab study can tell us something about the population at large.
Pursuing our threefold research strategy we are, thus, able to present very detailed and comprehensive evidence on the Allais paradox. Our results are useful for several practical issues: (1) Our results point to a number of conditions that make standard theoretical predictions more likely to hold, (2) Our results identify certain parts of the population that, due to inconsistencies, may have difficulties in making sound financial decisions, and (3) Our results contribute to a better understanding of what can be reliably learned from laboratory experiments.
Along the first dimension of our research strategy we find that violations in the original paradox are likely to be driven by very high payoffs with which, in real life, virtually nobody has any practical experience. Violations in the original Allais problem are twice as high as in both downscaled versions. This effect has been observed before with student samples (Conlisk 1989); we show that the pattern extends to the general population and across socioeconomic characteristics. Perhaps this result is not surprising as it simply stresses that economic theory can be expected to work much better in environments with which agents have experience and are, thus, well-adapted. On the other hand, we find no substantial difference between the two downscaled versions. Whether subjects are incentivized or not, violations are much lower in both cases.Footnote 2
Along the second dimension, we are able to identify a whole array of personal characteristics that correlate with inconsistent decision making. Education, occupation, income and asset holdings do all correlate with inconsistent decision making and in each case the direction of effects is as one would guess. The better educated are more consistent and so are those in employment, those who earn more and those who hold financial assets.
Finally, our methodological contribution reveals that the laboratory results are rather useful in predicting behavior in a general population. First, the relative treatment differences are precisely the same for both populations, panel and lab. Second, as demonstrated in a number of other studies (see Gächter et al. (2008) for a survey) the violations of standard theory observed in the lab provide a lower bound for violations observed in the population at large.Footnote 3
The remainder of the paper is organized as follows. In Section 1, we describe the main characteristics of the CentERpanel and introduce the experimental design. In Section 2 we present our results obtained with the panel. We first give a quick overview of the results and then present a more detailed analysis, based on regression results, that also accounts for the effect of sociodemographic characteristics. In Section 3 we introduce our lab results and compare them to those obtained in the panel. Section 4 concludes.
Design and data collection
We administer the original “Allais questions,” which consist of two pairwise lottery choices. Consider the following two choice problems. First, a subject is asked to choose between lotteries A and A ∗ where
Second, a subject is asked to choose between lotteries B and B ∗ where
Of the four possible answers AB, A ∗ B ∗ , AB ∗ , and A ∗ B only the first two are consistent with expected utility theory (henceforth, EUT) whereas the last two are not.Footnote 4 Many laboratory experiments have shown that violations of EUT are frequent and that a larger share of subjects violating EUT chooses AB ∗ instead of A ∗ B.Footnote 5
We have six simple treatments using a between-subjects design. To introduce these treatments, consider the following lotteries over three outcomes of monetary payoffs with probabilities as above, i.e., A = (0,1,0), A ∗ = (.01,.89,.10), B = (.89,.11,0), B ∗ = (.90,0,.10). Our three treatments were then as follows:
Treatment HighHyp: Original Allais questions with high hypothetical payoffs of € 0, € 1 million,and € 5 million.
Treatment LowHyp: Allais questions with low hypothetical payoffs of € 0, € 5, and € 25.
Treatment LowReal: Allais questions with low real payoffs of € 0, € 5, and € 25.
Note that the amounts of money we use in these treatments are the same as in Conlisk (1989) with the sole difference that he used dollars instead of euros. For all three treatments we had two sub treatments reversing the order of decisions. As we do not find any order effects in the data we pool the data throughout.
We collected data from a representative sample of the Dutch population. The experiments were conducted by CentERdata—an institute for applied economic and survey research for the social sciences—that is affiliated with Tilburg University in the Netherlands. CentERdata carries out its survey research mainly by using its own panel called CentERpanel. This panel is Internet based and consists of some 2000 households in the Netherlands which form a representative sample of the Dutch population.Footnote 6 One of the advantages of the CentERpanel is that the researcher has access to background information for each panel member such as demographic and financial data. Every weekend, the panel members complete a questionnaire on the Internet from their home.
After logging on to our experiment, panel members were randomly assigned to one of the six different treatments introduced above. After being informed about the nature of the experiment, subjects decided whether or not to participate—as common with many modules of the panel. For participating subjects, the next screen introduced an example of a pair of lotteries (which were referred to as “Options”). Subjects were told that their task would be to express preference for one of the two lotteries and, additionally, how the preferred lottery would be executed.Footnote 7 When subjects indicated that they were ready to start the experiment, they were, in two consecutive screens, presented with their two Allais questions. Only after answering both Allais questions, the two preferred lotteries were played out (by the computer) and subjects were informed about the outcome of their two preferred lotteries. In the treatments with real monetary payments, subjects were paid according to the outcomes in both of their preferred lotteries.Footnote 8
In total 1676 members of the CentERpanel logged on to our experiment. Of the subjects logging on, 1426 (85.1%) subjects decided to participate in our experiment while 250 (14.9%) subjects decided not to participate. Table 1 shows descriptive statistics of our sample. The column labeled “Participation” in Table 1 shows descriptive statistics of participating subjects in each of the three main treatments as well as statistics of subjects who chose not to participate in the experiment. The data in Table 1 is grouped according to gender, age, education, occupation and income. (The column labeled “Violation” shows statistics for participating subjects violating or not violating EUT, respectively, which we will analyze further below. It also contains tests on the role of socioeconomic characteristics for EUT violation which will also be discussed later.)
Concentrating on descriptive statistics for participating subjects in Table 1, we note that by and large most variables are relatively identically distributed across treatments. However, in some of the age and income brackets as well as in the category savings account, there is some more variation. A comparison of the descriptive statistics in the columns describing participating subjects with those of non-participating shows that there are no big differences except for the age categories. Basically, older people appear to be a little more reluctant to participate.
Since this causes concern about sample selection problems, we ran for all regressions reported below Heckman (1976) selection models using the variable “Ratio” as one of the exclusion variables. The variable “Ratio” measures the proportion of questionnaires completed by panel members in the three months proceeding our experiment. This variable can be assumed to affect the participation decision but not the decisions taken in the experiment. For none of the regressions we found evidence for a selection bias.Footnote 9
A summary of the experimental results is given in Table 2. The table shows both the absolute frequency of choices (left part) and the relative frequency of choices (right part). As mentioned in the introduction, we will concentrate our analysis on the incidence of subjects’ EUT violation in all treatments. However, we will also shortly answer the question whether violations, once they occur, are systematic.
Violation of EUT
Note that the right-most column in Table 2 indicates that violations of EUT are observed in all treatments. In fact, we observe 49.5%, 19.6% and 25.6% violations of EUT in treatments HighHyp, LowHyp, and LowReal, respectively. Furthermore, in all treatments we observe that the fraction of EUT-violating AB ∗ answers is higher than the fraction of EUT-violating A ∗ B answers. The Z-statistic proposed in Conlisk (1989) indicates that the first fraction is significantly higher than the latter fraction at p < 0.001 in all treatments. An interesting question we can answer with our data is whether the differences we report here for the aggregate data are “general” in the sense of applying across socioeconomic attributes or whether they are driven by only some of those attributes. The answer is provided in Tables 5, 6, 7 and 8 in Appendix B, which are structured as Table 1 and provide—for all data and for the three treatments separately—the relative frequency of choices for subjects with various socioeconomic attributes. We observe that EUT violations occur across all socioeconomic attributes and that the “Allais” pattern of more AB ∗ violations than A ∗ B violations is significant for most socioeconomic attributes in all treatments (see the column labeled “Sign. of Conlisk’s Z-statistic” in Tables 5–8 in Appendix 6). We conclude that, as in earlier studies, violations of EUT are observed and that they are systematic in the sense that AB ∗ is chosen more often than A ∗ B, mostly independent of socioeconomic background characteristics. To facilitate comparison, note that Conlisk (1989) using a student sample for his “Basic Version” (which is comparable to our treatment HighHyp) reports the following relative frequencies of AB, A ∗ B ∗ , AB ∗ , and A ∗ B choices: 7.6%, 41.9%, 43.6%, and 6.8%. Thus, he observes EUT violation in 50.4% of the cases which compares to 45.5% in our panel treatment HighHyp.
The effect of high versus small hypothetical payoffs
Next consider the effect of high versus small hypothetical payoffs on the extent of EUT violation. For this purpose we compare the rates of EUT violations in treatments HighHyp and LowHyp. Table 2 shows that the rate of EUT violations drops from 49.4% in treatment HighHyp to 19.6% in treatment LowHyp. The D-statistic proposed in Conlisk (1989) indicates that this difference is highly significant at p < 0.0001 (D = 9.115). Inspecting the relative frequencies of choices in Table 2 shows that moving from HighHyp to LowHyp sharply increases the fraction of choices consistent with expected value maximization (A ∗ B ∗ ) at the expense of all other three possible responses. In particular, many more subjects prefer the payoff-maximizing choice A ∗ over A when (hypothetical) payoffs become small. A possible explanation of this result is due to the fact that subjects in treatment LowHyp can be expected to be more familiar with the lower amounts of money leading them to make fewer mistakes.Footnote 10 Again, with our data we can check whether the result regarding the effect of varying the (hypothetical) stake size just shown for the aggregate data also applies when the data is broken down to various socioeconomic characteristics. Column 3 labeled “Significance of Conlisk’s D-statistic HighHyp vs LowHyp” in Table 9 in Appendix B shows that the answer to this question is, with a few exceptions, yes.
The effect of (small) real versus (small) hypothetical payoffs
Finally, consider the effect of (small) real versus (small) hypothetical payoffs on the extent of EUT violation. To analyze this, compare the rates of EUT violation in treatments LowHyp and LowReal. Table 2 shows that the rate of EUT violations is 19.6% in LowHyp whereas it is 25.6% in treatment LowReal. Thus, we see a slight increase in the share of EUT violations when we move from (small) hypothetical to (small) real payoffs. The D-statistic in Conlisk (1989) indicates that this difference is significant (D = − 1.6716, p = 0.047). In contrast, Harrison (1994) and Burke et al. (1996) report that the use of low real instead of low hypothetical payoffs reduces the extent of EUT violation. For a broader overview on how incentives affect behavior in decisions under risk, see Camerer (1995, p. 634f). Note that the result regarding the switch from (small) hypothetical to (small) real payoffs on the extent of EUT violation is usually not significant when one zooms in on socioeconomic characteristics, as shown in column 4 labeled “Significance of Conlisk’s D-statistic LowHyp vs LowReal” in Table 9 in Appendix B.
Note that our results concerning the extent of EUT violation and the effect of high versus small hypothetical payoffs are not entirely new. We show, however, that they extend to a general population and across socioeconomic characteristics. This should be of interest due to the current discussion about the relationship between results obtained in the lab and those obtained in other settings (see, e.g., Levitt and List 2007).
Let us now turn to providing answers to the first of the two new and main dimensions of our research strategy by inspecting the role of socioeconomic background variables in subjects’ behavioral responses to the Allais questions. Refer to Table 1 that under the heading “Violation” shows descriptive statistics of the subsamples violating and not violating EUT as well as p-levels of χ 2 tests. (For the latter, see the notes below Table 1.) Regarding gender, Table 1 reveals that women are slightly more likely to violate EUT than men. With respect to age, Table 1 does not suggest a clear effect although we note that the age bracket’s [35–44] relative share is higher in the panel’s subpopulation not violating EUT. Regarding education levels, those with lower secondary education and those subjects with a university degree stand out somewhat in the panel. The former because they violate EUT more often and the latter because they violate EUT less often. The most noticeable effect regarding occupation is that those employed on a contractual basis have a higher relative share in the subsample not violating EUT. Finally, with respect to household income, Table 1 does not suggest a clear effect.
Moreover, refer to the rightmost column labeled “p-value, χ 2” in Table 1 that shows p-levels of χ 2 tests for differences between proportions of violating and non-violating subjects in the category listed in column 1.Footnote 11 The χ 2 tests indicate the strongest differences in violation behavior in the categories of education, occupation and household income.
Econometrics and the role of socioeconomic characteristics
To test for across-treatment differences controlling for subjects’ sociodemographic characteristics and to check whether any of these characteristics are correlated with behavior, we ran probit regressions with the variable “Violate” as the dependent variable. “Violate” is equal to 1 if a subject’s answer to the Allais questions violates EUT (i.e., answers A ∗ B or AB ∗ ), and is equal to 0 otherwise (i.e., answers AB or A ∗ B ∗ ). The background variables we include in the regression are the ones shown in Table 1 above. The results are shown in Table 3 which reports marginal effects. Regression (1) includes all data whereas regressions (2) to (4) show results for each of the three treatments separately. Recall from the end of Section 2 that we did not find evidence for a selection bias due to non-response.
Let us first briefly reconsider across-treatment differences. For this purpose, refer to regression (1) in Table 3 which includes all data and controls for background variables. Importantly, note that in regression (1) the omitted treatment dummy is the one for LowHyp. Inspecting the treatment coefficients, we note that the coefficient for HighHyp is positive and big (0.302) and highly statistically significant whereas the coefficient of LowReal is also positive (0.053) but rather small and only borderline significant.
To analyze the effect of socioeconomic background variables econometrically, we examine regression (1) in Table 3. We make the following observations.
Controlling for other characteristics, gender and age have no significant influence on the extent of EUT violation.Footnote 12
Regarding education, we find a strong tendency for violations to be reduced with further education.Footnote 13 Overall, there is a strong effect of higher education that also shows in the separate specifications for both treatments with low payoffs. In LowHyp everything that improves on primary education goes hand in hand with reduced violations. Only in HighHyp there is no effect of education. This suggests an interesting interaction effect of experience with a decision domain and education. In the absence of any experience (as in HighHyp) education on its own does little to improve performance. Only if coupled with experience education is aligned with consistency.
Of the various occupational affiliations listed in Table 3, we find that the unemployed and ‘others’ do much worse than the employed, self-employed and freelancers.Footnote 14 This is more pronounced in treatments with hypothetical payoffs.
Regarding income, we notice that having a higher gross monthly household income (vis-à-vis the control group with the lowest gross monthly household income) goes along with reduced EUT violations.Footnote 15 Interestingly, this is particularly pronounced in the treatment LowReal when actual money is at stake. (One could have conjectured that it would be the other way round as the marginal utility of making some money and, hence, the incentive to think a little harder might be higher for those on low incomes. Alas, it does not work this way.)
Finally, subjects holding assets have significantly lower EUT violations (by about 8%) whereas subjects with a savings account have significantly higher EUT violations (by about 5%). Maybe not surprisingly, subjects holding assets tend to be expected value maximizers (mainly choosing A ∗ B ∗ ) while subjects who only have a savings account display “Allais” behavior tending toward the choice of AB ∗ .Footnote 16
In all a picture emerges that is reminiscent of recent studies by Benjamin et al. (2006), Burks et al. (2009) and Dohmen et al. (2010) who show that a range of behavioral biases are correlated with (or may even stem from) cognitive limitations and low IQ. We find that violations are more prevalent in those who are lowly educated, unemployed, on low income, and who have no significant asset holdings. This is, of course, particularly worrying as imprudent financial decision making and bad planning for retirement has the worst consequences in that group.
In Appendix C we complement the above analysis by running multinomial logit regressions using all four answers AB, A ∗ B ∗ , AB ∗ , and A ∗ B, and choosing the answer representing expected value maximization, A ∗ B ∗ , as the base outcome. The results (whose interpretation is less straightforward) are shown in Tables 10, 11, 12 and 13.
The lab experiment
As mentioned in the introduction, the third dimension of our research strategy is concerned with the external validity of laboratory experiments that are typically carried out with rather homogenous subject pools. Of course, the preceding section has shown that there are important sources of heterogeneity in the population at large that simply cannot be detected when the subject pool is restricted to students. The same is, of course, true for any highly selected convenience sample. But what about the questions we analyzed first—the effects of different treatments, the differences between high and low and real and hypothetical payoffs? Would a lab experiment give us reliable results to analyze such questions (as it has been implicitly assumed for a long time in the experimental community, perhaps negligently without much testing)? To shed more light on these issues we conducted an additional lab experiment in the laboratory of Tilburg University using Dutch speaking student subjects drawn from the normal subject pool.
The lab experiment was conducted in the same way as the experiment using the CentERpanel. That is, student subjects did the experiment using a web browser in the lab and using the same screens as the subjects in the panel. However, there were two small exceptions. First, lab subjects received a 10 Euro show-up fee. (Potential participants were informed about this in the invitation E-mail.) But of course, mirroring the panel design again, only subjects assigned to treatments with real payment had the chance to earn additional money during the experiment. This was not announced prior to the experiment. Second, lab subjects were not offered the choice of not participating in the experiment once they had reported to the lab and the experiment was started. This was done in an effort to mimic the normal procedures in lab experiments where by reporting to the lab, a subject usually confirms his or her decision to participate. Note that when we move from the panel to the lab sample, both the subject pool and the environment changes. We deliberately accepted these two simultaneous changes as our aim was to contrast the results obtained in the panel with those obtained in a normal lab experiment.Footnote 17
After the experiment we asked subjects to fill in a questionnaire in which we elicited some basic background information. Naturally, the information we collected from lab subjects is very limited and cannot be compared in scope and quality to the background information available from members of CentERpanel. The lab experiments were conducted in December 2006 using 223 subjects in total.
As in the panel experiment we did not observe any order effects of presenting the Allais questions, so we present only pooled data in Table 4 which shows the same information for the lab data that Table 2 showed for the panel. We make the following observations. First, as in the panel experiments, we observe EUT violations in all treatments, although to a much lesser degree.Footnote 18 This mirrors the main result in Gächter et al.’s (2008) meta-study: Violations from orthodox theoretical predictions and biases observed in the lab form a lower bound for violations and biases observed in the population at large. Second, as in the panel, moving from high hypothetical payoffs to low hypothetical payoffs reduces the extent of EUT violation significantly ( p < 0.001, D = 4.881). Third, moving from low hypothetical payoffs to low real payoffs increases the extent of EUT violation slightly but insignificantly (p < 0.226, D = − 0.7525). The similarities between the observations in the panel and in the lab are evident.
Figure 1 shows the shares of choices violating EUT in the two subsamples. It appears that the graph indicating the share of EUT violation in the panel can quite accurately be obtained by shifting the graph indicating the share of EUT violation in the lab upwards by about 15 percentage points.Footnote 19 This means that although the share of EUT violations is consistently higher in the panel than in the lab, the comparative statics results of moving from one treatment to another could have been reliably predicted by the lab experiments.
Using a representative sample of the Dutch population we revisit the Allais paradox. Our main results are threefold. First, as in previous lab samples, the violations of EUT are systematic in the population at large and much lower when stakes are low. Second, there is considerable heterogeneity in the population and violations are particularly prevalent among the lowly educated, those poor in income and asset holdings, and the unemployed. Third, comparing the panel results with a laboratory experiment we find that the relative treatment differences are identical in the panel and the lab but violation rates in all lab treatments are about 15 percentage points lower than in the corresponding non-lab treatment.
Our findings appear to imply two general messages. First, laboratory experiments with convenience samples of students might be more useful to study relative effects rather than absolute levels (see also Levitt and List (2007) who make a similar point in the context of social preferences). When it comes to the absolute measurement of behavior, it appears that lab results will draw a too optimistic picture. The population at large, it turns out, is less consistent with EUT than student samples are. Second, our results suggest that the predictive power of EUT in a general population is correlated with socioeconomic characteristics. In particular, parts of the population that are more likely to experience economic hardship are less consistent.
Of course, there exists a large literature on non-expected utility theories such as Kahneman and Tversky’s (1979) prospect theory or Machina’s (1982) fanning-out theory (both of which can explain the Allais paradox) or Viscusi’s (1989) prospective reference theory which predicts the paradox. Earlier laboratory experiments (see Camerer (1995) or Starmer (2000) for surveys) have documented the Allais paradox in student samples. Our paper highlights that, if anything, these studies underestimate the true prevalence of the paradox in general populations and indicates how violations are correlated with observable characteristics.
Several other studies have also used the CentER panel as a subject pool. Let us briefly mention some of these studies. Hey (2002) and Carbone (2005) analyze more complicated and sequential individual decision making tasks and do not find any background variable systematically influencing behavior. Bellemare and Kröger (2007) study a trust game and find “that heterogeneity in behavior is characterized by several asymmetries—men, the young and elderly, and low educated individuals invest relatively less, but reward significantly more investments.” (p. 183) von Gaudecker et al. (2011a) elicit risk preferences and report that older people, women, the relatively uneducated, and those with lower income are more risk averse. For another study on individual risk attitudes using a large and representative German sample, see Dohmen et al. (2011).
For early studies of the Allais paradox see, e.g., MacCrimmon (1968), Slovic and Tversky (1974), Allais and Hagen (1979) and Kahneman and Tversky (1979). For the effect of downscaled payoffs see Conlisk (1989), Starmer and Sugden (1991), Harrison (1994), Burke et al. (1996), Fan (2002), and van de Kuilen and Wakker (2006).
Almost all of the experiments on the Allais paradox conducted so far have used students as their subjects. There are two notable exceptions. List and Haigh (2005) test the Allais paradox both with students and professional traders from the Chicago Board of Trade. They report that both students and professional traders show Allais paradox behavior, but find that traders do so to a smaller extent. Fatas et al. (2007) use students and politicians and report similar results with students being more prone to Allais paradox behavior.
To see this note that by adding 0.89u(€ 0) − 0.89u(€ 1M) to both sides of the inequality u(A) = u(€1M) > 0.01u(€0) + 0.89u( €1M) + 0.1u(€5M) = u(A ∗ ) implies u(B) = 0.89u( € 0) + 0.11u(€ 1M) > 0.9u(€ 0) + 0.1u(€5M) + 0.1u(€ 5M) = u(B ∗ ).
For more information about the CentERpanel and the way it is administered see http://www.uvt.nl/centerdata/en/whatwedo/thecenterpanel/.
For more details see Appendix A which contains a translation of the screens used in the treatments with low payoffs. Note that the experiment was administered in Dutch.
Note the following about payments in treatment LowReal. CentERdata reimburses the telephone costs for filling in questionnaires by exchanging “CentERpoints” (1 CentERpoint = 0.01 Euro) to panel members’ private bank accounts four times a year. Although lotteries were described in Euro amounts, subjects in the treatments with real monetary earnings were informed that: “In this experiment you can earn real money that will be paid in the form of CentERpoints.”
Conlisk (1989) points out that this effect is in line with (a) Machina’s (1982) fanning out model that predicts Allais behavior for large payoffs and (b) the observation that EUT converges to expected payoff maximization for small payoffs. Notice, however, that this consistency argument is not an explanation—for it leaves open why fanning occurs and is more dramatic in its consequences with high payoffs. Non-familiarity with high payoffs is such an explanation and may, in fact, be adequately captured by fanning out of indifference curves.
Note that for the multinomial categories in the leftmost column in Table 1, the χ 2-tests check for the joint hypothesis that the violation rates are identical across all categories.
In light of recent findings about sharply declining numeracy skills in the (British) population above 55 (Banks 2006) this is perhaps slightly surprising.
Wald tests indicate, however, that the effects of the education levels below university degree listed in Table 3 are not statistically different.
A Wald test indicates that the effect of these two occupations is not statistically different.
To look at the effect of holding assets or a savings account more closely, we defined the variable “only assets” which equals 1 if a subject holds assets but has no savings account (otherwise it equals 0), the variable “only savings account” which equals 1 if a subject has a savings account but holds no assets (otherwise it equals 0), and the variable “assets & savings account” which equals 1 if a subject holds assets and has a savings account (otherwise it equals 0). Hence, the reference group consists of those subjects who neither hold assets nor have a savings account. Replacing the variables “assets” and “savings account” in regression (1) in Table 3 by the new variables “only assets,” “only savings account,” and “assets & savings account,” leaves the other variables of regression (1) almost unchanged (including significance levels) and shows that while the coefficients of the variables “only assets” and “assets & savings account” are negative (− 0.073 and − 0.032) but insignificant, the coefficient of the variable “only savings account” is positive ( 0.055 ) and significant at the 5% level. So it is not only the financially savvy who hold assets who do comparatively well but also people without any savings—perhaps because, having no financial cushion, they cannot afford making many mistakes.
von Gaudecker et al. (2011b) offer an analysis of the individual effects of implementation mode and of subject pool selection in a risk preference elicitation study and find that differences in behavior are due to selection and not implementation mode.
Again, we observe that the fraction of EUT-violating AB ∗ answers is significantly higher than the fraction of EUT-violating A ∗ B answers in all lab treatments (p < 0.001, Conlisk’s (1989) Z-statistic).
The difference in the extent of EUT violation between the panel and the lab is significant for all three treatments (HighHyp: p = 0.014, D = − 2.1732; LowHyp: p < 0.001; D = − 5.2220; LowReal: p < 0.001 , D = − 4.6935).
Allais, M. (1953). Le comportement de l’homme rationnel devant le risqué: Critique des postulats et axioms de l’ecole americaine. Econometrica, 21, 503–546.
Allais, M., & Hagen, O. (Eds.) (1979). Expected utility hypotheses and the Allais paradox.Dordrecht: Reidel.
Andersen, S., Harrison, G. W., Lau, M. I., & Rutström, E. E. (2008). Risk aversion in game shows. In J. C. Cox, & G. W. Harrison (Eds.), Risk aversion in experiments (Vol. 12). Greenwich: JAI Press. Research in Experimental Economics.
Banks, J. (2006). Economic choices, capabilities and outcomes at older ages. Fiscal Studies, 27, 281–311.
Bellemare, C., & Kröger, S. (2007). On representative social capital. European Economic Review, 51, 183–202.
Benjamin, D. J., Brown, S. A., & Shapiro, J. M. (2006). Who is “behavioral”? Cognitive ability and anomalous preferences. Working paper.
Burke, M. S., Carter, J. C., Gominiak, R. D., & Ohl, D. F. (1996). An experimental note on the Allais paradox and monetary incentives. Empirical Economics, 21, 617–632.
Burks, S. V., Carpenter, J. P., Götte, L., & Rustichini, A. (2009). Cognitive skills affect economic preferences, strategic behavior, and job attachment. Proceedings of the National Academy of Science, 106, 7745–7750.
Camerer, C. (1995). Individual decision making. In J. H. Kagel, & A. E. Roth (Eds.), The handbook of experimental economics (pp. 587–703). Princeton: Princeton University Press.
Carbone, E. (2005). Demographics and behaviour. Experimental Economics, 8, 217–232.
Conlisk, J. (1989). Three variants on the Allais example. American Economic Review, 79, 392–407.
Dohmen, T., Falk, A., Huffman, D., & Sunde, U. (2010). Are risk aversion and impatience related to cognitive ability? American Economic Review, 100, 1238–1260.
Dohmen, T., Falk, A., Huffman, D., Sunde, U., Schupp, J., & Wagner, G. G. (2011). Individual risk attitudes: Measurement, determinants and behavioral consequences. Journal of the European Economic Association, 9, 522–550.
Eckel, C., & Grossman, P. (2000). Volunteers and pseudo-volunteers: The effect of recruitment method on subjects’ behavior in experiments. Experimental Economics, 3, 107–120.
Fan, C.-P. (2002). Allais paradox in the small. Journal of Economic Behavior and Organization, 49, 411–421.
Fatas, E., Neugebauer, T., & Tamborero, P. (2007). How politicians make decisions: A political choice experiment. Journal of Economics, 92, 167–196.
Gächter, S., Huck, S., & Weizsäcker, G. (2008). Socio-demographics and choice in experimental economics. Mimeo.
Harrison, G. W. (1994). Expected utility and the experimentalists. Empirical Economics, 19, 223–253.
Harrison, G. W., & List, J. A. (2004). Field experiments. Journal of Economic Literature, 42, 1013–1059.
Harrison, G. W., Lau, M. I., & Rutström, E. E. (2009). Risk attitudes, randomization to treatment, and self-selection into experiments. Journal of Economic Behavior and Organization, 70, 498–507.
Heckman, J. (1976). The common structure of statistical models of truncation, sample selection, and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement, 5, 475–492.
Hey, J. D. (2002). Experimental economics and the theory of decision making under risk and uncertainty. Geneva Papers on Risk and Insurance Theory, 27, 5–21.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263–291.
Levitt, S. D., & List, J. A. (2007). What do laboratory experiments measuring social preferences reveal about the real world. Journal of Economic Perspectives, 21, 153–174.
List, J. A., & Haigh, M. S. (2005). A simple test of expected utility theory using professional traders. Proceedings of the National Academy of Science, 102, 945–948.
MacCrimmon, K. R. (1968). Descriptive and normative implications of the decision-theory postulates. In K. Borch, & J. Mossin (Eds.), Risk and uncertainty (Chapter 1). London: Macmillan.
Machina, M. J. (1982). Expected utility analysis without the independence axiom. Econometrica, 50, 277–323.
Post, T., van den Assem, M. J., Baltussen, G., & Thaler, R. H. (2008). Deal or no deal? Decision making under risk in a large-payoff game show. American Economic Review, 98, 38–71.
Slovic, P., & Tversky, A. (1974). Who accepts Savage’s axiom? Behavioral Sciences, 19, 368–373.
Starmer, C. (2000). Developments in non-expected utility theory: The hunt for a descriptive theory of choice under risk. Journal of Economic Literature, 38, 332–82.
Starmer, C., & Sugden, R. (1991). Does the random-lottery incentive system elicit true preferences? An experimental investigation. American Economic Review, 81, 971–978.
van de Kuilen, G., & Wakker, P. P. (2006). Learning in the Allais paradox. Journal of Risk and Uncertainty, 13, 155–164.
Viscusi, W. K. (1989). Prospective reference theory: Toward an explanation of the paradoxes. Journal of Risk and Uncertainty, 2, 235–264.
von Gaudecker, H.-M., van Soest, A., & Wengström, E. (2011a). Heterogeneity in risky choice behaviour in a broad population. American Economic Review, 101, 664–694.
von Gaudecker, H.-M., van Soest, A., & Wengström, E. (2011b). Experts in experiments: How selection matters for estimated distributions of risk preferences. IZA Discussion Paper No. 5575.
We thank Marcel Das and Marika Puumala of CentERdata (Tilburg University) for their most efficient support in collecting the data. Furthermore, we thank W. Kip Viscusi, anonymous referees, Johannes Binswanger, Oliver Kirchkamp, Tobias Klein, Sabine Kröger, Gijs van de Kuilen, Imran Rasul, Jan van Ours, Stefan Trautmann, Anthony Ziegelmeyer and participants of the 3rd International Meeting on Experimental and Behavioral Economics and the IMPRS Uncertainty Summer School as well as seminar participants at Tilburg University, University of Frankfurt (Main), Humboldt University Berlin, and the University of Amsterdam for helpful comments. We gratefully acknowledge financial help from the UK’s Economic and Social Research Council via ELSE and a grant on ‘Behavioral Mechanism Design’. The second author acknowledges financial help from the Netherlands Organisation for Scientific Research (NWO) through a VIDI grant.
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
This paper was originally entitled “Allais for all: revisiting the paradox”.
Appendix A: Instructions (Translation)
The experiment was administered in Dutch. Here we give a translation of the screens presented in treatment LowHyp and [LowReal]
This research is conducted by researchers of Tilburg University and University College London. The questionnaire consists of two choice problems in which you will be asked to make a choice between two situations. Based on your choices and luck you may win an amount of money. Please note: In this experiment all amounts are hypothetical, in reality you cannot win any money. [In LowReal: In this experiment you can earn real money that will be paid in the form of CentERpoints.]
If you do not want to participate as a matter of principle, you can indicate this below. You will then go directly to the end of the questionnaire.
\(\bigcirc \) I continue with the questionnaire.
\(\bigcirc \) No, I do not want to participate in this questionnaire.
You will shortly be presented with two questions. You will be asked to make a choice between two options which provide you with different chances to win something. Please see an example of such a situation here below. In the first option you have a chance of 80% to win nothing and a chance of 20% to win 10 Euro. The second option provides you with a chance of 20% of nothing and a chance of 80% of winning 20 Euro.
We would like to know whether you prefer Option 1 or Option 2 (in these instructions you don’t have to choose yet). After you have made the choice, the computer will play out the option you chose. The computer generates a random number that is between 1 and 100. The chance distribution of the chosen option then defines how much you win with this number.
For example: in the Option 1 above you get nothing if the computer generates a number between 1 and 80 (this is indicated above in Option 1 in brackets), but if the computer generates a number between 81 and 100 you will get 10 Euro. In Option 2 you get nothing if the computer generates a number between 1 and 20, but with a number between 21 and 100 you win 20 Euro. As already mentioned, it concerns hypothetical amounts here, in reality you cannot win any money. [In LowReal: If you win something then this amount will be added to your account of CentERpoints.]
If you are ready to start the experiment, press “Continue.”
Which of the following two options do you prefer?
\(\bigcirc \) Option A
\(\bigcirc \) Option B
Which of the following two options do you prefer?
\(\bigcirc \) Option C
\(\bigcirc \) Option D
You have now made the two decisions. Press “Continue” to see the results of the options you chose.
In the first question (option A or B) you have chosen Option X ([description of the chosen option]). The computer generated the number [random number]. Thus, you have won [in treatment LowHyp: the hypothetical] amount of [...] euro with this option.
In the second question (option C or D) you have chosen Option Y ([description of the chosen option]). The computer generated the number [random number]. Thus, you have won the [in treatment LowHyp: the hypothetical] amount of [...] euro with this option.
In total you have won the [in treatment LowHyp: the hypothetical] amount of [...] euro in this experiment.
Do you have any comments regarding the questionnaire?
\(\bigcirc \) Yes
\(\bigcirc \) No
Screen 8 [In case the answer to the question on Screen 7 was Yes.] :
You can type in your comments below.
This is the end of the questionnaire. Thank you for your participation.
Appendix B: Relative frequencies of choices depending on socioeconomic characteristics
In Tables 5–9 we report the relative frequency of choices and the results of additional tests depending on socioeconomic characteristics. We do this for the pooled data (Table 5) and the three treatments separately (Tables 5–8), and for pair-wise across-treatment tests (Table 9). The various tests are described in the notes to the tables.
Appendix C: Results of multinomial logit regressions
In this appendix, we report the results of multinomial logit regressions on all four answers AB, A ∗ B ∗ , AB ∗ , and A ∗ B, using expected value maximization, A ∗ B ∗ , as the base outcome and using the variables listed in column 1 in Table 3 as regressors. We perform multinomial logit regressions for the pooled data and for the three treatments separately. The results are reported in Tables 10 (all data) and Tables 11–13 (treatments HighHyp, LowHyp, and LowReal). In these tables we report the relative risks of choosing outcome AB, AB ∗ or A ∗ B over the base outcome A ∗ B ∗ . That is, the three columns in Tables 10–13 show, respectively, the ratios P(answer = AB)/P(answer = A ∗ B ∗ ), P(answer = A ∗ B)/P(answer = A ∗ B ∗ ), and P(answer = AB ∗ )/P(answer = A ∗ B ∗ ), where P(.) denotes the probability of choosing a given answer. The tables should be read as follows. Refer, for example, to Table 10 that reports the results on the pooled data. Since the omitted treatment is LowHyp, the coefficient 14.116 in the second column of this table means that by moving from treatment LowHyp to treatment HighHyp, the relative risk, P(answer = AB)/P(answer = A ∗ B ∗ ), of choosing answer AB over answer A ∗ B ∗ is equal to 14.116. Similarly, by moving from the omitted age category 16–24 to age category 25–34, the relative risk, P(answer = AB)/P(answer = A ∗ B ∗ ), of choosing answer AB over answer A ∗ B ∗ is equal to 0.726.
Inspecting the results of the multinomial logit regressions for the three treatments in Tables 11–13, the most salient feature seems to be that for the different treatments different categories of background characteristics have a significant effect on the relative risk of choosing one answer over the base answer A ∗ B ∗ . For treatment HighHyp we note that it is most age categories that show a significant correlation with the relative risk of choosing answer A ∗ B over answer A ∗ B ∗ (columns labeled “A ∗ B” in Table 11). For treatment LowHyp we observe that most of the occupation variables are significantly correlated with the relative risk of choosing answer AB ∗ over answer A ∗ B ∗ (column labeled “AB ∗ ” in Table 12). Finally, for treatment LowReal we infer that most of all the household gross income variables have a significant correlation with the relative risk of choosing answer A ∗ B over answer A ∗ B ∗ (column “A ∗ B” in Table 13).
About this article
Cite this article
Huck, S., Müller, W. Allais for all: Revisiting the paradox in a large representative sample. J Risk Uncertain 44, 261–293 (2012). https://doi.org/10.1007/s11166-012-9142-8
- Expected utility theory
- Allais paradox
- Common consequence effect
- Field experiments
- Representative sample