Experts in experiments

How selection matters for estimated distributions of risk preferences


An ever increasing number of experiments attempts to elicit risk preferences of a population of interest with the aim of calibrating parameters used in economic models. We are concerned with two types of selection effects, which may affect the external validity of standard experiments: Sampling from a narrowly defined population of students (“experimenter-induced selection”) and self-selection due to non-response or incomplete response of participants in a random sample from a broad population. We find that both types of selection lead to a sample of experts: Participants perform significantly better than the general population, in the sense of fewer violations of revealed preference conditions. Self-selection within a broad population does not seem to matter for average preferences. In contrast, sampling from a student population leads to lower estimates of average risk aversion and loss aversion parameters. Furthermore, it dramatically reduces the amount of heterogeneity in all parameters.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.

    A non-exhaustive list of recent contributions includes Harrison et al. (2002), Bleichrodt et al. (2001), Dohmen et al. (2011), Choi et al. (2007), Andersen et al. (2008), Tanaka et al. (2010), von Gaudecker et al. (2011) and Huck and Müller (2012).

  2. 2.

    Other threats to external validity may occur if the experimental design differs too widely from real-world situations in terms of context, stakes, or similar features. See Harrison and List (2004) and Levitt and List (2007) for reviews of such effects and Falk and Heckman (2009) for critical thoughts.

  3. 3.

    While this is true for most studies, there are clearly arguments for being interested in preference estimates from specific samples of the population. For example, Lazear et al. (2011) have recently argued that in certain settings self-selection on preferences can create market outcomes that are very different from those that would have occurred if the general population participated. In order to understand or predict behaviour in such situations, one would naturally rather have preference estimates of those active on the specific markets rather than the population at large.

  4. 4.

    The Online Appendix is available at

  5. 5.

    The main modification is that we include pie charts as a graphical illustration of probabilities (cf. Fig. 1); see von Gaudecker et al. (2011) for details.

  6. 6.

    The payoff configurations are scaled and rounded versions of those used by Holt and Laury (2002).

  7. 7.

    For related papers using data collected through the CentERpanel see, e.g., Donkers et al. (2001) who analysed risk preferences using hypothetical questions, Huck and Müller (2012) who addressed the Allais paradox and Bellemare and Kröger (2007) and Bellemare et al. (2008) for evidence from a trust game and an ultimatum game with real payoffs. For more information about the CentERpanel see

  8. 8.

    In the Online Appendix, we describe several alternative models in Section 5 and present the corresponding estimates in Section 10. The results show that our conclusions about selection effects are robust to the chosen functional forms, to excluding loss aversion, and to extending the model by allowing Kreps and Porteus (1978) preferences towards the timing of uncertainty resolution.

  9. 9.

    One might argue that τ should also be individual specific, but in practice it appears to be difficult to estimate heterogeneity in τ and ω separately (although both are identified, in theory).

  10. 10.

    The former can only happen on a follow-up screen if the respondent has chosen option ‘B’ throughout the first screen.

  11. 11.

    For example, if a subject switched from ‘A’ to ‘B’ on the first screen at a 0.5 probability of the high outcome, the second screen had high outcome probabilities 0.2, 0.3, 0.4 and 0.5 . A between-screens inconsistency then arises if he or she chose ‘B’ at probability 0.2 or if they chose ‘A’ at probability 0.5 .

  12. 12.

    Table 7 of the Online Appendix presents the frequencies of inconsistencies per subject instead of per violation opportunity. The pattern is qualitatively the same as that of Table 3.

  13. 13.

    We can only compute bounds that will at best be a 5%-interval (e.g. between 75% as the highest ‘A’-choice on the first screen and 80% as the lowest choice of ‘B’ on the second screen). In many cases, inconsistencies make these bounds grow substantially wider. We computed them as follows: the lowest possible switch point is defined as the highest probability corresponding to an ‘A’ choice that is still lower than the minimum probability with a ‘B’ choice; the upper bound is the minimum probability with a ‘B’ choice that is still higher than the maximum probability where option ‘A’ was chosen. If only choice ‘A’ (‘B’) was observed, both upper and lower bound were set to 100% (0%). We then averaged the upper bounds and the lower bounds across the seven payoff configurations. This leaves us with two preference measures per individual. To save space, we just report results using the midpoint of the two bounds. All results remain qualitatively the same if we use the upper or lower bounds. Figures corresponding to Fig. 3 can be found in the Online Appendix.

  14. 14.

    We do observe that the hypothetical treatment affects loss aversion estimates differently in the laboratory compared to the Internet, which is likely to be caused by the presence of participation fees in the laboratory. The results concerning selection effects are robust to the inclusion of the hypothetical treatment.

  15. 15.

    We also considered removing subjects that took an extremely long time to complete the experiment (probably stopping and continuing at a later point in time). These subjects, however, did not behave in a significantly different way compared to the rest of the sample, so we decided to keep them in the sample. In a different context but with a similar ongoing web-based panel, Viscusi et al. (2008) show that other attributes related to the administration mode were not strongly related to the preference estimates. More specifically, they consider the time subjects spent from invitation to completion, the time the respondents had been in the panel, and whether the respondent retired from the panel within five months after the survey.

  16. 16.

    The combined response rate for steps 2 and 3 in our Internet experiment is 78%. This seems to compare favourably to Harrison et al. (2009), who employed more standard recruitment procedures in mailing out a letter to a random subsample of the Danish population and achieved a response rate of 38% (253 of 664 subjects). However, it should be noted that our response rate is within a preselected sample that has shown a general inclination to fill out survey questionnaires by passing step 1 of the selection process.

  17. 17.

    We estimated models that treated the two components of step 3 (speeding through and dropping out) as separate outcomes. Age is the only variable for which the distinction mattered, which is why we only report the results from the more parsimonious specification in the table.

  18. 18.

    The test works as follows: Let y denote our parameter of interest and w(x) the weight. Our null hypothesis of no difference between the weighted and unweighted observations can then be stated as E[w(x)y] = E[y] or E[z] = 0 , with z = (w(x) − 1)y . Since we have a large sample size and few explanatory variables, we neglect the estimation error in w . The null hypothesis can then be tested with a standard t-test on whether the mean of z is zero or not.


  1. Alessie, R., Hochgürtel, S., van Soest, A. (2006). Non-take-up of tax-favored savings plans: evidence from Dutch employees. Journal of Economic Psychology, 27(4), 483–501.

    Article  Google Scholar 

  2. Andersen, S., Harrison, G.W., Lau, M.I., Rutström, E.E. (2006). Elicitation using multiple price list formats. Experimental Economics, 9(4), 383–405.

    Article  Google Scholar 

  3. Andersen, S., Harrison, G.W., Lau, M.I., Rutström, E.E. (2008). Eliciting risk and time preferences. Econometrica, 76(3), 583–618.

    Article  Google Scholar 

  4. Andersen, S., Harrison, G.W., Lau, M.I., Rutström, E.E. (2010). Preference heterogeneity in experiments: comparing the field and laboratory. Journal of Economic Behavior & Organization, 73(2), 209–224.

    Article  Google Scholar 

  5. Anderson, L., & Mellor, J. (2009). Are risk preferences stable? comparing an experimental measure with a validated survey-based measure. Journal of Risk and Uncertainty, 39, 137–160.

    Article  Google Scholar 

  6. Bellemare, C., & Kröger, S. (2007). On representative social capital. European Economic Review, 51(1), 183–202.

    Article  Google Scholar 

  7. Bellemare, C., Kröger, S., van Soest, A. (2008). Measuring inequity aversion in a heterogeneous population using experimental decisions and subjective probabilities. Econometrica, 76(4), 815–839.

    Article  Google Scholar 

  8. Benartzi, S., & Thaler, R.H. (1995). Myopic loss aversion and the equity premium puzzle. Quarterly Journal of Economics, 110(1), 73–92.

    Article  Google Scholar 

  9. Binswanger, H.P. (1980). Attitudes towards risk: an experimental measurement in rural India. American Journal of Agricultural Economics, 62, 395–407.

    Article  Google Scholar 

  10. Blavatskyy, P. (2009). Betting on own knowledge: experimental test of overconfidence. Journal of Risk and Uncertainty, 38, 39–49.

    Article  Google Scholar 

  11. Bleichrodt, H., Pinto, J.L., Wakker, P.P. (2001). Making descriptive use of prospect theory to improve the prescriptive use of expected utility. Management Science, 47(11), 1498–1514.

    Article  Google Scholar 

  12. Blundell, R.W., & Stoker, T.M. (2007). Models of aggregate economic relationships that account for heterogeneity. In J.J. Heckman, & E.E. Leamer (Eds.), Handbook of econometrics, handbook of econometrics (Vol. 6, Part 1, chap. 68, pp. 4609–4666). Elsevier.

  13. Browning, M., Hansen, L.P., Heckman, J.J. (1999). Micro data and general equilibrium models. In J.B. Taylor, & M. Woodford (Eds.), Handbook of macroeconomics (Vol. 1, chap. 8, pp. 543–633). Elsevier.

  14. bwGRiD (2007–2011). Member of the German D-Grid initiative, funded by the Ministry of Education and Research (Bundesministerium für Bildung und Forschung) and the Ministry for Science, Research and Arts Baden-Württemberg (Ministerium für Wissenschaft, Forschung und Kunst Baden-Württemberg).

  15. Choi, S., Fisman, R., Gale, D., Kariv, S. (2007). Consistency and heterogeneity of individual behavior under uncertainty. American Economic Review, 97(5), 1921–1938.

    Article  Google Scholar 

  16. Choi, S., Kariv, S., Müller, W., Silverman, D. (2011). Who is (more) rational? NBER Working Paper 16791.

  17. Coble, K., & Lusk, J. (2010). At the nexus of risk and time preferences: an experimental investigation. Journal of Risk and Uncertainty, 41, 67–79.

    Article  Google Scholar 

  18. Dave, C., Eckel, C., Johnson, C., Rojas, C. (2010). Eliciting risk preferences: when is simple better? Journal of Risk and Uncertainty, 41, 219–243.

    Article  Google Scholar 

  19. Dohmen, T., Falk, A., Huffman, D., Sunde, U., Schupp, J., Wagner, G.G. (2005). Individual risk attitudes: New evidence from a large, representative, experimentally-validated survey. IZA Discussion Paper No. 1730.

  20. Dohmen, T., Falk, A., Huffman, D., Sunde, U., Schupp, J., Wagner, G.G. (2011). Individual risk attitudes: measurement, determinants and behavioral consequences. Journal of the European Economic Association, 9(3), 522–550.

    Article  Google Scholar 

  21. Donkers, B., Melenberg, B., van Soest, A. (2001). Estimating risk attitudes using lotteries; a large sample approach. Journal of Risk and Uncertainty, 22(2), 165–195.

    Article  Google Scholar 

  22. Falk, A., & Heckman, J.J. (2009). Lab experiments are a major source of knowledge in the social sciences. Science, 326(5952), 535–538.

    Article  Google Scholar 

  23. Fehr, E., Fischbacher, U., von Rosenbladt, B., Schupp, J., Wagner, G.G. (2003). A nation-wide laboratory: Examining trust and trustworthiness by integrating behavioral experiments into representative surveys. IZA Discussion Paper No. 715.

  24. von Gaudecker, H.M., van Soest, A., Wengström, E. (2008). Selection and mode effects in risk preference elicitation experiments. IZA Discussion Paper No. 3321.

  25. von Gaudecker, H.M., van Soest, A., Wengström, E. (2011). Heterogeneity in risky choice behaviour in a broad population. American Economic Review, 101(2), 664–694.

    Article  Google Scholar 

  26. Güth, W., Schmidt, C., Sutter, M. (2007). Bargaining outside the lab—a newspaper experiment of a three person-ultimatum game. Economic Journal, 117(518), 449–469.

    Article  Google Scholar 

  27. Harless, D.W., & Camerer, C.F. (1994). The predictive utility of generalized expected utility theories. Econometrica, 62(6), 1251–1289.

    Article  Google Scholar 

  28. Harrison, G.W., & List, J.A. (2004). Field experiments. Journal of Economic Literature, 42(4), 1009–1055.

    Article  Google Scholar 

  29. Harrison, G.W., Lau, M.I., Williams, M.B. (2002). Estimating discount rates in Denmark: a field experiment. American Economic Review, 92(5), 1606–1617.

    Article  Google Scholar 

  30. Harrison, G.W., Lau, M.I., Rutström, E.E. (2009). Risk attitudes, randomization to treatment, and self-selection into experiments. Journal of Economic Behavior & Organization, 70(3), 498–507.

    Article  Google Scholar 

  31. Heckman, J.J. (1974). Shadow prices, market wages, and labor supply. Econometrica, 42(4), 679–694.

    Article  Google Scholar 

  32. Heckman, J.J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement, 5(4), 475–492.

    Google Scholar 

  33. Heckman, J.J. (1979). Sample selection bias as a specification error. Econometrica, 47(1), 153–161.

    Article  Google Scholar 

  34. Hey, J.D., & Orme, C. (1994). Investigating generalizations of expected utility theory using experimental data. Econometrica, 62(6), 1291–1326.

    Article  Google Scholar 

  35. Holt, C.A., & Laury, S.K. (2002). Risk aversion and incentive effects. American Economic Review, 92, 1644–1655.

    Article  Google Scholar 

  36. Hoogendoorn, A.W., & Daalmans, J. (2009). Nonresponse in the recruitment of an internet panel based on probability sampling. Survey Research Methods, 3(2), 59–72.

    Google Scholar 

  37. Huck, S., & Müller, W. (2012). Allais for all: revisiting the paradox in a large representative sample. Journal of Risk and Uncertainty, 44, 261–293.

    Article  Google Scholar 

  38. Jacobson, S., & Petrie, R. (2009). Learning from mistakes: what do inconsistent choices over risk tell us? Journal of Risk and Uncertainty, 38, 143–158.

    Article  Google Scholar 

  39. Kahneman, D., & Tversky, A.V. (1979). Prospect theory: an analysis of decision under risk. Econometrica, 47, 263–291.

    Article  Google Scholar 

  40. Köbberling, V., & Wakker, P.P. (2005). An index of loss aversion. Journal of Economic Theory, 122, 119–131.

    Article  Google Scholar 

  41. Kreps, D.M., & Porteus, E.L. (1978). Temporal resolution of uncertainty and dynamic choice theory. Econometrica, 46, 185–200.

    Article  Google Scholar 

  42. Lazear, E., Malmendier, U., Weber, R. (2011). Sorting, prices, and social preferences. American Economic Journal: Applied Economics, 4(1), 136–63.

    Article  Google Scholar 

  43. Levitt, S.D., & List, J.A. (2007). What do laboratory experiments measuring social preferences reveal about the real world? Journal of Economic Perspectives, 21(2), 153–174.

    Article  Google Scholar 

  44. Little, R.J., & Rubin, D.B. (2002). Statistical analysis with missing data (2nd edn). New York: John Wiley & Sons Inc.

    Google Scholar 

  45. Loomes, G. (2005). Modelling the stochastic component of behaviour in experiments: some issues for the interpretation of data. Experimental Economics, 8(4), 301–323.

    Article  Google Scholar 

  46. Loomes, G., Moffatt, P.G., Sugden, R. (2002). A microeconometric test of alternative stochastic theories of risky choice. Journal of Risk and Uncertainty, 24(2), 103–130.

    Article  Google Scholar 

  47. Lucking-Reiley, D. (1999). Using field experiments to test equivalence between auction formats: magic on the internet. American Economic Review, 89(5), 1063–1080.

    Article  Google Scholar 

  48. Madrian, B.C., & Shea, D.F. (2001). The power of suggestion: Inertia in 401(k) participation and savings behavior. Quarterly Journal of Economics, 116(4), 1149–1187.

    Article  Google Scholar 

  49. Revelt, D., & Train, K.E. (2000). Customer-specific taste parameters and mixed logit: Households’ choice of electricity supplier. University of California at Berkeley, Economics Working Paper E00-274.

  50. de Roos, N., & Sarafidis, Y. (2010). Decision making under risk in deal or no deal. Journal of Applied Econometrics, 25(6), 987–1027.

    Article  Google Scholar 

  51. Starmer, C. (2000). Developments in non-expected utility theory: the hunt for a descriptive theory of choice under risk. Journal of Economic Literature, 38(2), 332–382.

    Article  Google Scholar 

  52. Tanaka, T., Camerer, C.F., Nguyen, Q. (2010). Risk and time preferences: linking experimental and household survey data from Vietnam. American Economic Review, 100(1), 557–571.

    Article  Google Scholar 

  53. Tversky, A.V., & Kahneman, D. (1992). Advances in prospect theory: cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5(4), 297–323.

    Article  Google Scholar 

  54. Viscusi, W.K., Huber, J., Bell, J. (2008). The economic value of water quality. Environmental & Resource Economics, 41(2), 169–187.

    Article  Google Scholar 

Download references


Financial support from the Dutch Science Foundation (NWO), the Swedish Institute for Banking Research (Bankforskningsinstitutet), the Wallander-Hedelius Foundation and from the European Union under grant HPRN-CT-2002-00235 (RTN-AGE) is gratefully acknowledged. This paper has made use of the DNB Household Survey and other data collected in the CentERpanel. We thank the team of CentERdata, especially Marika Puumala, for their support with the experiments, as well as Morten Lau and Joachim Winter for very helpful comments on the experimental design. The analysis benefitted from comments received at presentations in Mannheim, Copenhagen, Gothenburg, Montreal, Berlin, Aix-en-Provence, Toulouse, at the XIIth FUR conference at LUISS in Rome, the ESA meetings in Nottingham and Tucson, and a Cemmap workshop in London. Daniel Kemptner provided able research assistance. The computation of results has been facilitated by the use of the bwGRiD (2007–2011). This paper updates and extends results from von Gaudecker et al. (2008) and an early working paper version of von Gaudecker et al. (2011).

Author information



Corresponding author

Correspondence to Erik Wengström.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

von Gaudecker, HM., van Soest, A. & Wengström, E. Experts in experiments. J Risk Uncertain 45, 159–190 (2012).

Download citation


  • Risk aversion
  • Loss aversion
  • Internet surveys
  • Laboratory experiments

JEL Classification

  • C90
  • D81