## Abstract

Accurately estimating risk preferences is of critical importance when evaluating data from many economic experiments or strategic interactions. I use a simulation model to conduct power analyses over two lottery batteries designed to classify individual subjects as being best explained by one of a number of alternative specifications of risk preference models. I propose a case in which there are only two possible alternatives for classification and find that the statistical methods used to classify subjects result in type I and type II errors at rates far beyond traditionally acceptable levels. These results suggest that subjects in experiments must make significantly more choices, or that traditional lottery pair batteries need to be substantially redesigned to make accurate inferences about the risk preference models that characterize a subject’s choices.

This is a preview of subscription content, log in to check access.

## Notes

- 1.
Though the method proposed by De Long and Lang (1992) also addresses the issue of type II errors, they do so using a meta-analyses of published literature, where I employ a power analysis though simulation to determine the probabilities of type I and type II errors directly.

- 2.
HO use a “Strong utility” stochastic specification, so-called because it implies “strong stochastic transitivity”, whereas the CU model implies “moderate stochastic transitivity”. Differences in stochastic specifications can lead to wholly different inferences drawn from the structural model of risk preferences. Wilcox (2008) provides an in-depth review of the implications of different stochastic specifications and the results of an experiment designed to test these implications.

- 3.
It may be the case that 100 observations are insufficiently large to satisfy the asymptotic properties of ML, but this is not the focus of this paper.

- 4.
The Akaike information criterion is given by \({\mathrm{AIC}} = -2 \log L({\hat{\alpha }}) / T + 2k / T\), where \(L({\hat{\alpha }})\) is the log-likelihood of the model at its estimated maximum,

*k*is the number of parameters for that model, and T is the number of observations. - 5.
As with HO, it may be the case that 80 observations are insufficiently large to satisfy the asymptotic properties of ML for these tests.

- 6.
Typically, when a test indicates the probability of a type I error to be less than 5%, social scientists consider this result “statistically significant,” and when researchers engage in

*ex ante*power analysis, they typically aim for a probability of a type II error less than 20% (Cohen 1988; Gelman and Loken 2014). These values are based on convention, and are somewhat arbitrary. Ronald Fisher disagreed with picking the same level of statistical significance for every analysis: “[...] no scientific worker has a fixed level of significance at which from year to year, and in all circumstances, he rejects hypotheses; he rather gives his mind to each particular case in the light of his evidence and his ideas” (Fisher 1956). - 7.
Consider a choice probability calculated to be 0.90 for option

*A*, and therefore 0.10 for option*B*. A random number drawn from an univariate uniform distribution on [0, 1] has a 90% chance of being less than or equal to 0.90, so option*A*would be chosen 90% of the time by the simulated subject. - 8.
See the Appendix of HN for estimates of typical university students in the United States, and Harrison and Rutström (2008) for additional reviews of studies with human subjects.

- 9.
While we may expect the probability that an RDU subject is correctly classified to vary somewhat with

*r*and \(\lambda\), how the probability of correct classification changes with the probability weighting parameters, \(\phi\) and \(\eta\), is of greater interest as these parameters define how RDU is different from EUT. - 10.
The

*r*parameter of subject 8 given as an example in HN (pg. 104) would fall in this range. - 11.
Recall that a type II error in this analysis is 1 minus the probability of correctly classifying an RDU subject.

- 12.
These are the values estimated for subject 98 in HN (p. 104) whom HN classified as RDU with a Prelec (1998) PWF. HN report in Appendix C that the estimated

*r*parameter is 0.3473 for this subject, which is near the \(r = 0.5\) restriction for this simulation, but do not report the estimated \(\lambda\) parameter. - 13.
Subject 94 from HN was classified as RDU with estimated parameters \(r = 0.4461\), \(\phi = 1.3907\), and \(\eta = 0.6883\), which fall in this range.

- 14.
While the analyses above use the Wald test to perform classification, and HO uses the likelihood ratio test. Analyses in Online Supplement A using the likelihood ratio test to classify subjects show no qualitative differences in the rates of type I and type II errors. Both types of errors still occur at exceedingly high rates when the likelihood ratio is used instead of the Wald test.

## References

Andersen, S., Fountain, J., Harrison, G. W., & Elisabet Rutström, E. (2014). Estimating subjective probabilities.

*Journal of Risk and Uncertainty*,*48*(3), 207–229.Andersen, S., Harrison, G. W., Lau, M. I., & Elisabet Rutström, E. (2008). Eliciting risk and time preferences.

*Econometrica*,*76*(3), 583–618.Bell, D. E. (1982). Regret in decision making under uncertainty.

*Operations Research*,*30*(5), 961–981.Cohen, J. (1988).

*Statistical power analysis for the behavioral sciences*(Vol. 2). New York: Academic Press.De Long, J. B., & Lang, K. (1992). Are all economic hypotheses false?

*Journal of Political Economy*,*100*(6), 1257–1272.Feiveson, A. H. (2002). Power by simulation.

*Stata Journal*,*2*(2), 107–124.Fisher, R. (1956).

*Statistical methods and scientific inference*(p. 175). Edinburgh: Oliver & Boyd.Gelman, A., & Loken, E. (2014). The statistical crisis in science.

*American Scientist*,*102*, 460–465.Harrison, G. W., & Elisabet Rutström, E. (2008). Risk aversion in the laboratory. In J. C. Cox & G. W. Harrison (Eds.),

*Research in experimental economics*(Vol. 12, pp. 41–196). Bingley: Emerald Group Publishing Limited.Harrison, G. W., Martínez-Correa, J., & Todd Swarthout, J. (2015). Reduction of compound lotteries with objective probabilities: Theory and evidence.

*Journal of Economic Behavior and Organization*,*119*, 32–55.Harrison, G. W., & Ng, J. M. (2016). Evaluating the expected welfare gain from insurance.

*Journal of Risk and Insurance*,*83*(1), 91–120.Hey, J. D., & Orme, C. (1994). Investigating generalizations of expected utility theory using experimental data.

*Econometrica*,*62*(6), 1291–1326.Ioannidis, J. P. A. (2005). Why most published research findings are false.

*Chance*,*18*(4), 40–47. (**arXiv: 0208024 [gr-qc]**).Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk.

*Econometrica*,*47*(2), 263–292.Loomes, G., & Sugden, R. (1982). Regret theory: An alternative theory of rational choice under uncertainty.

*Economic Journal*,*92*(368), 805–824.Loomes, G., & Sugden, R. (1998). Testing different stochastic specifications of risky choice.

*Economica*,*65*, 581–598.McCloskey, D. N., & Ziliak, S. T. (1996). The standard error of regressions.

*Journal of Economic Literature*,*34*, 97–114.Prelec, D. (1998). The probability weighting function.

*Econometrica*,*66*(3), 497–527.Quiggin, J. (1982). A theory of anticipated utility.

*Journal of Economic Behavior & Organization*,*3*, 323–343.Wilcox, N. T. (2008). Stochastic models for binary discrete choice under risk: A critical primer and econometric comparison. In J. C. Cox & G. W. Harrison (Ed.),

*Research in Experimental Economics*, Vol. 12 (pp. 197–292). Bingley, U.K.: Emerald Group Publishing Limited.Wilcox, N. T. (2011). ‘Stochastically more risk averse:’ A contextual theory of stochastic discrete choice under risk.

*Journal of Econometrics*,*162*(1), 89–104.Zhang, L., & Ortmann, A. (2013). Exploring the meaning of significance in experimental economics. Working Paper. Australian School of Business, University of New South Wales.

## Author information

### Affiliations

### Corresponding author

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Special thanks to Glenn Harrison, Don Ross, and Andre Hofmeyr for providing comments and feedback on this paper.

## Electronic supplementary material

Below is the link to the electronic supplementary material.

## Rights and permissions

## About this article

### Cite this article

Monroe, B.A. The statistical power of individual-level risk preference estimation.
*J Econ Sci Assoc* (2020). https://doi.org/10.1007/s40881-020-00098-x

Received:

Revised:

Accepted:

Published:

### Keywords

- Power analysis
- Risk preferences
- Experimental economics
- Expected utility theory
- Rank dependent utility

### JEL Classification

- C12
- C13
- C18
- C52
- C90