## Abstract

It is often of interest to elicit beliefs from populations that may include naïve participants. Unfortunately, elicitation mechanisms are typically assessed by assuming optimal responses to incentives. Using laboratory experiments with a population that potentially includes naïve participants, we compare the performance of two elicitation mechanisms proposed by Karni (Econometrica 77(2):603-606, 2009). These mechanisms, denoted as “declarative” and “clock,” are valuable because their incentive compatibility does not require strong assumptions such as risk neutrality or expected utility maximization. We show that, theoretically and empirically, with a sufficient fraction of naïve participants, the clock mechanism elicits beliefs more accurately than the declarative. The source of this accuracy advantage is twofold: the clock censors naïve responses, and participants are more likely to employ dominant strategies under the clock. Our findings hold practical value to anyone interested in eliciting beliefs from representative populations, a goal of increasing importance when conducting large-scale surveys or field experiments.

### Similar content being viewed by others

## Notes

A scoring rule is “proper” if the respondent must report true beliefs to maximize expected score. It was first introduced by meteorological statistician Brier (1950), and later popularized by Savage (1971). Reported beliefs are compared against realized outcomes, so that proper scoring rules provide incentives for accuracy. The quadratic, spherical, and logarithmic scoring rules are examples of proper scoring rules.

On page 604, Karni (2009) introduces the clock mechanism by saying, “An equivalent probability-elicitation auction mechanism is as follows…”

However, there exist practical difficulties when using incentive compatible mechanisms in the field to elicit one’s beliefs regarding, for example, the chance of changing jobs within the next 5 years. As we discuss in the conclusion, future research could address these important issues.

Some subjects may fail to adopt the beliefs with which they are endowed. However, this would occur in both mechanisms, and thus presents no difficulty for our comparative analysis (other than adding noise and leaving it more difficult to discover differences).

As noted by Kadane and Winkler (1988), the elicited probabilities intertwine with utilities “not just through the explicit or implicit payoffs related to the elicitation process, but also through other stakes the individual may have in the events of interest.” Previous work including Karni (1999) and Jaffray and Karni (1999) proposed elicitation procedures when the no-stakes condition is violated. However, we are not aware of any evidence informing the extent to which this violation matters empirically.

In essence, probabilistic sophistication means that the individual ranks bets with subjective probabilities over outcomes in a similar fashion as she would rank lotteries with an objective probability distribution.

Our analysis suggests people use different strategies, but our development below shows that the clock can have an accuracy advantage even when this is not the case.

If all decisions are optimal (or if all decisions are naïve), the clock mechanism has no advantage from censoring.

Note that this approach does not make any use of censored observations. An alternative is to take a censored belief as the interval between the clock’s stopping point and the clock’s upper limit. However, the transformation from interval estimates to point estimates is arbitrary (e.g., using the mean of the interval as the estimate of the belief), and results would generally be sensitive to this transformation. We do not pursue this approach here.

Karni specified the uniform distribution on [0, 1] for

*F*_{ r }(•), but we note that properties of the mechanisms remain the same for any continuous and strictly increasing distribution*F*_{ r }(•).Since both lotteries are presented using integers only, we chose to constrain the decisions also to integers for simplicity and transparency. Moreover, words such as “probabilities” or “distributions” were not used during our experiment.

To maintain symmetry with the clock procedure, all decisions were submitted via computers.

The subject physically drew a chip from the appropriate cloth bag. See Section 3.4 for details.

The two dominant strategies are equivalent because the individual ends up with the same bag except when R is the same as the number of white chips in bag A, as the former strategy leads to bag A while the latter leads to bag B. However, in this case the two bags have the same number of white chips, so the chance of winning $10 is identical.

The screen displays number R for 5 seconds, and the subject were aware that they were able to drop out at R and obtain bag A.

The dummy bidder in our experiment is implemented by stopping the clock when it reaches number

*R*.Subjects were not told there was a second round at the beginning of the experiment. Upon finishing the first round, the experimenter announced, “That was the end of the experiment. However, we still have some time left; let us do another experiment so you can make more money.”

E-Prime is a commercial software for computerized experiment design commonly used in psychology research: http://www.pstnet.com/eprime.cfm

The quiz was designed to test whether subjects understood how various hypothetical scenarios and decisions are translated into payoffs. The quiz is available upon request.

Hoffrage et al. (2000) showed that natural frequencies are much better than percent chances at facilitating statistical reasoning of people including experts and non-experts.

The two equally dominant strategies are both set as deviation of 0. In particular, when endowed belief is 0.2, decisions {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9} are converted into deviations {−0.1, 0, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6}; when endowed belief is 0.3, they are {−0.2, −0.1, 0, 0, 0.1, 0.2, 0.3, 0.4, 0.5}.

Post-experiment surveys indicate many subjects did not recognize the existence of optimal strategies.

Note the proportion of optimal decisions in our declarative mechanism is consistent with proportions of optimal decisions in the second-price auction in Cooper and Fang (2008, p. 1583).

Censored decisions are excluded from these statistics (see footnote 12).

The one-sided test is significant at 5% level; we have a clear ordered hypothesis as illustrated in Section 2.

The distribution of unfiltered declarative data is marginally significantly different from the distribution of the clock data (

*p*= 0.063, Chi-squared test).We assume the two dominant strategies are equally likely to be chosen. Hence, the ratios of subjects who choose decisions 0.2, 0.3 and 0.4 are 1:2:1.

In the declarative mechanism, deviations from dominant strategies in the second round are significantly smaller than they are in the first round (

*p*= .04, two sided Wilcoxon-Mann–Whitney).However, the distribution of unfiltered declarative data is not different from the distribution of the clock data (

*p*= .392, Chi-squared test).Subjects are members of the CentERpanel, consisting of about 2,000 households, who answer questions every weekend. See www.centerdata.nl for more.

In a laboratory study, Palfrey and Wang (2009) report that probabilities elicited using the linear scoring are biased towards 0 and 1 to a greater degree than with proper scoring rules. Their finding is consistent with theory predictions. On the other hand, Sonnemans and Offerman (2001) show that a flat-rate incentive does just as well as the quadratic scoring rule.

## References

Allen, F. (1987). Discovering personal probabilities when utility functions are unknown.

*Management Science, 33*(4), 542–544.Andersen, S., Harrison, G. W., Lau, M. I., & Rutström, E. E. (2008). Eliciting risk and time preferences.

*Econometrica, 76*(3), 583–618.Andersen, S., Fountain, J., Harrison, G. W., & Rutström, E. E. (2010). Estimating subjective probabilities. Working paper 2010–06, Center for the Economic Analysis of Risk, Georgia State University. http://cear.gsu.edu/files/Estimating_Subjective_Probabilities.pdf.

Andreoni, J. (1995). Cooperation in public-goods experiments: kindness or confusion?

*American Economic Review, 85*(4), 891–904.Bellemare, C., Kroger, S., & Van Soest, A. (2008). Measuring inequity aversion in a heterogeneous population using experimental decisions and subjective probabilities.

*Econometrica, 76*(4), 815–839.Brier, G. W. (1950). Verification of forecasts expressed in terms of probability.

*Monthly Weather Review, 78*, 1–3.Charness, G., Karni, E., & Levin, D. (2007). Individual and group decision making under risk: an experimental study of Bayesian updating and violations of first-order stochastic dominance.

*Journal of Risk and Uncertainty, 35*(2), 129–148.Charness, G., Karni, E., & Levin, D. (2010). On the conjunction fallacy in probability judgment: new experimental evidence regarding Linda.

*Games and Economic Behavior, 68*(2), 551–556.Cooper, D. J., & Fang, H. (2008). Understanding overbidding in second price auctions: an experimental study.

*The Economic Journal, 118*(532), 1572–1595.Dohmen, T., Falk, A., Huffman, D., Sunde, U., Schupp, J., & Wagner, G. G. (2009). Individual risk attitudes: measurement, determinants and behavioral consequences.

*Journal of the European Economic Association, 9*(3), 522–550.Garthwaite, P. H., Kadane, J. B., & O’Hagan, A. (2005). Statistical methods for eliciting probability distributions.

*Journal of the American Statistical Association, 100*(470), 680–701.Grether, D. M. (1992). Testing Bayes rule and the representativeness heuristic: some experimental evidence.

*Journal of Economic Behavior and Organization, 17*(1), 31–57.Güth, W., Schmittberger, R., & Schwarze, B. (1982). An experimental analysis of ultimatum bargaining.

*Journal of Economic Behavior and Organization, 3*(4), 367–388.Harstad, R. (2000). Dominant strategy adoption and bidders’ experience with pricing rules.

*Experimental Economics, 3*(3), 261–280.Hoffrage, U., Lindsey, S., Hertwig, R., & Gigerenzer, G. (2000). Communicating statistical information.

*Science, 290*(5500), 2261–2262.Holt, C. A., & Smith, A. M. (2009). An update on Bayesian updating.

*Journal of Economic Behavior and Organization, 69*(2), 125–134.Hossain, T., & Okui, R. (2011). The binarized scoring rule. Working paper: http://ssrn.com/abstract=1592082

Houser, D., & Kurzban, R. (2002). Revisiting kindness and confusion in public goods experiments.

*American Economic Review, 92*(4), 1062–1069.Houser, D., Keane, M., & McCabe, K. (2004). Behavior in a dynamic decision problem: an analysis of experimental evidence using a Bayesian type classification algorithm.

*Econometrica, 72*(3), 781–822.Jaffray, J.-Y., & Karni, E. (1999). Elicitation of subjective probabilities when the initial endowment is unobservable.

*Journal of Risk and Uncertainty, 18*(1), 5–20.Kadane, J. B., & Winkler, R. L. (1988). Separating probability elicitation from utilities.

*Journal of the American Statistical Association, 83*(402), 357–363.Kagel, J., & Levin, D. (1993). Independent private value auctions: bidder behaviour in first-, second-, and third-price auctions with varying numbers of bidders.

*The Economic Journal, 103*(419), 868–879.Kagel, J., & Levin, D. (2009). Implementing efficient multi-object auction institutions: an experimental study of the performance of boundedly rational agents.

*Games and Economic Behavior, 66*, 221–237.Kagel, J., Levin, D., & Harstad, R. (1987). Information impact and allocation rules in auctions with affiliated private values: a laboratory study.

*Econometrica, 55*(6), 1275–1304.Karni, E. (1999). Elicitation of subjective probabilities when preferences are state-dependent.

*International Economic Review, 40*(2), 479–486.Karni, E. (2009). A mechanism for eliciting probabilities.

*Econometrica, 77*(2), 603–606.Köszegi, B., & Rabin, M. (2008). Revealed mistakes and revealed preferences. In A. Caplin & A. Schotter (Eds.),

*The foundations of positive and normative economics: A handbook*. New York: Oxford University Press.Machina, M. J., & Schmeidler, D. (1992). A more robust definition of subjective probability.

*Econometrica, 60*(4), 745–780.Manski, C. (2004). Measuring expectations.

*Econometrica, 72*(5), 1329–1376.McKelvey, R. D., & Page, T. (1990). Public and private information: an experimental study of information pooling.

*Econometrica, 58*(6), 1321–1339.Möbius, M., Niederle, M., Niehaus, P., & Rosenblat, T. (2011). Managing self-confidence: theory and experimental evidence. NBER working paper no. 17104: http://www.nber.org/papers/w17014.

Nyarko, Y., & Schotter, A. (2002). An experimental study of belief learning using elicited beliefs.

*Econometrica, 70*(3), 971–1005.Offerman, T., Sonnemans, J., Van de Kuilen, G., & Wakker, P. P. (2009). A truth-serum for non-Bayesians: correcting proper scoring rules for risk attitudes.

*Review of Economic Studies, 76*(4), 1461–1489.Palfrey, T. R., & Wang, S. W. (2009). On eliciting beliefs in strategic games.

*Journal of Economic Behavior & Organization, 71*(2), 98–109.Roth, A. E., & Malouf, M. W. K. (1979). Game-theoretic models and the role of information in bargaining.

*Psychological Review, 86*(6), 574–594.Roth, A. E., & Murnighan, J. K. (1982). The role of information in bargaining: an experimental study.

*Econometrica, 50*(5), 1123–1142.Roth, A. E., & Schoumaker, F. (1983). Expectations and reputations in bargaining: an experimental study.

*American Economic Review, 73*(3), 362–372.Rutström, E. E. (1998). Home-grown values and incentive compatible auction design.

*International Journal of Game Theory, 27*(3), 427–441.Savage, L. J. (1971). Elicitation of personal probabilities and expectations.

*Journal of the American Statistical Association, 66*(336), 783–801.Schlag, K. H., & van der Weele, J. (2009). Eliciting probabilities, means, medians, variances and covariances without assuming risk-neutrality. Working paper, Universitat Pompeu Fabra, Barcelona.

Sonnemans, J., & Offerman, T. (2001). Is the quadratic scoring rule really incentive compatible? Working Paper, CREED, University of Amsterdam.

Winkler, R. L., & Murphy, A. H. (1968). “Good” probability assessors.

*Journal of Applied Meteorology, 7*(5), 751–758.

## Acknowledgements

We thank Glenn Harrison, Edi Karni, Eric Danan, Jacob Sagi, Ron Harstad, James Andreoni, Yan Chen, Soo Hong Chew, Nat Wilcox, colleagues at ICES, and seminar participants at ESA international 2009 (Washington DC), Advanced Workshop in Experimental Economics 2009 (Sydney, Australia), SEA 2009 (San Antonio, TX), and FUR 2010 (Newcastle, England) for helpful discussions.

## Author information

### Authors and Affiliations

### Corresponding author

## Appendices

### Appendix

### Instructions for Declarative mechanism with endowed belief of 0.2

Welcome to this experiment! In addition to the $5 for showing up on time, you will be paid in cash based on your decisions in the experiment. Please note that no other participant’s decisions in this experiment will affect your earnings, and vice versa. Please read these instructions carefully. Raise your hand if you have any questions, and the experimenter will come to assist you.

### 2.1 Overview

The procedure is simple. You will first submit a number, and then you will draw a chip from one of two bags. If the chip you draw is white you will earn $10, and if it is black you will earn $1.

### 2.2 Details

**Bag A** has **2** white chips and **8** black chips for a total of **10**. **Bag B** also has **10** chips, some white, some black, but you do not know how many of each. *The number of white chips in Bag B* is on the card in the sealed envelope at your desk. This card was drawn in advance from a deck of 9 cards, labeled from 1 to 9. Please do not open the envelope until you are told to do so.

To determine the bag you’ll draw from, you will first submit a number between 1 and 9. If the number you submit is less than or equal to *the number of white chips in Bag B*, you will draw from **Bag B**, otherwise you will draw from **Bag A**.

**Your payment**: If you draw a white chip you earn $10; a black chip earns you $1.

### Instructions for Clock mechanism with endowed belief of 0.3

Welcome to this experiment! In addition to the $5 for showing up on time, you will be paid in cash based on your decisions in the experiment. Please note that no other participant’s decisions in this experiment will affect your earnings, and vice versa. Please read these instructions carefully. Raise your hand if you have any questions, and the experimenter will come to assist you.

### 3.1 Overview

The procedure is simple. You will first participate in an exercise, and then you will draw a chip from one of two bags. If you draw a white chip you will earn $10, and if it is black you will earn $1.

### 3.2 Details

**Bag A** has **3** white chips and **7** black chips for a total of **10**. **Bag B** also has **10** chips, some white, some black, but you do not know how many of each. *The number of white chips in Bag B* is on the card in the sealed envelope at your desk. This card was drawn in advance from a deck of 9 cards, labeled from 1 to 9. Please do not open the envelope until you are told to do so.

To determine the bag you’ll draw from, you will first participate in an exercise. The computer screen in front of you will start counting from number **1**, and increase by **1** every 5 seconds until it reaches the number in the sealed envelope. You can stop the counting at any point by pressing the space key. If you press the space key before the counting stops, you draw from **Bag B**, otherwise you draw from **Bag A**.

**Your payment**: If you draw a white chip you earn $10; a black chip earns you $1.

## Rights and permissions

## About this article

### Cite this article

Hao, L., Houser, D. Belief elicitation in the presence of naïve respondents: An experimental study.
*J Risk Uncertain* **44**, 161–180 (2012). https://doi.org/10.1007/s11166-011-9133-1

Published:

Issue Date:

DOI: https://doi.org/10.1007/s11166-011-9133-1