Abstract
Bayesian optimal experiments that maximize the information gained from collected data are critical to efficiently identify behavioral models. We extend a seminal method for designing Bayesian optimal experiments by introducing two computational improvements that make the procedure tractable: (1) a search algorithm from artificial intelligence that efficiently explores the space of possible design parameters, and (2) a sampling procedure which evaluates each design parameter combination more efficiently. We apply our procedure to a game of imperfect information to evaluate and quantify the computational improvements. We then collect data across five different experimental designs to compare the ability of the optimal experimental design to discriminate among competing behavioral models against the experimental designs chosen by a “wisdom of experts” prediction experiment. We find that data from the experiment suggested by the optimal design approach requires significantly less data to distinguish behavioral models (i.e., test hypotheses) than data from the experiment suggested by experts. Substantively, we find that reinforcement learning best explains human decision-making in the imperfect information game and that behavior is not adequately described by the Bayesian Nash equilibrium. Our procedure is general and computationally efficient and can be applied to dynamically optimize online experiments.
Similar content being viewed by others
Notes
It is important to note that this difference metric is a directed, asymmetric measure. Wang et al. use a similar metric but propose using the average KL divergence (Wang et al. 2010), which can be expressed as \(I(\theta ) = \displaystyle \sum\nolimits _{i}^n p_i I(i;\theta )\).
Following the design by El-Gamal and Palfrey (1996), Player 2 is asked to make a decision even if Player 1 chooses Stop.
Wording as used in El-Gamal and Palfrey (1996). See Appendix for a more detailed description.
We closely follow the implementation of the Stop-Go game by El-Gamal and Palfrey (1996), where Player 2 makes a decision even if Player 1 chooses Stop.
In fact, we could construct the information surface for choosing the optimal design parameters only for a simpler two-player version of our game, and it took approximately 72 h on the following super-computing cluster: Four hundreds parallel R v.3.x jobs, distributed across 56-core x86 64 Little Endian Intel(R) Xeon(R) cpus (E5-2680 v4 @ 2.40GHz; L1d cache: 32K, L1i cache: 32K, L2 cache: 256K, L3 cache: 35840K).
All code is available at http://github.com/shakty/optimal-design.
Non-uniform sampling can be used when the experimenter has prior over the distribution of model parameters.
Note: the optimal experiment we report in Figure 3—\(A\approx 2.0\) and \(\pi \approx 0.5\)—is generated from an information surface comparing three models. The same coordinate is the optimal experiment when we include all four models. See Fig. A.3 in the Appendix for the full information surface.
References
Azevedo, E. M., Deng, A., Olea, J. L. M., Rao, J., & Weyl, E. G. (2019). A/B testing with fat tails. A/b testing with fat tails. Journal of Political Economy,. https://doi.org/10.1086/710607.
Bakshy, E., Dworkin, L., Karrer, B., Kashin, K. Letham, B., Murthy, A. & Singh, S. (2018). AE: A domain-agnostic platform for adaptive experimentation. In Conference on Neural Information Processing Systems (pp. 1–8). http://eytan.github.io/papers/ae_workshop.pdf.
Balietti, S. (2017). nodeGame: Real-time, synchronous, online experiments in the browser Behavior Research Methods(i), (1–31). https://doi.org/10.3758/s13428-016-0824-z
Berman, R. (2018). Beyond the last touch: Attribution in online advertising. Marketing Science, 37(5), 771–792. https://doi.org/10.1287/mksc.2018.1104.
Berman, R., Pekelis, L., Scott, A., & Van den Bulte, C. (2018). p-Hacking and false discovery in A/B testing. Available at SSRN,. https://doi.org/10.2139/ssrn.3204791.
Bramoullé, Y., Djebbari, H., & Fortin, B. (2020). Peer effects in networks: A survey. CEPR Discussion Paper No. DP14260. http://ftp.iza.org/dp12947.pdf.
Camerer, C. F. (2011). Behavioral game theory: Experiments in strategic interaction. Princeton University Press. http://psycnet.apa.org/record/2003-06054-000.
Camerer, C. F., & Ho, T.-H. (1999). Experience-weighted attraction learning in normal form games. Econometrica, 67(4), 27–874. https://doi.org/10.1111/1468-0262.00054.
Chapman, J., Snowberg, E., Wang, S., & Camerer, C. (2018). Loss attitudes in the US population: Evidence from dynamically optimized sequential experimentation (DOSE). National Bureau of Economic Research, (1–55). https://doi.org/10.3386/w25072.
Contal, E., Buffoni, D., Robicquet, A., & Vayatis, N. (2013). Parallel Gaussian process optimization with upper confidence bound and pure exploration. In Machine Learning and Knowledge Discovery in Databases (pp. 225–240). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-40988-2_15.
David, P. A. (1985). Clio and the economics of QWERTY. American Economic Review, 75(2), 332–337.
De Freitas, N., Smola, A. J., & Zoghi, M. (2012). Exponential regret bounds for Gaussian process bandits with deterministic observations. In Proceedings of the 29th International Coference on International Conference on Machine Learning (pp. 955–962). https://doi.org/10.5555/3042573.3042697.
DellaVigna, S., & Pope, D. (2017). What motivates effort? Evidence and expert forecasts. Review of Economic Studies, 85(2), 1029–1069. https://doi.org/10.1093/restud/rdx033.
Eckles, D., & Kaptein, M. C. (2014). Thompson sampling with the online bootstrap. arXiv, (1–13). arxiv:1410.4009.
El-Gamal, M. A., & Palfrey, T. R. (1996). Economical experiments: Bayesian efficient experimental design. International Journal of Game Theory, 25, 495–517. https://doi.org/10.1007/BF01803953.
Erev, I., & Roth, A. E. (1998). Predicting how people play games: Reinforcement learning in games with unique strategy equilibrium. American Economic Review, 88(4), 848–881.
Erev, I., & Roth, A. E. (2014). Maximization, learning, and economic behavior. Proceedings of the National Academy of Sciences, 111, 10818–10825. https://doi.org/10.1073/pnas.1402846111.
Feltovich, N. (2000). Reinforcement-based versus belief-based learning models in experimental asymmetric-information games. Econometrica, 68(3), 605–641. https://doi.org/10.1111/1468-0262.00125.
Fershtman, C., & Pakes, A. (2012). Dynamic games with asymmetric information: A framework for empirical work. The Quarterly Journal of Economics, 127(4), 1611–1661. https://doi.org/10.1093/qje/qjs025.
Fisher, R. A. (1936). The design of experiments. American Mathematical Monthly, 43(3), 180. https://doi.org/10.2307/2300364.
Foley, M., Forber, P., Smead, R., & Riedl, C. (2018). Conflict and convention in dynamic networks. Journal of the Royal Society Interface, 15(140), 20170835. https://doi.org/10.1098/rsif.2017.0835.
Fudenberg, D., & Tirole, J. (1991). Game theory. Cambridge: MIT Press.
Gale, J., Binmore, K. G., & Samuelson, L. (1995). Learning to be imperfect: The ultimatum game. Games and Economic Behavior, 8(1), 56–90. https://doi.org/10.1016/S0899-8256(05)80017-X.
Gilchrist, D. S., & Sands, E. G. (2016). Something to talk about: Social spillovers in movie consumption. Journal of Political Economy, 124(5), 1339–1382. https://doi.org/10.1086/688177.
Goldman, M., & Rao, J. (2016). Experiments as instruments: Heterogeneous position effects in sponsored search auctions. EEAI Endorsed Transactions on Serious Games,. https://doi.org/10.4108/eai.8-8-2015.2261043.
Görtler, J., Kehlbeck, R., & Deussen, O. (2019). A visual exploration of Gaussian processes. Distill,. https://doi.org/10.23915/distill.00017.
Harsanyi, J. C. (1967). Games with incomplete information played by “Bayesian” players, Part I. The Basic Model. Management Science, 14(3), 159–182. https://doi.org/10.1287/mnsc.1040.0270.
Hertwig, R., & Ortmann, A. (2001). Experimental practices in economics: A methodological challenge for psychologists? Behavioral and Brain Sciences, 24(3), 383–403. https://doi.org/10.1037/e683322011-032.
Hill, T. P. (1995). A statistical derivation of the significant-digit law. Statistical Science, 10(4), 354–363. https://doi.org/10.2307/2246134.
Ho, T.-H., Wang, X., & Camerer, C. F. (2008). Individual differences in EWA learning with partial payoff information. The Economic Journal, 118(525), 37–59. https://doi.org/10.1111/j.1468-0297.2007.02103.x.
Horton, J. J., Rand, D. G., & Zeckhauser, R. J. (2011). The online laboratory: Conducting experiments in a real labor market. Experimental Economics, 14(3), 399–425. https://doi.org/10.1007/s10683-011-9273-9.
Imai, T., & Camerer, C. F. (2018). Estimating time preferences from budget set choices using optimal adaptive design. working paper. http://taisukeimai.com/files/adaptive_ctb.pdf.
Kachelmeier, S. J., & Towry, K. L. (2005). The limitations of experimental design: A case study involving monetary incentive effects in laboratory markets. Experimental Economics, 8(1), 21–33. https://doi.org/10.1007/s10683-005-0435-5.
Katz, M. L., & Shapiro, C. (1985). Network externalities, competition, and compatibility. American Economic Review, 75(3), 424–440.
Knez, M., & Camerer, C. F. (1994). Creating expectational assets in the laboratory: Coordination in ‘weakest-link’ games. Strategic Management Journal, 15(1 S), 101–119. https://doi.org/10.1002/smj.4250150908.
Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M. (2009). Controlled experiments on the web: Survey and practical guide. Data Mining and Knowledge Discovery, 18(1), 140–181. https://doi.org/10.1007/s10618-008-0114-1.
Kohavi, R., & Thomke, S. (2017). The surprising power of online experiments. Harvard Business Review, 95(5), 2–9.
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79–86. https://doi.org/10.1214/aoms/1177729694.
Letham, B., Karrer, B., Ottoni, G., & Bakshy, E. (2017). Constrained Bayesian optimization with noisy experiments. arXiv, 1–20. arxiv:1706.07094.
Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44(1), 1–23. https://doi.org/10.3758/s13428-011-0124-6.
McIntyre, D. P., & Chintakananda, A. (2014). Competing in network markets: Can the winner take all? Business Horizons, 57(1), 117–125. https://doi.org/10.1016/j.bushor.2013.09.005.
Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5(5), 411–419.
Parker, B. M., Gilmour, S. G., & Schormans, J. (2017). Optimal design of experiments on connected units with application to social networks. Journal of the Royal Statistical Society C, 66(3), 455–480. https://doi.org/10.1111/rssc.12170.
Phan, T. Q., & Airoldi, E. M. (2015). A natural experiment of social network formation and dynamics. Proceedings of the National Academy of Sciences, 112(21), 6595–6600. https://doi.org/10.1073/pnas.1404770112.
Pooseh, S., Bernhardt, N., Guevara, A., Huys, Q. J. M., & Smolka, M. N. (2018). Value-based decision-making battery: A Bayesian adaptive approach to assess impulsive and risky behavior. Behavior Research Methods, 50(1), 236–249. https://doi.org/10.3758/s13428-017-0866-x.
Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian processes for machine learning The MIT Press. http://gaussianprocess.org/gpml/.
Rzhetsky, A., Foster, J. G., Foster, I. T., & Evans, J. A. (2015). Choosing experiments to accelerate collective discovery. In Proceedings of the National Academy of Sciences (pp. 1–6). https://doi.org/10.1073/pnas.1509757112.
Salmon, T. C. (2001). An evaluation of econometric models of adaptive learning. Econometrica, 69(6), 1597–1628. https://doi.org/10.1111/1468-0262.00258.
Sarin, R., & Vahid, F. (2001). Predicting how people play games: A simple dynamic model of choice. Games and Economic Behavior, 34(1), 104–122. https://doi.org/10.1006/game.1999.0783.
Schwartz, E. M., Bradlow, E. T., & Fader, P. S. (2017). Customer acquisition via display advertising using multi-armed bandit experiments. Marketing Science, 36(4), 500–522. https://doi.org/10.1287/mksc.2016.1023.
Sobol, I. M. (1998). On quasi-Monte Carlo integrations. Mathematics and Computers in Simulation, 47(2), 103–112. https://doi.org/10.1016/S0378-4754(98)00096-2.
Srinivas, N., Krause, A., Kakade, S. M., & Seeger, M. (2010). Gaussian process optimization in the bandit setting: No regret and experimental design. In Proceedings of the 27th International Conference on International Conference on Machine Learning (pp. 1015–1022). https://doi.org/10.5555/3104322.3104451.
Stefanakis, T. S., Contal, E., Vayatis, N., Dias, F., & Synolakis, C. E. (2014). Can small islands protect nearby coasts from tsunamis? An active experimental design approach. Proceedings of the Royal Society A, 470(2172), 1–20. https://doi.org/10.1098/rspa.2014.0575.
Tauber, E. M. (1972). Why do people shop? Journal of Marketing, 36(4), 46–49. https://doi.org/10.2307/1250426.
Wang, S. W., Filiba, M., & Camerer, C. F. (2010). Dynamically optimized sequential experimentation (DOSE) for estimating economic preference parameters. arXiv, 1–41. http://pdfs.semanticscholar.org/1707/ded4fdc981aedc2a2f6bab077fcf37acb7d5.pdf.
Zhou, S., Valentine, M., & Bernstein, M. S. (2018). In search of the dream team: Temporally constrained multi-armed bandits for identifying effective team structures. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1–13). https://doi.org/10.1145/3173574.3173682.
Acknowledgements
The authors acknowledge Mahmoud El-Gamal for helpful correspondences and Stephanie W. Wang for useful comments on the design and implementation. This work was supported in part by the Office of Naval Research (N00014-16-1-3005 and N00014-17-1-2542) and the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. All authors contributed to analyses and preparation of the manuscript. S.B. ran the online experiments, and all authors contributed to the expert surveying.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Balietti, S., Klein, B. & Riedl, C. Optimal design of experiments to identify latent behavioral types. Exp Econ 24, 772–799 (2021). https://doi.org/10.1007/s10683-020-09680-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10683-020-09680-w