The online laboratory: conducting experiments in a real labor market

Abstract

Online labor markets have great potential as platforms for conducting experiments. They provide immediate access to a large and diverse subject pool, and allow researchers to control the experimental context. Online experiments, we show, can be just as valid—both internally and externally—as laboratory and field experiments, while often requiring far less money and time to design and conduct. To demonstrate their value, we use an online labor market to replicate three classic experiments. The first finds quantitative agreement between levels of cooperation in a prisoner’s dilemma played online and in the physical laboratory. The second shows—consistent with behavior in the traditional laboratory—that online subjects respond to priming by altering their choices. The third demonstrates that when an identical decision is framed differently, individuals reverse their choice, thus replicating a famed Tversky-Kahneman result. Then we conduct a field experiment showing that workers have upward-sloping labor supply curves. Finally, we analyze the challenges to online experiments, proposing methods to cope with the unique threats to validity in an online setting, and examining the conceptual issues surrounding the external validity of online results. We conclude by presenting our views on the potential role that online experiments can play within the social sciences, and then recommend software development priorities and best practices.

This is a preview of subscription content, log in to check access.

References

  1. Andreoni, J. (1990). Impure altruism and donations to public goods: a theory of warm-glow giving. The Economic Journal, 464–477.

  2. Axelrod, R., & Hamilton, W. D. (1981). The evolution of cooperation. Science, 211(4489), 1390.

    Article  Google Scholar 

  3. Bainbridge, W. S. (2007). The scientific research potential of virtual worlds. Science, 317(5837), 472.

    Article  Google Scholar 

  4. Benjamin, D. J., Choi, J. J., & Strickland, A. (2010a). Social identity and preferences. American Economic Review (forthcoming).

  5. Benjamin, D. J., Choi, J. J., Strickland, A., & Fisher, G. (2010b) Religious identity and economic behavior. Cornell University, Mimeo.

  6. Bohnet, I., Greig, F., Herrmann, B., & Zeckhauser, R. (2008). Betrayal aversion: evidence from Brazil, China, Oman, Switzerland, Turkey, and the United States. American Economic Review, 98(1), 294–310.

    Article  Google Scholar 

  7. Brandts, J., & Charness, G. (2000). Hot vs. cold: sequential responses and preference stability in experimental games. Experimental Economics, 2(3), 227–238.

    Google Scholar 

  8. Camerer, C. (2003). Behavioral game theory: experiments in strategic interaction. Princeton: Princeton University Press.

    Google Scholar 

  9. Chandler, D., & Kapelner, A. (2010). Breaking monotony with meaning: motivation in crowdsourcing markets. University of Chicago, Mimeo.

  10. Chen, D., & Horton, J. (2010). The wages of pay cuts: evidence from a field experiment. Harvard University, Mimeo.

  11. Chilton, L. B., Sims, C. T., Goldman, M., Little, G., & Miller, R. C. (2009). Seaweed: a web application for designing economic games. In Proceedings of the ACM SIGKDD workshop on human computation (pp. 34–35). New York: ACM Press.

    Google Scholar 

  12. Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation. Boston: Houghton Mifflin.

    Google Scholar 

  13. Eckel, C. C., & Wilson, R. K. (2006). Internet cautions: experimental games with Internet partners. Experimental Economics, 9(1), 53–66.

    Article  Google Scholar 

  14. Falk, A., & Heckman, J. J. (2009). Lab experiments are a major source of knowledge in the social sciences. Science, 326(5952), 535.

    Article  Google Scholar 

  15. Fehr, E., & Schmidt, K. M. (1999). A theory of fairness, competition, and cooperation. Quarterly Journal of Economics, 114(3), 817–868.

    Article  Google Scholar 

  16. Fehr, E., Schmidt, K. M., & Gächter, S. (2000). Cooperation and punishment in public goods experiments. American Economic Review, 90(4), 980–994.

    Article  Google Scholar 

  17. Fischbacher, U. (2007). z- tree: Zurich toolbox for ready-made economic experiments. Experimental Economics, 10(2), 171–178.

    Article  Google Scholar 

  18. Frei, B. (2009). Paid crowdsourcing: current state & progress toward mainstream business use. Produced by Smartsheet.com.

  19. Gneezy, U., Leonard, K. L., & List, J. A. (2009). Gender differences in competition: evidence from a matrilineal and a patriarchal society. Econometrica, 77(5), 1637–1664.

    Article  Google Scholar 

  20. Harrison, G. W., & List, J. A. (2004). Field experiments. Journal of Economic Literature, 42(4), 1009–1055.

    Article  Google Scholar 

  21. Herrmann, B., & Thöni, C. (2009). Measuring conditional cooperation: a replication study in Russia. Experimental Economics, 12(1), 87–92.

    Article  Google Scholar 

  22. Horton, J. (2010). Online labor markets. In Workshop on Internet and network economics (pp. 515–522).

    Google Scholar 

  23. Horton, J. (2011). The condition of the Turking class: are online employers fair and honest? Economic Letters (forthcoming).

  24. Horton, J. & Chilton, L. (2010). The labor economics of paid crowdsourcing. In Proceedings of the 11th ACM conference on electronic commerce.

    Google Scholar 

  25. Ipeirotis, P. (2010). Demographics of Mechanical Turk. New York University Working Paper.

  26. Kagel, J. H., Roth, A. E., & Hey, J. D. (1995). The handbook of experimental economics. Princeton: Princeton University Press.

    Google Scholar 

  27. Kittur, A., Chi, E. H., & Suh, B. (2008). Crowdsourcing user studies with Mechanical Turk.

  28. Kocher, M. G., & Sutter, M. (2005). The decision maker matters: individual versus group behaviour in experimental beauty-contest games*. The Economic Journal, 115(500), 200–223.

    Article  Google Scholar 

  29. Levitt, S. D., & List, J. A. (2009). Field experiments in economics: the past, the present, and the future. European Economic Review, 53(1), 1–18.

    Article  Google Scholar 

  30. Little, G., Chilton, L. B., Goldman, M., & Miller, R. C. (2009). TurKit: tools for iterative tasks on Mechanical Turk. In Proceedings of the ACM SIGKDD workshop on human computation. New York: ACM Press.

    Google Scholar 

  31. Lucking-Reiley, D. (2000). Auctions on the Internet: what’s being auctioned, and how? The Journal of Industrial Economics, 48(3), 227–252.

    Google Scholar 

  32. Mason, W., & Watts, D. J. (2009). Financial incentives and the performance of crowds. In Proceedings of the ACM SIGKDD workshop on human computation (pp. 77–85). New York: ACM Press.

    Google Scholar 

  33. Mason, W., Watts, D. J., & Suri, S. (2010). Conducting behavioral research on Amazon’s Mechanical Turk. SSRN eLibrary.

  34. Pallais, A. (2010). Inefficient hiring in entry-level labor markets.

  35. Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5.

  36. Resnick, P., Kuwabara, K., Zeckhauser, R., & Friedman, E. (2000). Reputation systems. Communications of the ACM, 43(12), 45–48.

    Article  Google Scholar 

  37. Resnick, P., Zeckhauser, R., Swanson, J., & Lockwood, K. (2006). The value of reputation on eBay: a controlled experiment. Experimental Economics, 9(2), 79–101.

    Article  Google Scholar 

  38. Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701.

    Article  Google Scholar 

  39. Selten, R. (1967). Die Strategiemethode zur Erforschung des eingeschrankt rationalen Verhaltens im Rahmen eines Oligopolexperiments. Beitrage zur experimentellen Wirtschaftsforschung, 1, 136–168.

    Google Scholar 

  40. Shariff, A. F., & Norenzayan, A. (2007). God is watching you. Psychological Science, 18(9), 803–809.

    Article  Google Scholar 

  41. Sheng, V. S., Provost, F., & Ipeirotis, P. G. (2008). Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 614–622). New York: ACM Press.

    Google Scholar 

  42. Sorokin, A., & Forsyth, D. (2008). Utility data annotation with Amazon Mechanical Turk. University of Illinois at Urbana-Champaign, Mimeo, 51, 61820.

    Google Scholar 

  43. Suri, S., & Watts, D. (2011). A study of cooperation and contagion in web-based, networked public goods experiments. PLoS ONE (forthcoming).

  44. Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453.

    Article  Google Scholar 

  45. von Ahn, L., Blum, M., Hopper, N. J., & Langford, J. (2003). CAPTCHA: using hard AI problems for security. In Lecture notes in computer science (pp. 294–311). Berlin: Springer.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Richard J. Zeckhauser.

Additional information

Thanks to Alex Breinin and Xiaoqi Zhu for excellent research assistance. Thanks to Samuel Arbesman, Dana Chandler, Anna Dreber, Rezwan Haque, Justin Keenan, Robin Yerkes Horton, Stephanie Hurder and Michael Manapat for helpful comments, as well as to participants in the Online Experimentation Workshop hosted by Harvard’s Berkman Center for Internet and Society. Thanks to Anna Dreber, Elizabeth Paci and Yochai Benkler for assistance running the physical laboratory replication study, and to Sarah Hirschfeld-Sussman and Mark Edington for their help with surveying the Harvard Decision Science Laboratory subject pool. This research has been supported by the NSF-IGERT program “Multidisciplinary Program in Inequality and Social Policy” at Harvard University (Grant No. 0333403), and DGR gratefully acknowledges financial support from the John Templeton Foundation’s Foundational Questions in Evolutionary Biology Prize Fellowship.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Horton, J.J., Rand, D.G. & Zeckhauser, R.J. The online laboratory: conducting experiments in a real labor market. Exp Econ 14, 399–425 (2011). https://doi.org/10.1007/s10683-011-9273-9

Download citation

Keywords

  • Experimentation
  • Online labor markets
  • Prisoner’s dilemma
  • Field experiment
  • Internet

JEL Classification

  • J2
  • C93
  • C91
  • C92
  • C70