Experimental Economics

, Volume 14, Issue 3, pp 399–425 | Cite as

The online laboratory: conducting experiments in a real labor market

  • John J. Horton
  • David G. Rand
  • Richard J. ZeckhauserEmail author


Online labor markets have great potential as platforms for conducting experiments. They provide immediate access to a large and diverse subject pool, and allow researchers to control the experimental context. Online experiments, we show, can be just as valid—both internally and externally—as laboratory and field experiments, while often requiring far less money and time to design and conduct. To demonstrate their value, we use an online labor market to replicate three classic experiments. The first finds quantitative agreement between levels of cooperation in a prisoner’s dilemma played online and in the physical laboratory. The second shows—consistent with behavior in the traditional laboratory—that online subjects respond to priming by altering their choices. The third demonstrates that when an identical decision is framed differently, individuals reverse their choice, thus replicating a famed Tversky-Kahneman result. Then we conduct a field experiment showing that workers have upward-sloping labor supply curves. Finally, we analyze the challenges to online experiments, proposing methods to cope with the unique threats to validity in an online setting, and examining the conceptual issues surrounding the external validity of online results. We conclude by presenting our views on the potential role that online experiments can play within the social sciences, and then recommend software development priorities and best practices.


Experimentation Online labor markets Prisoner’s dilemma Field experiment Internet 

JEL Classification

J2 C93 C91 C92 C70 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Andreoni, J. (1990). Impure altruism and donations to public goods: a theory of warm-glow giving. The Economic Journal, 464–477. Google Scholar
  2. Axelrod, R., & Hamilton, W. D. (1981). The evolution of cooperation. Science, 211(4489), 1390. CrossRefGoogle Scholar
  3. Bainbridge, W. S. (2007). The scientific research potential of virtual worlds. Science, 317(5837), 472. CrossRefGoogle Scholar
  4. Benjamin, D. J., Choi, J. J., & Strickland, A. (2010a). Social identity and preferences. American Economic Review (forthcoming). Google Scholar
  5. Benjamin, D. J., Choi, J. J., Strickland, A., & Fisher, G. (2010b) Religious identity and economic behavior. Cornell University, Mimeo. Google Scholar
  6. Bohnet, I., Greig, F., Herrmann, B., & Zeckhauser, R. (2008). Betrayal aversion: evidence from Brazil, China, Oman, Switzerland, Turkey, and the United States. American Economic Review, 98(1), 294–310. CrossRefGoogle Scholar
  7. Brandts, J., & Charness, G. (2000). Hot vs. cold: sequential responses and preference stability in experimental games. Experimental Economics, 2(3), 227–238. Google Scholar
  8. Camerer, C. (2003). Behavioral game theory: experiments in strategic interaction. Princeton: Princeton University Press. Google Scholar
  9. Chandler, D., & Kapelner, A. (2010). Breaking monotony with meaning: motivation in crowdsourcing markets. University of Chicago, Mimeo. Google Scholar
  10. Chen, D., & Horton, J. (2010). The wages of pay cuts: evidence from a field experiment. Harvard University, Mimeo. Google Scholar
  11. Chilton, L. B., Sims, C. T., Goldman, M., Little, G., & Miller, R. C. (2009). Seaweed: a web application for designing economic games. In Proceedings of the ACM SIGKDD workshop on human computation (pp. 34–35). New York: ACM Press. CrossRefGoogle Scholar
  12. Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation. Boston: Houghton Mifflin. Google Scholar
  13. Eckel, C. C., & Wilson, R. K. (2006). Internet cautions: experimental games with Internet partners. Experimental Economics, 9(1), 53–66. CrossRefGoogle Scholar
  14. Falk, A., & Heckman, J. J. (2009). Lab experiments are a major source of knowledge in the social sciences. Science, 326(5952), 535. CrossRefGoogle Scholar
  15. Fehr, E., & Schmidt, K. M. (1999). A theory of fairness, competition, and cooperation. Quarterly Journal of Economics, 114(3), 817–868. CrossRefGoogle Scholar
  16. Fehr, E., Schmidt, K. M., & Gächter, S. (2000). Cooperation and punishment in public goods experiments. American Economic Review, 90(4), 980–994. CrossRefGoogle Scholar
  17. Fischbacher, U. (2007). z- tree: Zurich toolbox for ready-made economic experiments. Experimental Economics, 10(2), 171–178. CrossRefGoogle Scholar
  18. Frei, B. (2009). Paid crowdsourcing: current state & progress toward mainstream business use. Produced by Google Scholar
  19. Gneezy, U., Leonard, K. L., & List, J. A. (2009). Gender differences in competition: evidence from a matrilineal and a patriarchal society. Econometrica, 77(5), 1637–1664. CrossRefGoogle Scholar
  20. Harrison, G. W., & List, J. A. (2004). Field experiments. Journal of Economic Literature, 42(4), 1009–1055. CrossRefGoogle Scholar
  21. Herrmann, B., & Thöni, C. (2009). Measuring conditional cooperation: a replication study in Russia. Experimental Economics, 12(1), 87–92. CrossRefGoogle Scholar
  22. Horton, J. (2010). Online labor markets. In Workshop on Internet and network economics (pp. 515–522). CrossRefGoogle Scholar
  23. Horton, J. (2011). The condition of the Turking class: are online employers fair and honest? Economic Letters (forthcoming). Google Scholar
  24. Horton, J. & Chilton, L. (2010). The labor economics of paid crowdsourcing. In Proceedings of the 11th ACM conference on electronic commerce. Google Scholar
  25. Ipeirotis, P. (2010). Demographics of Mechanical Turk. New York University Working Paper. Google Scholar
  26. Kagel, J. H., Roth, A. E., & Hey, J. D. (1995). The handbook of experimental economics. Princeton: Princeton University Press. Google Scholar
  27. Kittur, A., Chi, E. H., & Suh, B. (2008). Crowdsourcing user studies with Mechanical Turk. Google Scholar
  28. Kocher, M. G., & Sutter, M. (2005). The decision maker matters: individual versus group behaviour in experimental beauty-contest games*. The Economic Journal, 115(500), 200–223. CrossRefGoogle Scholar
  29. Levitt, S. D., & List, J. A. (2009). Field experiments in economics: the past, the present, and the future. European Economic Review, 53(1), 1–18. CrossRefGoogle Scholar
  30. Little, G., Chilton, L. B., Goldman, M., & Miller, R. C. (2009). TurKit: tools for iterative tasks on Mechanical Turk. In Proceedings of the ACM SIGKDD workshop on human computation. New York: ACM Press. Google Scholar
  31. Lucking-Reiley, D. (2000). Auctions on the Internet: what’s being auctioned, and how? The Journal of Industrial Economics, 48(3), 227–252. Google Scholar
  32. Mason, W., & Watts, D. J. (2009). Financial incentives and the performance of crowds. In Proceedings of the ACM SIGKDD workshop on human computation (pp. 77–85). New York: ACM Press. CrossRefGoogle Scholar
  33. Mason, W., Watts, D. J., & Suri, S. (2010). Conducting behavioral research on Amazon’s Mechanical Turk. SSRN eLibrary. Google Scholar
  34. Pallais, A. (2010). Inefficient hiring in entry-level labor markets. Google Scholar
  35. Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5. Google Scholar
  36. Resnick, P., Kuwabara, K., Zeckhauser, R., & Friedman, E. (2000). Reputation systems. Communications of the ACM, 43(12), 45–48. CrossRefGoogle Scholar
  37. Resnick, P., Zeckhauser, R., Swanson, J., & Lockwood, K. (2006). The value of reputation on eBay: a controlled experiment. Experimental Economics, 9(2), 79–101. CrossRefGoogle Scholar
  38. Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701. CrossRefGoogle Scholar
  39. Selten, R. (1967). Die Strategiemethode zur Erforschung des eingeschrankt rationalen Verhaltens im Rahmen eines Oligopolexperiments. Beitrage zur experimentellen Wirtschaftsforschung, 1, 136–168. Google Scholar
  40. Shariff, A. F., & Norenzayan, A. (2007). God is watching you. Psychological Science, 18(9), 803–809. CrossRefGoogle Scholar
  41. Sheng, V. S., Provost, F., & Ipeirotis, P. G. (2008). Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 614–622). New York: ACM Press. CrossRefGoogle Scholar
  42. Sorokin, A., & Forsyth, D. (2008). Utility data annotation with Amazon Mechanical Turk. University of Illinois at Urbana-Champaign, Mimeo, 51, 61820. Google Scholar
  43. Suri, S., & Watts, D. (2011). A study of cooperation and contagion in web-based, networked public goods experiments. PLoS ONE (forthcoming). Google Scholar
  44. Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211(4481), 453. CrossRefGoogle Scholar
  45. von Ahn, L., Blum, M., Hopper, N. J., & Langford, J. (2003). CAPTCHA: using hard AI problems for security. In Lecture notes in computer science (pp. 294–311). Berlin: Springer. Google Scholar

Copyright information

© Economic Science Association 2011

Authors and Affiliations

  • John J. Horton
    • 1
  • David G. Rand
    • 1
  • Richard J. Zeckhauser
    • 1
    Email author
  1. 1.Harvard UniversityCambridgeUSA

Personalised recommendations