Turking overtime: how participant characteristics and behavior vary over time and day on Amazon Mechanical Turk

Abstract

Online experiments allow researchers to collect datasets at times not typical of laboratory studies. We recruit 2336 participants from Amazon Mechanical Turk to examine if participant characteristics and behaviors differ depending on whether the experiment is conducted during the day versus night, and on weekdays versus weekends. Participants make incentivized decisions involving prosociality, punishment, and discounting, and complete a demographic and personality survey. We find no time or day differences in behavior, but do find that participants at nights and on weekends are less experienced with online studies; on weekends are less reflective; and at night are less conscientious and more neurotic. These results are largely robust to finer-grained measures of time and day. We also find that those who participated earlier in the course of the study are more experienced, reflective, and agreeable, but less charitable than later participants.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

Notes

  1. 1.

    Each session was closed after 30 participants accepted the HIT or 1 h had elapsed, and participants had a maximum of 1 h to complete the study. The first week (11/15-11/19) had 28 sessions launched every 6 h starting at 00:00 EST; the second week (12/8–12/15) had 56 sessions launched every 3 h starting at 09:00 EST. This difference in granularity is not relevant for our analyses of day versus night, which uses 12-h blocks.

  2. 2.

    Unless otherwise stated, we found qualitatively similar results when the 12-h night was defined as beginning at 7 p.m. or at 9 p.m., or if we define weekend as the time between the start of Saturday day and the start of Monday day.

  3. 3.

    We used a continuous implementation of the PD (as in Capraro et al. 2014) such that each player received a $0.40 endowment and chose how much to transfer to the other person, with any transfer doubled by the experimenters.

  4. 4.

    In the third-party punishment game, Player 1 chose whether or not to evenly split $0.50 with Player 2. The participant, in the role of Player 3, then chose how much of a $0.10 endowment to spend on punishing Player 1 (with each cent reducing Player 1’s payoff by 3 cents) if Player 1 did not (3P) or did (AP) split the $0.50. Participants in our study played only in the role of the third player (which was our decision of interest). We did not deceive participants, however—a small number of Players 1 and 2 were recruited separately and repeatedly matched with Player 3 s (as per Stagnaro et al. 2017).

  5. 5.

    We used a short version of the discounting task developed by Kirby et al. (1999), where participants chose nine different monetary allocations between a smaller reward and a larger, delayed reward (e.g. “Would you rather have $25 today or $60 in 14 days”). Log-transformed values reported in all analyses. One participant was selected at random to have one of their choices implemented. Because of the instructions stating “At the end of the study one participant and one question will be selected randomly. The winner will receive the associated bonus according to the choice made”, we had assumed that participants understood “today” to mean “at the end of the study”. On reflection, we realize that this (unintentional) poor execution on our part might have been misunderstood by the participants.

  6. 6.

    We report only main effects because preliminary ANOVAs reveal no significant interaction between a dummy for night versus day and weekend versus weekday. See Table A1 in Online Appendix for significance levels of all the variables and Figure A1 in Online Appendix for their distributions.

  7. 7.

    We also test those seven null results for robustness to demographic and personality controls in stepwise regressions (Table A3 in Online Appendix). We find that such controls have no effect on the non-significance of the time/day coefficients.

  8. 8.

    There were no significant interactions for the demographics; Fig. 3 shows means for each condition to allow readers to see the absolute levels (which may be of general interest).

  9. 9.

    See Table A4 in Online Appendix for a complete list of the significance levels, and Tables A5 and A6 for regression analyses with dummies for each of the day/time categories as independent variables.

  10. 10.

    See Figure A4 for a visual representation of cumulative averages over the data collection process, and Tables A7 and A8 for regression results. Our findings are qualitatively similar when using a dummy for either week, session number or the total hours passed since the first session as an independent variable.

References

  1. Amir, O., Rand, D. G., & Gal, Y. K. (2012). Economic games on the Internet: The effect of $1 stakes. PLoS ONE, 7(2), e31461.

    Article  Google Scholar 

  2. Arechar, A. A., Molleman, L., & Gachter, S. (2017). Conducting interactive experiments online. Experimental Economics. doi:10.1007/s10683-017-9527-2.

    Google Scholar 

  3. Aviv, A. L., Zelenski, J. M., Rallo, L., & Larsen, R. J. (2002). Who comes when: Personality differences in early and later participation in a university subject pool. Personality and Individual Differences, 33(3), 487–496.

    Article  Google Scholar 

  4. Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com’s mechanical turk. Political Analysis, 20(3), 351–368. doi:10.1093/pan/mpr057.

    Article  Google Scholar 

  5. Capraro, V., Jordan, J. J., & Rand, D. G. (2014). Heuristics guide the implementation of social preferences in one-shot Prisoner’s Dilemma experiments. Scientific Reports, 4, 6790.

    Article  Google Scholar 

  6. Casey, L. S., Chandler, J., Levine, A. S., Proctor, A., & Strolovitch, D. Z. (2016). Intertemporal differences among MTurk worker demographics. https://osf.io/preprints/psyarxiv/8352x.

  7. Chandler, J., Paolacci, G., Peer, E., Mueller, P., & Ratliff, K. A. (2015). Using nonnaive participants can reduce effect sizes. Psychological Science, 26(7), 1131–1139.

    Article  Google Scholar 

  8. Deetlefs, J., Chylinski, M., & Ortmann, A. (2015). MTurk ‘Unscrubbed’: Exploring the good, the ‘super’, and the unreliable on Amazon’s mechanical turk. http://ssrn.com/abstract=2654056.

  9. Frederick, S. (2005). Cognitive reflection and decision making. The Journal of Economic Perspectives, 19(4), 25–42.

    Article  Google Scholar 

  10. Gosling, S. D., Rentfrow, P. J., & Swann, W. B., Jr. (2003). A very brief measure of the Big-Five personality domains. Journal of Research in Personality, 37(6), 504–528.

    Article  Google Scholar 

  11. Gunia, B. C., Barnes, C. M., & Sah, S. (2014). The morality of larks and owls: Unethical behavior depends on chronotype as well as time of day. Psychological Science, 25(12), 2272–2274.

    Article  Google Scholar 

  12. Horton, J. J., Rand, D. G., & Zeckhauser, R. J. (2011). The online laboratory: Conducting experiments in a real labor market. Experimental Economics, 14(3), 399–425. doi:10.1007/s10683-011-9273-9.

    Article  Google Scholar 

  13. Kirby, K. N., Petry, N. M., & Bickel, W. K. (1999). Heroin addicts have higher discount rates for delayed rewards than non-drug-using controls. Journal of Experimental Psychology-General, 128(1), 78–87.

    Article  Google Scholar 

  14. Kouchaki, M., & Smith, I. H. (2014). The morning morality effect: The influence of time of day on unethical behavior. Psychological Science, 25(1), 95–102. doi:10.1177/0956797613498099.

    Article  Google Scholar 

  15. Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on amazon mechanical turk. Judgment and Decision Making, 5(5), 411–419.

    Google Scholar 

  16. Rand, D. G. (2012). The promise of mechanical turk: How online labor markets can help theorists run behavioral experiments. Journal of Theoretical Biology, 299, 172–179.

    Article  Google Scholar 

  17. Rand, D. G., Peysakhovich, A., Kraft-Todd, G. T., Newman, G. E., Wurzbacher, O., Nowak, M. A., et al. (2014). Social heuristics shape intuitive cooperation. Nature Communications, 5, 3677.

    Article  Google Scholar 

  18. Shenhav, A., Rand, D. G., & Greene, J. D. (2012). Divine intuition: Cognitive style influences belief in God. Journal of Experimental Psychology: General, 141(3), 423–428.

    Article  Google Scholar 

  19. Stagnaro, M. N., Arechar, A. A., & Rand, D. G. (2017). From good institutions to generous citizens: Top-down incentives to cooperate promote subsequent prosociality but not norm enforcement. Cognition. doi:10.1016/j.cognition.2017.01.017.

    Google Scholar 

  20. Peysakhovich, A., Nowak, M. A., & Rand, D. G. (2014). Humans display a ‘cooperative phenotype’ that is domain general and temporally stable. Nature communications, 5, 4939.

    Article  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge funding from the Templeton World Charity Foundation (Grant No. TWCF0209), the Defense Advanced Research Projects Agency NGS2 program (Grant No. D17AC00005), and the National Institutions of Health (Grant No. P30-AG034420). They also thank Becky Fortgang, the Editor, and two anonymous reviewers for their valuable feedback, and SJ Language Services for copyediting.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Antonio A. Arechar or David G. Rand.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 1495 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Arechar, A.A., Kraft-Todd, G.T. & Rand, D.G. Turking overtime: how participant characteristics and behavior vary over time and day on Amazon Mechanical Turk. J Econ Sci Assoc 3, 1–11 (2017). https://doi.org/10.1007/s40881-017-0035-0

Download citation

Keywords

  • Cooperation
  • Honesty
  • Decision-making
  • Time of day
  • MTurk
  • Self-control

JEL Classification

  • C80
  • C90