Behavior Research Methods

, Volume 46, Issue 4, pp 1023–1031 | Cite as

Reputation as a sufficient condition for data quality on Amazon Mechanical Turk

Article

Abstract

Data quality is one of the major concerns of using crowdsourcing websites such as Amazon Mechanical Turk (MTurk) to recruit participants for online behavioral studies. We compared two methods for ensuring data quality on MTurk: attention check questions (ACQs) and restricting participation to MTurk workers with high reputation (above 95% approval ratings). In Experiment 1, we found that high-reputation workers rarely failed ACQs and provided higher-quality data than did low-reputation workers; ACQs improved data quality only for low-reputation workers, and only in some cases. Experiment 2 corroborated these findings and also showed that more productive high-reputation workers produce the highest-quality data. We concluded that sampling high-reputation workers can ensure high-quality data without having to resort to using ACQs, which may lead to selection bias if participants who fail ACQs are excluded post-hoc.

Keywords

Online research Amazon Mechanical Turk Data quality Reputation 

References

  1. Aust, F., Diedenhofen, B., Ullrich, S., & Musch, J. (2013). Seriousness checks are useful to improve data validity in online research. Behavior Research Methods, 45, 527–535. doi:10.3758/s13428-012-0265-2 PubMedCrossRefGoogle Scholar
  2. Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6, 3–5. doi:10.1177/1745691610393980 CrossRefGoogle Scholar
  3. Cacioppo, J. T., Petty, R. E., & Feng Kao, C. (1984). The efficient assessment of need for cognition. Journal of Personality Assessment, 48, 306–307.PubMedCrossRefGoogle Scholar
  4. Chandler, J., Mueller, P., & Paolacci, G. (2013). Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior Research Methods. doi:10.3758/s13428-013-0365-7. Advance online publication.Google Scholar
  5. Downs, J. S., Holbrook, M. B., Sheng, S., & Cranor, L. F. (2010). Are your participants gaming the system? Screening Mechanical Turk workers. In Proceedings of the 28th International Conference on Human Factors in Computing Systems (pp. 2399–2402). New York, NY: ACM.Google Scholar
  6. Fischer, D. G., & Fick, C. (1993). Measuring social desirability: Short forms of the Marlowe–Crowne Social Desirability Scale. Educational and Psychological Measurement, 53, 417–424.CrossRefGoogle Scholar
  7. Goodman, J. K., Cryder, C. E., & Cheema, A. (2013). Data collection in a flat world: The strengths and weaknesses of Mechanical Turk samples. Journal of Behavioral Decision Making, 26, 213–224. doi:10.1002/bdm.1753 CrossRefGoogle Scholar
  8. Gosling, S. D., Rentfrow, P. J., & Swann, W. B., Jr. (2003). A very brief measure of the Big Five personality domains. Journal of Research in Personality, 37, 504–528.CrossRefGoogle Scholar
  9. Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1–20. doi:10.1037/h0076157 CrossRefGoogle Scholar
  10. Hakstian, R. A., & Whalen, T. A. (1976). A k-sample significance test for independent alpha coefficients. Psychometrika, 41, 219–231.CrossRefGoogle Scholar
  11. Oppenheimer, D. M., Meyvis, T., & Davidenko, N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology, 45, 867–872.CrossRefGoogle Scholar
  12. Paolacci, G., Chandler, J., & Ipeirotis, P. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419.Google Scholar
  13. Robins, R. W., Hendin, H. M., & Trzesniewski, K. H. (2001). Measuring global self-esteem: Construct validation of a single-item measure and the Rosenberg Self-Esteem Scale. Personality and Social Psychology Bulletin, 27, 151–161.CrossRefGoogle Scholar
  14. Rosenberg, M. (1979). Rosenberg self-esteem scale. New York, NY: Basic Books.Google Scholar
  15. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131. doi:10.1126/science.185.4157.1124 PubMedCrossRefGoogle Scholar

Copyright information

© Psychonomic Society, Inc. 2013

Authors and Affiliations

  • Eyal Peer
    • 1
  • Joachim Vosgerau
    • 2
  • Alessandro Acquisti
    • 3
  1. 1.Graduate School of Business AdministrationBar-Ilan UniversityRamat-GanIsrael
  2. 2.School of Economics and ManagementTilburg UniversityTilburgThe Netherlands
  3. 3.Heinz CollegeCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations