Amazon Mechanical Turk in Organizational Psychology: An Evaluation and Practical Recommendations

Cheung, Janelle H.; Burns, Deanna K.; Sinclair, Robert R.; Sliter, Michael

doi:10.1007/s10869-016-9458-5

Amazon Mechanical Turk in Organizational Psychology: An Evaluation and Practical Recommendations

Original Paper
Published: 29 June 2016

Volume 32, pages 347–361, (2017)
Cite this article

Journal of Business and Psychology Aims and scope Submit manuscript

Janelle H. Cheung¹,
Deanna K. Burns¹,
Robert R. Sinclair¹ &
…
Michael Sliter²

10k Accesses
299 Citations
3 Altmetric
Explore all metrics

Abstract

Purpose

Amazon Mechanical Turk is an increasingly popular data source in the organizational psychology research community. This paper presents an evaluation of MTurk and provides a set of practical recommendations for researchers using MTurk.

Design/Methodology/Approach

We present an evaluation of methodological concerns related to the use of MTurk and potential threats to validity inferences. Based on our evaluation, we also provide a set of recommendations to strengthen validity inferences using MTurk samples.

Findings

Although MTurk samples can overcome some important validity concerns, there are other limitations researchers must consider in light of their research objectives. Researchers should carefully evaluate the appropriateness and quality of MTurk samples based on the different issues we discuss in our evaluation.

Implications

There is not a one-size-fits-all answer to whether MTurk is appropriate for a research study. The answer depends on the research questions and the data collection and analytic procedures adopted. The quality of the data is not defined by the data source per se, but rather the decisions researchers make during the stages of study design, data collection, and data analysis.

Originality/Value

The current paper extends the literature by evaluating MTurk in a more comprehensive manner than in prior reviews. Past review papers focused primarily on internal and external validity, with less attention paid to statistical conclusion and construct validity—which are equally important in making accurate inferences about research findings. This paper also provides a set of practical recommendations in addressing validity concerns when using MTurk.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Amazon’s Mechanical Turk as a Viable Source for Organizational and Occupational Health Research

Article 29 November 2017

Is it ethical to use Mechanical Turk for behavioral research? Relevant data from a representative survey of MTurk participants and wages

Article 22 May 2023

TurkPrime.com: A versatile crowdsourcing data acquisition platform for the behavioral sciences

Article Open access 12 April 2016

Notes

The ten methodological concerns are not presented in any particular order that indicates the importance or prevalence of each concern.
In our own data collections, we have allowed MTurk participants a maximum of two attempts and have received positive reviews from participants about offering them a second chance.

References

Aguinis, H., & Lawal, S. O. (2012). Conducting field experiments using eLancing’s natural environment. Journal of Business Venturing, 27, 493–505.
Article Google Scholar
Aguinis, H., & Lawal, S. O. (2013). eLancing: A review and research agenda for bridging the science-practice gap. Human Resource Management Review, 23, 6–17.
Article Google Scholar
Antin, J., & Shaw, A. (2012). Social desirability bias and self-reports of motivation: A study of Amazon Mechanical Turk in the US and India. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI’12, (pp. 2925–2934).
Aust, F., Diedenhofen, B., Ullrich, S., & Musch, J. (2013). Seriousness checks are useful to improve data validity in online research. Behavior Research Methods, 45, 527–535.
Article PubMed Google Scholar
Barger, T., Behrend, T. S., Sharek, D. J., & Sinar, E. F. (2011). I-O and the crowd: Frequently asked questions about using Mechanical Turk for research. The Industrial–Organizational Psychologist, 49, 11–18.
Google Scholar
Behrend, T. S., Sharek, D. S., Meade, A. W., & Wiebe, E. N. (2011). The viability of crowdsourcing for survey research. Behavior Research Methods, 43, 800–813.
Article PubMed Google Scholar
Bergman, M. E., & Jean, V. A. (2016). Where have all the “workers” gone? A critical analysis the unrepresentativeness of our samples relative to the labor market in the industrial–organizational psychology literature. Industrial and Organizational Psychology: Perspectives on Science and Practice, 9, 84–113.
Article Google Scholar
Bergvall-Kareborn, B., & Howcroft, D. (2015). Amazon Mechanical Turk and the commodification of labor. New Technology, Work and Employment, 29, 213–223.
Article Google Scholar
Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Political Analysis, 20, 351–368.
Article Google Scholar
Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6, 3–5.
Article PubMed Google Scholar
Callison-Burch, C., & Dredze, M. (2010). Creating speech and language data with Amazon’s Mechanical Turk. In Proceedings of the NAACL HLT (pp. 1–12).
Chandler, J., Mueller, P., & Paolacci, G. (2014). Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior Research Methods, 46, 112–130.
Article PubMed Google Scholar
Chandler, J., Paolacci, G., Peer, E., Mueller, P., & Ratliff, K. A. (2015). Using nonnative participants can reduce effect sizes. Psychological Science, 26, 1131–1139.
Article PubMed Google Scholar
Crump, M. J. C., McDonnell, J. V., & Gureckis, T. M. (2013). Evaluating Amazon’s Mechanical Turk as a tool for experimental behavioral research. PLoS One, 8, e57410.
Article PubMed PubMed Central Google Scholar
DeSimone, J. A., Harms, P. D., & DeSimone, A. J. (2015). Best practice recommendations for data screening. Journal of Organizational Behavior, 36, 171–181.
Article Google Scholar
Fleischer, A., Mead, A. D., & Huang, J. (2015). Inattentive responding in MTurk and other online samples. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 196–202.
Article Google Scholar
Harms, P. D., & DeSimone, J. A. (2015). Caution! MTurk workers ahead—Fines doubled. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 183–190.
Article Google Scholar
Hauser, D. J., & Schwarz, N. (2016). Attentive Turkers: MTurk participants perform better on online attention checks than subject pool participants. Behavior Research Methods, 48, 400–407.
Article PubMed Google Scholar
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33, 61–135.
Article PubMed Google Scholar
Highhouse, S., & Zhang, D. (2015). The new fruit fly for applied psychological research. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 179–183.
Article Google Scholar
Horton, J. J., Rand, D. G., & Zeckhauser, R. J. (2011). The online laboratory: Conducting experiments in a real labor market. Experimental Economics, 14, 399–425.
Article Google Scholar
Huang, J. L., Bowling, N. A., Liu, M., & Li, Y. (2015a). Detecting insufficient effort responding with an infrequency scale: Evaluating validity and participant reactions. Journal of Business and Psychology, 30, 299–311.
Article Google Scholar
Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27, 99–114.
Article Google Scholar
Huang, J. L., Liu, M., & Bowling, N. A. (2015b). Insufficient effort responding: Examining an insidious confound in survey data. Journal of Applied Psychology, 100, 828–845.
Article PubMed Google Scholar
Hunter, J. E., Schmidt, F. L., & Le, H. (2006). Implications of direct and indirect range restriction for meta-analysis methods and findings. Journal of Applied Psychology, 91, 594–612.
Article PubMed Google Scholar
Ipeirotis, P. G. (2010). Demographics of Mechanical Turk. NYU Working Paper No.; CEDER-10-01. Retrieved from http://ssrn.com/abstract=1585030
Kam, C. C. S., & Meyer, J. P. (2015). How careless responding and acquiescence response bias can influence construct dimensionality: The case of job satisfaction. Organizational Research Methods, 18, 512–541.
Article Google Scholar
Landers, R. N., & Behrend, T. S. (2015). An inconvenient truth: Arbitrary distinctions between organizational, Mechanical Turk, and other convenience samples. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 142–164.
Article Google Scholar
Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44, 1–23.
Article PubMed Google Scholar
Matthijsse, S. M., de Leeuw, E. D., & Hox, J. J. (2015). Internet panels, professional respondents, and data quality. Methodology, 11, 81–88.
Article Google Scholar
McGonagle, A. K. (2015). Participant motivation: A critical consideration. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 208–214.
Article Google Scholar
McGonagle, A. K., Huang, J. L., & Walsh, B. M. (2016). Insufficient effort survey responding: An under-appreciated problem in work and organizational health psychology research. Applied Psychology: An International Review, 65, 287–321.
Article Google Scholar
McGrath, R. E., Mitchell, M., Kim, B. H., & Hough, L. (2010). Evidence for response bias as a source of error variance in applied assessment. Psychological Bulletin, 136, 450–470.
Article PubMed Google Scholar
Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17, 437–455.
Article PubMed Google Scholar
Oppenheimer, D. M., Meyvis, T., & Davidenko, N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology, 45, 867–872.
Article Google Scholar
Paolacci, G., & Chandler, J. (2014). Inside the turk: Understanding Mechanical Turk as a participant pool. Current Directions in Psychology Science, 23, 184–188.
Article Google Scholar
Paolacci, G., Chandler, J., & Ipeirotis, P. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419.
Google Scholar
Peer, E., Vosgerau, J., & Acquisti, A. (2014). Reputation as a sufficient condition for data quality on Amazon Mechanical Turk. Behavior Research Methods, 46, 1023–1031.
Article PubMed Google Scholar
Podsakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and recommended remedies. Journal of Applied Psychology, 88, 879–903.
Article PubMed Google Scholar
Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual Review of Psychology, 63, 539–569.
Article PubMed Google Scholar
Pollack, J., & Aguinis, H. (2013). 2013 JCR journal rankings. Retrieved from https://drive.google.com/file/d/0B68LcC5lXuedZmpXSWFvcTZNck0/edit
Ran, S., Liu, M., Marchiondo, L. A., & Huang, J. L. (2015). Difference in response effort across sample types: Perception or reality? Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 202–208.
Article Google Scholar
Roulin, N. (2015). Don’t throw the baby out with the bathwater: Comparing data quality of crowdsourcing, online panels, and student samples. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 190–196.
Article Google Scholar
Rouse, S. V. (2015). A reliability analysis of Mechanical Turk data. Computers in Human Behavior, 43, 304–307.
Article Google Scholar
Schmidt, G. B. (2015). Fifty days as an MTurk worker: The social and motivational context for Amazon Mechanical Turk Workers. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 165–171.
Article Google Scholar
Schmitt, N., & Stults, D. M. (1985). Factors defined by negatively keyed items: The result of careless respondents? Applied Psychological Measurement, 9, 367–373.
Article Google Scholar
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage Learning.
Shapiro, D. N., Chandler, J., & Mueller, P. A. (2013). Using Mechanical Turk to study clinical populations. Clinical Psychological Science, 1, 213–220.
Article Google Scholar
Smith, N. A., Sabat, I. E., Martinez, L. R., Weaver, K., & Xu, S. (2015). A convenient solution: Using MTurk to sample from hard-to-reach populations. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 220–228.
Article Google Scholar
Spector, P. E. (2006). Method variance in organizational research. Organizational Research Methods, 9, 221–232.
Article Google Scholar
Sprouse, J. (2011). A validation of Amazon Mechanical Turk for the collection of acceptability judgments in linguistic theory. Behavior Research Methods, 43, 155–167.
Article PubMed Google Scholar
Stewart, N., Ungemach, C., Harris, A. J., Bartels, D. M., Newell, B. R., Paolacci, G., & Chandler, J. (2015). The average laboratory samples a population of 7300 Amazon Mechanical Turk workers. Judgment and Decision Making, 10, 479–491.
Google Scholar
Stone-Romero, E. F. (2011). Research strategies in industrial and organizational psychology: Nonexperimental, quasi-experimental, and randomized experimental research in special purpose and nonspecial purpose settings. In S. Zedeck (Ed.), APA handbook of industrial and organizational psychology (Vol. 1, pp. 37–72). Building and developing the organization Washington, DC: American Psychological Association.
Google Scholar
Welcome to Requester Help. (n.d.). Retrieved from http://requester.mturk.com/help
Woo, S. E., Keith, M., & Thornton, M. A. (2015). Amazon Mechanical Turk for industrial and organizational psychology: Advantages, challenges and practical recommendations. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 171–178.
Article Google Scholar
Woods, C. M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factor analysis. Journal of Psychopathology and Behavioral Assessment, 28, 186–191.
Article Google Scholar
Zhu, X., Barnes-Farrell, J. L., & Dalal, D. K. (2015). Stop apologizing for your samples, start embracing them. Industrial and Organizational Psychology: Perspectives on Science and Practice, 8, 228–232.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, Clemson University, 418 Brackett Hall, Clemson, SC, 29634, USA
Janelle H. Cheung, Deanna K. Burns & Robert R. Sinclair
FurstPerson, Chicago, IL, USA
Michael Sliter

Authors

Janelle H. Cheung
View author publications
You can also search for this author in PubMed Google Scholar
Deanna K. Burns
View author publications
You can also search for this author in PubMed Google Scholar
Robert R. Sinclair
View author publications
You can also search for this author in PubMed Google Scholar
Michael Sliter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Janelle H. Cheung.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 48 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheung, J.H., Burns, D.K., Sinclair, R.R. et al. Amazon Mechanical Turk in Organizational Psychology: An Evaluation and Practical Recommendations. J Bus Psychol 32, 347–361 (2017). https://doi.org/10.1007/s10869-016-9458-5

Download citation

Published: 29 June 2016
Issue Date: August 2017
DOI: https://doi.org/10.1007/s10869-016-9458-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Amazon Mechanical Turk in Organizational Psychology: An Evaluation and Practical Recommendations