Are we measuring the same health constructs? Amazon’s Mechanical Turk versus a community sample

Thompson, Linda M.; Van Liew, Charles; Patrus, Alan; Azzoo, Kassandra I.; Cronan, Terry A.

doi:10.1007/s12144-020-01176-3

Are we measuring the same health constructs? Amazon’s Mechanical Turk versus a community sample

Published: 11 November 2020

Volume 41, pages 6700–6711, (2022)
Cite this article

Current Psychology Aims and scope Submit manuscript

Linda M. Thompson¹,
Charles Van Liew²,
Alan Patrus¹,
Kassandra I. Azzoo¹ &
…
Terry A. Cronan ORCID: orcid.org/0000-0001-6621-1474¹

483 Accesses
Explore all metrics

Abstract

Amazon’s Mechanical Turk (MTurk) platform has increasingly gained popularity because of its affordability and efficiency. The results of studies comparing MTurk respondents to community respondents have been mixed. The purpose of the present study was to compare an MTurk and a community sample to determine whether the psychometric properties of a measure completed in the two different formats were comparable. There were 957 MTurk participants and 837 from the community sample, with approximately equal numbers of males and females. Participants were asked to read a scenario depicting a family with a sick child, and then to complete a questionnaire that measured their perceived likelihood of hiring a Health Care Advocate (HCA). The results indicated some demographic differences between MTurk and community participants. There was an effect of medical condition in the MTurk sample, such that participants were more likely to perceive hiring an HCA for a child with leukemia than cystic fibrosis (p = .008). However, in the community sample, there was an effect of conception difficulty where participants were more likely to perceive hiring an HCA for a child who took 2 months to conceive than 5 years to conceive (p = .012). Despite some psychometric similarities between the two samples, there were some differences in the constructs measured in the two samples. Future researchers should continue to evaluate the reliability and validity of paper-and-pencil measurements for online administration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing Amazon’s Mechanical Turk Platform to Conventional Data Collection Methods in the Health and Medical Research Literature

Article 04 January 2018

Is it ethical to use Mechanical Turk for behavioral research? Relevant data from a representative survey of MTurk participants and wages

Article 22 May 2023

TurkPrime.com: A versatile crowdsourcing data acquisition platform for the behavioral sciences

Article Open access 12 April 2016

References

American Psychological Association. (2017). Ethical principles of psychologists and code of conduct. (2002, amended effective June 1, 2010, and January 1, 2017). http://www.apa.org/ethics/code/index.aspx
Antoun, C., Zhang, C., Conrad, F. G., & Schober, M. F. (2015). Comparisons of online recruitment strategies for convenience samples: Craigslist, Google AdWords, Facebook, and Amazon mechanical Turk. Field Methods, 28(3), 231–246. https://doi.org/10.1177/1525822X15603149.
Article Google Scholar
Aruguete, M. S., Huynh, H., Browne, B. L., Jurs, B., Flint, E., & McCutcheon, L. E. (2019). How serious is the ‘carelessness’ problem on mechanical Turk? International Journal of Social Research Methodology, 22(5), 441–449.
Article Google Scholar
Balboa Park. (n.d.) Advertising & Sponsorship. Retrieved March 28, 2020, from https://www.balboapark.org/about/sponsor-ads
Barak, A. (2011). Internet-based psychological testing and assessment. In Online Counseling (pp. 225-255). Elsevier.
Bartneck, C., Duenser, A., Moltchanova, E., & Zawieska, K. (2015). Comparing the similarity of responses received from studies in Amazon's mechanical Turk to studies conducted online and with direct recruitment. PLoS One, 10(4), e0121595. https://doi.org/10.1371/journal.pone.0121595.
Article PubMed PubMed Central Google Scholar
Beymer, M. R., Holloway, I. W., & Grov, C. (2018). Comparing self-reported demographic and sexual behavioral factors among men who have sex with men recruited through mechanical Turk, Qualtrics, and a HIV/STI clinic-based sample: Implications for researchers and providers. Archives of Sexual Behavior, 47(1), 133–142.
Article Google Scholar
Brock, R. L., Barry, R. A., Lawrence, E., Dey, J., & Rolffs, J. (2012). Internet administration of paper-and-pencil questionnaires used in couple research: Assessing psychometric equivalence. Assessment, 19(2), 226–242.
Article Google Scholar
Buchanan, T. (2002). Online assessment: Desirable or dangerous? Professional Psychology: Research and Practice, 33(2), 148–154.
Article Google Scholar
Buchanan, T., Ali, T., Heffernan, T. M., Ling, J., Parrott, A. C., Rodgers, J., & Scholey, A. B. (2005). Nonequivalence of on-line and paper-and-pencil psychological tests: The case of the prospective memory questionnaire. Behavior Research Methods, 37(1), 148–154.
Article Google Scholar
Buchanan, T., & Smith, J. L. (1999). Using the internet for psychological research: Personality testing on the world wide web. British Journal of Psychology, 90(1), 125–144.
Article Google Scholar
Buhrmester, M. K., Kwang, T. T., & Gosling, S. D. (2011). Amazon's MechanicalTurk: A new source of inexpensive, yet high-quality. Perspectives on Psychological Science, 6, 3–5.
Article Google Scholar
Chambers, S., Nimon, K., & Anthony-McMann, P. (2016). A primer for conducting survey research using MTurk: Tips for the field. International Journal of Adult Vocational Education and Technology (IJAVET), 7(2), 54–73.
Article Google Scholar
Chandler, J., Sisso, I., & Shapiro, D. (2020). Participant carelessness and fraud: Consequences for clinical research and potential solutions. Journal of Abnormal Psychology, 129(1), 49–55.
Article Google Scholar
Coles, M. E., Cook, L. M., & Blake, T. R. (2007). Assessing obsessive compulsive symptoms and cognitions on the internet: Evidence for the comparability of paper and internet administration. Behaviour Research and Therapy, 45(9), 2232–2240.
Article Google Scholar
Davis, R. N. (1999). Web-based administration of a personality questionnaire: Comparison with traditional methods. Behavior Research Methods, Instruments, & Computers, 31(4), 572–577.
Article Google Scholar
Difallah, D., Filatova, E., & Ipeirotis, P. (2018). Demographics and dynamics of mechanical Turk workers. In Proceedings of the eleventh ACM international conference on web search and data mining (pp. 135-143).
Follmer, D. J., Sperling, R. A., & Suen, H. K. (2017). The role of MTurk in education research: Advantages, issues, and future directions. Educational Researcher, 46(6), 329–334. https://doi.org/10.3102/0013189X17725519.
Article Google Scholar
Goodman, J. K., Cryder, C. E., & Cheema, A. (2013). Data collection in a flat world: The strengths and weaknesses of mechanical Turk samples. Journal of Behavioral Decision Making, 26(3), 213–224.
Article Google Scholar
Hauser, D., Paolacci, G., & Chandler, J. J. (2018). Common concerns with MTurk as a participant pool: Evidence and solutions.
Hertel, G., Naumann, S., Konradt, U., & Batinic, B. (2002). Personality assessment via internet: Comparing online and paper-and-pencil questionnaires. Online social sciences, 115-133.
Huff, C., & Tingley, D. (2015). "Who are these people?" Evaluating the demographic characteristics and political preferences of MTurk survey respondents. Research & Politics, 2. https://doi.org/10.1177/2053168015604648.
Janvier, A., Leblanc, I., & Barrington, K. J. (2008). Nobody likes premies: The relative value of patients’ lives. Journal of Perinatology, 28(12), 821–826.
Article Google Scholar
Kuang, J., Argo, L., Stoddard, G., Bray, B. E., & Zeng-Treitler, Q. (2015). Assessing pictograph recognition: A comparison of crowdsourcing and traditional survey approaches. Journal of Medical Internet Research, 17(12), e281.
Article Google Scholar
Levay, K. E., Freese, J., & Druckman, J. N. (2016). The demographic and political composition of mechanical Turk samples. SAGE Open, 6(1), 2158244016636433. https://doi.org/10.1177/2158244016636433.
Article Google Scholar
Luce, K. H., Winzelberg, A. J., Das, S., Osborne, M. I., Bryson, S. W., & Taylor, C. B. (2007). Reliability of self-report: Paper versus online administration. Computers in Human Behavior, 23(3), 1384–1389.
Article Google Scholar
Lynch, C. D. (2011). How long does it take the average couple to get pregnant? A systematic review of what we know. Fertility and Sterility, 96(3), S115.
Article Google Scholar
McCredie, M. N., & Morey, L. C. (2019). Who are the Turkers? A characterization of MTurk workers using the personality assessment inventory. Assessment, 26(5), 759–766.
Article Google Scholar
Meyerson, P., & Tryon, W. W. (2003). Validating internet research: A test of the psychometric equivalence of internet and in-person samples. Behavior Research Methods, Instruments, & Computers, 35(4), 614–620.
Article Google Scholar
Mortensen, K., & Hughes, T. L. (2018). Comparing Amazon’s mechanical Turk platform to conventional data collection methods in the health and medical research literature. Journal of General Internal Medicine, 33(4), 533–538. https://doi.org/10.1007/s11606-017-4246-0.
Article PubMed PubMed Central Google Scholar
Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon mechanical Turk. Judgment and Decision making, 5(5), 411–419.
Google Scholar
Peer, E., Vosgerau, J., & Acquisti, A. (2014). Reputation as a sufficient condition for data quality on Amazon mechanical Turk. Behavior Research Methods, 46(4), 1023–1031. https://doi.org/10.3758/s13428-013-0434-y.
Article PubMed Google Scholar
Riva, G., Teruzzi, T., & Anolli, L. (2003). The use of the internet in psychological research: Comparison of online and offline questionnaires. Cyberpsychology & Behavior, 6(1), 73–80.
Article Google Scholar
Robinson, J., Rosenzweig, C., Moss, A. J., & Litman, L. (2019). Tapped out or barely tapped? Recommendations for how to harness the vast and largely unused potential of the mechanical Turk participant pool. PLoS One, 14(12), e0226394.
Article Google Scholar
Stanton, J. M. (1998). An empirical assessment of data collection using the internet. Personnel Psychology, 51(3), 709–725.
Article Google Scholar
Tseng, H.-M., Macleod, H. A., & Wright, P. (1997). Computer anxiety and measurement of mood change. Computers in Human Behavior, 13(3), 305–316.
Article Google Scholar
United States Census Bureau, U. S. C (n.d.). U.S. and world population clock. U.S. Department of Commerce.
Vasserman-Stokes, E. A., Cronan, T. A., & Sadler, M. S. (2012). Factors that influence the likelihood of hiring a health care advocate for a chronically ill child. Journal of Pediatric Health Care, 26(1), 27–36.
Article Google Scholar
Walters, K., Christakis, D. A., & Wright, D. R. (2018). Are mechanical Turk worker samples representative of health status and health behaviors in the U.S. PLOS ONE, 13(6), e0198835. https://doi.org/10.1371/journal.pone.0198835.
Article PubMed PubMed Central Google Scholar
Yank, V., Agarwal, S., Loftus, P., Asch, S., & Rehkopf, D. (2017). Crowdsourced health data: Comparability to a US National Survey, 2013–2015. American Journal of Public Health, 107(8), 1283–1289.
Article Google Scholar

Download references

Acknowledgments

We would like to acknowledge Kai Givogue for his assistance in transferring the methodology from the community study to the MTurk study and in conducting literature reviews.

Funding

Research reported in this publication was supported in part by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R25GM058906. The content is solely the responsibility of the authors, and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

Department of Psychology, San Diego State University, San Diego, CA, 92182, USA
Linda M. Thompson, Alan Patrus, Kassandra I. Azzoo & Terry A. Cronan
College of Health Solutions, Arizona State University, Tempe, AZ, 85281, USA
Charles Van Liew

Authors

Linda M. Thompson
View author publications
You can also search for this author in PubMed Google Scholar
Charles Van Liew
View author publications
You can also search for this author in PubMed Google Scholar
Alan Patrus
View author publications
You can also search for this author in PubMed Google Scholar
Kassandra I. Azzoo
View author publications
You can also search for this author in PubMed Google Scholar
Terry A. Cronan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Terry A. Cronan.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Data Sharing and Data Accessibility

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thompson, L.M., Van Liew, C., Patrus, A. et al. Are we measuring the same health constructs? Amazon’s Mechanical Turk versus a community sample. Curr Psychol 41, 6700–6711 (2022). https://doi.org/10.1007/s12144-020-01176-3

Download citation

Accepted: 05 November 2020
Published: 11 November 2020
Issue Date: October 2022
DOI: https://doi.org/10.1007/s12144-020-01176-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Are we measuring the same health constructs? Amazon’s Mechanical Turk versus a community sample

Abstract

Access this article

Similar content being viewed by others

Comparing Amazon’s Mechanical Turk Platform to Conventional Data Collection Methods in the Health and Medical Research Literature

Is it ethical to use Mechanical Turk for behavioral research? Relevant data from a representative survey of MTurk participants and wages

TurkPrime.com: A versatile crowdsourcing data acquisition platform for the behavioral sciences

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Data Sharing and Data Accessibility

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Are we measuring the same health constructs? Amazon’s Mechanical Turk versus a community sample

Abstract

Access this article

Similar content being viewed by others

Comparing Amazon’s Mechanical Turk Platform to Conventional Data Collection Methods in the Health and Medical Research Literature

Is it ethical to use Mechanical Turk for behavioral research? Relevant data from a representative survey of MTurk participants and wages

TurkPrime.com: A versatile crowdsourcing data acquisition platform for the behavioral sciences

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Data Sharing and Data Accessibility

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation