Collecting response times using Amazon Mechanical Turk and Adobe Flash

Abstract

Crowdsourcing systems like Amazon’s Mechanical Turk (AMT) allow data to be collected from a large sample of people in a short amount of time. This use has garnered considerable interest from behavioral scientists. So far, most experiments conducted on AMT have focused on survey-type instruments because of difficulties inherent in running many experimental paradigms over the Internet. This study investigated the viability of presenting stimuli and collecting response times using Adobe Flash to run ActionScript 3 code in conjunction with AMT. First, the timing properties of Adobe Flash were investigated using a phototransistor and two desktop computers running under several conditions mimicking those that may be present in research using AMT. This experiment revealed some strengths and weaknesses of the timing capabilities of this method. Next, a flanker task and a lexical decision task implemented in Adobe Flash were administered to participants recruited with AMT. The expected effects in these tasks were replicated. Power analyses were conducted to describe the number of participants needed to replicate these effects. A questionnaire was used to investigate previously undescribed computer use habits of 100 participants on AMT. We conclude that a Flash program in conjunction with AMT can be successfully used for running many experimental paradigms that rely on response times, although experimenters must understand the limitations of the method.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Adobe Developer Connection. (2012). Flash Platform Developer Center. http://www.adobe.com/devnet/flashplatform.html Retrieved 2012-08-11.

  2. AMT FAQ. (2012). https://requester.mturk.com/help/faq#examples_violations Retrieved 2012-08-11.

  3. Behrend, T. S., Sharek, D. J., Meade, A. W., & Wiebe, E. N. (2011). The viability of crowdsourcing for survey research. Behavior Research Methods, 43, 800–813.

    PubMed  Article  Google Scholar 

  4. Berinsky, A. J., Huber, & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Political Analysis, 20, 351–368.

    Article  Google Scholar 

  5. Brand, A., & Bradley, M. T. (2012). Assessing the effects of technical variance on the statistical outcomes of web experiments measuring response times. Social Science Computer Review, 30, 350–357.

    Article  Google Scholar 

  6. Buhrmester, M. D., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6, 3–5.

    Article  Google Scholar 

  7. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, New Jersey: Lawrence Erlbaum.

    Google Scholar 

  8. Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191.

    PubMed  Article  Google Scholar 

  9. Fowler, R. L. (1985). Point estimates and confidence intervals in measures of association. Quantitative Methods in Psychology, 98, 160–165.

    Google Scholar 

  10. Goodman, J. K., Cryder, C. E., & Cheema, A. (in press). Data collection in a flat world: Strengths and weaknesses of mechanical Turk samples. Journal of Behavioral Decision Making.

  11. Grossman, G. & Huang, E. (2009). ActionScript 3.0 Overview. Adobe systems incorporated. Retrieved 2012-08-11 from http://www.adobe.com/devnet/actionscript/articles/actionscript3_overview.html

  12. Halberda, J., Ly, R., Wilmer, J., Naiman, D., & Germine, L. (2012). Number sense across the lifespan as revealed by a massive internet-based sample. Proceedings of the National Academy of Sciences, 109, 11116–11120.

    Article  Google Scholar 

  13. Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavior and Brain Sciences. Retrieved from: http://www2.psych.ubc.ca/~henrich/pdfs/WeirdPeople.pdf

  14. Hewson, C. M., Laurent, D., & Vogel, C. M. (1996). Proper methodologies for psychological and sociological studies conducted via the Internet. Behavior Research Methods, Instruments, & Computers, 28, 186–191.

    Article  Google Scholar 

  15. Houben, K., & Wiers, R. W. (2008). Measuring implicit alcohol associations via the Internet: Validation of Web-based implicit association tests. Behavior Research Methods, 40, 1134–1143.

    PubMed  Article  Google Scholar 

  16. Ipeirotis, P. (2010). Demographics of Mechanical Turk. CeDER Working Papers, CeDER-10-01, New York University, Stern School of Business. Retrieved Aug 2012 from: http://hdl.handle.net/2451/29585

  17. Lupker, S. J., Perea, M., & Davis, C. M. (2008). Transposed-letter effects: Consonants, vowels and letter frequency. Language & Cognitive Processes, 23, 93–116.

    Article  Google Scholar 

  18. Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44, 1–23.

    PubMed  Article  Google Scholar 

  19. Mayo, C., Aubanel, V., & Cooke, M. (2012). Effect of prosodic changes on speech intelligibility. In Proc. Interspeech, Portland, OR, USA.

  20. McDonnell, J., Domingo, D., & Gureckis, T. (2012). Is Mechanical Turk the future of cognitive science research? Retrieved Aug 2012 from http://gureckislab.org/blog/?p=1297

  21. Meyer, D. E., Osman, A. M., Irwin, D. E., & Yantis, S. (1988). Modern mental chronometry. Biological Psychology, 26, 3–67.

    PubMed  Article  Google Scholar 

  22. Meyer, D. E., & Roger, W. S. (1971). Facilitation in recognizing pairs of words: Evidence of a dependence between retrieval operations. Journal of Experimental Psychology, 90, 227–234.

    PubMed  Article  Google Scholar 

  23. Neath, I., Earle, A., Hallett, D., & Surprenant, A. M. (2011). Response time accuracy in Apple Macintosh computers. Behavior Research Methods, 43, 353–362.

    PubMed  Article  Google Scholar 

  24. Nieuwenhuis, S., Stins, J. F., Posthuma, D., Polderman, T. J., Boomsma, D. I., & de Geus, E. J. (2006). Accounting for sequential trial effects in the flanker task: Conflict adaptation or associative priming? Memory & Cognition, 34, 1260–1272.

    Article  Google Scholar 

  25. Owen, A. M., Hampshire, A., Grahn, J. A., Stenton, R., Dajani, S., Burns, A. S., et al. (2010). Putting brain training to the test. Nature, 465, 775–778.

    PubMed Central  PubMed  Article  Google Scholar 

  26. Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419.

    Google Scholar 

  27. Paré, D. E., & Cree, G. S. (2009). Web-based image norming: How do object familiarity and visual complexity ratings compare when collected in-lab versus online? Behavior Research Methods, 41, 699–704.

    PubMed  Article  Google Scholar 

  28. Plant, R., & Turner, G. (2009). Millisecond precision psychological research in a world of commodity computers: New hardware, new problems? Behavior Research Methods, 41, 598–614.

    PubMed  Article  Google Scholar 

  29. R Developmental Core Team. (2011). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/

  30. Rand, D. G. (2012). The promise of Mechanical Turk: How online labor markets can help theorists run behavioral experiments. Journal of Theoretical Biology, 222, 172–179.

    Article  Google Scholar 

  31. Reimers, S., & Stewart, N. (2007). Adobe Flash as a medium for online experimentation: A test of reaction time measurement capabilities. Behavior Research Methods, 39, 365–370.

    PubMed  Article  Google Scholar 

  32. Schmidt, W. C. (2001). Presentation accuracy of Web animation methods. Behavior Research Methods, Instruments, & Computers, 33, 187–200.

    Article  Google Scholar 

  33. Simcox, T. (2012). [Compilation of user agent strings from Amazon Mechanical Turk workers]. Unpublished raw data.

  34. Smithson, M. (2001). Correct confidence intervals for various regression effect sizes and parameters: The importance of noncentral distributions in computing intervals. Educational and Psychological Measurement, 61, 605–632.

    Article  Google Scholar 

  35. The Mechanical Turk Blog. (2012, August 28). Improving quality with qualifications – tips for API requesters [Web log post]. Retrieved from: http://mechanicalturk.typepad.com/blog/2012/08/requesters-consistently-tell-us-that-using-qualifications-is-one-of-the-most-effective-strategies-for-optimizing-the-quality.html

  36. Ulrich, R., & Giray, M. (1989). The resolution of clocks: Effects on reaction time measurement-Good news for bad clocks. British Journal of Mathematical and Statistical Psychology, 42, 1–12.

    Article  Google Scholar 

  37. Woltman, G. (2012). Prime95. Retrieved from: http://www.mersenne.org/freesoft/

  38. Yu, C. H. (2003). Resampling methods: Concepts, applications, and justification. Practical Assessment, Research & Evaluation, 8. Retrieved from http://PAREonline.net/getvn.asp?v=8&n=19

Download references

Author Note

Travis Simcox, Department of Psychology, University of Pittsburgh; The Center for the Neural Basis of Cognition, Pittsburgh; Learning Research and Development Center, University of Pittsburgh. Julie A. Fiez, Department of Psychology, University of Pittsburgh; The Center for Neuroscience, University of Pittsburgh; The Center for the Neural Basis of Cognition, Pittsburgh; Learning Research and Development Center, University of Pittsburgh.

This research was supported by NIH R01 HD060388 and NSF 0815945.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Travis Simcox.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplemental table

(GIF 64 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Simcox, T., Fiez, J.A. Collecting response times using Amazon Mechanical Turk and Adobe Flash. Behav Res 46, 95–111 (2014). https://doi.org/10.3758/s13428-013-0345-y

Download citation

Keywords

  • Response times
  • Crowdsourcing
  • Amazon Mechanical Turk
  • Adobe flash
  • ActionScript
  • Stimulus presentation
  • Web experiment
  • Rich media
  • Timing