Crowdsourcing systems like Amazon’s Mechanical Turk (AMT) allow data to be collected from a large sample of people in a short amount of time. This use has garnered considerable interest from behavioral scientists. So far, most experiments conducted on AMT have focused on survey-type instruments because of difficulties inherent in running many experimental paradigms over the Internet. This study investigated the viability of presenting stimuli and collecting response times using Adobe Flash to run ActionScript 3 code in conjunction with AMT. First, the timing properties of Adobe Flash were investigated using a phototransistor and two desktop computers running under several conditions mimicking those that may be present in research using AMT. This experiment revealed some strengths and weaknesses of the timing capabilities of this method. Next, a flanker task and a lexical decision task implemented in Adobe Flash were administered to participants recruited with AMT. The expected effects in these tasks were replicated. Power analyses were conducted to describe the number of participants needed to replicate these effects. A questionnaire was used to investigate previously undescribed computer use habits of 100 participants on AMT. We conclude that a Flash program in conjunction with AMT can be successfully used for running many experimental paradigms that rely on response times, although experimenters must understand the limitations of the method.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Adobe Developer Connection. (2012). Flash Platform Developer Center. http://www.adobe.com/devnet/flashplatform.html Retrieved 2012-08-11.
AMT FAQ. (2012). https://requester.mturk.com/help/faq#examples_violations Retrieved 2012-08-11.
Behrend, T. S., Sharek, D. J., Meade, A. W., & Wiebe, E. N. (2011). The viability of crowdsourcing for survey research. Behavior Research Methods, 43, 800–813.
Berinsky, A. J., Huber, & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Political Analysis, 20, 351–368.
Brand, A., & Bradley, M. T. (2012). Assessing the effects of technical variance on the statistical outcomes of web experiments measuring response times. Social Science Computer Review, 30, 350–357.
Buhrmester, M. D., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6, 3–5.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, New Jersey: Lawrence Erlbaum.
Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191.
Fowler, R. L. (1985). Point estimates and confidence intervals in measures of association. Quantitative Methods in Psychology, 98, 160–165.
Goodman, J. K., Cryder, C. E., & Cheema, A. (in press). Data collection in a flat world: Strengths and weaknesses of mechanical Turk samples. Journal of Behavioral Decision Making.
Grossman, G. & Huang, E. (2009). ActionScript 3.0 Overview. Adobe systems incorporated. Retrieved 2012-08-11 from http://www.adobe.com/devnet/actionscript/articles/actionscript3_overview.html
Halberda, J., Ly, R., Wilmer, J., Naiman, D., & Germine, L. (2012). Number sense across the lifespan as revealed by a massive internet-based sample. Proceedings of the National Academy of Sciences, 109, 11116–11120.
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavior and Brain Sciences. Retrieved from: http://www2.psych.ubc.ca/~henrich/pdfs/WeirdPeople.pdf
Hewson, C. M., Laurent, D., & Vogel, C. M. (1996). Proper methodologies for psychological and sociological studies conducted via the Internet. Behavior Research Methods, Instruments, & Computers, 28, 186–191.
Houben, K., & Wiers, R. W. (2008). Measuring implicit alcohol associations via the Internet: Validation of Web-based implicit association tests. Behavior Research Methods, 40, 1134–1143.
Ipeirotis, P. (2010). Demographics of Mechanical Turk. CeDER Working Papers, CeDER-10-01, New York University, Stern School of Business. Retrieved Aug 2012 from: http://hdl.handle.net/2451/29585
Lupker, S. J., Perea, M., & Davis, C. M. (2008). Transposed-letter effects: Consonants, vowels and letter frequency. Language & Cognitive Processes, 23, 93–116.
Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44, 1–23.
Mayo, C., Aubanel, V., & Cooke, M. (2012). Effect of prosodic changes on speech intelligibility. In Proc. Interspeech, Portland, OR, USA.
McDonnell, J., Domingo, D., & Gureckis, T. (2012). Is Mechanical Turk the future of cognitive science research? Retrieved Aug 2012 from http://gureckislab.org/blog/?p=1297
Meyer, D. E., Osman, A. M., Irwin, D. E., & Yantis, S. (1988). Modern mental chronometry. Biological Psychology, 26, 3–67.
Meyer, D. E., & Roger, W. S. (1971). Facilitation in recognizing pairs of words: Evidence of a dependence between retrieval operations. Journal of Experimental Psychology, 90, 227–234.
Neath, I., Earle, A., Hallett, D., & Surprenant, A. M. (2011). Response time accuracy in Apple Macintosh computers. Behavior Research Methods, 43, 353–362.
Nieuwenhuis, S., Stins, J. F., Posthuma, D., Polderman, T. J., Boomsma, D. I., & de Geus, E. J. (2006). Accounting for sequential trial effects in the flanker task: Conflict adaptation or associative priming? Memory & Cognition, 34, 1260–1272.
Owen, A. M., Hampshire, A., Grahn, J. A., Stenton, R., Dajani, S., Burns, A. S., et al. (2010). Putting brain training to the test. Nature, 465, 775–778.
Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419.
Paré, D. E., & Cree, G. S. (2009). Web-based image norming: How do object familiarity and visual complexity ratings compare when collected in-lab versus online? Behavior Research Methods, 41, 699–704.
Plant, R., & Turner, G. (2009). Millisecond precision psychological research in a world of commodity computers: New hardware, new problems? Behavior Research Methods, 41, 598–614.
R Developmental Core Team. (2011). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/
Rand, D. G. (2012). The promise of Mechanical Turk: How online labor markets can help theorists run behavioral experiments. Journal of Theoretical Biology, 222, 172–179.
Reimers, S., & Stewart, N. (2007). Adobe Flash as a medium for online experimentation: A test of reaction time measurement capabilities. Behavior Research Methods, 39, 365–370.
Schmidt, W. C. (2001). Presentation accuracy of Web animation methods. Behavior Research Methods, Instruments, & Computers, 33, 187–200.
Simcox, T. (2012). [Compilation of user agent strings from Amazon Mechanical Turk workers]. Unpublished raw data.
Smithson, M. (2001). Correct confidence intervals for various regression effect sizes and parameters: The importance of noncentral distributions in computing intervals. Educational and Psychological Measurement, 61, 605–632.
The Mechanical Turk Blog. (2012, August 28). Improving quality with qualifications – tips for API requesters [Web log post]. Retrieved from: http://mechanicalturk.typepad.com/blog/2012/08/requesters-consistently-tell-us-that-using-qualifications-is-one-of-the-most-effective-strategies-for-optimizing-the-quality.html
Ulrich, R., & Giray, M. (1989). The resolution of clocks: Effects on reaction time measurement-Good news for bad clocks. British Journal of Mathematical and Statistical Psychology, 42, 1–12.
Woltman, G. (2012). Prime95. Retrieved from: http://www.mersenne.org/freesoft/
Yu, C. H. (2003). Resampling methods: Concepts, applications, and justification. Practical Assessment, Research & Evaluation, 8. Retrieved from http://PAREonline.net/getvn.asp?v=8&n=19
Travis Simcox, Department of Psychology, University of Pittsburgh; The Center for the Neural Basis of Cognition, Pittsburgh; Learning Research and Development Center, University of Pittsburgh. Julie A. Fiez, Department of Psychology, University of Pittsburgh; The Center for Neuroscience, University of Pittsburgh; The Center for the Neural Basis of Cognition, Pittsburgh; Learning Research and Development Center, University of Pittsburgh.
This research was supported by NIH R01 HD060388 and NSF 0815945.
Electronic supplementary material
Below is the link to the electronic supplementary material.
(GIF 64 kb)
About this article
Cite this article
Simcox, T., Fiez, J.A. Collecting response times using Amazon Mechanical Turk and Adobe Flash. Behav Res 46, 95–111 (2014). https://doi.org/10.3758/s13428-013-0345-y
- Response times
- Amazon Mechanical Turk
- Adobe flash
- Stimulus presentation
- Web experiment
- Rich media