A thousand studies for the price of one: Accelerating psychological science with Pushkin

  • Joshua K. HartshorneEmail author
  • Joshua R. de Leeuw
  • Noah D. Goodman
  • Mariela Jennings
  • Timothy J. O’Donnell


Half of the world’s population has internet access. In principle, researchers are no longer limited to subjects they can recruit into the laboratory. Any study that can be run on a computer or mobile device can be run with nearly any demographic anywhere in the world, and in large numbers. This has allowed scientists to effectively run hundreds of experiments at once. Despite their transformative power, such studies remain rare for practical reasons: the need for sophisticated software, the difficulty of recruiting so many subjects, and a lack of research paradigms that make effective use of their large amounts of data, due to such realities as that they require sophisticated software in order to run effectively. We present Pushkin: an open-source platform for designing and conducting massive experiments over the internet. Pushkin allows for a wide range of behavioral paradigms, through integration with the intuitive and flexible jsPsych experiment engine. It also addresses the basic technical challenges associated with massive, worldwide studies, including auto-scaling, extensibility, machine-assisted experimental design, multisession studies, and data security.


Online studies Robust and reliable research Massive online experiments Citizen science 



  1. Adelman, J. S., Brown, G. D., & Quesada, J. F. (2006). Contextual diversity, not word frequency, determines word-naming and lexical decision times. Psychological Science, 17, 814–823. doi: PubMedCrossRefGoogle Scholar
  2. Adenot, P., & Wilson, C. (2016). Web audio API (W3C editor’s draft). Retrieved July 20, 2016, from
  3. Amazon Web Services. (2018). Amazon Web Services: Getting started resource center. Retrieved from
  4. Arnold, J. E. (2001). The effect of thematic roles on pronoun use and frequency of reference continuation. Discourse Processes, 31, 137–162.CrossRefGoogle Scholar
  5. Aust, F., Diedenhofen, B., Ullrich, S., & Musch, J. (2013). Seriousness checks are useful to improve data validity in online research. Behavior Research Methods, 45, 527–535. doi: PubMedCrossRefGoogle Scholar
  6. Auth0. (2017). Token based authentication made easy—Auth0. Retrieved from
  7. Bainbridge, W. S. (2007). The scientific research potential of virtual worlds. Science, 317, 472–476.PubMedCrossRefGoogle Scholar
  8. Barnhoorn, J. S., Haasnoot, E., Bocanegra, B. R., & van Steenbergen, H. (2015). QRTEngine: An easy solution for running online reaction time experiments using Qualtrics. Behavior Research Methods, 47, 918–929. doi: PubMedCrossRefGoogle Scholar
  9. Becker, K. (2018). How citizen scientists discovered the strangest star in the galaxy. Nova Next. Retrieved from
  10. Behrend, T. S., Sharek, D. J., Meade, A. W., & Wiebe, E. N. (2011). The viability of crowdsourcing for survey research. Behavior Research Methods, 43, 800–813. doi: PubMedCrossRefGoogle Scholar
  11. Berent, I., Vaknin, V., & Marcus, G. F. (2007). Roots, stems, and the universality of lexical representations: evidence from Hebrew. Cognition, 104, 254–286. doi: PubMedCrossRefGoogle Scholar
  12. Birnbaum, M. H. (2004). Human research and data collection via the Internet. Annual Review of Psychology, 55, 803–832. doi: PubMedCrossRefGoogle Scholar
  13. Blanchard, R., & Lippa, R. A. (2007). Birth order, sibling sex ratio, handedness, and sexual orientation of male and female participants in a BBC Internet Research Project. Archives of Sexual Behavior, 36, 163–176.PubMedCrossRefGoogle Scholar
  14. Bleidorn, W., Klimstra, T. A., Denissen, J. J. A., Rentfrow, P. J., Potter, J., & Gosling, S. D. (2013). Personality maturation around the world a cross-cultural examination of social-investment theory. Psychological Science, 24, 2530–2540. doi: PubMedCrossRefGoogle Scholar
  15. Bleidorn, W., Schönbrodt, F., Gebauer, J. E., Rentfrow, P. J., Potter, J., & Gosling, S. D. (2016). To live among like-minded others: Exploring the links between person-city personality fit and self-esteem. Psychological Science, 27, 419–427.PubMedCrossRefGoogle Scholar
  16. Bonney, R., Shirk, J. L., Phillips, T. B., Wiggins, A., Ballard, H. L., Miller-Rushing, A. J., & Parrish, J. K. (2014). Next steps for citizen science. Science, 343, 1436–1437.PubMedCrossRefGoogle Scholar
  17. Brady, T. F., Konkle, T., Alvarez, G. A., & Oliva, A. (2008). Visual long-term memory has a massive storage capacity for object details. Proceedings of the National Academy of Sciences, 105, 14325–14329. doi: CrossRefGoogle Scholar
  18. Brown, R. W., & Fish, D. (1983). The psychological causality implicit in language. Cognition, 14, 237–273.PubMedCrossRefGoogle Scholar
  19. Brysbaert, M., Stevens, M., Mandera, P., & Keuleers, E. (2016). How many words do we know? Practical estimates of vocabulary size dependent on word definition, the degree of language input and the participant’s age. Frontiers in Psychology, 7, 1116. doi: PubMedPubMedCentralCrossRefGoogle Scholar
  20. Buchanan, T., & Smith, J. L. (1999). Using the Internet for psychological research: Personality testing on the World Wide Web. British Journal of Psychology, 90, 125–144.PubMedCrossRefGoogle Scholar
  21. Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6, 3–5. doi: PubMedCrossRefGoogle Scholar
  22. Cadmus-Bertram, L. A., Marcus, B. H., Patterson, R. E., Parker, B. A., & Morey, B. L. (2015). Randomized trial of a Fitbit-based physical activity intervention for women. American Journal of Preventive Medicine, 49, 414–418.PubMedPubMedCentralCrossRefGoogle Scholar
  23. Casler, K., Bickel, L., & Hackett, E. (2013). Separate but equal? a comparison of participants and data gathered via Amazon’s Mturk, social media, and face-to-face behavioral testing. Computers in Human Behavior, 29, 2156–2160.CrossRefGoogle Scholar
  24. Chetverikov, A., & Upravitelev, P. (2016). Online versus offline: The Web as a medium for response time data collection. Behavior Research Methods, 48, 1086–1099. doi: PubMedCrossRefGoogle Scholar
  25. Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior, 12, 335–359. doi: CrossRefGoogle Scholar
  26. Condon, D. M., Roney, E., & Revelle, W. (2017). A sapa project update: On the structure of phrased self-report personality items. Journal of Open Psychology Data, 5(1), 3.CrossRefGoogle Scholar
  27. Crump, M. J. C., McDonnell, J. V., & Gureckis, T. M. (2013). Evaluating Amazon’s Mechanical Turk as a tool for experimental behavioral research. PLoS ONE, 8, e57410. doi: PubMedPubMedCentralCrossRefGoogle Scholar
  28. Cushman, F., & Greene, J. D. (2012). Finding faults: How moral dilemmas illuminate cognitive structure. Social Neuroscience, 7, 269–279.PubMedCrossRefGoogle Scholar
  29. Datadog. (2016). Datadog: Real-time performance monitoring. Retrieved from
  30. de Leeuw, J. R., & Motz, B. A. (2016). Psychophysics in a Web browser? Comparing response times collected with JavaScript and Psychophysics Toolbox in a visual search task. Behavior Research Methods, 48, 1–12. doi:
  31. Dickinson, J. L., Zuckerberg, B., & Bonter, D. N. (2010). Citizen science as an ecological research tool: Challenges and benefits. Annual Review of Ecology, Evolution, and Systematics, 41, 149–172. doi: CrossRefGoogle Scholar
  32. Doan, A., Ramakrishnan, R., & Halevy, A. Y. (2011). Crowdsourcing systems on the World-Wide Web. Communications of the ACM, 54(4), 86–96. doi: CrossRefGoogle Scholar
  33. Ellis, N. C. (2002). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24, 143–188.Google Scholar
  34. Fedorov, V. (2010). Optimal experimental design. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 581–589.CrossRefGoogle Scholar
  35. Ferrand, L., New, B., Brysbaert, M., Keuleers, E., Bonin, P., Méot, A., . . . Pallier, C. (2010). The French Lexicon Project: Lexical decision data for 38,840 French words and 38,840 pseudowords. Behavior Research Methods, 42, 488–496. doi:
  36. Ferstl, E. C., Garnham, A., & Manouilidou, C. (2011). Implicit causality bias in English: A corpus of 300 verbs. Behavior Research Methods, 43, 124–135. doi: PubMedCrossRefGoogle Scholar
  37. Fiser, J., & Aslin, R. N. (2001). Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychological Science, 12, 499–504.PubMedCrossRefGoogle Scholar
  38. Fontenelle, G. A., Phillips, A. P., & Lane, D. M. (1985). Generalizing across stimuli as well as subjects: A neglected aspect of external validity. Journal of Applied Psychology, 70, 101–107.CrossRefGoogle Scholar
  39. Fortenbaugh, F. C., DeGutis, J., Germine, L. T., Wilmer, J. B., Grosso, M., Russo, K., & Esterman, M. (2015). Sustained attention across the life span in a sample of 10,000 dissociating ability and strategy. Psychological Science, 26, 1497–1510.PubMedPubMedCentralCrossRefGoogle Scholar
  40. Gao, T., Harari, D., Tenenbaum, J., & Ullman, S. (2014). When computer vision gazes at cognition. arXiv preprint. arXiv:1412.2672Google Scholar
  41. Garvey, C., & Caramazza, A. (1974). Implicit causality in verbs. Linguistic Inquiry, 5, 459–464.Google Scholar
  42. Gebauer, J. E., Bleidorn, W., Gosling, S. D., Rentfrow, P. J., Lamb, M. E., & Potter, J. (2014). Cross-cultural variations in Big Five relationships with religiosity: A sociocultural motives perspective. Journal of Personality and Social Psychology, 107, 1064–1091. doi: PubMedCrossRefGoogle Scholar
  43. Germine, L. T., Duchaine, B., & Nakayama, K. (2011). Where cognitive development and aging meet: Face learning ability peaks after age 30. Cognition, 118, 201–210.PubMedCrossRefGoogle Scholar
  44. Germine, L. T., Dunn, E. C., McLaughlin, K. A., & Smoller, J. W. (2015). Childhood adversity is associated with adult theory of mind and social affiliation, but not face processing. PLoS ONE, 10, e0129612. doi:
  45. Germine, L., Nakayama, K., Duchaine, B. C., Chabris, C. F., Chatterjee, G., & Wilmer, J. B. (2012). Is the Web as good as the lab? Comparable performance from Web and lab in cognitive/perceptual experiments. Psychonomic Bulletin & Review, 19, 847–857. doi: CrossRefGoogle Scholar
  46. Goodman, J. K., Cryder, C. E., & Cheema, A. (2013). Data collection in a flat world: The strengths and weaknesses of Mechanical Turk samples. Journal of Behavioral Decision Making, 26, 213–224.CrossRefGoogle Scholar
  47. Goodman, N. D., Mansinghka, V. K., Roy, D. M., Bonawitz, K., & Tenenbaum, J. B. (2008). Church: A language for generative models. In D. A. McAllester & P. Myllymäki (Eds.), UAI 2008, Proceedings of the 24th Conference in Uncertainty in Artificial Intelligence (pp. 220–229). Corvallis, OR: AUAI Press.Google Scholar
  48. Goodman, N. D., & Stuhlmüller, A. (2014). The design and implementation of probabilistic programming languages. Retrieved February 23, 2017, from
  49. Goodman, N. D., & Tenenbaum, J. B. (2014). Probabilistic models of cognition. Retrieved from
  50. Gordon, P. C., Grosz, B. J., & Gilliom, L. A. (1993). Pronouns, names, and the centering of attention in discourse. Cognitive Science, 17, 311–347.CrossRefGoogle Scholar
  51. Gosling, S. D., & Mason, W. (2015). Internet research in psychology. Annual Review of Psychology, 66, 877–902. doi: PubMedCrossRefGoogle Scholar
  52. Gosling, S. D., Sandy, C. J., John, O. P., & Potter, J. (2010). Wired but not WEIRD: The promise of the Internet in reaching more diverse samples. Behavioral and Brain Sciences, 33, 94–95. doi: PubMedCrossRefGoogle Scholar
  53. Gosling, S. D., Vazire, S., Srivastava, S., & John, O. P. (2004). Should we trust Web-based studies? A comparative analysis of six preconceptions about Internet questionnaires. American Psychologist, 59, 93–104.PubMedCrossRefGoogle Scholar
  54. Grassegger, H., & Krogerus, M. (2017). The data that turned the world upside down. Vice Magazine, 30. Retrieved from
  55. Greene, J. (2014). Moral tribes: Emotion, reason, and the gap between us and them. New York, NY: Penguin.Google Scholar
  56. Greene, M. J., Kim, J. S., Seung, H. S., & the EyeWirers. (2016). Analogous convergence of sustained and transient inputs in parallel on and off pathways for retinal motion computation. Cell Reports, 14, 1892–1900.PubMedCrossRefGoogle Scholar
  57. Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. (1998). Measuring individual differences in implicit cognition: the implicit association test. Journal of Personality and Social Psychology, 74, 1464–1480. doi: PubMedCrossRefGoogle Scholar
  58. Halberda, J., Ly, R., Wilmer, J. B., Naiman, D. Q., & Germine, L. T. (2012). Number sense across the lifespan as revealed by a massive Internet-based sample. Proceedings of the National Academy of Sciences, 109, 11116–11120.CrossRefGoogle Scholar
  59. Harari, G. M., Lane, N. D., Wang, R., Crosier, B. S., Campbell, A. T., & Gosling, S. D. (2016). Using smartphones to collect behavioral data in psychological science: Opportunities, practical considerations, and challenges. Perspectives on Psychological Science, 11, 838–854. doi: PubMedPubMedCentralCrossRefGoogle Scholar
  60. Hardy, J., & Scanlon, M. (2009). The science behind luminosity. San Francisco, CA: Lumos Labs.Google Scholar
  61. Hartshorne, J. K. (2008). Visual working memory capacity and proactive interference. PLoS ONE, 3, e2716. doi: PubMedPubMedCentralCrossRefGoogle Scholar
  62. Hartshorne, J. K., Bonial, C., & Palmer, M. (2013a). The VerbCorner Project: Toward an empirically-based semantic decomposition of verbs. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP) (pp. 1438–1442). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
  63. Hartshorne, J. K., Bonial, C., & Palmer, M. (2014). The VerbCorner Project: Findings from Phase 1 of crowd-sourcing a semantic decomposition of verbs. Proceedings of the Association of Computational Linguistics, 2, 397–402.Google Scholar
  64. Hartshorne, J. K., & Germine, L. T. (2015). When does cognitive functioning peak? The asynchronous rise and fall of different cognitive abilities across the life span. Psychological Science, 26, 433–443. doi: PubMedPubMedCentralCrossRefGoogle Scholar
  65. Hartshorne, J. K., & Jennings, M. (Organizers). (2017). First annual Pushkin developer’s conference, Chestnut Hill, MA.Google Scholar
  66. Hartshorne, J. K., Leeuw, J. R. D., Germine, L., Reinecke, K., & Jennings, M. (2018a). Massive online experiments in cognitive science. Workshop at the Annual Meeting of the Cognitive Science Society, Madison, WI.Google Scholar
  67. Hartshorne, J. K., O’Donnell, T. J., & Tenenbaum, J. B. (2015). The causes and consequences explicit in verbs. Language, Cognition, and Neuroscience, 30, 716–734.CrossRefGoogle Scholar
  68. Hartshorne, J. K., & Snedeker, J. (2013). Verb argument structure predicts implicit causality: The advantages of finer-grained semantics. Language and Cognitive Processes, 28, 1474–1508.CrossRefGoogle Scholar
  69. Hartshorne, J. K., Sudo, Y., & Uruwashi, M. (2013b). Are implicit causality pronoun resolution biases consistent across languages and cultures? Experimental Psychology, 60, 179–196.PubMedCrossRefGoogle Scholar
  70. Hartshorne, J. K., Tenenbaum, J. B., & Pinker, S. (2018b). A critical period for second language acquisition: Evidence from 2/3 million English speakers. Cognition, 177, 263–277. doi: PubMedCrossRefGoogle Scholar
  71. Hauser, D. J., & Schwarz, N. (2016). Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants. Behavior Research Methods, 48, 400–407. doi: PubMedCrossRefGoogle Scholar
  72. Hauser, M. (2006). Moral minds: How nature designed our universal sense of right and wrong. New York, NY: Ecco/HarperCollins.Google Scholar
  73. Hauser, M. D., Young, L., & Cushman, F. (2008). Reviving Rawls’s linguistic analogy: Operative principles and the causal structure of moral actions. In W. Sinnott-Armstrong (Ed.), Moral psychology: Vol. 2. The cognitive science of morality: Intuition and diversity (pp. 107–143). Cambridge, MA: MIT Press.Google Scholar
  74. Haworth, C. M. A., Harlaar, N., Kovas, Y., Davis, O. S. P., Oliver, B. R., Hayiou-Thomas, M. E., . . . Plomin, R. (2007). Internet cognitive testing of large samples needed in genetic research. Twin Research and Human Genetics, 10, 554–563. doi:
  75. Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33, 61–83. doi: PubMedCrossRefGoogle Scholar
  76. Hilbig, B. E. (2016). Reaction time effects in lab- versus Web-based research: Experimental evidence. Behavior Research Methods, 48, 1718–1724. doi: PubMedCrossRefGoogle Scholar
  77. Hlushko, E., Kaper, R., Larkin, S., Braimbridge, A., Grisogono, G., Menichelli, J., . . . Stewart, J. (2018). webpack (Software). Retrieved from
  78. Honing, H., & Ladinig, O. (2008). The potential of the Internet for music perception research: A comment on lab-based versus web-based studies. Empirical Musicology Review, 3, 4–7.CrossRefGoogle Scholar
  79. Howe, J. (2006). The rise of crowdsourcing. Wired Magazine, 14(6), 1–4.Google Scholar
  80. Huber, B., Reinecke, K., & Gajos, K. Z. (2017). The effect of performance feedback on social media sharing at volunteer-based online experiment platforms. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 1882–1886).Google Scholar
  81. Ipeirotis, P. G. (2010). Demographics of Mechanical Turk (NYU Working Paper CEDER-10-01). New York, NY: New York University, Leonard N. Stern School of Business.Google Scholar
  82. ITU Telecommunication Development Sector. (2017). ICT facts and figures. Retrieved from
  83. Johnson, J. A. (2005). Ascertaining the validity of individual protocols from web-based personality inventories. Journal of Research in Personality, 39, 103–129.CrossRefGoogle Scholar
  84. Johnson, W., Logie, R. H., & Brockmole, J. R. (2010). Working memory tasks differ in factor structure across age cohorts: Implications for dedifferentiation. Intelligence, 38, 513–528.CrossRefGoogle Scholar
  85. Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. Journal of Personality and Social Psychology, 103, 54–69.PubMedCrossRefGoogle Scholar
  86. Jun, E., Hsieh, G., & Reinecke, K. (2017). Types of motivation affect study selection, attention, and dropouts in online experiments. In C. Lampe, J. Nichols, K. Karahalios, G. Fitzpatrick, U. Lee, A. Monroy-Hernandez, & W. Stuerzlinger (Eds.), Proceedings of ACM Human–Computer Interaction (Vol. 1, Article 56). New York, NY: ACM Press.Google Scholar
  87. Kajonius, P. J., & Johnson, J. (2018). Sex differences in 30 facets of the five factor model of personality in the large public (n = 320,128). Personality and Individual Differences, 129, 126–130.CrossRefGoogle Scholar
  88. Kaufman, A. S. (2001). WAIS-III IQs, Horn’s theory, and generational changes from young adulthood to old age. Intelligence, 29, 131–167.CrossRefGoogle Scholar
  89. Kehler, A., & Rohde, H. (2013). A probabilistic reconciliation of coherence-driven and centering-driven theories of pronoun interpretation. Theoretical Linguistics, 39, 1–37.CrossRefGoogle Scholar
  90. Keller, K., Troesch, L. M., & Grob, A. (2015). First-born siblings show better second language skills than later born siblings. Frontiers in Psychology, 6, 705. doi: PubMedPubMedCentralCrossRefGoogle Scholar
  91. Keuleers, E., Stevens, M., Mandera, P., & Brysbaert, M. (2015). Word knowledge in the crowd: Measuring vocabulary size and word prevalence in a massive online experiment. Quarterly Journal of Experimental Psychology, 68, 1665–1692. doi: CrossRefGoogle Scholar
  92. Killingsworth, M. A., & Gilbert, D. T. (2010). A wandering mind is an unhappy mind. Science, 330, 932–932.PubMedCrossRefGoogle Scholar
  93. Kim, J. S., Greene, M. J., Zlateski, A., Lee, K., Richardson, M., Turaga, S. C., . . . the EyeWirers. (2014). Space–time wiring specificity supports direction selectivity in the retina. Nature, 509, 331–336.Google Scholar
  94. Kleiner, M., Brainard, D., & Pelli, D. (2007). What’s new in Psychtoolbox-3? Perception, 36(ECVP Abstract Suppl), 14.Google Scholar
  95. Krantz, J. H. (2001). Stimulus delivery on the web: What can be presented when calibration isn’t possible. Dimensions of Internet Science, 113–130.Google Scholar
  96. Kumar, A., Killingsworth, M. A., & Gilovich, T. (2014). Waiting for merlot: Anticipatory consumption of experiential and material purchases. Psychological Science, 25, 1924–1931.PubMedCrossRefGoogle Scholar
  97. Levay, K. E., Freese, J., & Druckman, J. N. (2016). The demographic and political composition of Mechanical Turk samples. Sage Open, 6(1). doi:
  98. Lindley, D. V. (1956). On a measure of the information provided by an experiment. Annals of Mathematical Statistics, 27, 986–1005.CrossRefGoogle Scholar
  99. Lippa, R. A. (2008). Sex differences and sexual orientation differences in personality: Findings from the BBC Internet survey. Archives of Sexual Behavior, 37, 173–187.PubMedCrossRefGoogle Scholar
  100. Logie, R. H., & Maylor, E. A. (2009). An Internet study of prospective memory across adulthood. Psychology and Aging, 24, 767–774.PubMedCrossRefGoogle Scholar
  101. Manning, J. T., & Fink, B. (2008). Digit ratio (2d:4d), dominance, reproductive success, asymmetry, and sociosexuality in the BBC Internet study. American Journal of Human Biology, 20, 451–461.PubMedCrossRefGoogle Scholar
  102. Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk. Behavior Research Methods, 44, 1–23. doi: PubMedCrossRefGoogle Scholar
  103. Maylor, E. A., & Logie, R. H. (2010). A large-scale comparison of prospective and retrospective memory development from childhood to middle age. Quarterly Journal of Experimental Psychology, 63, 442–451.CrossRefGoogle Scholar
  104. Meyerson, P., & Tryon, W. W. (2003). Validating Internet research: A test of the psychometric equivalence of Internet and in-person samples. Behavior Research Methods, Instruments, & Computers, 35, 614–620.CrossRefGoogle Scholar
  105. Miller, G. (2012). The smartphone psychology manifesto. Perspectives on Psychological Science, 7, 221–237.PubMedCrossRefGoogle Scholar
  106. Montgomery-Downs, H. E., Insana, S. P., & Bond, J. A. (2012). Movement toward a novel activity monitoring device. Sleep and Breathing, 16, 913–917.PubMedCrossRefGoogle Scholar
  107. Morton, J. (1969). Interaction of information in word recognition. Psychological Review, 76, 165–178. doi: CrossRefGoogle Scholar
  108. Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, 2008 (SP 2008) (pp. 111–125). Piscataway, NJ: IEEE Press.Google Scholar
  109. Nosek, B. A., Banaji, M., & Greenwald, A. G. (2002). Harvesting implicit group attitudes and beliefs from a demonstration web site. Group Dynamics: Theory, Research, and Practice, 6, 101–115. doi: CrossRefGoogle Scholar
  110. Ouyang, L., Tessler, M. H., Ly, D., & Goodman, N. D. (2018). webppl-oed: A practical optimal experiment design system. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Meeting of the Cognitive Science Society (pp. 2192–2197). Austin, TX: Cognitive Science Society. Google Scholar
  111. Paolacci, G., Chandler, J., & Ipeirotis, P. G. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making, 5, 411–419.Google Scholar
  112. Papoutsaki, A., Sangkloy, P., Laskey, J., Daskalova, N., Huang, J., & Hays, J. (2016). WebGazer: Scalable webcam eye tracking using user interactions. Article presented at the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI 2016), New York, NY.Google Scholar
  113. Peters, M., Reimers, S., & Manning, J. T. (2006). Hand preference for writing and associations with selected demographic and behavioral variables in 255,100 subjects: The BBC Internet study. Brain and Cognition, 62, 177–189.PubMedCrossRefGoogle Scholar
  114. Picard, R. W., Fedor, S., & Ayzenberg, Y. (2015). Multiple arousal theory and daily-life electrodermal activity asymmetry. Emotion Review, 8, 62–75. doi: CrossRefGoogle Scholar
  115. Pinet, S., Zielinski, C., Mathot, S., Dufau, S., Alario, F.-X., & Longcamp, M. (2017). Measuring sequences of keystrokes with jspsych: Reliability of response times and interkeystroke intervals. Behavior Research Methods, 49, 1163–1176.PubMedCrossRefGoogle Scholar
  116. Poesio, M., Chamberlain, J., Kruschwitz, U., Robaldo, L., & Ducceschi, L. (2013). Phrase detectives: Utilizing collective intelligence for Internet-scale language resource creation. ACM Transactions on Interactive Intelligent Systems (TIIS), 3, 1–44. doi:
  117. Poh, M.-Z., Swenson, N. C., & Picard, R. W. (2010). A wearable sensor for unobtrusive, long-term assessment of electrodermal activity. IEEE Transactions on Biomedical Engineering, 57, 1243–1252.PubMedCrossRefGoogle Scholar
  118. Rand, D. G. (2012). The promise of Mechanical Turk: How online labor markets can help theorists run behavioral experiments. Journal of Theoretical Biology, 299, 172–179.PubMedCrossRefGoogle Scholar
  119. Ratcliff, R., Gomez, P., & McKoon, G. (2004). A diffusion model account of the lexical decision task. Psychological Review, 111, 159–182. doi: PubMedPubMedCentralCrossRefGoogle Scholar
  120. Reed, J., Raddick, M. J., Lardner, A., & Carney, K. (2013). An exploratory factor analysis of motivations for participating in zooniverse, a collection of virtual citizen science projects. In 46th Hawaii International Conference on System Sciences (HICSS) 2013 (pp. 610–619). Piscataway, NJ: IEEE Press.Google Scholar
  121. Reimers, S., & Stewart, N. (2015). Presentation and response timing accuracy in Adobe Flash and HTML5/JavaScript Web experiments. Behavior Research Methods, 47, 309–327. doi: PubMedCrossRefGoogle Scholar
  122. Reinecke, K., & Gajos, K. Z. (2014). Quantifying visual preferences around the world. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 11–20). New York, NY: ACM Press.Google Scholar
  123. Reinecke, K., & Gajos, K. Z. (2015). LabintheWild: Conducting large-scale online experiments with uncompensated samples. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 1364–1378). New York, NY: ACM Press.Google Scholar
  124. Reips, U.-D. (2002). Standards for Internet-based experimenting. Experimental Psychology, 49, 243–256. doi: PubMedCrossRefGoogle Scholar
  125. Ren, Z., Meng, J., Yuan, J., & Zhang, Z. (2011). Robust hand gesture recognition with Kinect sensor. In Proceedings of the 19th ACM International Conference on Multimedia (pp. 759–760). New York, NY: ACM Press.Google Scholar
  126. Rife, S. C., Cate, K. L., Kosinski, M., & Stillwell, D. (2016). Participant recruitment and data collection through Facebook: The role of personality factors. International Journal of Social Research Methodology, 19, 69–83.CrossRefGoogle Scholar
  127. Riley, E., Okabe, H., Germine, L., Wilmer, J., Esterman, M., & DeGutis, J. (2016). Gender differences in sustained attentional control relate to gender inequality across countries. PLoS ONE, 11, e165100. doi:
  128. Rudolph, U., & Forsterling, F. (1997). The psychological causality implicit in verbs: A review. Psychological Bulletin, 121, 192–218.CrossRefGoogle Scholar
  129. Salganik, M. J., Dodds, P. S., & Watts, D. J. (2006). Experimental study of inequality and unpredictability in an artificial cultural market. Science, 311, 854–856.PubMedCrossRefGoogle Scholar
  130. Salthouse, T. A. (2004). What and when of cognitive aging. Current Directions in Psychological Science, 13, 140–144.CrossRefGoogle Scholar
  131. Salthouse, T. A. (2009). When does age-related cognitive decline begin? Neurobiology of Aging, 30, 507–514.PubMedPubMedCentralCrossRefGoogle Scholar
  132. Schneider, W., Eschman, A., & Zuccolotto, A. (2002). E-Prime user’s guide. Pittsburgh, PA: Psychology Software Incorporated.Google Scholar
  133. Scott, K., & Schulz, L. (2017). Lookit (part 1): A new online platform for developmental research. Open Mind, 1(1), 4–14. doi: CrossRefGoogle Scholar
  134. Semmelmann, K., & Weigelt, S. (2017). Online psychophysics: Reaction time effects in cognitive experiments. Behavior Research Methods, 49, 1241–1260.PubMedCrossRefGoogle Scholar
  135. Settles, B. (2012). Active learning. San Rafael, CA: Morgan & Claypool.Google Scholar
  136. Settles, B., & Meeder, B. (2016). A trainable spaced repetition model for language learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistic (Vol. 1, pp. 1848–1858). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
  137. Shapiro, D. N., Chandler, J., & Mueller, P. A. (2013). Using Mechanical Turk to study clinical populations. Clinical Psychological Science, 1, 213–220. doi: CrossRefGoogle Scholar
  138. Silvertown, J. (2009). A new dawn for citizen science. Trends in Ecology and Evolution, 24, 467–471.PubMedCrossRefGoogle Scholar
  139. Simcox, T., & Fiez, J. A. (2014). Collecting response times using Amazon Mechanical Turk and Adobe Flash. Behavior Research Methods, 46, 95–111. doi: PubMedPubMedCentralCrossRefGoogle Scholar
  140. Simpson, R., Page, K. R., & De Roure, D. (2014). Zooniverse: Observing the world’s largest citizen science platform. In Proceedings of the 23rd International Conference on World Wide Web Companion (pp. 1049–1054). New York, NY: ACM Press.Google Scholar
  141. Skitka, L. J., & Sargis, E. G. (2006). The Internet as psychological laboratory. Annual Review of Psychology, 57, 529–555.PubMedCrossRefGoogle Scholar
  142. Slote, J., & Strand, J. F. (2016). Conducting spoken word recognition research online: Validation and a new timing method. Behavior Research Methods, 48, 553–566. doi:
  143. Smith, S. M., Roster, C. A., Golden, L. L., & Albaum, G. S. (2016). A multi-group analysis of online survey respondent data quality: Comparing a regular USA consumer panel to MTurk samples. Journal of Business Research, 69, 3139–3148.CrossRefGoogle Scholar
  144. Soto, C. J., John, O. P., Gosling, S. D., & Potter, J. (2011). Age differences in personality traits from 10 to 65: Big Five domains and facets in a large cross-sectional sample. Journal of Personality and Social Psychology, 100, 330–348. doi: PubMedCrossRefGoogle Scholar
  145. Stewart, N., Chandler, J., & Paolacci, G. (2017). Crowdsourcing samples in cognitive science. Trends in Cognitive Sciences, 21, 736–748. doi: PubMedCrossRefGoogle Scholar
  146. Stewart, N., Ungemach, C., Harris, A. J., Bartels, D. M., Newell, B. R., Paolacci, G., & Chandler, J. (2015). The average laboratory samples a population of 7,300 Amazon Mechanical Turk workers. Judgment and Decision Making, 10, 479–491.Google Scholar
  147. Stieger, S., Lewetz, D., & Reips, U.-D. (2017). Can smartphones be used to bring computer-based tasks from the lab to the field? A mobile experience-sampling method study about the pace of life. Behavior Research Methods. Advance online publication. doi:
  148. Streeter, M. (2015). Mixture modeling of individual learning curves. Article presented at the International Conference on Educational Data Mining Society, Madrid, Spain.Google Scholar
  149. Suchow, J. (2018). Dallinger (Software). Retrieved from
  150. Sullivan, B. L., Aycrigg, J. L., Barry, J. H., Bonney, R. E., Bruns, N., Cooper, C. B., . . . Kelling, S. (2014). The eBird enterprise: An integrated approach to development and application of citizen science. Biological Conservation, 169, 31–40.Google Scholar
  151. Susilo, T., Germine, L. T., & Duchaine, B. (2013). Face recognition ability matures late: Evidence from individual differences in young adults. Journal of Experimental Psychology: Human Perception and Performance, 39, 1212–1217. doi: PubMedCrossRefGoogle Scholar
  152. Trull, T. J., & Ebner-Priemer, U. (2013). Ambulatory assessment. Annual Review of Clinical Psychology, 9, 151–176. doi: PubMedCrossRefGoogle Scholar
  153. Tucker-Drob, E. M. (2011). Global and domain-specific changes in cognition throughout adulthood. Developmental Psychology, 47, 331–343.PubMedPubMedCentralCrossRefGoogle Scholar
  154. Videla, A., & Williams, J. J. W. (2012). RabbitMQ in action: Distributed messaging for everyone. Shelter Island, NY: Manning.Google Scholar
  155. Willett, K. W., Galloway, M. A., Bamford, S. P., Lintott, C. J., Masters, K. L., Scarlata, C., . . . Smith, A. M. (2017). Galaxy Zoo: Morphological classifications for 120,000 galaxies in HST legacy imaging. Monthly Notices of the Royal Astronomical Society, 464, 4176–4203.CrossRefGoogle Scholar
  156. Wilson, R. E., Gosling, S. D., & Graham, L. T. (2012). A review of Facebook research in the social sciences. Perspectives on Psychological Science, 7, 203–220. doi: PubMedCrossRefGoogle Scholar
  157. Xu, A. R. (2018). Scholars have data on millions of Facebook users: Who’s guarding it? New York Times.Google Scholar
  158. Zimmer, M. (2016). OkCupid study reveals the perils of big-data science. Wired Magazine. Retrieved from

Copyright information

© Psychonomic Society, Inc. 2019

Authors and Affiliations

  • Joshua K. Hartshorne
    • 1
    Email author
  • Joshua R. de Leeuw
    • 2
  • Noah D. Goodman
    • 3
  • Mariela Jennings
    • 1
  • Timothy J. O’Donnell
    • 4
  1. 1.Department of PsychologyBoston CollegeNewtonUSA
  2. 2.Department of Cognitive ScienceVassar CollegePoughkeepsieUSA
  3. 3.Department of PsychologyStanford UniversityStanfordUSA
  4. 4.Department of LinguisticsMcGill UniversityMontrealCanada

Personalised recommendations