Skip to main content
Log in

Fast, Cheap, and Unethical? The Interplay of Morality and Methodology in Crowdsourced Survey Research

  • Published:
Review of Philosophy and Psychology Aims and scope Submit manuscript

Abstract

Crowdsourcing is an increasingly popular method for researchers in the social and behavioral sciences, including experimental philosophy, to recruit survey respondents. Crowdsourcing platforms, such as Amazon’s Mechanical Turk (MTurk), have been seen as a way to produce high quality survey data both quickly and cheaply. However, in the last few years, a number of authors have claimed that the low pay rates on MTurk are morally unacceptable. In this paper, I explore some of the methodological implications for online experimental philosophy research if, in fact, typical pay practices on MTurk are morally impermissible. I argue that the most straightforward solution to this apparent moral problem—paying survey respondents more and relying only on “high reputation” respondents—will likely increase the number of subjects who have previous experience with survey materials and thus are “non-naïve” with respect to those materials. I then discuss some likely effects that this increase in experimental non-naivete will have on some aspects of the “negative” program in experimental philosophy, focusing in particular on recent debates about philosophical expertise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. See Williamson (2016a, b, 77–8) for some data documenting the rapid explosion of work in political science that relies on MTurk.

  2. In late May 2016, a search for the terms “Mechanical Turk” or “MTurk” in journals that have published work in experimental philosophy returned at least 127 papers, most of which appeared in 2014 or later. Results from individual journals were as follows: American Philosophical Quarterly (1), Analysis (1), Australasian Journal of Philosophy (3), BMC Medical Ethics (1), Cognition (22), Consciousness and Cognition (13), Episteme (1), Ergo (2), Erkenntnis (1), Ethics (1), Mind and Language (2), Neuroethics (9), Nous (4), Philosopher’s Imprint (2), Philosophical Psychology (21), Philosophical Studies (8), Philosophy and Phenomenological Research (4), Psychonomic Bulletin and Review (1), Review of Philosophy and Psychology (17), Synthese (13). I say “at least” because 83 total articles in the journal Cognition contained the relevant search terms; a conservative estimate was that 22 of these concerned topics in experimental philosophy (e.g., folk judgments about moral cognition) rather than other topics in psychology.

  3. As John Bohannon writes: “The obvious advantage is the speed and cost. [Quoting Adam Berinsky:] “Generally, we pay $8 for a 15- to 20-min experiment in a lab. We can run the same study on MTurk for 75 cents to a dollar” (2011, 307). See also Mason and Suri (2012, 3).

  4. For one thing, the slogan suggests that low cost and high quality can be achieved simply by sacrificing speed, which is not always the case. I first learned of this saying from Errol Morris’s 1997 documentary Fast, Cheap, and Out of Control (itself named after a 1989 article on interplanetary exploration by Rodney A. Brooks and Anita M. Flynn).

  5. As I discuss further in Section 3, my aim is not to argue that low pay rates on MTurk are immoral. Rather, I will assume that (at least for a sizable minority of MTurk workers) paying less than minimum wage is immoral. My main interest is in exploring the methodological implications that follow from addressing this moral concern.

  6. I focus on MTurk since most previous research has been conducted on this platform, but my discussion can be readily applied to other crowdsourcing platforms.

  7. This figure is based both on time spent working on tasks for which they were paid and “unpaid work,” including “looking for tasks … and researching requesters” (Berg 2016, 557).

  8. https://www.reddit.com/r/mturk/comments/1z4sma/new_to_mturk_heres_what_you_should_know/ I owe this reference to Cima (2014).

  9. Fort et al. (2014) also note the “very low wages (below $2 an hour…)” on MTurk, citing Ipeirotis (2010b) and Ross et al. (2010), although, to be fair, this paper was originally presented at a 2011 conference.

  10. Chandler, Mueller, and Paolacci (2014, 115) claim that “the most prolific 10% [of workers] were responsible for completing 41% of the submitted [tasks]” from a pool of over 16,000 tasks. See also Ipeirotis (2010a) and Fort et al. (2011, 416). On a related point, Stewart et al. (2015) argue that the effective population size of MTurk workers that a typical laboratory can access is only 7300, despite MTurk having around 500,000 registered workers at the time.

  11. Mason and Suri cite a blog post by Panos Ipeirotis that discusses MTurk as a market for lemons: http://www.behind-the-enemy-lines.com/2010/07/mechanical-turk-low-wages-and-market.html See also, Fort et al. (2011, 418).

  12. Chandler et al. (2014, 117-8) show that, unsurprisingly, the most productive MTurk workers are more likely to report experience with common experimental paradigms. For example, of those in the top 1% based on productivity, 85% reported having seen the trolley problem before and 88% reported prior exposure to the prisoner’s dilemma. (Note, however, that Chandler et al. (2015) found that self-reports of prior participation are, unsurprisingly, an imperfect measure of actual prior participation.) The 2014 study also notes several other differences between MTurk workers and members of traditional participant pools that would make it more likely that the former have prior experience with experimental materials than the latter: MTurk workers may remain members of crowdsourcing websites indefinitely; they are typically not restricted in the kinds of studies in which they may participate; they have greater opportunity to complete slightly different versions of a single experimental paradigm; and they have the ability to discuss research tasks with other MTurk workers on online discussion boards and social media (2014, 113). Further, many workers report having a list of favorite researchers that they monitor for available research tasks, which further increases the likelihood of nonnaiveté for each individual researcher (ibid., 117, 128). Necka et al. (2016, 9) report that 56.2% of respondents in an MTurk sample claim to look “for studies by a researcher they already know,” compared to only 17.4% and 10.9% of respondents who report such behavior in campus and community samples, respectively.

  13. For instance, one study found that performance on the original version of the “cognitive reflection task” (CRT), which consists of questions that have highly intuitive but incorrect answers, such as “A bat and a ball cost $1.10 in total. The bat costs a dollar more than the ball. How much does the ball cost?,” was positively correlated with the number of previously completed research tasks (Chandler et al. 2014, 120). By contrast, performance on a “new” version of the CRT that contained items that workers were unlikely to have seen before was not so correlated. I discuss these results further in Section 4.

  14. Note, though, that Chandler et al. (2014, 120) “found no evidence that worker experience predicts the tendency of workers to respond randomly to surveys.” A more recent study suggests that MTurk respondents are more likely to provide quick, “good enough” answers (rather than carefully considered ones) than university students (Hamby and Taylor 2016).

  15. Chandler and Shapiro (2016, 68) write that they “consider payment to be more of an ethical issue than a data-quality issue and suggest that researchers should pay participants at a rate they consider to be fair and in line with the ethical standards of the field.” They direct readers to a subsequent section for “guidance,” where they claim only that “fairness relative to wage rates external to MTurk must be counterbalanced by [American Psychological Association] ethical guidelines that prevent researchers from offering coercively high incentives,” in addition to mentioning the website cited in note 16 (ibid., 72).

  16. See also the website for MTurk worker advocacy, Dynamo, on fair payment practices: http://wiki.wearedynamo.org/index.php?title=Fair_payment.

  17. For example, in a blog post on the New York Times, economist Nancy Folbre asserts without argument that “A sustainable form of crowdsourcing … will require some assurance of human rights, including access to decent employment, living wages and high-quality public education” (Folbre 2013). This suggests, of course, that crowdsourcing currently does not provide such assurance.

  18. See, e.g., Worstall (2013) and Mason and Suri (2012, 16) for hints of how such arguments might go. These authors put a lot of weight on the idea that workers’ decisions to work on MTurk are completely voluntary. This suggests a defense of MTurk pay practices based on an “agreement view” of justice in pay, according to which “the just wage is whatever wage the employer and the employee agree to without force or fraud” (Moriarty 2017). On this point, V. Williamson notes that one cannot justifiably believe that “the going rate” that workers (apparently) voluntarily accept is fair compensation unless one has good reason to think that “market forces cannot be exploitative of workers” (2016a, b, 78).

  19. An anonymous referee helpfully notes that two bodies of literature in bioethics are relevant here: (1) work on informed consent (e.g., Nelson et al. (2011)) and (2) the debate about whether paying subjects for participating in clinical research studies amounts to “coercive” or “undue” inducement (e.g., Wilkinson and Moore 1997 and Grady 2005). Fully exploring how this work bears on the morality of payment practices on MTurk is beyond the scope of this paper. However, I’ll note here that many of the features of clinical research that some have argued make payment of subjects morally wrong (such as an unknown or high risk of harm (McNeill 1997), a dependency of subjects on researchers (Grant and Sugarman 2004), or a strong, principled aversion to the study on the part of the subjects (ibid.)) typically do not apply to online survey research.

  20. For example, Williamson (2016a, b, 80) claims that paying MTurk workers less than minimum wage is morally wrong, but she suggests that if someone is paid only $.10 for completing a single survey or completes many such surveys solely because they enjoy doing so, then this at least opens up the possibility that this low pay rate is morally permissible. Further, she claims that the fact that MTurk relies on “numerous paid ‘regulars’” (many of whom, as I discuss below, rely on MTurk as a significant source of income) implies that “MTurk has substantial ethical implications beyond those that typically govern the treatment of survey participants” (ibid., 79). On the other side, Mason and Suri (2012, 16) claim that the alleged fact that “most workers are not relying on the wages earned on Mechanical Turk for necessities” is part of a “reasonable argument” for the moral permissibility of the low wages on MTurk. See also Berg (2016, 565-6) for the history of using the rhetoric of “pin money” to oppose minimum wage laws in general.

  21. An anonymous referee suggested simply making the surveys themselves less tedious (so that even those who are currently monetarily motivated would take the surveys simply for fun). I suspect that this would not be sufficient motivation for many of the roughly 40% of MTurk workers who take surveys in order to “make ends meet.”

  22. E.g., as Berg (2016, 570) suggests: “For … companies or individuals who use crowdworkers for occasional tasks, they could hire the services of the platform, which would then have their own, screened and trained employees.” This may also address other moral concerns that have been raised about MTurk, and crowdsourcing in general, such as lack of fringe benefits.

  23. See Weinberg (2016) for a defense of the claim that philosophers often rely on intuitions as evidence (what he calls “intuitive practice”). See Mallon (2016) for a discussion of the distinction between “negative” and “positive” programs in experimental philosophy.

  24. Stich and Tobia (2016) focus on this “restricted” interpretation of “experimental philosophy,” according to which it is “the empirical investigation of philosophical intuitions, the factors that affect them, and the psychological and neurological mechanisms that underlie them” (ibid., 5). See also Mallon (2016). Knobe (2016) argues that it is a mistake to think of experimental philosophy in this narrow way (i.e., as either a negative or positive reaction to traditional conceptual analysis) and notes that only a small minority of experimental philosophy studies cataloged in PhilPapers from 2009 to 2013 are devoted to either the “positive” program (10.4%) or “negative” program (1.3%). The experimental non-naivete of MTurk workers would likely also affect studies devoted to Knobe’s alternative conception of experimental philosophy. However, I focus on the expertise response to the “negative” program since, as I discuss below, experimental non-naivete is clearly relevant to debates about philosophical expertise, and as Timothy Williamson notes, the “negative” program has “attracted attention disproportionate to its size” because of its allegedly “radical implications for philosophical methodology” (2016a, b, 22–3).

  25. The fact that someone is experimentally non-naïve in a broad sense (having taken many surveys on MTurk) does not imply that they have taken many experimental philosophy surveys on MTurk. However, simply having taken many surveys, of any kind, is enough to raise the worries about “performance errors” that I discuss below. Further, there is at least some evidence that many of (at least the most “productive”) MTurk workers have had prior exposure to experimental philosophy materials, such as the trolley problem (see note 12 above). If higher pay on MTurk does lead to a greater number of MTurk workers being “professional survey takers,” then, given the extent to which “Super Turkers” communicate with each other about research tasks and look for tasks offered by the same researcher (again, see note 12), I think that it is likely that this narrower kind of experimental nonnaivete will become more common. To be clear, I am not claiming that even this narrower kind of experimental nonnaivete is equivalent to “philosophical nonnaivete,” in which supposedly pre-theoretical beliefs are actually determined by a philosophical theory and thus are not neutral data that can be used as evidence for philosophical theories. Thanks to an anonymous referee for prompting me to clarify the kind of experimental non-naivete I am concerned with and its relation to philosophical non-naivete.

  26. See Mallon (2016, 413) for this idea. Of course, different kinds of intuitions may be the product of different capacities.

  27. The effects of non-naivete are likely context-sensitive and variable between experimental paradigms. For instance, Chandler et al.’s (2014) work with non-naïve subjects on the CRT, mentioned above and discussed in more detail below, does not seem to display the effects of boredom or “demand characteristics.” But see Hamby and Taylor (2016), mentioned in n.14 above, for results that suggest that non-naïve responses may be problematic for research that aims to discover the psychological capacities that produce our intuitive responses (at least in the sense of “intuitive” that some philosophers endorse).

  28. An anonymous reviewer suggested that empirical studies could be conducted to determine how, if at all, the responses of experimentally non-naïve subjects differ from those of other MTurk workers and subjects selected by other means. I am all for such studies, but the fact that they are necessary in the first place confirms my main point here: that non-naïve responses cannot simply be assumed to be a product of the psychological capacity or capacities, whose nature is one of the main subjects of debate in the controversies surrounding the “negative” program in experimental philosophy.

  29. Another motivation to do so comes from the fact that, in July 2015, Amazon increased the commission that it charges when tasks are completed on MTurk (from 10% to 40%, for studies with more than 10 individual participants). Crowdsourcing on MTurk is no longer as cheap as it once was.

  30. As an anonymous referee pointed out, this kind of “snowballing” method raises other methodological concerns, such as potential bias introduced by the fact that subjects are all likely friends, or friends of friends, with the experimenter. A recent study found that “push strategies” (like using social media), which target people not actively looking to participate in research, recruit participants that are less committed to the survey task, but are more demographically diverse, than those recruited by “pull strategies” (like MTurk) (Antoun et al. 2016). These latter features may also apply to another “push strategy” used by Justin Systma and colleagues: gathering responses to test questions in exchange for providing subjects the results of a Big Five personality assessment (via the website philosophicalpersonality.com). (Thanks to an anonymous referee for bringing this platform to my attention.) At first glance, there does not seem to be anything ethically problematic about this and other “push strategies.” At least they do not seem to be affected by any of the moral concerns about low pay or by worries about “undue inducement” of subjects raised in the bioethics literature (mentioned in note 19).

  31. Thanks to an anonymous referee for making this suggestion.

  32. Some support for the claim that philosophers’ higher CRT scores are not caused by their being more reflective comes from Chandler et al. (2014), who did not find a correlation between CRT score and the “need for cognition” scale among their subjects, contrary to the original CRT study (Frederick 2005). This suggests that the CRT might not measure a stable individual difference in cognitive orientation (such as being reflective or enjoying effortful thinking), as is commonly assumed. (The “need for cognition” scale is taken to measure such a stable individual difference.)

  33. In this case it appears that non-naïve subjects’ responses on the CRT are not significantly affected by “extraneous” factors such as boredom.

  34. Livengood et al. (2010, 328 n.10) note that psychological training was negatively correlated with CRT score. This perhaps provides some support for Livengood et al.’s hypothesis since it is plausible that those with psychological training are more likely (or at least not less likely) to have seen questions from the original CRT than those with philosophical training.

  35. Rini reports that only 25–27% of philosophers’ judgments about moral dilemmas were affected by whether the dilemma was phrased in first-person or third-person terms in Tobia et al.’s (2013) study, and only 34% of ethics specialists were subject to order effects in (Schwitzgebel and Cushman 2012).

  36. See Schwitzgebel and Cushman (2015) for a more recent study that shows that philosophers’ responses are still subject to framing and order effects even after limiting the target group to philosophers who claim expertise on the types of dilemma in question, to philosophers who report having stable opinions on the survey materials, or to those who were encouraged to give reflective responses.

  37. Again, non-naivete with respect to surveys on MTurk does not imply non-naivete with respect to experimental philosophy surveys (or a subset thereof). (See note 25 above.) Even so, given the extent to which MTurk workers (especially Super Turkers) already discuss specific tasks with other workers and look for tasks offered by the same researchers (see note 12) and the extent to which these practices are likely to increase if pay rates on MTurk increase and the number of Super Turkers grows, there is some reason to think that non-naivete with respect to specific survey materials will increase in response to higher pay rates. Further, even if it doesn’t do so on its own, a population of professional survey takers is an ideal place to cultivate respondents with such specific experimental non-naivete.

  38. An anonymous referee suggested that Rini’s claims could be tested by performing a longitudinal study in which participants are presented with the relevant materials multiple times. That’s exactly what I’m suggesting in the latter part of the above paragraph, and comparing philosophers with non-philosophers, both of whom are familiar with the survey materials, would help to determine whether philosophical skills or training is correlated with a distinctive pattern of response. See also the more general discussion in the next two paragraphs.

  39. This result would provide some support for the “cognitive stimulus hypothesis” concerning why subjects’ responses change on longitudinal surveys. According to this hypothesis, “repeatedly administering attitude questions serves to stimulate respondents to reflect and deliberate more closely on the issues to which the questions pertain. This, in turn, results in stronger and more internally consistent attitudes in the later waves of a panel” (Sturgis et al. 2009, 114).

  40. Crowdsourcing, and the “gig economy” more broadly, already have been receiving attention in law, history, and even art and design. See, for example, the Spring 2016 special issue of Comparative Labor Law and Policy Journal on “Crowdsourcing, the Gig-Economy, and the Law” and the Digital Labor conference, held from Nov. 14–16, 2014, at the New School for Social Research.

References

  • Antoun, Christopher, Chan Zhang, Frederick G. Conrad, and Michael F. Schober. 2016. Comparisons of online recruitment strategies for convenience samples: Craigslist, Google AdWords, Facebook, and Amazon mechanical Turk. Field Methods 28: 231–246.

    Article  Google Scholar 

  • Berg, Janine. 2016. Income security in the on-demand economy: Findings and policy lessons from a survey of Crowdworkers. Comparative Labor Law and Policy Journal. 37: 543–576.

    Google Scholar 

  • Bohannon, John. 2011. Social science for pennies. Science 334: 307.

    Article  Google Scholar 

  • Brawley, Alice M., and Cynthia L.S. Pury. 2016. Work experiences on MTurk: Job satisfaction, turnover, and information sharing. Computers in Human Behavior. 54: 531–546.

    Article  Google Scholar 

  • Buckwalter, Wesley. 2016. Intuition fail: Philosophical activity and the limits of expertise. Philosophy and Phenomenological Research. 92: 378–410.

    Article  Google Scholar 

  • Buhrmester, Michael, Tracy Kwang, and Samuel D. Gosling. 2011. Amazon’s mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science. 6 (1): 3–5.

    Article  Google Scholar 

  • Chandler, Jesse, Pam Mueller, and Gabriele Paolacci. 2014. Nonnaïvéte among Amazon mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior Research Methods. 46: 112–130.

    Article  Google Scholar 

  • Chandler, Jesse, Gabriele Paolacci, Eyal Peer, Pam Mueller, and Kate A. Ratliff. 2015. Using nonnaive participants can reduce effect sizes. Psychological Science 26: 1131–1139.

    Article  Google Scholar 

  • Chandler, Jesse, and Danielle Shapiro. 2016. Conducting clinical research using crowdsourced convenience samples. Annual Review of Clinical Psychology. 12: 53–81.

    Article  Google Scholar 

  • Cima, Rosie. 2014. Mechanical Turk: The new face of behavioral science? Priceonomics. Oct 15. http://priceonomics.com/mechanical-turk-new-face-of-behavioral-science/.

  • Crump, Matthew J.C., John V. McDonnell, and Todd M. Gureckis. 2013. Evaluating Amazon’s mechanical Turk as a tool for experimental behavioral research. PLoS One 8: e57410.

    Article  Google Scholar 

  • Felstiner, Alek. 2011. Working the crowd: Employment and labor law in the crowdsourcing industry. Berkeley Journal of Employment and Labor Law. 32 (1): 143–203.

    Google Scholar 

  • Folbre, Nancy. 2013. The unregulated work of mechanical Turk. NYTimes Economix Blog. http://economix.blogs.nytimes.com/2013/03/18/the-unregulated-work-of-mechanical-turk/.

  • Fort, Karën, Gilles Adda, and K. Bretonnel Cohen. 2011. Amazon mechanical Turk: Gold mine or coal mine? Computational Linguistics. 37 (2): 413–420.

    Article  Google Scholar 

  • Fort, Karën, Gilles Adda, Benoit Sagot, Joseph Mariani, and Allain Gouillault. (2014) “Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use.” Human Language Technology: Challenges for Computer Science and Linguistics. Zygmunt Vetulani and Joseph Mariani, eds. Cham: Springer. 303–314.

  • Frederick, Shane. 2005. Cognitive reflection and decision making. Journal of Economic Perspectives. 19: 25–42.

    Article  Google Scholar 

  • Grady, Christine. 2005. Payment of clinical research subjects. The Journal of Clinical Investigation. 115: 1681–1687.

    Article  Google Scholar 

  • Grant, Ruth W., and Jeremy Sugarman. 2004. Ethics in human subjects research. Journal of Medicine and Philosophy. 29: 717–738.

    Article  Google Scholar 

  • Hamby, Tyler, and Wyn Taylor. 2016. Survey satisficing inflates reliability and validity measures: An experimental comparison of college and Amazon mechanical Turk samples. Educational and Psychological Measurement. 76: 912–932.

    Article  Google Scholar 

  • Hauser, David J., and Norbert Schwarz. 2016. Attentive Turkers: MTurk participants perform better on attention checks than do subject pool participants. Behavior Research Methods. 48: 400–407.

    Article  Google Scholar 

  • Ipeirotis, Panagiotis G. 2010a. Analyzing the Amazon mechanical Turk marketplace. XRDS: Crossroads, The ACM Magazine for Students. 17 (2): 16–21.

    Article  Google Scholar 

  • Ipeirotis, Panagiotis G. 2010b. Demographics of mechanical Turk. CeDER Working Papers. New York University. Retrieved from http://hdl.handle.net/2451/29585.

  • Knobe, Joshua. 2016. “Experimental philosophy is cognitive science.” In A companion to experimental philosophy, ed. Justin Sytsma and Wesley Buckwalter, 37–52. Chichester: Wiley.

  • Litman, Leib, Jonathan Robinson, and Cheskie Rosenzweig. 2015. The relationship between motivation, monetary compensation, and data quality among US- and India-based workers on mechanical Turk. Behavior Research Methods. 47: 519–528.

    Article  Google Scholar 

  • Livengood, Jonathan, Justin Systma, Adam Feltz, Richard Scheines, and Edouard Machery. 2010. Philosophical temperament. Philosophical Psychology 23: 313–330.

  • Mallon, Ron. (2016) “Experimental Philosophy.” In The Oxford Handbook of Philosophical Methodology. Herman Cappelen, Tamar Szabo Gendler, and John Hawthorne, eds. Oxford: Oxford University Press. 410–443.

  • Marder, Jenny and Mike Fritz. 2015. The Internet’s hidden science factory. PBS NewsHour. Feb 11. http://www.pbs.org/newshour/updates/inside-amazons-hidden-science-factory/.

  • Mason, Winter, and Siddharth Suri. 2012. Conducting behavioral research on Amazon’s mechanical Turk. Behavior Research Methods. 44: 1–23.

    Article  Google Scholar 

  • Mason, Winter, and Duncan J. Watts. 2010. Financial incentives and the ‘performance of crowds. ACM SigKDD Explorations Newsletter. 11 (2): 100–108.

    Article  Google Scholar 

  • McNeill, Paul. 1997. Paying people to participate in research: Why not? Bioethics 11: 390–396.

    Article  Google Scholar 

  • Moriarty, Jeffrey. 2017. “Business Ethics.” In The Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. Fall edition. https://plato.stanford.edu/archives/fall2017/entries/ethics-business/.

  • Necka, Elizabeth A., Stephanie Cacioppo, Greg J. Norman, and John T. Cacioppo. 2016. Measuring the prevalence of problematic respondent behaviors among MTurk, campus, and community participants. PLoS One 11: e0157732.

    Article  Google Scholar 

  • Nelson, Robert M., Tom Beauchamp, Victoria A. Miller, William Reynolds, Richard F. Ittenbach, and Mary Frances Luce. 2011. The concept of voluntary consent. The American Journal of Bioethics 11 (8): 6–16.

  • Paolacci, Gabriele, and Jesse Chandler. 2014. Inside the Turk: Understanding mechanical Turk as a participant pool. Current Directions in Psychological Science. 23 (3): 184–188.

    Article  Google Scholar 

  • Paolacci, Gabriele, Jesse Chandler, and Panagiotis G. Ipeirotis. 2010. Running experiments on Amazon mechanical Turk. Judgment and Decision Making. 5 (5): 411–419.

    Google Scholar 

  • Peer, Eyal, Joachim Vosgerau, and Alessandro Acquisti. 2014. Reputation as a sufficient condition for data quality on Amazon mechanical Turk. Behavior Research Methods. 46: 1023–1031.

    Article  Google Scholar 

  • Rand, David G., Alexander Peysakhovich, Gordon T. Kraft-Todd, et al. 2014. Social heuristics shape intuitive cooperation. Nature Communications 5: 1–12.

    Article  Google Scholar 

  • Rini, Regina A. 2015. How not to test for philosophical expertise. Synthese 192: 431–452.

    Article  Google Scholar 

  • Ross, Joel, Irani Lilly, M. Six Silberman, Andrew Zaldivar, and Bill Tomlinson. 2010. Who are the Crowdworkers? Shifting demographics in mechanical Turk. CHI 2010: 2863–2872.

  • Schneider, Nathan. 2015. Intellectual piecework. The Chronicle of Higher Education. Feb 16. http://chronicle.com/article/Intellectual-Piecework/190039/.

  • Schwitzgebel, Eric, and Fiery Cushman. 2012. Expertise in moral reasoning? Order effects on moral judgment in professional philosophers and non-philosophers. Mind & Language. 27: 135–153.

    Article  Google Scholar 

  • Schwitzgebel, Eric, and Fiery Cushman. 2015. Philosophers’ biased judgments persist despite training, expertise and reflection. Cognition 141: 127–137.

    Article  Google Scholar 

  • Searles, Kathleen and John Barry Ryan. 2015. Researchers are rushing to Amazon’s mechanical Turk. Should they? May 4. The Washington Post Monkey Cage Blog. https://www.washingtonpost.com/blogs/monkey-cage/wp/2015/05/04/researchers-are-rushing-to-amazons-mechanical-turk-should-they/.

  • Silberman, M. Six, Kristy Milland, Rochelle LaPlante, Joel Ross, and Lilly Irani. (2015) Stop Citing Ross et al. 2010, ‘Who are the Crowdworkers? March 16. Medium. https://medium.com/@silberman/stop-citing-ross-et-al-2010-who-are-the-crowdworkers-b3b9b1e8d300#.d0aytetdl.

  • Smith, Scott M., Catherine A. Roster, Linda L. Golden, and Gerald S. Albaum. 2016. A multi-group analysis of online survey respondent data quality: Comparing a regular USA consumer panel to MTurk samples. Journal of Business Research. 69: 3139–3148.

    Article  Google Scholar 

  • Stewart, Neil, Christoph Ungemach, Adam J.L. Harris, Daniel M. Bartels, Ben R. Newell, Gabriele Paolacci, and Jesse Chandler. 2015. The average laboratory samples a population of 7,300 Amazon mechanical Turk workers. Judgment and Decision Making. 10 (5): 479–491.

    Google Scholar 

  • Stich, Stephen, and Kevin P. Tobia. 2016. Experimental philosophy and the philosophical tradition. In A companion to experimental philosophy, ed. Justin Sytsma and Wesley Buckwalter, 5–21. Chichester: Wiley.

    Google Scholar 

  • Sturgis, Patrick, Nick Allum, and Ian Brunton-Smith. 2009. Attitudes over time: The psychology of panel conditioning. In Methodology of longitudinal surveys, ed. Peter Lynn, 113–126. Chichester: Wiley.

    Chapter  Google Scholar 

  • Thomson, Keela S., and Daniel M. Oppenheimer. 2016. Investigating an alternate form of the cognitive reflection test. Judgment and Decision Making. 11: 99–113.

    Google Scholar 

  • Tobia, Kevin, Wesley Buckwalter, and Stephen Stich. 2013. Moral intuitions: Are philosophers experts? Philosophical Psychology. 26: 629–638.

    Article  Google Scholar 

  • Weinberg, Jonathan. (2016) “Intuitions.” In The Oxford Handbook of Philosophical Methodology. Herman Cappelen, Tamar Szabo Gendler, and John Hawthorne, eds. Oxford: Oxford University Press. 287–308.

  • Wilkinson, Martin, and Andrew Moore. 1997. Inducement in research. Bioethics 11: 373–389.

    Article  Google Scholar 

  • Williams, George. 2010. The ethics of Amazon’s mechanical Turk. ProfHacker Blog: The Chronicle of Higher Education. March 1. http://chronicle.com/blogs/profhacker/the-ethics-of-amazons-mechanical-turk/23010.

  • Williamson, Timothy. 2016a. Philosophical criticisms of experimental philosophy. In A companion to experimental philosophy, ed. Justin Sytsma and Wesley Buckwalter, 22–36. Chichester: Wiley.

    Google Scholar 

  • Williamson, Vanessa. 2016b. On the ethics of crowdsourced research. PS: Political Science & Politics. 49 (1): 77–81.

    Google Scholar 

  • Worstall, Tim. 2013. On the New York times stupidity over Amazon's mechanical Turk. Forbes Tech. (March 19) http://www.forbes.com/sites/timworstall/2013/03/19/on-the-new-york-times-stupidity-over-amazons-mechanical-turk/#e3ecf4c16601.

Download references

Acknowledgments

Thanks to Marcus Holmes for inviting me to participate in a panel discussion on the methodological challenges and ethical concerns regarding crowdsourcing, which prompted me to write this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthew C. Haug.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Haug, M.C. Fast, Cheap, and Unethical? The Interplay of Morality and Methodology in Crowdsourced Survey Research. Rev.Phil.Psych. 9, 363–379 (2018). https://doi.org/10.1007/s13164-017-0374-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13164-017-0374-z

Navigation