Skip to main content

Risks and Rewards of Crowdsourcing Marketplaces

  • Chapter
  • First Online:
Handbook of Human Computation

Abstract

Crowdsourcing has become an increasingly popular means of flexibly deploying large amounts of human computational power. The present chapter investigates the role of microtask labor marketplaces in managing human and hybrid human machine computing. Labor marketplaces offer many advantages that in combination allow human intelligence to be allocated across projects rapidly and efficiently and information to be transmitted effectively between market participants. Human computation comes with a set of challenges that are distinct from machine computation, including increased unsystematic error (e.g. mistakes) and systematic error (e.g. cognitive biases), both of which can be exacerbated when motivation is low, incentives are misaligned, and task requirements are poorly communicated. We provide specific guidance about how to ameliorate these issues through task design, workforce selection, data cleaning and aggregation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Akerlof GA (1970) The market for“ lemons”: quality uncertainty and the market mechanism. Q J Econ 84:488–500

    Google Scholar 

  • Allio RJ (2004) CEO interview: the InnoCentive model of open innovation. Strategy Leadersh 32(4):4–9

    Article  Google Scholar 

  • Anderson LR, Holt CA (1997) Information cascades in the laboratory. Am Econ Rev 87:847–862

    Google Scholar 

  • Bao J, Sakamoto Y, Nickerson JV (2011) Evaluating design solutions using crowds. In: Proceedings of the 17th Americas conference on information systems, Detroit, MI, USA

    Google Scholar 

  • Becker GS, Murphy KM (1992) The division of labor, coordination costs, and knowledge. Q J Econ 107(4):1137–1160

    Article  Google Scholar 

  • Berinsky AJ, Huber GA, Lenz GS (2012) Evaluating online labor markets for experimental research: Amazon.com’s Mechanical Turk. Polit Anal 20:351–368. doi:10.1093/pan/mpr057

    Article  Google Scholar 

  • Bernstein MS, Little G, Miller RC, Hartmann B, Ackerman MS, Karger DR, Crowell D, Panovich K (2010) Soylent: a word processor with a crowd inside. In: Proceeding UIST 2010, ACM Press, pp 313–322

    Google Scholar 

  • Bigham JP, Jayant C, Ji H, Little G, Miller A, Miller RC, … Yeh T (2010) VizWiz: nearly real-time answers to visual questions. In: Proceedings of the 23nd annual ACM symposium on user interface software and technology, ACM, New York, pp 333–342

    Google Scholar 

  • Case SM, Swanson DB (2001) Constructing written test questions for the basic and clinical sciences, 3rd edn. National Board of Medical Examiners, Philadelphia

    Google Scholar 

  • Chandler D, Horton J (2011) Labor allocation in paid crowdsourcing: experimental evidence on positioning, nudges and prices. In: Workshops at the twenty-fifth AAAI conference on artificial intelligence. AAAI Press, Menlo Park, California

    Google Scholar 

  • Chandler D, Kapelner A (2013) Breaking monotony with meaning: motivation in crowdsourcing markets. J Econ Behav Organ 90:123–133

    Article  Google Scholar 

  • Chandler J, Mueller P, Paolacci G (in press) Methodological concerns and advanced uses of Amazon mechanical Turk in psychological research. Manuscript submitted for publication

    Google Scholar 

  • Chilton LB, Horton JJ, Miller RC, Azenkot S (2010) Task search in a human computation market. In: Proceedings of the ACM SIGKDD workshop on human computation, ACM, New York, pp 1–9

    Google Scholar 

  • Chua CC, Milosavljevic M, Curran JR (2009) A sentiment detection engine for internet stock message boards. In Pizzato LA, Schwitter R (eds) Proceedings of the Australasian language technology association workshop 2009, Sydney, pp 89–93

    Google Scholar 

  • Collins A, Joseph D, Bielaczyc K (2004) Design research: theoretical and methodological issues. J Learn Sci 13(1):15–42

    Google Scholar 

  • Cooper S, Khatib F, Treuille A, Barbero J, Lee J, Beenen M, Leaver-Fay A, Baker D, Popović Z (2010) Predicting protein structures with a multiplayer online game. Nature 466(7307): 756–760

    Article  Google Scholar 

  • Couper M (2008) Designing effective web surveys. Cambridge University Press, New York

    Book  Google Scholar 

  • Davis LE (1965) Pacing effects on manned assembly lines. Int J Prod Res 4(3):171–184

    Article  Google Scholar 

  • Dominowski RL, Dallob PI (1995) Insight and problem solving. In: Sternberg RJ, Davidson JE (eds) The nature of insight. MIT Press, Cambridge, pp 33–62

    Google Scholar 

  • Elson DK, McKeown KR (2010) Automatic attribution of quoted speech in literary narrative. In: Proceedings of the twenty-fourth AAAI conference on artificial intelligence. The AAAI Press, Menlo Park, pp 1013–1019

    Google Scholar 

  • Estellés-Arolas E, González-Ladrón-de-Guevara F (2012) Towards an integrated crowdsourcing definition. J Info Sci 38(2):189–200

    Google Scholar 

  • Galton F 1907 Vox populi. Nature 75:450–451

    Google Scholar 

  • Gneezy U, Meier S, Rey-Biel P (2011) When and why incentives (don’t) work to modify behavior. J Econ Perspect 25:191–209

    Article  Google Scholar 

  • Goldin G, Darlow A (2013) TurkGate (Version 0.4.0) [Software]. Available from http://gideongoldin.github.com/TurkGate/)

  • Goodman JK, Cryder CE, Cheema A (2012) Data collection in a flat world: the strengths and weaknesses of mechanical Turk samples. J Behav Decis Making 26:213–224

    Article  Google Scholar 

  • Grady C, Lease M (2010) Crowdsourcing document relevance assessment with mechanical Turk. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s mechanical Turk. Association for Computational Linguistics, pp 172–179

    Google Scholar 

  • Grice HP (1989) Studies in the way of words. Harvard University Press, Cambridge

    Google Scholar 

  • Gruenstein A, McGraw I, Sutherland A (2009) A self-transcribing speech corpus: collecting continuous speech with an online educational game. In: Proceedings of the speech and language technology in education (SLaTE) workshop. Warwickshire

    Google Scholar 

  • Hayes AF, Krippendorff K (2007) Answering the call for a standard reliability measure for coding data. Commun Methods Meas 1:77–89. doi:10.1080/19312450709336664

    Article  Google Scholar 

  • Horton JJ (2010) Online labor markets. Springer Berlin Heidelberg, pp 515–522

    Google Scholar 

  • Horton JJ (2011) The condition of the Turking class: are online employers fair and honest? Econ Lett 111(1):10–12

    Article  Google Scholar 

  • Horton JJ, Chilton LB (2010) The labor economics of paid crowdsourcing. In: Proceedings of the 11th ACM conference on electronic commerce, ACM, pp 209–218

    Google Scholar 

  • Hosseini M, Cox I, Milić-Frayling N, Kazai G, Vinay V (2012) On aggregating labels from multiple crowd workers to infer relevance of documents. Adv Inf Retr 182–194

    Google Scholar 

  • Hsieh G, Kraut RE, Hudson SE (2010) Why pay?: exploring how financial incentives are used for question & answer. In: Proceedings of the 28th international conference on human factors in computing systems, pp 305–314. doi: 10.1145/1753326.1753373

  • Hullman J, Adar E, Shah P (2011) The impact of social information on visual judgments. In: Proceedings of the 2011 annual conference on human factors in computing systems, ACM, New York, pp 1461–1470

    Google Scholar 

  • Ipeirotis P (2010) Demographics of mechanical Turk. CeDER-10–01 working paper, New York University

    Google Scholar 

  • Ipeirotis PG, Horton JJ (2011) The need for standardization in crowdsourcing. CHI

    Google Scholar 

  • Jung HJ, Lease M (2011) Improving consensus accuracy via Z-score and weighted voting. In: Proceedings of the 3rd Human Computation Workshop (HCOMP) at AAAI Press, Menlo Park, California

    Google Scholar 

  • Kahneman D (2011) Thinking, fast and slow. Farrar, Straus and Giroux, New York

    Google Scholar 

  • Kapelner A, Chandler D (2010) Preventing satisficing in online surveys: a ‘kapcha’ to ensure higher quality data. In: The world’s first conference on the future of distributed work, San Francisco, CA (CrowdConf2010)

    Google Scholar 

  • Kaufmann N, Schulze T, Veit D (2011) More than fun and money. worker motivation in crowdsourcing–a study on mechanical turk. In: Proceedings of the seventeenth Americas conference on information systems, Detroit

    Google Scholar 

  • Kazai G, Milic-Frayling N (2009) On the evaluation of the quality of relevance assessments collected through crowdsourcing. In: SIGIR 2009 workshop on the future of IR evaluation, Boston, MA, p 21

    Google Scholar 

  • Kazai G, Kamps J, Milic-Frayling N (2012) The face of quality in crowdsourcing relevance labels: Demographics, personality and labeling accuracy. In: Proceedings of the 21st ACM international conference on Information and knowledge, ACM, New York, pp 2583–2586

    Google Scholar 

  • Khanna S, Ratan A, Davis J, Thies W (2010) Evaluating and improving the usability of mechanical turk for low-income workers in India. In: Proceedings of the first ACM symposium on computing for development, ACM, New York, p 12

    Google Scholar 

  • Kittur A, Chi EH, Suh, B (2008) Crowdsourcing user studies with mechanical turk. In Proceedings of the SIGCHI conference on human factors in computing systems, ACM, New York, pp 453–456

    Google Scholar 

  • Kittur A, Nickerson J, Bernstein M, Gerber E, Shaw A, Zimmerman J, … Horton J (2013) The future of crowd work. In: Sixteenth ACM conference on Computer Supported Coooperative Work (CSCW 2013), Forthcoming

    Google Scholar 

  • Krippendorff K (2004) Reliability in content analysis. Hum Commun Res, 30(3):411–433

    Google Scholar 

  • Krosnick JA (2006) Response strategies for coping with the cognitive demands of attitude measures in surveys. Appl Cogn Psychol 5(3):213–236

    Article  Google Scholar 

  • Krug S (2009) Don’t make me think: a common sense approach to web usability. New Riders, Berkeley, CA

    Google Scholar 

  • Lakhani KR (2008) InnoCentive. com (A). Harvard Business School Case, 608–170

    Google Scholar 

  • Lane I, Weibel A, Eck M, Rottmann K (2010) Tools for collecting speech corpora via Mechanical-Turk. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s mechanical turk, Association for Computational Linguistics, Stroudsbug, PA, pp 184–187

    Google Scholar 

  • Lau T, Drews C, Nichols J (2009) Interpreting written how-to instructions. In: Kitano H (ed) Proceedings of the 21st international joint conference on artificial intelligence, Morgan Kaufmann, San Francisco, pp 1433–1438

    Google Scholar 

  • Li B, Liu Y, Agichtein E (2008) CoCQA: co-training over questions and answers with an application to predicting question subjectivity orientation. In: Proceedings of the 2008 conference on empirical methods in natural language processing. Association for Computational Linguistics, Stroudsburg. doi: 10.3115/1613715.1613836, pp 937–946

  • Lintott CJ, Schawinski K, Slosar A, Land K, Bamford S, Thomas D, … Vandenberg J (2008) Galaxy zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Mon Not R Astron Soc 389(3):1179–1189

    Google Scholar 

  • Marge M, Banerjee S, Rudnicky AI (2010) Using the Amazon mechanical turk for transcription of spoken language. In: Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE international conference on (5270–5273), Institute of Electronics and Electrical Engineers, Washington, DC. doi:10.1109/ICASSP.2010.5494979

  • Mason W, Suri S (2012) Conducting behavioral research on Amazon’s Mechanical Turk. Behav Res Methods 44(1):1–23

    Google Scholar 

  • Mason W, Watts DJ (2009) Financial incentives and the performance of crowds. In: Proceedings of the ACM SIGKDD workshop on human computation, ACM, New York, pp 77–85

    Google Scholar 

  • Molla D, Santiago-Martinez ME (2011) Development of a corpus for evidence based medicine summarisation. In: Proceedings of Australasian language technology association workshop, Australasian Language Technology Association, Melbourne, pp 86–94

    Google Scholar 

  • Nelson L, Held C, Pirolli P, Hong L, Schiano D, Chi EH (2009) With a little help from my friends: examining the impact of social annotations in sensemaking tasks. In: Proceedings of the 27th international conference on human factors in computing systems, ACM, New York, pp 1795–1798. doi:10.1145/1518701.1518977

  • Nickerson JV, Sakamoto Y, Yu L (2011) Structures for creativity: the crowdsourcing of design. In: CHI workshop on crowdsourcing and human computation, pp 1–4

    Google Scholar 

  • Open Science Collaboration (2013) The reproducibility project: a model of large-scale collaboration for empirical research on reproducibility. In: Stodden V, Leisch F, Peng R (eds) Implementing reproducible computational research (A Volume in The R Series). Taylor and Francis, New York

    Google Scholar 

  • Paolacci G, Chandler J, Ipeirotis P (2010) Running experiments on Amazon mechanical turk. Judgm Decis Making 5:411–419

    Google Scholar 

  • Pe’er E, Paolacci G, Chandler J, Mueller P (2012) Screening participants from previous studies on Amazon mechanical turk and qualtrics. Available at SSRN 2100631

    Google Scholar 

  • Prelec D (2004) A bayesian truth serum for subjective data. Science 306(5695):462–466

    Google Scholar 

  • Pontin J (2007) Artificial intelligence, with help from the humans. New York Times, 25. Retrieved from http://www.nytimes.com/2007/03/25/business/yourmoney/25Stream.html

  • Rand DG (2012) The promise of mechanical turk: how online labor markets can help theorists run behavioral experiments. J Theor Biol 299:172–179

    Article  MathSciNet  Google Scholar 

  • Resnick P, Kuwabara K, Zeckhauser R, Friedman E (2000) Reputation systems. Commun ACM 43(12):45–48

    Article  Google Scholar 

  • Rochet JC, Tirole J (2003) Platform competition in two‐sided markets. J Eur Econ Assoc 1(4):990–1029

    Article  Google Scholar 

  • Rogstadius J, Kostakos V, Kittur A, Smus B, Laredo J, Vukovic M (2011) An assessment of intrinsic and extrinsic motivation on task performance in crowdsourcing markets. In ICWSM

    Google Scholar 

  • Sayeed AB, Rusk B, Petrov M, Nguyen HC, Meyer TJ, Weinberg A (2011) Crowdsourcing syntactic relatedness judgements for opinion mining in the study of information technology adoption. In: Proceedings of the 5th ACL-HLT workshop on language technology for cultural heritage, social sciences, and humanities, Association for Computational Linguistics, Stroudsburg, pp 69–77

    Google Scholar 

  • Schwarz N (1999) Self-reports: how the questions shape the answers. Am psychol 54(2):93

    Google Scholar 

  • Shapiro DN, Chandler J, Mueller PA (2013) Using mechanical turk to study clinical populations. Clin Psychol Sci 1:213–220

    Article  Google Scholar 

  • Shaw AD, Horton JJ, Chen DL (2011) Designing incentives for inexpert human raters. In: Proceedings of the ACM 2011 conference on computer supported cooperative work, ACM, New York, pp 275–284

    Google Scholar 

  • Sheerman-Chase T, Ong EJ, Bowden R (2011) Cultural factors in the regression of non-verbal communication perception. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, pp 1242–1249

    Google Scholar 

  • Silberman M, Irani L, Ross J (2010) Ethics and tactics of professional crowdwork. XRDS Crossroads ACM Mag Stud 17(2):39–43

    Article  Google Scholar 

  • Simon HA (1972) Theories of bounded rationality. Decis Organ 1:161–176

    Google Scholar 

  • Suri S, Goldstein DG, Mason WA (2011) Honesty in an online labor market. In: von Ahn L, Ipeirotis PG (eds) Papers from the 2011 AAAI workshop. AAAI Press, Menlo Park

    Google Scholar 

  • Tang W, Lease M (2011) Semi-supervised consensus labeling for crowdsourcing. In: Proceedings of the ACM SIGIR workshop on crowdsourcing for information retrieval, ACM, New York

    Google Scholar 

  • Tetlock P (2005) Expert political judgment: how good is it? How can we know? Princeton University Press, Princeton

    Google Scholar 

  • Tetreault JR, Filatova E, Chodorow M (2010) Rethinking grammatical error annotation and evaluation with the Amazon mechanical turk. In: Proceedings of the NAACL HLT 2010 fifth workshop on innovative use of NLP for building educational applications, Association for Computational Linguistics, pp 45–48

    Google Scholar 

  • Tversky A, Kahneman D (1974) Judgment under uncertainty: heuristics and biases. Science 211(January):453–458

    MathSciNet  Google Scholar 

  • von Ahn L (2006) Games with a purpose. Computer 39(6):92–94

    Article  Google Scholar 

  • Von Ahn L, Maurer B, McMillen C, Abraham D, Blum M (2008) recaptcha: human-based character recognition via web security measures. Science 321(5895):1465–1468

    Google Scholar 

  • Wenger E (1998) Communities of practice: learning, meaning, and identity. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Winchester S (2004) The meaning of everything: The story of the Oxford English Dictionary. Oxford University Press

    Google Scholar 

  • Yu L, Nickerson JV (2011) Cooks or cobblers?: crowd creativity through combination. In: Proceedings of the 2011 annual conference on human factors in computing systems, ACM, New York, pp 1393–1402

    Google Scholar 

  • Zhou DX, Resnick P, Mei Q (2011) Classifying the political leaning of news articles and users from user votes. In: Proceedings of the fifth international AAAI conference on weblogs and social media. The AAAI Press, Menlo Park, pp 417–424

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jesse Chandler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Chandler, J., Paolacci, G., Mueller, P. (2013). Risks and Rewards of Crowdsourcing Marketplaces. In: Michelucci, P. (eds) Handbook of Human Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8806-4_30

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-8806-4_30

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-8805-7

  • Online ISBN: 978-1-4614-8806-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics