High-Throughput Crowdsourcing Mechanisms for Complex Tasks

  • Guido Sautter
  • Klemens Böhm
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6984)

Abstract

Crowdsourcing is popular for large-scale data processing endeav ors that require hu man input. However, working with a large community of users raises new chal lenges. In particular, both possible misjudgment and disho nesty threaten the quality of the results. Common countermeasures are based on redundancy, giving way to a tradeoff between result quality and throughput. Ideally, measures should (1) maintain high throughput and (2) ensure high result quality at the same time. Existing work on crowdsourcing mostly focuses on result quality, paying little attention to throughput or even to that tradeoff. One reason is that the number of tasks (individual atomic units of work) is usually small. A further problem is that the tasks users work on are small as well. In consequence, existing result-improvement mecha nisms do not scale to the number or complexity of tasks that arise, for instance, in proofreading and processing of digitized legacy literature. This paper proposes novel result-improvement mechanisms that (1) are independent of the size and complexity of tasks and (2) allow to trade result quality for throughput to a significant extent. Both mathematical analyses and extensive simulations show the effectiveness of the proposed mechanisms.

Keywords

Crowdsourcing Data Quality Throughput 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    The Amazon Mechanical Turk, http://www.mturk.com
  2. 2.
    Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z.: Predicting protein structures with a multiplayer online game. Nature 466 (2010)Google Scholar
  3. 3.
    Eckert, K., Niepert, M., Niemann, C., Buckner, C., Allen, C., Stuckenschmidt, H.: Crowdsourcing the assembly of concept hierarchies. In: Proceedings of JCDL 2010, Brisbane, Australia (2010)Google Scholar
  4. 4.
    Lintott, C.J., Schawinski, K., Slosar, A., Land, K., Bamford, S., Thomas, D., Raddick, M.J., Nichol, R.C., Szalay, A., Andreescu, D., Murray, P., Vandenberg, J.: Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389 (2008), doi:10.1111/j.1365-2966.2008.13689.xGoogle Scholar
  5. 5.
    Newby, G.B., Franks, C.: Distributed proofreading. In: Proceedings of JCDL 2003, Houston, TX (2003), doi:10.1109/JCDL.2003.1204888Google Scholar
  6. 6.
    Sautter, G., Böhm, K., Agosti, D., Klingenberg, C.: Digital Resources from Legacy Documents - an Experience Report from the Biosystematics Domain. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 738–752. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Siorpaes, K., Hepp, M.: OntoGame: Towards overcoming the incentive bottleneck in ontology building. In: Chung, S., Herrero, P. (eds.) OTM-WS 2007, Part II. LNCS, vol. 4806, pp. 1222–1232. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  8. 8.
    Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. In: EMNLP 2008, Morristown, NJ, USA (2008)Google Scholar
  9. 9.
    Von Ahn, L., Blum, M., Hopper, N., Langford, J.: CAPTCHA: Using Hard AI Problems for Security. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 294–311. Springer, Heidelberg (2003), doi:10.1007/3-540-39200-9_18CrossRefGoogle Scholar
  10. 10.
    Von Ahn, L.: Games with a Purpose. IEEE Computer 29(6), 92–94 (2006)CrossRefGoogle Scholar
  11. 11.
    Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., Blum, M.: reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science 321(5895) (2008), doi:10.1126/science.1160379Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Guido Sautter
    • 1
  • Klemens Böhm
    • 1
  1. 1.KITKarlsruheGermany

Personalised recommendations