Crowdsourcing Versus the Laboratory: Towards Human-Centered Experiments Using the Crowd

  • Ujwal Gadiraju
  • Sebastian Möller
  • Martin Nöllenburg
  • Dietmar Saupe
  • Sebastian Egger-Lampl
  • Daniel Archambault
  • Brian Fisher
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10264)


Crowdsourcing solutions are increasingly being adopted across a variety of domains these days. An important consequence of the flourishing crowdsourcing markets is that experiments which were traditionally carried out in laboratories on a much smaller scale can now tap into the immense potential of online labor. Researchers in different fields have shown considerable interest in attempting to carry out priorly constrained lab experiments in the crowd. In this chapter, we reflect on the key factors to consider while transitioning from controlled laboratory experiments to large scale experiments in the crowd.



We would like to thank Dagstuhl for facilitating the seminar (titled, ‘Evaluation in the Crowd: Crowdsourcing and Human-Centred Experiments’) that brought about this collaboration. Part of this work (Sect. 4) was supported by the German Research Foundation (DFG) within project A05 of SFB/Transregio 161. We also thank Andrea Mauri and Christian Keimel for their valuable contributions and feedback during discussions.


  1. 1.
    Anderson, J.R., Matessa, M., Lebiere, C.: ACT-R: a theory of higher level cognition and its relation to visual attention. Hum. Comput. Interact. 12(4), 439–462 (1997)CrossRefGoogle Scholar
  2. 2.
    Campbell, D.J.: Task complexity: a review and analysis. Acad. Manag. Rev. 13(1), 40–52 (1988)Google Scholar
  3. 3.
    Cheng, J., Teevan, J., Bernstein, M.S.: Measuring crowdsourcing effort with error-time curves. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 1365–1374. ACM (2015)Google Scholar
  4. 4.
    Chung, D.H.S., Archambault, D., Borgo, R., Edwards, D.J., Laramee, R.S., Chen, M.: How ordered is it? On the perceptual orderability of visual channels. Comput. Graph. Forum 35(3), 131–140 (2016). (Proc. of EuroVis 2016)CrossRefGoogle Scholar
  5. 5.
    Cole, F., Sanik, K., DeCarlo, D., Finkelstein, A., Funkhouser, T., Rusinkiewicz, S., Singh, M.: How well do line drawings depict shape? ACM Trans. Graph. 28(3), 1–9 (2009)CrossRefGoogle Scholar
  6. 6.
    Cozby, P.: Asking people about themselves: survey research. In: Methods in Behavioral Research, 7th edn., pp. 103–124. Mayfield Publishing Company, Mountain View (2001)Google Scholar
  7. 7.
    Crump, M.J., McDonnell, J.V., Gureckis, T.M.: Evaluating Amazon’s Mechanical Turk as a tool for experimental behavioral research. PloS one 8(3), e57410 (2013)CrossRefGoogle Scholar
  8. 8.
    Difallah, D.E., Catasta, M., Demartini, G., Cudré-Mauroux, P.: Scaling-up the crowd: micro-task pricing schemes for worker retention and latency improvement. In: Second AAAI Conference on Human Computation and Crowdsourcing (2014)Google Scholar
  9. 9.
    Difallah, D.E., Demartini, G., Cudré-Mauroux, P.: Mechanical cheat: spamming schemes and adversarial techniques on crowdsourcing platforms. In: CrowdSearch, pp. 26–30. Citeseer (2012)Google Scholar
  10. 10.
    Dow, S., Kulkarni, A., Klemmer, S., Hartmann, B.: Shepherding the crowd yields better work. In: Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pp. 1013–1022. ACM (2012)Google Scholar
  11. 11.
    Eickhoff, C., de Vries, A.P.: Increasing cheat robustness of crowdsourcing tasks. Inf. Retr. 16(2), 121–137 (2013)CrossRefGoogle Scholar
  12. 12.
    Feyisetan, O., Luczak-Roesch, M., Simperl, E., Tinati, R., Shadbolt, N.: Towards hybrid NER: a study of content and crowdsourcing-related performance factors. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 525–540. Springer, Cham (2015). CrossRefGoogle Scholar
  13. 13.
    Fikkert, W., D’Ambros, M., Bierz, T., Jankun-Kelly, T.J.: Interacting with visualizations. In: Kerren, A., Ebert, A., Meyer, J. (eds.) Human-Centered Visualization Environments. LNCS, vol. 4417, pp. 77–162. Springer, Heidelberg (2007). CrossRefGoogle Scholar
  14. 14.
    Fu, W.T., Pirolli, P.: SNIF-ACT: a cognitive model of user navigation on the world wide web. Hum. Comput. Interact. 22(4), 355–412 (2007)Google Scholar
  15. 15.
    Gadiraju, U.: Crystal clear or very vague? Effects of task clarity in the microtask crowdsourcing ecosystem. In: 1st International Workshop on Weaving Relations of Trust in Crowd Work: Transparency and Reputation Across Platforms, Co-located With the 8th International ACM Web Science Conference 2016, Hannover (2016)Google Scholar
  16. 16.
    Gadiraju, U., Dietze, S.: Improving learning through achievement priming in crowdsourced information finding microtasks. In: Proceedings of ACM LAK Conference. ACM (2017, to appear)Google Scholar
  17. 17.
    Gadiraju, U., Fetahu, B., Kawase, R.: Training workers for improving performance in crowdsourcing microtasks. In: Conole, G., Klobučar, T., Rensing, C., Konert, J., Lavoué, É. (eds.) EC-TEL 2015. LNCS, vol. 9307, pp. 100–114. Springer, Cham (2015). CrossRefGoogle Scholar
  18. 18.
    Gadiraju, U., Kawase, R., Dietze, S.: A taxonomy of microtasks on the web. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, pp. 218–223. ACM (2014)Google Scholar
  19. 19.
    Gadiraju, U., Kawase, R., Dietze, S., Demartini, G.: Understanding malicious behavior in crowdsourcing platforms: the case of online surveys. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI 2015), Seoul, 18–23 April 2015, pp. 1631–1640 (2015)Google Scholar
  20. 20.
    Gadiraju, U., Siehndel, P., Fetahu, B., Kawase, R.: Breaking bad: understanding behavior of crowd workers in categorization microtasks. In: Proceedings of the 26th ACM Conference on Hypertext & Social Media, pp. 33–38. ACM (2015)Google Scholar
  21. 21.
    Gardlo, B., Egger, S., Seufert, M., Schatz, R.: Crowdsourcing 2.0: enhancing execution speed and reliability of web-based QoE testing. In: Proceedings of the IEEE International Conference on Communications (ICC), pp. 1070–1075 (2014)Google Scholar
  22. 22.
    Goncalves, J., Ferreira, D., Hosio, S., Liu, Y., Rogstadius, J., Kukka, H., Kostakos, V.: Crowdsourcing on the spot: altruistic use of public displays, feasibility, performance, and behaviours. In: Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 753–762. ACM (2013)Google Scholar
  23. 23.
    Hanhart, P., Korshunov, P., Ebrahimi, T.: Crowd-based quality assessment of multiview video plus depth coding. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 743–747. IEEE (2014)Google Scholar
  24. 24.
    Heer, J., Bostock, M.: Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In: Proceedings of the 28th International Conference on Human Factors in Computing Systems (CHI 2010), Atlanta, 10–15 April 2010, pp. 203–212 (2010)Google Scholar
  25. 25.
    Heinzelman, J., Waters, C.: Crowdsourcing crisis information in disaster-affected Haiti. US Institute of Peace (2010)Google Scholar
  26. 26.
    Horton, J.J., Rand, D.G., Zeckhauser, R.J.: The online laboratory: conducting experiments in a real labor market. Exp. Econ. 14(3), 399–425 (2011)CrossRefGoogle Scholar
  27. 27.
    Hoßfeld, T., Keimel, C., Hirth, M., Gardlo, B., Habigt, J., Diepold, K., Tran-Gia, P.: Best practices for QoE crowdtesting: QoE assessment with crowdsourcing. IEEE Trans. Multimed. 16(2), 541–558 (2014)CrossRefGoogle Scholar
  28. 28.
    Hoßfeld, T., Tran-Gia, P., Vucovic, M.: Crowdsourcing: from theory to practice and long-term perspectives (Dagstuhl Seminar 13361). Dagstuhl Rep. 3(9), 1–33 (2013). Google Scholar
  29. 29.
    ITU-T Rec. P.805: Subjective evaluation of conversational quality. International Telecommunication Union, Geneva (2007)Google Scholar
  30. 30.
    Ipeirotis, P.G.: Analyzing the Amazon Mechanical Turk marketplace. XRDS: Crossroads ACM Mag. Stud. 17(2), 16–21 (2010)CrossRefGoogle Scholar
  31. 31.
    Ipeirotis, P.G.: Demographics of Mechanical Turk (2010)Google Scholar
  32. 32.
    Isenberg, P., Elmqvist, N., Scholtz, J., Cernea, D., Ma, K.L., Hagen, H.: Collaborative visualization: definition, challenges, and research agenda. Inf. Vis. 10(4), 310–326 (2011)CrossRefGoogle Scholar
  33. 33.
    Khatib, F., Cooper, S., Tyka, M.D., Xu, K., Makedon, I., Popović, Z., Baker, D., Players, F.: Algorithm discovery by protein folding game players. Proc. Natl. Acad. Sci. 108(47), 18949–18953 (2011)CrossRefGoogle Scholar
  34. 34.
    Khatib, F., DiMaio, F., Cooper, S., Kazmierczyk, M., Gilski, M., Krzywda, S., Zabranska, H., Pichova, I., Thompson, J., Popović, Z., et al.: Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nat. Struct. Mol. Biol. 18(10), 1175–1177 (2011)CrossRefGoogle Scholar
  35. 35.
    Lebreton, P.R., Mäki, T., Skodras, E., Hupont, I., Hirth, M.: Bridging the gap between eye tracking and crowdsourcing. In: Human Vision and Electronic Imaging XX, San Francisco, 9–12 February 2015, p. 93940W (2015)Google Scholar
  36. 36.
    Marshall, C.C., Shipman, F.M.: Experiences surveying the crowd: reflections on methods, participation, and reliability. In: Proceedings of the 5th Annual ACM Web Science Conference, pp. 234–243. ACM (2013)Google Scholar
  37. 37.
    Mason, W., Suri, S.: Conducting behavioral research on Amazons Mechanical Turk. Behav. Res. Methods 44(1), 1–23 (2012)CrossRefGoogle Scholar
  38. 38.
    McCrae, J., Mitra, N.J., Singh, K.: Surface perception of planar abstractions. ACM Trans. Appl. Percept. 10(3), 14: 1–14: 20 (2013)CrossRefGoogle Scholar
  39. 39.
    Okoe, M., Jianu, R.: GraphUnit: evaluating interactive graph visualizations using crowdsourcing. Comput. Graph. Forum 34(3), 451–460 (2015)CrossRefGoogle Scholar
  40. 40.
    Oleson, D., Sorokin, A., Laughlin, G., Hester, V., Le, J., Biewald, L.: Programmatic gold: targeted and scalable quality assurance in crowdsourcing. In: Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence (WS-11-11). AAAI (2011)Google Scholar
  41. 41.
    Paolacci, G., Chandler, J., Ipeirotis, P.G.: Running experiments on Amazon Mechanical Turk. Judgm. Decis. Mak. 5(5), 411–419 (2010)Google Scholar
  42. 42.
    Pirolli, P., Card, S.: The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In: Proceedings of International Conference on Intelligence Analysis, vol. 5, pp. 2–4 (2005)Google Scholar
  43. 43.
    Pylyshyn, Z.W.: Things and Places: How the Mind Connects with the World. MIT Press, Cambridge (2007)Google Scholar
  44. 44.
    Rand, D.G.: The promise of Mechanical Turk: how online labor markets can help theorists run behavioral experiments. J. Theor. Biol. 299, 172–179 (2012)CrossRefMathSciNetGoogle Scholar
  45. 45.
    Rokicki, M., Chelaru, S., Zerr, S., Siersdorfer, S.: Competitive game designs for improving the cost effectiveness of crowdsourcing. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, pp. 1469–1478. ACM (2014)Google Scholar
  46. 46.
    Rokicki, M., Zerr, S., Siersdorfer, S.: Groupsourcing: team competition designs for crowdsourcing. In: Proceedings of the 24th International Conference on World Wide Web, pp. 906–915. International World Wide Web Conferences Steering Committee (2015)Google Scholar
  47. 47.
    Salehi, N., Irani, L.C., Bernstein, M.S., Alkhatib, A., Ogbe, E., Milland, K., et al.: We are dynamo: overcoming stalling and friction in collective action for crowd workers. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 1621–1630. ACM (2015)Google Scholar
  48. 48.
    Tetlock, P.E., Mellers, B.A., Rohrbaugh, N., Chen, E.: Forecasting tournaments tools for increasing transparency and improving the quality of debate. Curr. Dir. Psychol. Sci. 23(4), 290–295 (2014)CrossRefGoogle Scholar
  49. 49.
    Von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 319–326. ACM (2004)Google Scholar
  50. 50.
    Weber, L., Silverman, R.E.: On-demand workers: we are not robots. Wall Str. J. 7 (2015)Google Scholar
  51. 51.
    Williamson, V.: On the ethics of crowdsourced research. PS Political Sci. Politics 49(01), 77–81 (2016)CrossRefGoogle Scholar
  52. 52.
    Yang, J., Redi, J., DeMartini, G., Bozzon, A.: Modeling task complexity in crowdsourcing. In: Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2016), pp. 249–258. AAAI (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Ujwal Gadiraju
    • 1
  • Sebastian Möller
    • 2
  • Martin Nöllenburg
    • 3
  • Dietmar Saupe
    • 4
  • Sebastian Egger-Lampl
    • 5
  • Daniel Archambault
    • 6
  • Brian Fisher
    • 7
  1. 1.Leibniz Universität HannoverHannoverGermany
  2. 2.TU BerlinBerlinGermany
  3. 3.Algorithms and Complexity GroupTU WienViennaAustria
  4. 4.University of KonstanzKonstanzGermany
  5. 5.Austrian Institute of TechnologyViennaAustria
  6. 6.Swansea UniversitySwanseaUK
  7. 7.Simon Fraser UniversityBurnabyCanada

Personalised recommendations