Advertisement

Crowd Anatomy Beyond the Good and Bad: Behavioral Traces for Crowd Worker Modeling and Pre-selection

  • Ujwal Gadiraju
  • Gianluca Demartini
  • Ricardo Kawase
  • Stefan Dietze
Article
  • 40 Downloads

Abstract

The suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, there is still a lack of understanding about the distinct behavior and performance of workers within microtasks. In this paper, we first introduce a fine-grained data-driven worker typology based on different dimensions and derived from behavioral traces of workers. Next, we propose and evaluate novel models of crowd worker behavior and show the benefits of behavior-based worker pre-selection using machine learning models. We also study the effect of task complexity on worker behavior. Finally, we evaluate our novel typology-based worker pre-selection method in image transcription and information finding tasks involving crowd workers completing 1,800 HITs. Our proposed method for worker pre-selection leads to a higher quality of results when compared to the standard practice of using qualification or pre-screening tests. For image transcription tasks our method resulted in an accuracy increase of nearly 7% over the baseline and of almost 10% in information finding tasks, without a significant difference in task completion time. Our findings have important implications for crowdsourcing systems where a worker’s behavioral type is unknown prior to participation in a task. We highlight the potential of leveraging worker types to identify and aid those workers who require further training to improve their performance. Having proposed a powerful automated mechanism to detect worker types, we reflect on promoting fairness, trust and transparency in microtask crowdsourcing platforms.

Keywords

Behavioral traces Crowdsourcing Microtasks Pre-selection Pre-screening Workers Worker typology 

References

  1. Berg, Bruce Lawrence (2004). Methods for the social sciences. Qualitative Research Methods for the Social Sciences. Boston: Pearson Education.Google Scholar
  2. Bozzon, Alessandro; Marco Brambilla; Stefano Ceri; Matteo Silvestri; and Giuliano Vesci (2013). Choosing the Right Crowd: Expert Finding in Social Networks. EDBT’13. Joint 2013 EDBT/ICDT Conferences, Proceedings of the 16th International Conference on Extending Database Technology, Genoa, Italy, 18-22 March 2013. New York: ACM Press, pp. 637– 648.Google Scholar
  3. Cheng, Justin; Jaime Teevan; Shamsi T Iqbal; and Michael S Bernstein (2015). Break it down: A comparison of macro-and microtasks. CHI’15. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Republic of Korea, 18-23 April 2015. New York: ACM Press, pp. 4061–4064.Google Scholar
  4. Dang, Brandon; Miles Hutson; and Matthew Lease (2016). MmmTurkey: A Crowdsourcing Framework for Deploying Tasks and Recording Worker Behavior on Amazon Mechanical Turk. HCOMP’16. Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP): Works-in-Progress Track, Austin, Texas, USA, 30 October-3 November 2016. AAAI Press, pp. 1–3.Google Scholar
  5. Demartini, Gianluca; Djellel Eddine Difallah; and Philippe Cudré-Mauroux (2012). ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. WWW’12. Proceedings of the 21st World Wide Web Conference 2012, Lyon, France, 16-20 April 2012. New York: ACM Press, pp. 469–478.Google Scholar
  6. Denzin, Norman K (1978). The research act: A theoretical orientation to sociological methods, Vol. 2. New York: McGraw-Hill.Google Scholar
  7. Difallah, Djellel Eddine; Gianluca Demartini; and Philippe Cudré-Mauroux (2013). Pick-a-crowd: tell me what you like, and i’ll tell you what to do. WWW’13. Proceedings of the 22nd International World Wide Web Conference, Rio de Janeiro, Brazil, 13-17 May 2013. New York: ACM Press, pp. 367–374.Google Scholar
  8. Difallah, Djellel Eddine; Michele Catasta; Gianluca Demartini; Panagiotis G Ipeirotis; and Philippe Cudré-Mauroux (2015). The dynamics of micro-task crowdsourcing: The case of amazon mturk. WWW’15. Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18-22 May 2015. New York: ACM Press, pp. 238–247.Google Scholar
  9. Dow, Steven; Anand Kulkarni; Scott Klemmer; and Björn Hartmann (2012). Shepherding the crowd yields better work. CSCW’12. Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, Seattle, WA, USA, 11-15 February 2012. New York: ACM Press, pp. 1013–1022.Google Scholar
  10. Eckersley, Peter (2010). How unique is your web browser? PETS’10. Proceedings of the 10th International Symposium on Privacy Enhancing Technologies Symposium, Berlin, Germany, 21-23 July 2010. Heidelberg: Springer, pp. 1–18.Google Scholar
  11. Eickhoff, Carsten; Christopher G Harris; Arjen P de Vries; and Padmini Srinivasan (2012). Quality through flow and immersion: gamifying crowdsourced relevance assessments. SIGIR’12. Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, Portland, OR, USA, 12-16 August 2012. New York: ACM Press, pp. 871–880.Google Scholar
  12. Feyisetan, Oluwaseyi; Elena Simperl; Max Van Kleek; and Nigel Shadbolt (2015a). Improving paid microtasks through gamification and adaptive furtherance incentives. WWW’15. Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18-22 May 2015. New York: ACM Press, pp. 333–343.Google Scholar
  13. Feyisetan, Oluwaseyi; Markus Luczak-Roesch; Elena Simperl; Ramine Tinati; and Nigel Shadbolt (2015b). Towards hybrid NER: a study of content and crowdsourcing-related performance factors. ESWC’15. Proceedings of The Semantic Web. Latest Advances and New Domains - 12th European Semantic Web Conference, Portoroz, Slovenia, 31 May-4 June 2015. Heidelberg: Springer, pp. 525–540.Google Scholar
  14. Gadiraju, Ujwal; and Neha Gupta (2016). Dealing with Sub-optimal Crowd Work: Implications of Current Quality Control Practices. International Reports on Socio-Informatics (IRSI), Proceedings of the CHI 2016 - Workshop: Crowd Dynamics: Exploring Conflicts and Contradictions in Crowdsourcing, Vol. 13. pp. 15–20.Google Scholar
  15. Gadiraju, Ujwal; and Ricardo Kawase (2017). Improving Reliability of Crowdsourced Results by Detecting Crowd Workers with Multiple Identities. ICWE’17. Proceedings of the 17th International Conference, Rome, Italy, 5-8 June 2017. Heidelberg: Springer, pp. 190–205.Google Scholar
  16. Gadiraju, Ujwal; and Stefan Dietze (2017). Improving learning through achievement priming in crowdsourced information finding microtasks. LAK’17. Proceedings of the seventh international learning analytics & knowledge conference, Vancouver, BC, Canada, 13-17 March 2017. New York: ACM Press, pp. 105–114.Google Scholar
  17. Gadiraju, Ujwal; Ricardo Kawase; and Stefan Dietze (2014). A taxonomy of microtasks on the web. HT’14. Proceedings of the 25th ACM Conference on Hypertext and Social Media, Santiago, Chile, 1-4 September 2014. New York: ACM Press, pp. 218–223.Google Scholar
  18. Gadiraju, Ujwal; Besnik, Fetahu; and Ricardo, Kawase (2015a). Training workers for improving performance in crowdsourcing microtasks. EC-TEL’15. Design for Teaching and Learning in a Networked World - Proceedings of the 10th European Conference on Technology Enhanced Learning, Toledo, Spain, 15-18 September 2015. Heidelberg: Springer, pp. 100–114.Google Scholar
  19. Gadiraju, Ujwal; Ricardo Kawase; Stefan Dietze; and Gianluca Demartini (2015b). Understanding malicious behavior in crowdsourcing platforms: The case of online surveys. CHI’15. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, CHI 2015, Seoul, Republic of Korea, 18-23 April 2015. New York: ACM Press, pp. 1631–1640.Google Scholar
  20. Gadiraju, Ujwal; Alessandro Checco; Neha Gupta; and Gianluca Demartini (2017a). Modus operandi of crowd workers: The invisible role of microtask work environments. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), vol. 1, no. 3, pp. 49:1–49:29.Google Scholar
  21. Gadiraju, Ujwal; Besnik Fetahu; Ricardo Kawase; Patrick Siehndel; and Stefan Dietze (2017b). Using worker self-assessments for competence-based preselection in crowdsourcing microtasks. ACM Transactions on Computer-Human Interaction (TOCHI), vol. 24, no. 4, pp. 30:1–30:26.CrossRefGoogle Scholar
  22. Gadiraju, Ujwal; Jie Yang; and Alessandro Bozzon (2017c). Clarity is a Worthwhile Quality – On the Role of Task Clarity in Microtask Crowdsourcing. HT’17. Proceedings of the 28th ACM Conference on Hypertext and Social Media, Prague, Czech Republic, 4-7 July 2017. New York: ACM Press, pp. 5–14.Google Scholar
  23. Gaikwad, Snehalkumar Neil S; Durim Morina; Adam Ginzberg; Catherine Mullings; Shirish Goyal; Dilrukshi Gamage; Christopher Diemert; Mathias Burton; Sharon Zhou; Mark Whiting et al. (2016). Boomerang: Rebounding the consequences of reputation feedback on crowdsourcing platforms. UIST’16. Proceedings of the 29th Annual Symposium on User Interface Software and Technology, Tokyo, Japan, 16-19 October 2016. New York: ACM Press, pp. 625–637.Google Scholar
  24. Ipeirotis, Panagiotis G; Foster Provost; and Jing Wang (2010). Quality management on amazon mechanical turk. HCOMP’10. Proceedings of the ACM SIGKDD workshop on Human Computation. New York: ACM Press, pp. 64–67.Google Scholar
  25. Irani, Lilly C; and M Silberman (2013). Turkopticon: Interrupting worker invisibility in amazon mechanical turk. CHI’13. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Paris, France, 27 April-2 May 2013. New York: ACM Press, pp. 611– 620.Google Scholar
  26. Kazai, Gabriella; and Imed, Zitouni (2016). Quality management in crowdsourcing using gold judges behavior. WSDM’18. Proceedings of the ninth ACM international conference on web search and data mining, san francisco, CA, USA, 22-25 February, 2016. New York: ACM Press, pp. 267–276.Google Scholar
  27. Kazai, Gabriella; Jaap Kamps; and Natasa Milic-Frayling (2011). Worker types and personality traits in crowdsourcing relevance labels. CIKM’11. Proceedings of the 20th ACM International Conference on Information and Knowledge Management, Glasgow, United Kingdom, 24-28 October 2011. New York: ACM Press, pp. 1941–1944.Google Scholar
  28. Kazai, Gabriella; Jaap Kamps; and Natasa Milic-Frayling (2012). The face of quality in crowdsourcing relevance labels: demographics, personality and labeling accuracy. CIKM’12. Proceedings of the 21st ACM International conference on Information and Knowledge Management, Maui, HI, USA, 29 October-02 November 2012. New York: ACM Press, pp. 2583–2586.Google Scholar
  29. Kazai, Gabriella; Jaap Kamps; and Natasa Milic-Frayling (2013). An analysis of human factors and label accuracy in crowdsourcing relevance judgments. Information Retrieval, vol. 16, no. 2, pp. 138–178.CrossRefGoogle Scholar
  30. Kittur, Aniket; Jeffrey V Nickerson; Michael Bernstein; Elizabeth Gerber; Aaron Shaw; John Zimmerman; Matt Lease; and John Horton (2013). The future of crowd work. CSCW’13. Proceedings of the 16th ACM Conference on Computer Supported Cooperative Work, San Antonio, TX, USA, 23-27 February 2013. New York: ACM Press, pp. 1301–1318.Google Scholar
  31. Marshall, Catherine C; and Frank M Shipman (2013). Experiences surveying the crowd: Reflections on methods, participation, and reliability. Proceedings of the 5th Annual ACM Web Science Conference, pp. 234–243.Google Scholar
  32. Martin, David; Benjamin V Hanrahan; Jacki O’Neill; and Neha Gupta (2014). Being a Turker. CSCW’14. Proceedings of the 17th ACM conference on Computer Supported Cooperative Work & Social Computing, Baltimore, MD, USA, 15-19 February 2014. New York: ACM Press, pp. 224–235.Google Scholar
  33. Oleson, David; Alexander Sorokin; Greg P. Laughlin; Vaughn Hester; John Le; and Lukas Biewald (2011). Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing. HCOMP’11. Papers from the 2011 AAAI Workshop on Human Computation, San Francisco, California, USA, 8 August 2011. AAAI Press, pp. 43–48.Google Scholar
  34. Rokicki, Markus; Sergej Zerr; and Stefan Siersdorfer (2015). Groupsourcing: Team competition designs for crowdsourcing. WWW’15. Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18-22 May 2015. New York: ACM Press, pp. 906–915.Google Scholar
  35. Rzeszotarski, Jeffrey; and Aniket Kittur (2012). CrowdScape: interactively visualizing user behavior and output. UIST’12. Proceedings of the he 25th Annual ACM Symposium on User Interface Software and Technology, Cambridge, MA, USA, 7-10 October 2012. New York: ACM Press, pp. 55–62.Google Scholar
  36. Rzeszotarski, Jeffrey M; and Aniket Kittur (2011). Instrumenting the crowd: using implicit behavioral measures to predict task performance. UIST’11. Proceedings of the 24th annual ACM symposium on User Interface Software and Technology, Santa Barbara, CA, USA, 16-19 October 2011. New York: ACM Press, pp. 13–22.Google Scholar
  37. Sheshadri, Aashish; and Matthew Lease (2013). SQUARE: A Benchmark for Research on Computing Crowd Consensus. HCOMP’13. Proceedings of the First AAAI Conference on Human Computation and Crowdsourcing, 7-9 November 2013, Palm Springs, CA, USA. AAAI Press, pp. 156–164.Google Scholar
  38. Strauss, Anselm; and Barney Glaser (1967). Discovery of grounded theory. Chicago: Aldine.Google Scholar
  39. Strauss, Anselm L (1987). Qualitative analysis for social scientists. Cambridge: Cambridge University Press.Google Scholar
  40. Taras, Maddalena (2002). Using assessment for learning and learning from assessment. Assessment & Evaluation in Higher Education, vol. 27, no. 6, pp. 501–510.CrossRefGoogle Scholar
  41. Venanzi, Matteo; John Guiver; Gabriella Kazai; Pushmeet Kohli; and Milad Shokouhi (2014). Community-based bayesian aggregation models for crowdsourcing. WWW’14. Proceedings of the 23rd International World Wide Web Conference, Seoul, Republic of Korea, 7-11 April 2014. New York: ACM Press, pp. 155–164.Google Scholar
  42. Vuurens, Jeroen BP; and Arjen P De Vries (2012). Obtaining high-quality relevance judgments using crowdsourcing. IEEE Internet Computing, vol. 16, no. 5, pp. 20–27.CrossRefGoogle Scholar
  43. Wang, Jing; Panagiotis G Ipeirotis; and Foster Provost (2011). Managing crowdsourcing workers. WCBI’11. Proceedings of the Winter Conference on Business Intelligence, Salt Lake City, Utah, USA, 12-14 March 2011. Citeseer, pp. 10–12.Google Scholar
  44. Wood, Robert E (1986). Task complexity: Definition of the construct. Organizational Behavior and Human Decision Processes, vol. 37, no. 1, pp. 60–82.CrossRefGoogle Scholar
  45. Yang, Jie; Judith Redi; Gianluca Demartini; and Alessandro Bozzon (2016). Modeling Task Complexity in Crowdsourcing. HCOMP’16. Proceedings of the Fourth AAAI Conference on Human Computation and Crowdsourcing, Austin, Texas, USA, 30 October-3 November 2016. AAAI Press, pp. 249–258.Google Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.L3S Research CenterLeibniz Universität HannoverHannoverGermany
  2. 2.School of ITEEUniversity of QueenslandQueenslandAustralia
  3. 3.mobile.de GmbH/eBay Inc.BerlinGermany

Personalised recommendations