GroExpert: A Novel Group-Aware Experts Identification Approach in Crowdsourcing

  • Qianli XingEmail author
  • Weiliang Zhao
  • Jian Yang
  • Jia Wu
  • Qi Wang
  • Mei Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11881)


Measuring workers’ abilities is a way to address the long standing problem of quality control in crowdsourcing. The approaches for measuring worker ability reported in recent work can be classified into two groups, i.e., upper bound-based approaches and lower bound-based approaches. Most of these works are based on two assumptions: (1) workers give their answers to a task independently and are not affected by other workers; (2) a worker’s ability for a task is a fixed value. However realistically, a worker’s ability should be evaluated as a relative value to those of others within a group. In this work, we propose an approach called GroExpert to identify experts based on their relative values in their working groups, which can be used as a basis for quality estimation in crowdsourcing. The proposed solution employs a fully connected neural network to implement the pairwise ranking method when identifying experts. Both workers’ features and groups’ features are considered in GroExpert. We conduct a set of experiments on three real-world datasets from the Amazon Mechanical Turk platform. The experimental results show that the proposed GroExpert approach outperforms the state-of-the-art in worker ability measurement.


Crowdsourcing Group-aware Worker ability 



This work was supported in part by the MQNS (No. 9201701203), the MQEPS (No. 96804590), the MQRSG (No. 95109718), and in part by the Investigative Analytics Collaborative Research Project between Macquarie University and Data61 CSIRO.


  1. 1.
    Allahbakhsh, M., Ignjatovic, A., Benatallah, B., Bertino, E., Foo, N., et al.: Reputation management in crowdsourcing systems. In: CollaborateCom, pp. 664–671. IEEE (2012)Google Scholar
  2. 2.
    Burges, C., et al.: Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 89–96. ACM (2005)Google Scholar
  3. 3.
    Chollet, F., et al.: Keras (2015).
  4. 4.
    Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., Allahbakhsh, M.: Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions. ACM Comput. Surv. (CSUR) 51(1), 7 (2018)CrossRefGoogle Scholar
  5. 5.
    Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Stat. 28, 20–28 (1979)CrossRefGoogle Scholar
  6. 6.
    Donmez, P., Carbonell, J.G.: Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 619–628. ACM (2008)Google Scholar
  7. 7.
    Fan, J., Li, G., Ooi, B.C., Tan, K.l., Feng, J.: iCrowd: an adaptive crowdsourcing framework. In: Proceedings of the 2015 ACM SIGMOD, pp. 1015–1030. ACM (2015)Google Scholar
  8. 8.
    Hirth, M., Hoßfeld, T., Tran-Gia, P.: Cheat-detection mechanisms for crowdsourcing. University of Würzburg, Technical report, vol. 4 (2010)Google Scholar
  9. 9.
    Hu, H., Zheng, Y., Bao, Z., Li, G., Feng, J., Cheng, R.: Crowdsourced poi labelling: location-aware result inference and task assignment. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp. 61–72. IEEE (2016)Google Scholar
  10. 10.
    Jagabathula, S., Subramanian, L., Venkataraman, A.: Reputation-based worker filtering in crowdsourcing. In: Advances in Neural Information Processing Systems, pp. 2492–2500 (2014)Google Scholar
  11. 11.
    Jain, A., Sarma, A.D., Parameswaran, A., Widom, J.: Understanding workers, developing effective tasks, and enhancing marketplace dynamics: a study of a large crowdsourcing marketplace. Proc. VLDB Endow. 10(7), 829–840 (2017)CrossRefGoogle Scholar
  12. 12.
    Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the Eighth ACM SIGKDD, pp. 133–142. ACM (2002)Google Scholar
  13. 13.
    Li, G., Wang, J., Zheng, Y., Franklin, M.J.: Crowdsourced data management: a survey. IEEE Trans. Knowl. Data Eng. 28(9), 2296–2319 (2016)CrossRefGoogle Scholar
  14. 14.
    Li, H., Zhao, B., Fuxman, A.: The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 165–176. ACM (2014)Google Scholar
  15. 15.
    Li, J., Baba, Y., Kashima, H.: Hyper questions: unsupervised targeting of a few experts in crowdsourcing. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1069–1078. ACM (2017)Google Scholar
  16. 16.
    Liu, Q., Peng, J., Ihler, A.T.: Variational inference for crowdsourcing. In: Advances in Neural Information Processing Systems, pp. 692–700 (2012)Google Scholar
  17. 17.
    Ma, F., et al.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21th ACM SIGKDD, pp. 745–754. ACM (2015)Google Scholar
  18. 18.
    Sheng, V.S., Provost, F., Ipeirotis, P.G.: Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD, pp. 614–622. ACM (2008)Google Scholar
  19. 19.
    Sunahase, T., Baba, Y., Kashima, H.: Pairwise hits: quality estimation from pairwise comparisons in creator-evaluator crowdsourcing process. In: AAAI, pp. 977–984 (2017)Google Scholar
  20. 20.
    Venanzi, M., Guiver, J., Kazai, G., Kohli, P., Shokouhi, M.: Community-based Bayesian aggregation models for crowdsourcing. In: Proceedings of the 23rd WWW, pp. 155–164. ACM (2014)Google Scholar
  21. 21.
    Venanzi, M., Rogers, A., Jennings, N.R.: Trust-based fusion of untrustworthy information in crowdsourcing applications. In: Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems, pp. 829–836. International Foundation for Autonomous Agents and Multiagent Systems (2013)Google Scholar
  22. 22.
    Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems, pp. 2424–2432 (2010)Google Scholar
  23. 23.
    Yin, L., Han, J., Zhang, W., Yu, Y.: Aggregating crowd wisdoms with label-aware autoencoders. In: Proceedings of the 26th IJCAI, pp. 1325–1331. AAAI Press (2017)Google Scholar
  24. 24.
    Yu, H., Shen, Z., Miao, C., An, B.: Challenges and opportunities for trust management in crowdsourcing. In: IEEE/WIC/ACM, pp. 486–493. IEEE Computer Society (2012)Google Scholar
  25. 25.
    Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 116. ACM (2004)Google Scholar
  26. 26.
    Zheng, Y., Li, G., Cheng, R.: DOCS: a domain-aware crowdsourcing system using knowledge bases. Proc. VLDB Endow. 10(4), 361–372 (2016)CrossRefGoogle Scholar
  27. 27.
    Zheng, Y., Li, G., Li, Y., Shan, C., Cheng, R.: Truth inference in crowdsourcing: is the problem solved? Proc. VLDB Endow. 10(5), 541–552 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Qianli Xing
    • 1
    Email author
  • Weiliang Zhao
    • 1
  • Jian Yang
    • 1
  • Jia Wu
    • 1
  • Qi Wang
    • 1
  • Mei Wang
    • 2
  1. 1.Department of ComputingMacquarie UniversitySydneyAustralia
  2. 2.School of Computer Science and TechnologyDonghua UniversityShanghaiChina

Personalised recommendations