Incorporating Worker Similarity for Label Aggregation in Crowdsourcing

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11140)


For the quality control in the crowdsourcing tasks, requesters usually assign a task to multiple workers to obtain redundant answers and then aggregate them to obtain the more reliable answer. Because of the existence of the non-experts in the crowds, one of the problems in the label aggregation is how to differ experts with higher ability from non-experts with lower ability and strengthen the influences of these experts. Most of the existing label aggregation approaches tend to strengthen the workers who provide majority answers and regard them with high ability. In addition, we find that the similarity among worker labels is possible to be effective for this issue because two experts are more probable to reach consensus than two non-experts. We thus propose a novel probabilistic model which can incorporate the similarity information of workers. The experimental results on a number of real datasets show that our approach can outperform the existing models including a probabilistic model without incorporating the similarity. We also make an empirical study on the influence of worker ability, label sparsity and redundancy to the performance of label aggregation approaches, and provide a suggestion on the strategy of collecting the labels in crowdsourcing.


Crowdsourcing Quality control Worker similarity 



This work was partially supported by JSPS KAKENHI Grant Number 15H01704.


  1. 1.
    Bachrach, Y., Minka, T., Guiver, J., Graepel, T.: How to grade a test without knowing the answers: a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on International Conference on Machine Learning ICML 2012, pp. 819–826 (2012)Google Scholar
  2. 2.
    Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)CrossRefGoogle Scholar
  3. 3.
    Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Proceedings of the 24th International Conference on Neural Information Processing Systems NIPS 2011, pp. 1953–1961 (2011)Google Scholar
  4. 4.
    Li, H.W., Zhao, B., Fuxman, A.: The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 165–176 (2014)Google Scholar
  5. 5.
    Li, J., Baba, Y., Kashima, H.: Hyper questions: unsupervised targeting of a few experts in crowdsourcing. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management CIKM 2017, pp. 1069–1078 (2017)Google Scholar
  6. 6.
    Liu, Q., Peng, J., Ihler, A.: Variational inference for crowdsourcing. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 692–700 (2012)Google Scholar
  7. 7.
    Ma, F.L., et al.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2015, pp. 745–754 (2015)Google Scholar
  8. 8.
    Mozafari, B., Sarkar, P., Franklin, M.J., Jordan, M.I., Madden, S.: Active learning for crowd-sourced databases. CoRR abs/1209.3686 (2012)Google Scholar
  9. 9.
    Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics ACL 2004 (2004)Google Scholar
  10. 10.
    Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing EMNLP 2008, pp. 254–263 (2008)Google Scholar
  11. 11.
    Venanzi, M., Teacy, W., Rogers, A., Jennings, N.R.: Weather sentiment-amazon mechanical turk dataset (2015)Google Scholar
  12. 12.
    Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems NIPS 2010, pp. 2424–2432 (2010)Google Scholar
  13. 13.
    Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems NIPS 2009, pp. 2035–2043 (2009)Google Scholar
  14. 14.
    Zhou, D.Y., Platt, J.C., Basu, S., Mao, Y.: Learning from the wisdom of crowds by minimax entropy. In: Proceedings of the 25th International Conference on Neural Information Processing Systems NIPS 2012, pp. 2195–2203 (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.University of YamanashiKofuJapan
  2. 2.University of TsukubaTsukubaJapan
  3. 3.Kyoto UniversityKyotoJapan
  4. 4.RIKEN Center for AIPTokyoJapan

Personalised recommendations