Journal of Intelligent Information Systems

, Volume 47, Issue 3, pp 403–425 | Cite as

A formalized framework for incorporating expert labels in crowdsourcing environment

  • Qingyang Hu
  • Qinming He
  • Hao Huang
  • Kevin Chiew
  • Zhenguang Liu


Crowdsourcing services have been proven efficient in collecting large amount of labeled data for supervised learning tasks. However, the low cost of crowd workers leads to unreliable labels, a new problem for learning a reliable classifier. Various methods have been proposed to infer the ground truth or learn from crowd data directly though, there is no guarantee that these methods work well for highly biased or noisy crowd labels. Motivated by this limitation of crowd data, in this paper, we propose a novel framewor for improving the performance of crowdsourcing learning tasks by some additional expert labels, that is, we treat each labeler as a personal classifier and combine all labelers’ opinions from a model combination perspective, and summarize the evidence from crowds and experts naturally via a Bayesian classifier in the intermediate feature space formed by personal classifiers. We also introduce active learning to our framework and propose an uncertainty sampling algorithm for actively obtaining expert labels. Experiments show that our method can significantly improve the learning quality as compared with those methods solely using crowd labels.


Crowdsourcing Multiple annotator Classification Classifier fusion Active learning 



This work is partly supported by the Fundamental Research Funds for the Central Universities under Grant No. 2042015kf0038, and the Scientific Research Foundation for Introduced Talents of Wuhan University.


  1. Ambati, V., Vogel, S., & Carbonell, J. (2010). Active learning and crowd-sourcing for machine translation. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), (Vol. 7 pp. 2169–2174): Citeseer.Google Scholar
  2. Bishop, C.M., & et al. (2006). Pattern recognition and machine learning Vol. 1. New York: Springer.Google Scholar
  3. Brew, A., Greene, D., & Cunningham, P. (2010). Using crowdsourcing and active learning to track sentiment in online media. In Proceedings of the 19th European Conference on Artificial Intelligence (ECAI) (pp. 145–150).Google Scholar
  4. Dagan, I., & Engelson, S.P. (1995). Committee-based sampling for training probabilistic classifiers. In Proceedings of the International Conference on Machine Learning (ICML’95), (Vol. 95 pp. 150–157): Citeseer.Google Scholar
  5. Dawid, A.P., & Skene, A.M. (1979). Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied Statistics, 28(1), 20–28.CrossRefGoogle Scholar
  6. Donmez, P., Carbonell, J.G., & Schneider, J. (2009). Efficiently learning the accuracy of labeling sources for selective sampling. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 259–268).Google Scholar
  7. Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J., & Dredze, M. (2010). Annotating named entities in twitter data with crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (pp. 80–88).Google Scholar
  8. Frank, A., & Asuncion, A. (2010). UCI machine learning repository.Google Scholar
  9. Hu, Q., He, Q., Huang, H., Chiew, K., & Liu, Z. (2014). Learning from crowds under experts’ supervision. In Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 200–211).Google Scholar
  10. Ipeirotis, P.G., Provost, F., & Wang, J. (2010). Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 64–67).Google Scholar
  11. Jaakkola, T., & Jordan, M. (1997). A variational approach to Bayesian logistic regression models and their extensions. In Proceedings of the 6th International Workshop on Artificial Intelligence and Statistics.Google Scholar
  12. Kajino, H., Tsuboi, Y., & Kashima, H. (2012). A convex formulation for learning from crowds. In Proceedings of the 26th AAAI Conference on Artificial Intelligence (pp. 73–79).Google Scholar
  13. Kajino, H., Tsuboi, Y., Sato, I., & Kashima, H. (2012). Learning from crowds and experts. In Proceedings of the 4th Human Computation Workshop (HCOMP’12) (pp. 107–113).Google Scholar
  14. Karger, D.R., Oh, S., & Shah, D. (2011). Iterative learning for reliable crowdsourcing systems. In Proceedings of Advances in Neural Information Processing Systems (NIPS’11) (pp. 1953–1961).Google Scholar
  15. Kuncheva, L.I. (2007). Combining pattern classifiers: Methods and algorithms (kuncheva, li; 2004)[book review]. IEEE Transactions on Neural Networks, 18(3), 964–964.CrossRefGoogle Scholar
  16. Kuncheva, L.I., Bezdek, J.C., & Duin, R.P.W. (2001). Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognition, 34(2), 299–314.CrossRefMATHGoogle Scholar
  17. Lewis, D.D., & Gale, W.A. (1994). A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 3–12).Google Scholar
  18. Liu, Q., Peng, J., & Ihler, A. (2012). Variational inference for crowdsourcing. In Proceedings of Advances in Neural Information Processing Systems (NIPS’12) (pp. 701–709).Google Scholar
  19. McCallumzy, A.K., & Nigamy, K. (1998). Employing em and pool-based active learning for text classification. In Proceedings of the International Conference on Machine Learning (ICML’95): Citeseer.Google Scholar
  20. Merz, C.J. (1999). Using correspondence analysis to combine classifiers. Machine Learning, 36(1), 33–58.CrossRefGoogle Scholar
  21. Raykar, V.C., & Yu, S. (2011). Ranking annotators for crowdsourced labeling tasks. In Proceedings of Advances in Neural Information Processing Systems (NIPS’11) (pp. 1809–1817).Google Scholar
  22. Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., & Moy, L. (2010). Learning from crowds. The Journal of Machine Learning Research, 11(4), 1297–1322.MathSciNetGoogle Scholar
  23. Seung, H.S., Opper, M., & Sompolinsky, H. (1992). Query by committee. In Proceedings of the 5th Annual Workshop on Computational Learning Theory (pp. 287–294): ACM.Google Scholar
  24. Sheng, V.S., Provost, F., & Ipeirotis, P.G. (2008). Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 614–622).Google Scholar
  25. Snow, R., O’Connor, B., Jurafsky, D., & Ng, A. (2008). Cheap and fast—but is it good? evaluating non-expert annotations for natural language tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 254–263): Association for Computational Linguistics.Google Scholar
  26. Tang, W., & Lease, M. (2011). Semi-supervised consensus labeling for crowdsourcing. In Proceedings of the 2nd ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR’11).Google Scholar
  27. Tian, Y., & Zhu, J. (2012). Learning from crowds in the presence of schools of thought. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 226–234).Google Scholar
  28. Wauthier, F.L., & Jordan, M.I. (2011). Bayesian bias mitigation for crowdsourcing. In Proceedings of Advances in Neural Information Processing Systems (NIPS’11) (pp. 1800–1808).Google Scholar
  29. Welinder, P., Branson, S., Belongie, S., & Perona, P. (2010). The multidimensional wisdom of crowds. In Proceedings of Advances in Neural Information Processing Systems (NIPS’10) (pp. 2424–2432).Google Scholar
  30. Wolpert, D.H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.MathSciNetCrossRefGoogle Scholar
  31. Yan, Y., & et al. (2010). Modeling annotator expertise: Learning when everybody knows a bit of something. In Proceedings of 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10), (Vol. 9 pp. 932–939).Google Scholar
  32. Yan, Y., Rosales, R., Fung, G., & Dy, J. (2011). Active learning from crowds. In Proceedings of the International Conference on Machine Learning (ICML’11) (pp. 1161–1168).Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Qingyang Hu
    • 1
  • Qinming He
    • 1
  • Hao Huang
    • 2
  • Kevin Chiew
    • 3
  • Zhenguang Liu
    • 1
  1. 1.College of Computer Science and TechnologyZhejiang UniversityHangzhouChina
  2. 2.State Key Laboratory of Software EngineeringWuhan UniversityWuhanChina
  3. 3.Singapore Branch, Handal Indah Sdn BhdJohor bahruSingapore

Personalised recommendations