Skip to main content
Log in

A formalized framework for incorporating expert labels in crowdsourcing environment

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Crowdsourcing services have been proven efficient in collecting large amount of labeled data for supervised learning tasks. However, the low cost of crowd workers leads to unreliable labels, a new problem for learning a reliable classifier. Various methods have been proposed to infer the ground truth or learn from crowd data directly though, there is no guarantee that these methods work well for highly biased or noisy crowd labels. Motivated by this limitation of crowd data, in this paper, we propose a novel framewor for improving the performance of crowdsourcing learning tasks by some additional expert labels, that is, we treat each labeler as a personal classifier and combine all labelers’ opinions from a model combination perspective, and summarize the evidence from crowds and experts naturally via a Bayesian classifier in the intermediate feature space formed by personal classifiers. We also introduce active learning to our framework and propose an uncertainty sampling algorithm for actively obtaining expert labels. Experiments show that our method can significantly improve the learning quality as compared with those methods solely using crowd labels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. We assume that an expert always gives true labels and use the two terms ‘expert labels’ and ‘ground truth’ interchangeably. As experts can also make mistakes, this assumption is a simplification and may be relaxed in future work.

  2. A task of which true labels are completely infeasible is beyond the scope of our current work as it is no longer a traditional supervised learning problem.

  3. We assume at this point that all labelers give full labels to keep the notations simple. We will discuss our assumption on the number of labels and the case of missing labels in Section 3.6.1.

References

  • Ambati, V., Vogel, S., & Carbonell, J. (2010). Active learning and crowd-sourcing for machine translation. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), (Vol. 7 pp. 2169–2174): Citeseer.

  • Bishop, C.M., & et al. (2006). Pattern recognition and machine learning Vol. 1. New York: Springer.

  • Brew, A., Greene, D., & Cunningham, P. (2010). Using crowdsourcing and active learning to track sentiment in online media. In Proceedings of the 19th European Conference on Artificial Intelligence (ECAI) (pp. 145–150).

  • Dagan, I., & Engelson, S.P. (1995). Committee-based sampling for training probabilistic classifiers. In Proceedings of the International Conference on Machine Learning (ICML’95), (Vol. 95 pp. 150–157): Citeseer.

  • Dawid, A.P., & Skene, A.M. (1979). Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied Statistics, 28(1), 20–28.

    Article  Google Scholar 

  • Donmez, P., Carbonell, J.G., & Schneider, J. (2009). Efficiently learning the accuracy of labeling sources for selective sampling. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 259–268).

  • Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J., & Dredze, M. (2010). Annotating named entities in twitter data with crowdsourcing. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk (pp. 80–88).

  • Frank, A., & Asuncion, A. (2010). UCI machine learning repository.

  • Hu, Q., He, Q., Huang, H., Chiew, K., & Liu, Z. (2014). Learning from crowds under experts’ supervision. In Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 200–211).

  • Ipeirotis, P.G., Provost, F., & Wang, J. (2010). Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation (pp. 64–67).

  • Jaakkola, T., & Jordan, M. (1997). A variational approach to Bayesian logistic regression models and their extensions. In Proceedings of the 6th International Workshop on Artificial Intelligence and Statistics.

  • Kajino, H., Tsuboi, Y., & Kashima, H. (2012). A convex formulation for learning from crowds. In Proceedings of the 26th AAAI Conference on Artificial Intelligence (pp. 73–79).

  • Kajino, H., Tsuboi, Y., Sato, I., & Kashima, H. (2012). Learning from crowds and experts. In Proceedings of the 4th Human Computation Workshop (HCOMP’12) (pp. 107–113).

  • Karger, D.R., Oh, S., & Shah, D. (2011). Iterative learning for reliable crowdsourcing systems. In Proceedings of Advances in Neural Information Processing Systems (NIPS’11) (pp. 1953–1961).

  • Kuncheva, L.I. (2007). Combining pattern classifiers: Methods and algorithms (kuncheva, li; 2004)[book review]. IEEE Transactions on Neural Networks, 18(3), 964–964.

    Article  Google Scholar 

  • Kuncheva, L.I., Bezdek, J.C., & Duin, R.P.W. (2001). Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognition, 34(2), 299–314.

    Article  MATH  Google Scholar 

  • Lewis, D.D., & Gale, W.A. (1994). A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 3–12).

  • Liu, Q., Peng, J., & Ihler, A. (2012). Variational inference for crowdsourcing. In Proceedings of Advances in Neural Information Processing Systems (NIPS’12) (pp. 701–709).

  • McCallumzy, A.K., & Nigamy, K. (1998). Employing em and pool-based active learning for text classification. In Proceedings of the International Conference on Machine Learning (ICML’95): Citeseer.

  • Merz, C.J. (1999). Using correspondence analysis to combine classifiers. Machine Learning, 36(1), 33–58.

    Article  Google Scholar 

  • Raykar, V.C., & Yu, S. (2011). Ranking annotators for crowdsourced labeling tasks. In Proceedings of Advances in Neural Information Processing Systems (NIPS’11) (pp. 1809–1817).

  • Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., & Moy, L. (2010). Learning from crowds. The Journal of Machine Learning Research, 11(4), 1297–1322.

    MathSciNet  Google Scholar 

  • Seung, H.S., Opper, M., & Sompolinsky, H. (1992). Query by committee. In Proceedings of the 5th Annual Workshop on Computational Learning Theory (pp. 287–294): ACM.

  • Sheng, V.S., Provost, F., & Ipeirotis, P.G. (2008). Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 614–622).

  • Snow, R., O’Connor, B., Jurafsky, D., & Ng, A. (2008). Cheap and fast—but is it good? evaluating non-expert annotations for natural language tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 254–263): Association for Computational Linguistics.

  • Tang, W., & Lease, M. (2011). Semi-supervised consensus labeling for crowdsourcing. In Proceedings of the 2nd ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR’11).

  • Tian, Y., & Zhu, J. (2012). Learning from crowds in the presence of schools of thought. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 226–234).

  • Wauthier, F.L., & Jordan, M.I. (2011). Bayesian bias mitigation for crowdsourcing. In Proceedings of Advances in Neural Information Processing Systems (NIPS’11) (pp. 1800–1808).

  • Welinder, P., Branson, S., Belongie, S., & Perona, P. (2010). The multidimensional wisdom of crowds. In Proceedings of Advances in Neural Information Processing Systems (NIPS’10) (pp. 2424–2432).

  • Wolpert, D.H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.

    Article  MathSciNet  Google Scholar 

  • Yan, Y., & et al. (2010). Modeling annotator expertise: Learning when everybody knows a bit of something. In Proceedings of 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10), (Vol. 9 pp. 932–939).

  • Yan, Y., Rosales, R., Fung, G., & Dy, J. (2011). Active learning from crowds. In Proceedings of the International Conference on Machine Learning (ICML’11) (pp. 1161–1168).

Download references

Acknowledgments

This work is partly supported by the Fundamental Research Funds for the Central Universities under Grant No. 2042015kf0038, and the Scientific Research Foundation for Introduced Talents of Wuhan University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qinming He.

Additional information

A conference version of our framework can be found in Hu et al. (2014).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, Q., He, Q., Huang, H. et al. A formalized framework for incorporating expert labels in crowdsourcing environment. J Intell Inf Syst 47, 403–425 (2016). https://doi.org/10.1007/s10844-015-0371-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-015-0371-6

Keywords

Navigation