Abstract
With the rapid development of crowdsourcing learning, inferring (integrating) truth labels from multiple noisy label sets, it is also called label integration, has been a hot research topic. And many methods have been proposed for label integration. However, due to the variable uncertainty of crowdsourced labelers, inferring truth labels from multiple noisy label sets still faces great challenges. In this paper we transform the label integration problem into an optimization problem, and exploit a differential evolution-based weighted majority voting method, simply DEWMV, for label integration. DEWMV searches and weights the voting quality of each label through the designed differential evolution (DE) algorithm. In DEWMV, we define three fitness functions, including the uncertainty of the integration label, the uncertainty of the class member probability and the hybrid uncertainty, to search the optimal voting quality for each label. By theoretically analyzing their effectiveness, we choose the hybrid uncertainty as the final fitness function for DEWMV. The experimental results on 14 real-world datasets show that DEWMV is superior to standard majority voting (MV) and all the other state-of-the-art label integration methods used to compare.
This work was partially supported by NSFC (U1711267).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dawid, A., Skene, A.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Stat. 28(1), 20–28 (1979)
Demartini, G., Difallah, D.E., Cudré-Mauroux, P.: Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web, pp. 469–478. ACM (2012)
Howe, J.: The rise of crowdsourcing. Wired Mag. 14(6), 1–4 (2006)
Raykar, V.C., et al.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)
Sheng, V.S., Provost, F., Ipeiotis, P.G.: Get another label? Improving data quality and data mining using multiple noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 614–622. ACM (2008)
Snow, R., OConnor, B., Jurafsky, D., Ng, A.Y.: Cheap and fastbut is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of the 2008 Conference on Empirical Method in Natural Language Processing, pp. 254–263. Association for Computational Linguistics, Hawaii (2008)
Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)
Zhang, J., Sheng, V.S., Nicholson, B.A., Wu, X.: CEKA: a tool for mining the wisdom of crowds. J. Mach. Learn. Res. 16, 2853–2858 (2015)
Zhang, J., Wu, X., Sheng, V.S.: Imbalanced multiple noisy labeling. IEEE Trans. Knowl. Data Eng. 27(2), 489–503 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhang, H., Jiang, L., Xu, W. (2018). Differential Evolution-Based Weighted Majority Voting for Crowdsourcing. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11013. Springer, Cham. https://doi.org/10.1007/978-3-319-97310-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-97310-4_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97309-8
Online ISBN: 978-3-319-97310-4
eBook Packages: Computer ScienceComputer Science (R0)