Skip to main content

Crowd Learning with Candidate Labeling: An EM-Based Solution

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (CAEPIA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11160))

Included in the following conference series:

Abstract

Crowdsourcing is widely used nowadays in machine learning for data labeling. Although in the traditional case annotators are asked to provide a single label for each instance, novel approaches allow annotators, in case of doubt, to choose a subset of labels as a way to extract more information from them. In both the traditional and these novel approaches, the reliability of the labelers can be modeled based on the collections of labels that they provide. In this paper, we propose an Expectation-Maximization-based method for crowdsourced data with candidate sets. Iteratively the likelihood of the parameters that model the reliability of the labelers is maximized, while the ground truth is estimated. The experimental results suggest that the proposed method performs better than the baseline aggregation schemes in terms of estimated accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Banerjee, S.O.A., Gurari, D.: Let’s agree to disagree: a meta-analysis of disagreement among crowdworkers during visual question answering. In: GroupSight Workshop at AAAI HCOMP, Quebec City, Canada (2017)

    Google Scholar 

  2. Beñaran-Muñoz, I., Hernández-González, J., Pérez, A.: Weak Labeling for Crowd Learning. arXiv e-prints (2018)

    Google Scholar 

  3. Brams, S.J., Fishburn, P.C.: Approval voting. Am. Polit. Sci. Rev. 72(3), 831–847 (1978)

    Article  Google Scholar 

  4. Côme, E., Oukhellou, L., Denoeux, T., Aknin, P.: Learning from partially supervised data using mixture models and belief functions. Pattern Recognit. 42(3), 334–348 (2009)

    Article  Google Scholar 

  5. Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. Roy. Stat. Soc. Ser. C 28(1), 20–28 (1979)

    Google Scholar 

  6. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  7. Ding, Y.X., Zhou, Z.H.: Crowdsourcing with unsure option. Mach. Learn. 107(4), 749–766 (2018)

    Article  MathSciNet  Google Scholar 

  8. Falmagne, J.C., Regenwetter, M.: A random utility model for approval voting. J. Math. Psychol. 40(2), 152–159 (1996)

    Article  Google Scholar 

  9. Grady, C., Lease, M.: Crowdsourcing document relevance assessment with mechanical turk. In: NAACL HLT 2010 Workshop, pp. 172–179 (2010)

    Google Scholar 

  10. Hernández-González, J., Inza, I., Lozano, J.A.: Weak supervision and other non-standard classification problems: a taxonomy. Pattern Rec. Lett. 69, 49–55 (2016)

    Article  Google Scholar 

  11. Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: NIPS, pp. 1953–1961 (2011)

    Google Scholar 

  12. López-Cruz, P.L., Bielza, C., Larrañaga, P.: Learning conditional linear gaussian classifiers with probabilistic class labels. In: Bielza, C., et al. (eds.) CAEPIA 2013. LNCS (LNAI), vol. 8109, pp. 139–148. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40643-0_15

    Chapter  Google Scholar 

  13. Procaccia, A.D., Shah, N.: Is approval voting optimal given approval votes? In: NIPS, pp. 1801–1809 (2015)

    Google Scholar 

  14. Raykar, V.C., et al.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)

    MathSciNet  Google Scholar 

  15. Smyth, P., Fayyad, U.M., Burl, M.C., Perona, P., Baldi, P.: Inferring ground truth from subjective labelling of venus images. In: Proceedings of NIPS 7, pp. 1085–1092 (1994)

    Google Scholar 

  16. Venanzi, M., Guiver, J., Kohli, P., Jennings, N.R.: Time-sensitive bayesian information aggregation for crowdsourcing systems. J. Artif. Intell. Res. 56, 517–545 (2016)

    Article  MathSciNet  Google Scholar 

  17. Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: Proceedings of NIPS 23, pp. 2424–2432 (2010)

    Google Scholar 

  18. Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.R.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of NIPS 22, pp. 2035–2043 (2009)

    Google Scholar 

  19. Zhang, J., Sheng, V.S., Wu, J., Wu, X.: Multi-class ground truth inference in crowdsourcing with clustering. IEEE Trans. Knowl. Data Eng. 28(4), 1080–1085 (2016)

    Article  Google Scholar 

  20. Zhang, Y., Chen, X., Zhou, D., Jordan, M.I.: Spectral methods meet EM: a provably optimal algorithm for crowdsourcing. In: Advances in Neural Information Processing Systems, pp. 1260–1268 (2014)

    Google Scholar 

  21. Zhong, J., Tang, K., Zhou, Z.H.: Active learning from crowds with unsure option. In: Proceedings of 24th IJCAI, pp. 1061–1068 (2015)

    Google Scholar 

Download references

Acknowledgments

IBM and AP are both supported by the Spanish Ministry MINECO through BCAM Severo Ochoa excellence accreditation SEV-2013-0323 and the project TIN2017-82626-R funded by (AEI/FEDER, UE). IBM is also supported by the grant BES-2016-078095. AP is also supported by the Basque Government through the BERC 2014-2017 and the ELKARTEK programs, and by the MINECO through BCAM Severo Ochoa excellence accreditation SVP-2014-068574. JHG is supported by the Basque Government (IT609-13, Elkartek BID3A) and the MINECO (TIN2016-78365-R).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iker Beñaran-Muñoz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Beñaran-Muñoz, I., Hernández-González, J., Pérez, A. (2018). Crowd Learning with Candidate Labeling: An EM-Based Solution. In: Herrera, F., et al. Advances in Artificial Intelligence. CAEPIA 2018. Lecture Notes in Computer Science(), vol 11160. Springer, Cham. https://doi.org/10.1007/978-3-030-00374-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00374-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00373-9

  • Online ISBN: 978-3-030-00374-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics