Adapting Supervised Classification Algorithms to Arbitrary Weak Label Scenarios

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10584)


In many real-world problems, labels are often weak, meaning that each instance is labelled as belonging to one of several candidate categories, at most one of them being true. Recent theoretical contributions have shown that it is possible to construct proper losses or classification calibrated losses for weakly labelled classification scenarios by means of a linear transformation of conventional proper or classification calibrated losses, respectively. However, how to translate these theoretical results into practice has not been explored yet. This paper discusses both the algorithmic design and the potential advantages of this approach, analyzing consistency and convexity issues arising in practical settings, and evaluating the behavior of such transformations under different types of weak labels.


Weak labels Noisy labels Proper losses 


  1. 1.
    Ambroise, C., Denoeux, T., Govaert, G., Smets, P.: Learning from an imprecise teacher: probabilistic and evidential approaches. Appl. Stochast. Models Data Anal. 1, 100–105 (2001)Google Scholar
  2. 2.
    Angluin, D., Laird, P.: Learning from noisy examples. Mach. Learn. 2(4), 343–370 (1988)Google Scholar
  3. 3.
    Cid-Sueiro, J.: Proper losses for learning from partial labels. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1574–1582 (2012)Google Scholar
  4. 4.
    Cid-Sueiro, J., García-García, D., Santos-Rodríguez, R.: Consistency of losses for learning from weak labels. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8724, pp. 197–210. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44848-9_13 Google Scholar
  5. 5.
    Cour, T., Sapp, B., Taskar, B.: Learning from partial labels. J. Mach. Learn. Res. 12, 1225–1261 (2011)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Grandvalet, Y.: Logistic regression for partial labels. In: 9th Information Processing and Management of Uncertainty in Knowledge-based System, pp. 1935–1941 (2002)Google Scholar
  7. 7.
    Grandvalet, Y., Bengio, Y.: Learning from partial labels with minimum entropy. Centre Universitaire de recherche en analyse des organizations (2004)Google Scholar
  8. 8.
    Hüllermeier, E., Beringer, J.: Learning from ambiguously labeled examples. Intell. Data Anal. 10(5), 419–439 (2006)zbMATHGoogle Scholar
  9. 9.
    Hüllermeier, E., Cheng, W.: Superset learning based on generalized loss minimization. In: European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 260–275 (2015)Google Scholar
  10. 10.
    Jin, R., Ghahramani, Z.: Learning with multiple labels. In: Advances in Neural Information Processing Systems, vol. 15, pp. 897–904 (2002)Google Scholar
  11. 11.
    Nguyen, N., Caruana, R.: Classification with partial labels. In: SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 551–559 (2008)Google Scholar
  12. 12.
    Ni, Y., McVicar, M., Santos-Rodriguez, R., De Bie, T.: Understanding effects of subjectivity in measuring chord estimation accuracy. IEEE Trans. Audio Speech Lang. Process. 21(12), 2607–2615 (2013)CrossRefGoogle Scholar
  13. 13.
    Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. J. Mach. Learn. Res. 99, 1297–1322 (2010)MathSciNetGoogle Scholar
  14. 14.
    van Rooyen, B., Menon, A.K., Williamson, R.C.: Learning with symmetric label noise: the importance of being unhinged. In: Advances in Neural Information Processing Systems, pp. 10–18 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.University of BristolBristolUK
  2. 2.Universidad Carlos III de MadridLeganésSpain

Personalised recommendations