Advertisement

On Binary Reduction of Large-Scale Multiclass Classification Problems

  • Bikash JoshiEmail author
  • Massih-Reza Amini
  • Ioannis Partalas
  • Liva Ralaivola
  • Nicolas Usunier
  • Eric Gaussier
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9385)

Abstract

In the context of large-scale problems, traditional multiclass classification approaches have to deal with class imbalancement and complexity issues which make them inoperative in some extreme cases. In this paper we study a transformation that reduces the initial multiclass classification of examples into a binary classification of pairs of examples and classes. We present generalization error bounds that exhibit the interdependency between the pairs of examples and which recover known results on binary classification with i.i.d. data. We show the efficiency of the deduced algorithm compared to state-of-the-art multiclass classification strategies on two large-scale document collections especially in the interesting case where the number of classes becomes very large.

Keywords

Binary Classification Multiclass Classification Binary Problem Proper Cover Error Correct Output Code 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

This work is partially supported by the LabEx PERSYVAL-Lab ANR-11-LABX-0025, and Titan CNRS-Mastodons.

References

  1. 1.
    Babbar, R., Metzig, C., Partalas, I., Gaussier, E., Amini, M.R.: On power law distributions in large-scale taxonomies. SIGKDD Explor. 16(1), 47–56 (2014)CrossRefGoogle Scholar
  2. 2.
    Beygelzimer, A., Langford, J., Ravikumar, P.: Error-correcting tournaments. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) ALT 2009. LNCS, vol. 5809, pp. 247–262. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  3. 3.
    Choromanska, A., Langford, J.: Logarithmic time online multiclass prediction. CoRR abs/1406.1822 (2014)
  4. 4.
    Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2002)zbMATHGoogle Scholar
  5. 5.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)zbMATHGoogle Scholar
  6. 6.
    Har-Peled, S., Roth, D., Zimak, D.: Constraint classification: a new approach to multiclass classification and ranking. In: Advances in Neural Information Processing Systems, vol. 15, pp. 365–379 (2002)Google Scholar
  7. 7.
    Hüllermeier, E., Fürnkranz, J.: On minimizing the position error in label ranking. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 583–590. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  8. 8.
    Janson, S.: Large deviations for sums of partly dependent random variables. Random Struct. Algorithms 24(3), 234–248 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Lehmann, E.: Nonparametric Statistical Methods Based on Ranks. McGraw-Hill, New York (1975) zbMATHGoogle Scholar
  10. 10.
    Lorena, A.C., Carvalho, A.C., Gama, J.A.M.: A review on the combination of binary classifiers in multiclass problems. Artif. Intell. Rev. 30(1–4), 19–37 (2008)CrossRefGoogle Scholar
  11. 11.
    McDiarmid, C.: On the method of bounded differences. In: Survey in Combinatorics, pp. 148–188 (1989)Google Scholar
  12. 12.
    Mohri, M., Rostamizadeh, A.: Rademacher complexity bounds for non-i.i.d. processes. In: Advances in Neural Information Processing Systems 21, pp. 1097–1104 (2009)Google Scholar
  13. 13.
    Park, S.H., Fürnkranz, J.: Efficient prediction algorithms for binary decomposition techniques. Data Min. Knowl. Disc. 24(1), 40–77 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Park, S., Fürnkranz, J.: Efficient implementation of class-based decomposition schemes for naïve bayes. Mach. Learn. 96(3), 295–309 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Partalas, I., Kosmopoulos, A., Baskiotis, N., Artieres, T., Paliouras, G., Gaussier, E., Androutsopoulos, I., Amini, M.R., Galinari, P.: LSHTC: a benchmark for large-scale text classification. ArXiv e-prints, March 2015Google Scholar
  16. 16.
    Ralaivola, L., Szafranski, M., Stempfel, G.: Chromatic PAC-bayes bounds for non-IID data: applications to ranking and stationary \(\beta \)-mixing processes. J. Mach. Learn. Res. 11, 1927–1956 (2010)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37(3), 297–336 (1999)CrossRefzbMATHGoogle Scholar
  18. 18.
    Steinwart, I., Christmann, A.: Fast learning from non-i.i.d. observations. In: Advances in Neural Information Processing Systems 22, pp. 1768–1776 (2010)Google Scholar
  19. 19.
    Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: Proceedings of the Twenty-first International Conference on Machine Learning, p. 104. ACM (2004)Google Scholar
  20. 20.
    Usunier, N., Amini, M.R., Gallinari, P.: Generalization error bounds for classifiers trained with interdependent data. In: Advances in Neural Information Processing Systems 18, pp. 1369–1376 (2006)Google Scholar
  21. 21.
    Weston, J., Bengio, S., Usunier, N.: Wsabie: scaling up to large vocabulary image annotation. In: Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI (2011)Google Scholar
  22. 22.
    Weston, J., Watkins, C.: Multi-class support vector machines. Tech. rep., CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London (1998)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Bikash Joshi
    • 1
    Email author
  • Massih-Reza Amini
    • 1
  • Ioannis Partalas
    • 2
  • Liva Ralaivola
    • 3
  • Nicolas Usunier
    • 4
  • Eric Gaussier
    • 1
  1. 1.Grenoble Informatics LaboratoryUniversity of Grenoble AlpesSaint Martin D’heresFrance
  2. 2.R.&D. DepartmentVISEOGrenobleFrance
  3. 3.Fundamental Informatics LaboratoryUniversité Aix-MarseilleMarseilleFrance
  4. 4.Université Technologique de Compiègne, HeudiasycCompiègneFrance

Personalised recommendations