Abstract
Multiclass learning is widely solved by reducing to a set of binary problems. By considering base binary classifiers as black boxes, we analyze generalization errors of various constructions, including Max-Win, Decision Directed Acyclic Graphs, Adaptive Directed Acyclic Graphs, and the unifying approach based on coding matrix with Hamming decoding of Allwein, Schapire, and Singer, using only elementary probabilistic tools. Many of these bounds are new, some are much simpler than previously known. This technique also yields a simple proof of the equivalences of the learnability and polynomial-learnability of the multiclass problem and the induced pairwise problems.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)
Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
Friedman, J.H.: Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University (1996)
Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: NIPS 1997: Proceedings of the 1997 conference on Advances in neural information processing systems 10, pp. 507–513. MIT Press, Cambridge (1998)
Platt, J., Cristianini, N., Shawe-Taylor, J.: Large margin DAGs for multiclass classification. In: Advance in Neural Information Processing System, vol. 12. MIT Press, Cambridge (2000)
Kreßel, U.H.G.: Pairwise classification and support vector machines. In: Advances in kernel methods: support vector learning, pp. 255–268. MIT Press, Cambridge (1999)
Kijsirikul, B., Ussivakul, N., Meknavin, S.: Adaptive directed acyclic graphs for multiclass classification. In: PRICAI 2002, pp. 158–168 (2002)
Dietterich, T.G., Bakiri, G.: Error-correcting output codes: a general method for improving multiclass inductive learning programs. In: Dean, T.L., McKeown, K. (eds.) Proceedings of the Ninth AAAI National Conference on Artificial Intelligence, pp. 572–577. AAAI Press, Menlo Park (1991)
Guruswami, V., Sahai, A.: Multiclass learning, boosting, and error-correcting codes. In: Computational Learning Theory, pp. 145–155 (1999)
Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multiclass to binary: a unifying approach for margin classifiers. J. Mach. Learn. Res. 1, 113–141 (2001)
Har-Peled, S., Roth, D., Zimak, D.: Constraint classification: A new approach to multiclass classification and ranking. In: NIPS. (2003)
Bar-Hillel, A., Weinshall, D.: Learning with equivalence constraints, and the relation to multiclass learning. In: COLT. (2003)
Fakcharoenphol, J.: A note on random DDAG. Manuscript (2003)
Schapire, R.E., Freund, Y., Bartlett, P.L., Lee, W.S.: Boosting the margin: a new explanation for the effectiveness of voting methods. Annals of Statistics 26, 1651–1686 (1998)
Paugam-Moisy, H., Elisseeff, A., Guermeur, Y.: Generalization performance of multiclass discriminant models. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, IJCNN 2000, Neural Computing: New Challenges and Perspectives for the New Millennium, Como, Italy, July 24-27, vol. 4. IEEE, Los Alamitos (2000)
Bennett, K.P., Cristianini, N., Shawe-Taylor, J., Wu, D.: Enlarging the margins in perceptron decision trees. Mach. Learn. 41, 295–313 (2000)
Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: A framework for structural risk minimisation. In: COLT 1996: Proceedings of the ninth annual conference on Computational learning theory, pp. 68–76. ACM Press, New York (1996)
Ben-David, S., Cesa-Bianchi, N., Haussler, D., Long, P.M.: Characterizations of learnability for classes of {0, .̇, n}-valued functions. J. Comput. Syst. Sci. 50, 74–86 (1995)
Natarajan, B.K.: On learning sets and functions. Mach. Learn. 4, 67–97 (1989)
Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theoret. Probi. and its Appl. 16, 264–280 (1971)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the vapnik-chervonenkis dimension. J. ACM 36, 929–965 (1989)
Phetkaew, T., Kijsirikul, B., Rivepiboon, W.: Reordering adaptive directed acyclic graphs for multiclass support vector machines. In: Proceedings of the Third International Conference on Intelligent Technologies, InTech 2002 (2002)
Klautau, A., Jevtić, N., Orlitsky, A.: On nearest-neighbor error-correcting output codes with application to all-pairs multiclass support vector machines. J. Mach. Learn. Res. 4, 1–15 (2003)
Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fakcharoenphol, J., Kijsirikul, B. (2005). Constructing Multiclass Learners from Binary Learners: A Simple Black-Box Analysis of the Generalization Errors. In: Jain, S., Simon, H.U., Tomita, E. (eds) Algorithmic Learning Theory. ALT 2005. Lecture Notes in Computer Science(), vol 3734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564089_12
Download citation
DOI: https://doi.org/10.1007/11564089_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29242-5
Online ISBN: 978-3-540-31696-1
eBook Packages: Computer ScienceComputer Science (R0)