Which Is the Best Multiclass SVM Method? An Empirical Study

  • Kai-Bo Duan
  • S. Sathiya Keerthi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3541)


Multiclass SVMs are usually implemented by combining several two-class SVMs. The one-versus-all method using winner-takes-all strategy and the one-versus-one method implemented by max-wins voting are popularly used for this purpose. In this paper we give empirical evidence to show that these methods are inferior to another one-versus-one method: one that uses Platt’s posterior probabilities together with the pairwise coupling idea of Hastie and Tibshirani. The evidence is particularly strong when the training dataset is sparse.


Support Vector Machine Multiclass Problem Kernel Logistic Regression Pairwise Coupling Multiclass Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boser, B., Guyon, I., Vapnik, V.: An training algorithm for optimal margin classifiers. In: Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM, Pittsburgh (1992)CrossRefGoogle Scholar
  2. 2.
    Dietterich, T., Bakiri, G.: Solving multiclass problem via error-correcting output code. Journal of Artificial Intelligence Research 2, 263–286 (1995)zbMATHGoogle Scholar
  3. 3.
    Duan, K.-B., Keerthi, S.S.: Which is the best multiclass SVM method? An empirical study. Technical Report CD-03-12, Control Division, Department of Mechanical Engineering, National University of Singapore (2003)Google Scholar
  4. 4.
    Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: Jordan, M.I., Kearns, M.J., Solla, A.S. (eds.) Advances in Neural Information Processing Systems, vol. 10. MIT Press, Cambridge (1998)Google Scholar
  5. 5.
    Hsu, C.-W., Lin, C.-J.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  6. 6.
    Lin, H.-T., Lin, C.-J., Weng, R.C.: A note on Platt’s probabilistic outputs for support vector machines (2003), Available:
  7. 7.
    Platt, J.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Smola, A.J., Bartlett, P., Schölkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (1999)Google Scholar
  8. 8.
    Platt, J., Cristanini, N., Shawe-Taylor, J.: Large margin DAGs for multiclass classification. In: Advances in Neural Information Processing Systems, vol. 12, pp. 543–557. MIT Press, Cambridge (2000)Google Scholar
  9. 9.
    Rifkin, R., Klautau, A.: In defence of one-versus-all classificaiton. Journal of Machine Learning Research 5, 101–141 (2004)MathSciNetGoogle Scholar
  10. 10.
    Roth, V.: Probabilistic discriminant kernel classifiers for multi-class problems. In: Radig, B., Florczyk, S. (eds.) DAGM 2001. LNCS, vol. 2191, pp. 246–253. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Vapnik, V.: Statistical Learning Theory. Wiley Interscience, Hoboken (1998)zbMATHGoogle Scholar
  12. 12.
    Wu, T.-F., Lin, C.-J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research 5, 975–1005 (2004)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Kai-Bo Duan
    • 1
  • S. Sathiya Keerthi
    • 2
  1. 1.BioInformatics Research CentreNanyang Technological UniversitySingapore
  2. 2.Yahoo! Research LabsPasadenaUSA

Personalised recommendations