An Empirical Comparison of Hierarchical vs. Two-Level Approaches to Multiclass Problems

  • Suju Rajan
  • Joydeep Ghosh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3077)

Abstract

The Error Correcting Output Codes (ECOC) framework provides a powerful and popular method for solving multiclass problems using a multitude of binary classifiers. We had recently introduced [10] the Binary Hierarchical Classifier (BHC) architecture that addresses multiclass classification problems using a set of binary classifiers organized in the form of a hierarchy. Unlike ECOCs, the BHC groups classes according to their natural affinities in order to make each binary problem easier. However, it cannot exploit the powerful error correcting properties of an ECOC ensemble, which can provide good results even when the individual classifiers are weak. In this paper, we provide an empirical comparison of these two approaches on a variety of datasets, using well-tuned SVMs as the base classifiers. The results show that while there is no clear advantage to either technique in terms of classification accuracy, BHCs typically achieve this performance using fewer classifiers, and have the added advantage of automatically generating a hierarchy of classes. Such hierarchies often provide a valuable tool for extracting domain knowledge, and achieve better results when coarser granularity of the output space is acceptable.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, NY (1995)MATHGoogle Scholar
  2. 2.
    Hsu, C., Lin, C.: A Comparison of Methods for Multiclass Support Vector Machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)CrossRefGoogle Scholar
  3. 3.
    Nilsson, N.J.: Learning machines. McGraw-Hill, New York (1965)MATHGoogle Scholar
  4. 4.
    Fürnkranz, J.: Round Robin Classification. Journal of Machine Learning Research 2, 721–747 (2002)MATHCrossRefGoogle Scholar
  5. 5.
    Hastie, T., Tibshirani, R.: Classification by Pairwise Coupling. In: Hastie, T., Tibshirani, R. (eds.) Advances in Neural Information Processing Systems, vol. 10, The MIT Press, Cambridge (1998)Google Scholar
  6. 6.
    Dietterich, T.G., Bakiri, G.: Solving Multiclass Learning Problems via Error-Correcting Output Codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)MATHGoogle Scholar
  7. 7.
    Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers. In: Proc. 17th International Conf. on Machine Learning, pp. 9–16. Morgan Kaufmann, San Francisco (2000)Google Scholar
  8. 8.
    Crammer, K., Singer, Y.: On the Learnability and Design of Output Codes for Multiclass Problems. Computational Learning Theory, 35–46 (2000)Google Scholar
  9. 9.
    Rifkin, R., Klautau, A.: In Defense of One-Vs-All Classification. Journal of Machine Learning Research 5, 101–141 (2004)MathSciNetGoogle Scholar
  10. 10.
    Kumar, S., Ghosh, J., Crawford, M.M.: Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis. Pattern Analysis and Applications, spl. Issue on Fusion of Multiple Classifiers 5(2), 210–220 (2002)MATHMathSciNetGoogle Scholar
  11. 11.
    Morgan, T.J., Henneguelle, A., Ham, J., Ghosh, J., Crawford, M.M.: Adaptive Feature Spaces for Land Cover Classification with Limited Ground Truth Data. Kittler, J., Roli, F. (eds.) International Journal of Pattern Recognition and Artificial Intelligence (2004) (to appear)Google Scholar
  12. 12.
    Kumar, S., Ghosh, J.: GAMLS: A Generalized framework for Associative Modular Learning Systems. In: Application and Science of Computational Intelligence II, SPIE, vol. 3722, pp. 24–35 (1999)Google Scholar
  13. 13.
    Kittler, J., Ahmadyfard, A., Windridge, D.: Serial Multiple Classifier Systems Exploiting a Coarse to Fine Output Coding. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 96–104. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  14. 14.
    Sejnowski, T.J., Rosenberg, C.R.: Parallel Networks that learn to pronounce English text. Complex Systems 1, 145–168 (1987)MATHGoogle Scholar
  15. 15.
    Kong, E.B., Dietterich, T.G.: Error-Correcting Output Coding Corrects Bias and Variance. In: International Conference on Machine Learning, pp. 313–321 (1995)Google Scholar
  16. 16.
    Bose, R.C., Ray-Chauduri, D.K.: On a Class of Error Correcting Binary Group Codes. Information and Control (3), 68–79 (1960)Google Scholar
  17. 17.
    Tapia, E., Gonzalez, J.C., Garcia-Villalba, J.: Good Error Correcting Output Codes for Adaptive Multiclass Learning. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 156–165. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  18. 18.
    Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, University of California, Department of Information and Computer Science, Irvine, CA (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  19. 19.

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Suju Rajan
    • 1
  • Joydeep Ghosh
    • 1
  1. 1.Laboratory of Artificial Neural Systems Department of Electrical and Computer EngineeringThe University of Texas at AustinAustinUSA

Personalised recommendations