Unsupervized Data-Driven Partitioning of Multiclass Problems

  • Hernán C. Ahumada
  • Guillermo L. Grinblat
  • Pablo M. Granitto
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6791)

Abstract

Many classification problems of high technological value are multiclass. In the last years, several improved solutions based on the combination of simple classifiers were introduced. An interesting kind of methods creates a hierarchy of sub-problems by clustering prototypes of each one of the classes, but the solution produced by the clustering stage is heavily influenced by the label’s information. In this work we introduce a new strategy to solve multiclass problems that makes more use of spatial information than other methods. Based on our previous work on imbalanced problems, we construct a hierarchy of subproblems, but opposite to previous developments, based only on spatial information and not using class labels at any time. We consider different clustering methods (either agglomerative or divisive) for this task. We use an SVM for each sub-problem (if needed, because in several cases the clustering method directly gives a subset with samples of a single class). Using publicly available datasets we compare the new method with several previous approaches, finding promising results.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ahumada, H., Grinblat, G., Uzal, L., Granitto, P., Ceccatto, A.: Repmac: A new hybrid approach to highly imbalanced classification problems. In: 8th Int. Conference on Hybrid Intelligent Systems. pp. 386–391 (2008)Google Scholar
  2. 2.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
  3. 3.
    Bahl, L., Jelinek, F., Mercer, R.: A maximum likelihood approach to continuous speech recognition. IEEE T. Pattern Anal. (2), 179–190 (2009)Google Scholar
  4. 4.
    Benabdeslem, K., Bennani, Y.: Dendogram-based SVM for Multi-Class Classification. J. Comput. Inform. Tech. 14(4), 283–289 (2006)CrossRefGoogle Scholar
  5. 5.
    Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J Mach. Learn. Res. 2, 265–292 (2002)MATHGoogle Scholar
  6. 6.
    Cristianini, N., Shawe–Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)MATHGoogle Scholar
  7. 7.
    Fei, B., Liu, J.: Binary tree of SVM: a new fast multiclass training and classification algorithm. IEEE T. Neural Networ. 17(3), 696–704 (2006)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Freund, Y., Schapire, R.: A desicion-theoretic generalization of on-line learning and an application to boosting. In: Proc. of COLT, pp. 23–37 (1995)Google Scholar
  9. 9.
    Granitto, P., Verdes, P., Ceccatto, H.: Large-scale investigation of weed seed identification by machine vision. Comput. Electron. Agr. 47(1), 15–24 (2005)CrossRefMATHGoogle Scholar
  10. 10.
    Hastie, T., Tibshirani, R.: Classification by pairwise coupling. Ann. Stat. 26(2), 451–471 (1998)CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE T. Neural Networ. 13(2), 415–425 (2002)CrossRefGoogle Scholar
  12. 12.
    King, B.: Step-wise clustering procedures. J. Am. Stat. Assoc. 69, 86–101 (1967)CrossRefGoogle Scholar
  13. 13.
    Liu, S., Yi, H., Chia, L.T., Rajan, D.: Adaptive hierarchical multi-class SVM classifier for texture-based image classification. In: IEEE Int. Conf. on Multimedia and Expo, pp. 1–4 (2005)Google Scholar
  14. 14.
    Lorena, A.C., Carvalho, A.C., Gama, J.M.: A review on the combination of binary classifiers in multiclass problems. Artif. Intell. Rev. 30, 19–37 (2008)CrossRefGoogle Scholar
  15. 15.
    McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. of the Fifth Berkeley Symposium on Mathematics, Statistics and Probability, pp. 281–297 (1967)Google Scholar
  16. 16.
    Platt, J., Cristianini, N., Shawe-Taylor, J.: Large margin dags for multiclass classification. In: Adv. in Neural Information Processing Systems, vol. 12, pp. 547–553 (2000)Google Scholar
  17. 17.
    Ramaswamy, S., Tamayo, P., Rifkin, R., et al.: Multiclass cancer diagnosis using tumor gene expression signatures. P. Natl. Acad. Sci. USA 98(26), 15149 (2001)CrossRefGoogle Scholar
  18. 18.
    Rifkin, R., Klautau, A.: In defense of one-vs-all classification. J. Mach. Learn. Res. 5, 101–141 (2004)MATHMathSciNetGoogle Scholar
  19. 19.
    Sneath, P.H.A., Sokal, R.R.: Numerical Taxonomy. W.H. Freeman and Company, San Francisco (1973)MATHGoogle Scholar
  20. 20.
    Songsiri, P., Kijsirikul, B., Phetkaew, T.: Information-based dichotomization: A method for multiclass Support Vector Machines. In: IEEE Int. Joint Conference on Neural Networks, pp. 3284–3291 (2008)Google Scholar
  21. 21.
    Weston, J., Watkins, C.: Support vector machines for multi-class pattern recognition. In: 7th European Symposium On Art. Neural Networks, pp. 4–6 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Hernán C. Ahumada
    • 1
  • Guillermo L. Grinblat
    • 1
  • Pablo M. Granitto
    • 1
  1. 1.CIFASIS, French Argentine International Center for Information and Systems SciencesUPCAM (France) / UNR–CONICET (Argentina)RosarioArgentina

Personalised recommendations