The Effect of Dimensionality Reduction on Large Scale Hierarchical Classification

  • Aris Kosmpoulos
  • Georgios Paliouras
  • Ion Androutsopoulos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8685)


Many classification problems are related to a hierarchy of classes, that can be exploited in order to perform hierarchical classification of test objects. The most basic way of hierarchical classification is that of cascade classification, which greedily traverses the hierarchy from root to the predicted leaf. In order to perform cascade classification, a classifier must be trained for each node of the hierarchy. In large scale problems, the number of features can be prohibitively large for the classifiers in the upper levels of the hierarchy. It is therefore desirable to reduce the dimensionality of the feature space at these levels. In this paper we examine the computational feasibility of the most common dimensionality reduction method (Principal Component Analysis) for this problem, as well as the computational benefits that it provides for cascade classification and its effect on classification accuracy. Our experiments on two benchmark datasets with a large hierarchy show that it is possible to perform a certain version of PCA efficiently in such large hierarchies, with a slight decrease in the accuracy of the classifiers. Furthermore, we show that PCA can be used selectively at the top levels of the hierarchy in order to decrease the loss in accuracy. Finally, the reduced feature space, provided by the PCA, facilitates the use of more costly and possibly more accurate classifiers, such as non-linear SVMs.


Hierarchical Classification Dimensionality Reduction Principal Component Analysis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kosmopoulos, A., Gaussier, É., Paliouras, G., Aseervatham, S.: The ECIR 2010 large scale hierarchical classification workshop. In: SIGIR Forum. vol. 44, pp. 23–32 (2010)Google Scholar
  2. 2.
    Roweis, S.: EM Algorithms for PCA and SPCA. In: Advances in Neural Information Processing Systems, pp. 626–632 (1998)Google Scholar
  3. 3.
    Van der Maaten, L.J.P., Postma, E.O., van den Herik, H.J.: Dimensionality reduction: A comparative review. Journal of Machine Learning Research 10, 66–71 (2009)Google Scholar
  4. 4.
    Patridge, M., Calvo, R.: Fast dimensionality reduction and Simple PCA. Intelligent Data Analysis 2, 292–298 (1997)Google Scholar
  5. 5.
    Oja, E.: Principal components, minor components, and linear neural networks. In: Neural Networks, pp. 927–935 (1992)Google Scholar
  6. 6.
    Lehoucq, R.B., Sorensen, D.C., Yang, C.: ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods. Software, Environments, and Tools 6 (1998)Google Scholar
  7. 7.
    Grbovic, M., Dance, R.C., Vucetic, S.: Sparse Principal Component Analysis with Constraints. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)Google Scholar
  8. 8.
    Perronnin, F., Liu, Y., Sánchez, J., Poirier, H.: Large-scale image retrieval with compressed Fisher vectors. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, pp. 3384–3391 (2010)Google Scholar
  9. 9.
    Dekel, O., Keshet, J., Singer, Y.: Large margin hierarchical classification. In: ICML 2004: Proceedings of the Twenty First International Conference on Machine Learning, p. 27 (2004)Google Scholar
  10. 10.
    Yang, Y., Liu, X.: A re-examination of text categorization methods, pp. 42–49. ACM Press (1999)Google Scholar
  11. 11.
    Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008)zbMATHGoogle Scholar
  12. 12.
    Venables, K.V., Ripley, B.D.: Modern Applied Statistics with S. Springer (2002)Google Scholar
  13. 13.
    Setiono, R., Liu, H.: Chi2: Feature selection and discretization of numeric attributes. In: Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence (1995)Google Scholar
  14. 14.
    Liu, T., Yang, Y., Wan, H., Zeng, H., Chen, Z., Ma, W.: Support Vector Machines Classification with a Very Large-scale Taxonomy. In: SIGKDD Explor. Newsl., pp. 36–43 (2005)Google Scholar
  15. 15.
    Chang, C., Lin, C.: LIBSVM: a library for support vector machines. In: ACM Transactions on Intelligent Systems and Technology (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Aris Kosmpoulos
    • 1
    • 2
  • Georgios Paliouras
    • 1
  • Ion Androutsopoulos
    • 2
  1. 1.Institute of Informatics and TelecommunicationsNational Center for Scientific Research “Demokritos”9AthensGreece
  2. 2.Department of InformaticsAthens University of Economics and BusinessGreece

Personalised recommendations