Supervised Feature Extraction Using Hilbert-Schmidt Norms

  • P. Daniušis
  • P. Vaitkus
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5788)


We propose a novel, supervised feature extraction procedure, based on an unbiased estimator of the Hilbert-Schmidt independence criterion (HSIC). The proposed procedure can be directly applied to single-label or multi-label data, also the kernelized version can be applied to any data type, on which a positive definite kernel function has been defined. Computer experiments with various classification data sets reveal that our approach can be applied more efficiently than the alternative ones.


Feature Extraction Linear Discriminant Analysis Feature Extraction Method Reproduce Kernel Hilbert Space Kernel Trick 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Borgwardt, K.M.: Graph kernels. Doctoral dissertation. München (2007)Google Scholar
  2. 2.
    Daniušis, P., Vaitkus, P.: Kernel regression on matrix patterns. Lithuanian Mathematical Journal. Spec. edition 48-49, 191–195 (2008)zbMATHGoogle Scholar
  3. 3.
    Gärtner, T.: A survey of kernels for structured data. SIGKDD Explorations 5(1), 49–58 (2003)CrossRefGoogle Scholar
  4. 4.
    Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert-Schmidt norms. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 63–77. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179–188 (1936)CrossRefGoogle Scholar
  6. 6.
    Fukumizu, K., Bach, F.R., Jordan, M.I.: Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. Journal of Machine Learning Research 5, 73–99 (2004)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Hein, M., Bousquet, O.: Kernels, associated structures and generalizations. Tech. report (2004)Google Scholar
  8. 8.
    Jolliffe, I.T.: Principal component analysis. Springer, Berlin (1986)CrossRefzbMATHGoogle Scholar
  9. 9.
    Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChe journal 37, 233–243 (1991)CrossRefGoogle Scholar
  10. 10.
    Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)zbMATHGoogle Scholar
  11. 11.
    Song, L., Smola, A., Gretton, A., Borgwardt, K., Bedo, J.: Supervised feature selection via dependence estimation. In: Proc. Intl. Conf. Machine Learning, pp. 823–830. Omnipress (2007)Google Scholar
  12. 12.
    Song, L., Smola, A., Borgwardt, K., Gretton, A.: Colored maximum variance unfolding. In: NIPS 20, pp. 1385–1392 (2008)Google Scholar
  13. 13.
    Zhang, Y., Zhi-Hua, Z.: Multi-label dimensionality reduction via dependence maximization. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008)Google Scholar
  14. 14.
    Zhang, M.-L., Zhou, Z.-H.: ML-kNN: a lazy learning approach to multi-label learning. Pattern Recognition 40(7), 2038–2048 (2007)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • P. Daniušis
    • 1
    • 2
  • P. Vaitkus
    • 1
  1. 1.Vilnius UniversityVilniusLithuania
  2. 2.Vilnius Management AcademyVilniusLithuania

Personalised recommendations