Evaluation of Feature Selection by Multiclass Kernel Discriminant Analysis

  • Tsuneyoshi Ishii
  • Shigeo Abe
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5998)


In this paper, we propose and evaluate the feature selection criterion based on kernel discriminant analysis (KDA) for multiclass problems, which finds the number of classes minus one eigenvectors. The selection criterion is the sum of the objective function of KDA, namely the sum of eigenvalues associated with the eigenvectors. In addition to the KDA criterion, we propose a new selection criterion that replaces the between-class scatter in KDA with the sum of square distances between all pairs of classes. To speed up backward feature selection, we introduce block deletion, which deletes many features at a timeC and to enhance generalization ability of the selected features we use cross-validation as a stopping condition.

By computer experiments using benchmark datasets, we show that the KDA criterion has performance comparable with that of the selection criterion based on the SVM-based recognition rate with cross-validation and can reduce computational cost. We also show that the KDA criterion can terminate feature selection stably using cross-validation as a stopping condition.


Support Vector Machine Feature Selection Recognition Rate Feature Selection Method Generalization Ability 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Somol, P., Pudil, P., Novovicǒvá, J., Paclík, P.: Adaptive floating search method in feature selection. Pattern Recognition Letters 20(11-13), 1157–1163 (1999)CrossRefGoogle Scholar
  2. 2.
    Abe, S.: Pattern Classification: Neuro-Fuzzy Methods and Their Comparison. Springer, London (2001)zbMATHGoogle Scholar
  3. 3.
    Thawonmas, R., Abe, S.: A novel approach to feature selection based on analysis of class regions. IEEE Transactions on Systems, Man, and Cybernetics—Part B 27(2), 196–207 (1997)CrossRefGoogle Scholar
  4. 4.
    Ashihara, M., Abe, S.: Feature selection based on kernel discriminant analysis. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. LNCS, vol. 4132, pp. 282–291. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Louw, N., Steel, S.J.: Variable selection in kernel Fisher discriminant analysis by means of recursive feature elimination. Computational Statistics & Data Analysis 51(3), 2043–2055 (2006)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Ishii, T., Ashihara, M., Abe, S.: Kernel discriminant analysis based feature selection. Neurocomputing 71(13-15), 2544–2552 (2008)CrossRefGoogle Scholar
  7. 7.
    Evgeniou, T., Pontil, M., Papageorgiou, C., Poggio, T.: Image representations for object detection using kernel classifiers. In: Proc. ACCV 2000, pp. 687–692 (2000)Google Scholar
  8. 8.
    Mukherjee, S., Tamayo, P., Slonim, D., Verri, A., Golub, T., Mesirov, J.P., Poggio, T.: Support vector machine classification of microarray data, Technical Report AI Memo 1677, Massachusetts Institute of Technology (1999)Google Scholar
  9. 9.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)zbMATHCrossRefGoogle Scholar
  10. 10.
    Perkins, S., Lacker, K., Theiler, J.: Grafting: Fast, incremental feature selection by gradient descent in function space. Journal of Machine Learning Research 3, 1333–1356 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Liu, Y., Zheng, Y.F.: FS_SFS: A novel feature selection method for support vector machines. Pattern Recognition 39(7), 1333–1345 (2006)zbMATHCrossRefGoogle Scholar
  12. 12.
    Wang, L.: Feature selection with kernel class separability. Pattern Analysis and Machine Intelligence 30(9), 1534–1546 (2008)CrossRefGoogle Scholar
  13. 13.
    Abe, S.: Modified backward feature selection by cross validation. In: Proc. ESANN 2005, pp. 163–168 (2005)Google Scholar
  14. 14.
    Nagatani, T., Abe, S.: Backward variable selection of support vector regressors by block deletion. In: Proc. IJCNN 2007, pp. 2117–2122 (2007)Google Scholar
  15. 15.
    Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Proc. ICML 1998, pp. 82–90 (1998)Google Scholar
  16. 16.
    Brown, M.: Exploring the set of sparse, optimal classifiers. In: Proc. ANNPR 2003, pp. 178–184 (2003)Google Scholar
  17. 17.
    Bo, L., Wang, L., Jiao, L.: Sparse Gaussian processes using backward elimination. In: Wang, J., Yi, Z., Żurada, J.M., Lu, B.-L., Yin, H. (eds.) ISNN 2006, Part 1. LNCS, vol. 3971, pp. 1083–1088. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  18. 18.
    Ishii, T., Abe, S.: Feature selection based on kernel discriminant analysis for multiclass problems. In: Proc. IJCNN 2008, pp. 2456–2461 (2008)Google Scholar
  19. 19.
    Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Computation 12(10), 2385–2404 (2000)CrossRefGoogle Scholar
  20. 20.
    Asuncion, A., Newman, D.J.: UCI machine learning repository (2007),
  21. 21.
    Cantú-Paz, E.: Feature subset selection, class separability, and genetic algorithms. In: Deb, K., et al. (eds.) GECCO 2004. LNCS, vol. 3102, pp. 959–970. Springer, Heidelberg (2004)Google Scholar
  22. 22.
    Wang, L., Chan, K.L.: Learning kernel parameters by using class separability measure. In: Sixth Kernel Machines Workshop, In conjunction with Neural Information Processing Systems, NIPS (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Tsuneyoshi Ishii
    • 1
  • Shigeo Abe
    • 1
  1. 1.Graduate School of EngineeringKobe UniversityKobeJapan

Personalised recommendations