A Divergence Criterion for Classifier-Independent Feature Selection

  • Naoto Abe
  • Mineichi Kudo
  • Jun Toyama
  • Masaru Shimbo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1876)


Feature selection aims to find the most important feature subset from a given feature set without degradation of discriminative information. In general, we wish to select a feature subset that is effective for any kind of classifier. Such studies are called Classifier-Independent Feature Selection, and Novovičová et al.’s method is one of them. Their method estimates the densities of classes with Gaussian mixture models, and selects a feature subset using Kullback-Leibler divergence between the estimated densities, but there is no indication how to choose the number of features to be selected. Kudo and Sklansky (1997) suggested the selection of a minimal feature subset such that the degree of degradation of performance is guaranteed. In this study, based on their suggestion, we try to find a feature subset that is minimal while maintainig a given Kullback-Leibler divergence.


Feature Selection Recognition Rate Gaussian Mixture Model Feature Subset Minimum Description Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Yu, B., and Yuan, B.: A More Effecient Branch and Bound Algorithm for Feature Selection. Pattern Recognition 26(1993) 883–889CrossRefGoogle Scholar
  2. 2.
    Siedlecki, W., and Sklansky, S.: A Note on Genetic Algorithms for Large-Scale Feature Selection. Pattern Recognition Letters 10(1989) 335–347zbMATHCrossRefGoogle Scholar
  3. 3.
    Pudil, P., Novovičová, J., and Kittler, J.: Floating Search Methods in Feature Selection. Pattern Recognition Letters 15(1994) 1119–1125CrossRefGoogle Scholar
  4. 4.
    Holz, H.J., and Loew, M.H.: Relative Feature Importance: A Classifier-Independent Approach to Feature Selection. In: Gelsema E.S. and Kanal L.N. (eds.) Pattern Recognition in Practice IV Amsterdam: Elsevier (1994) 473–487Google Scholar
  5. 5.
    Kudo, M., and Shimbo, M.: Feature Selection Based on the Structural Indices of Categories. Pattern Recognition 26(1993) 891–901CrossRefGoogle Scholar
  6. 6.
    Novovičová, J., Pudil, P., and Kittler, J.: Divergence Based Feature Selection for Multimodal Class Densities. IEEE Transactions on Pattern Analysis and Machine Intelligence 18(1996) 218–223CrossRefGoogle Scholar
  7. 7.
    Boekee, D.E., and Van der Lubbe, J.C.A.: Some Aspects of Error Bounds in Feature Selection. Pattern Recognition 11(1979) 353–360zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Kudo, M., and Sklansky, J.: A Comparative Evaluation of Medium-and Large-Scale Feature Selectors for Pattern Classifiers. In: 1st International Workshop on Statistical Techniques in Pattern Recognition Prague Czech Republic (1997) 91–96Google Scholar
  9. 9.
    Kudo, M., and Sklansky, J.: Comparison of Algorithms that Select Features for Pattern Classifiers. Pattern Recogntion 33–1(2000) 25–41CrossRefGoogle Scholar
  10. 10.
    Kudo, M., and Sklansky, J.: Classifier-Independent Feature Selection for Two-Stage Feature Selection. Advances in Pattern Recognition, Lecture Notes in Computer Science 1451(1998) 548–554CrossRefGoogle Scholar
  11. 11.
    Dempster, A.P., Laird, N.M., and Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of Royal Statistical Society 39(1977) 1–38zbMATHMathSciNetGoogle Scholar
  12. 12.
    Ichimura, N.: Robust Clustering Based on a Maximum Likelihood Method for Estimation of the Suitable Number of Clusters. The Transactions of the Institute of Electronics Information and Communication Engineers 8(1995) 1184–1195 (in Japanese)Google Scholar
  13. 13.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann San Mateo CA (1993)Google Scholar
  14. 14.
    Murphy, P.M., and Aha, D.W.: UCI Repository of machine learning databases [Machine-readable data repository]. University of California Irnive, Department of Information and Computation Science (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Naoto Abe
    • 1
  • Mineichi Kudo
    • 1
  • Jun Toyama
    • 1
  • Masaru Shimbo
    • 1
  1. 1.Division of Systems and Information Engineering, Graduate School of EngineeringHokkaido UniversitySapporoJapan

Personalised recommendations