A Divergence Criterion for Classifier-Independent Feature Selection
Feature selection aims to find the most important feature subset from a given feature set without degradation of discriminative information. In general, we wish to select a feature subset that is effective for any kind of classifier. Such studies are called Classifier-Independent Feature Selection, and Novovičová et al.’s method is one of them. Their method estimates the densities of classes with Gaussian mixture models, and selects a feature subset using Kullback-Leibler divergence between the estimated densities, but there is no indication how to choose the number of features to be selected. Kudo and Sklansky (1997) suggested the selection of a minimal feature subset such that the degree of degradation of performance is guaranteed. In this study, based on their suggestion, we try to find a feature subset that is minimal while maintainig a given Kullback-Leibler divergence.
KeywordsFeature Selection Recognition Rate Gaussian Mixture Model Feature Subset Minimum Description Length
- 4.Holz, H.J., and Loew, M.H.: Relative Feature Importance: A Classifier-Independent Approach to Feature Selection. In: Gelsema E.S. and Kanal L.N. (eds.) Pattern Recognition in Practice IV Amsterdam: Elsevier (1994) 473–487Google Scholar
- 8.Kudo, M., and Sklansky, J.: A Comparative Evaluation of Medium-and Large-Scale Feature Selectors for Pattern Classifiers. In: 1st International Workshop on Statistical Techniques in Pattern Recognition Prague Czech Republic (1997) 91–96Google Scholar
- 12.Ichimura, N.: Robust Clustering Based on a Maximum Likelihood Method for Estimation of the Suitable Number of Clusters. The Transactions of the Institute of Electronics Information and Communication Engineers 8(1995) 1184–1195 (in Japanese)Google Scholar
- 13.Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann San Mateo CA (1993)Google Scholar
- 14.Murphy, P.M., and Aha, D.W.: UCI Repository of machine learning databases [Machine-readable data repository]. University of California Irnive, Department of Information and Computation Science (1996)Google Scholar