Is Feature Selection Still Necessary?
Feature selection is usually motivated by improved computational complexity, economy and problem understanding, but it can also improve classification accuracy in many cases. In this paper we investigate the relationship between the optimal number of features and the training set size. We present a new and simple analysis of the well-studied two-Gaussian setting. We explicitly find the optimal number of features as a function of the training set size for a few special cases and show that accuracy declines dramatically by adding too many features. Then we show empirically that Support Vector Machine (SVM), that was designed to work in the presence of a large number of features produces the same qualitative result for these examples. This suggests that good feature selection is still an important component in accurate classification.
KeywordsSupport Vector Machine Feature Selection Optimal Number Polynomial Kernel Generalization Error
Unable to display preview. Download preview PDF.
- 1.Quinlan, J.R.: Induction of decision trees. In: Shavlik, J.W., Dietterich, T.G. (eds.) Readings in Machine Learning. Morgan Kaufmann, San Francisco (1990), Originally published in Machine Learning 1, 81–106 (1986)Google Scholar
- 2.Kira, K., Rendell, L.: A practical approach to feature selection. In: Proc. 9th International Workshop on Machine Learning, pp. 249–256 (1992)Google Scholar
- 3.Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Proceedings of the Ninth National Conference on Artificial Inte lligence (AAAI 1991), Anaheim, California, vol. 2, pp. 547–552. AAAI Press, Menlo Park (1991)Google Scholar
- 4.Koller, D., Sahami, M.: Toward optimal feature selection. In: International Conference on Machine Learning, pp. 284–292 (1996)Google Scholar
- 5.Gilad-Bachrach, R., Navot, A., Tishby, N.: Margin based feature selection - theory and algorithms. In: Proc. 21st International Conference on Machine Learning (ICML), pp. 337–344 (2004)Google Scholar
- 6.Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learnig Research, 1157–1182 (March 2003)Google Scholar
- 12.Boser, B., Guyon, I., Vapnik, V.: Optimal margin classifiers. In: Fifth Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)Google Scholar
- 14.Cawley, G.C.: MATLAB support vector machine toolbox (v0.55β), University of East Anglia, School of Information Systems, Norwich, Norfolk, U.K. NR4 7TJ (2000), http://theoval.sys.uea.ac.uk/~gcc/svm/toolbox