A Unified Strategy of Feature Selection
In the field of data mining (DM), feature selection is one of the basic strategies handling with high-dimensionality problems. This paper makes a review of current methods of feature selection and proposes a unified strategy of feature selection, which divides overall procedures of feature selection into two stages, first to determine the FIF (Feature Important Factor) of features according to DM tasks, second to select features according to FIF. For classifying problems, we propose a new method for determining FIF based on decision trees and provide practical suggestion for feature selection. Through analysis on experiments conducted on UCI datasets, such a unified strategy of feature selection is proven to be effective and efficient.
KeywordsFeature Selection Prediction Accuracy Feature Subset Feature Selection Method Independent Component Analysis
Unable to display preview. Download preview PDF.
- 2.Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education, Inc., London (2006)Google Scholar
- 4.Das, S.: Filters, Wrappers and A Boosting Based Hybrid for Feature Selection. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 74–81 (2001)Google Scholar
- 6.Liu, P.: R-C4.5: A Robust Decision Tree Improved Model. In: Proceedings of ISICA 2005 (The International Symposium on Intelligent Computation and Its Application), Progress in Intelligent Computation and Its Applications, Wuhan, China, pp. 454–459 (2005)Google Scholar
- 7.Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Datasets (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
- 8.Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, London (2005)Google Scholar