Advertisement

A Unified Strategy of Feature Selection

  • Peng Liu
  • Naijun Wu
  • Jiaxian Zhu
  • Junjie Yin
  • Wei Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)

Abstract

In the field of data mining (DM), feature selection is one of the basic strategies handling with high-dimensionality problems. This paper makes a review of current methods of feature selection and proposes a unified strategy of feature selection, which divides overall procedures of feature selection into two stages, first to determine the FIF (Feature Important Factor) of features according to DM tasks, second to select features according to FIF. For classifying problems, we propose a new method for determining FIF based on decision trees and provide practical suggestion for feature selection. Through analysis on experiments conducted on UCI datasets, such a unified strategy of feature selection is proven to be effective and efficient.

Keywords

Feature Selection Prediction Accuracy Feature Subset Feature Selection Method Independent Component Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kantardzic, M.: Data Mining Concepts, Models, Methods, and Algorithms. A John Wiley & Sons, Inc., Chichester (2003)MATHGoogle Scholar
  2. 2.
    Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education, Inc., London (2006)Google Scholar
  3. 3.
    Dash, M.: Feature Selection for Classification. Intelligent Data Analysis 1, 131–156 (1997)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Das, S.: Filters, Wrappers and A Boosting Based Hybrid for Feature Selection. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 74–81 (2001)Google Scholar
  5. 5.
    Ratanamahatana, C.A., Gunopulos, D.: Feature Selection for the Naive Bayesian Classifier Using Decision Trees. Applied Artificial Intelligence 17(5–6), 475–487 (2003)CrossRefGoogle Scholar
  6. 6.
    Liu, P.: R-C4.5: A Robust Decision Tree Improved Model. In: Proceedings of ISICA 2005 (The International Symposium on Intelligent Computation and Its Application), Progress in Intelligent Computation and Its Applications, Wuhan, China, pp. 454–459 (2005)Google Scholar
  7. 7.
    Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Datasets (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  8. 8.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, London (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Peng Liu
    • 1
  • Naijun Wu
    • 1
  • Jiaxian Zhu
    • 1
  • Junjie Yin
    • 1
  • Wei Zhang
    • 1
  1. 1.School of Information Management and EngineeringShanghai University of Finance and EconomicsShanghaiP.R. China

Personalised recommendations