A Modified Kolmogorov-Smirnov Correlation Based Filter Algorithm for Feature Selection

  • Pakkurthi Srinivasu
  • P. S. Avadhani
  • Suresh Chandra Satapathy
  • Tummala Pradeep
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 132)

Abstract

A feature selection is a technique of selecting a subset of relevant features from which the classification model can be constructed for a particular task. Feature selection is a preprocessing step of machine learning which is effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving results. In this paper, a modified Kolmogorov-Smirnov Correlation Based Filter algorithm for Feature Selection is proposed based on Kolmogorov-Smirnov statistic which uses class label information while comparing feature pairs. Results obtained from this algorithm are compared with two other algorithms, Correlation Feature Selection algorithm (CFS) and simple Kolmogorov Smirnov-Correlation Based Filter (KS-CBF), capable of removing irrelevancy and redundancy. The classification accuracy is achieved with the reduced feature set using the proposed approach with two of the standard classifiers such as the Decision-Tree classifier and the K-NN classifier.

Keywords

Feature Selection Information Gain Feature Subset Filter Model Redundant Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Chou, T., Yen, K., Luo, J., Pissinou, N., Makki, K.: Correlation Based Feature Selection for Intrusion Detection Design. In: IEEE Military Communications Conference, MILCOM 2007, pp. 1–7 (2007)Google Scholar
  2. 2.
    Hall, M.A., Smith, L.A.: Feature subset selection: A correlation based filter approach. In: Proc. Intl. Conf. Neural Inform. Processing Intell. Inform. Syst., pp. 855–858 (1997)Google Scholar
  3. 3.
    Bonev, B., Escolano, F., Cazorla, M.A.: A Novel Information Theory Method for Filter Feature Selection. In: Gelbukh, A., Kuri Morales, Á.F. (eds.) MICAI 2007. LNCS (LNAI), vol. 4827, pp. 431–440. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  4. 4.
    Chou, T.-S.: Ensemble Fuzzy Belief Intrusion Detection Design Thesis. Florida International University, Miami (2007)Google Scholar
  5. 5.
    Bancarz, I.: Conditional Entropy Metrics for Feature Selection, University of Edinburgh, College of Science and Engineering, School of Informatics (June 2005)Google Scholar
  6. 6.
    Blachnik, M., Duch, W., Kachel, A., Biesiada, J.: Feature Selection for Supervised Classification: A Kolmogorov-Smirnov Class Correlation-Based Filter. In: AIMeth, Symposium On Methods Of Artificial Intelligence, Gliwice, Poland, (November 10-19, 2009)Google Scholar
  7. 7.
    Duch, W., Biesiada, J.: Feature Selection for High-Dimensional Data: A KolmogorovSmirnov Correlation-Based Filter Solution. In: Advances in Soft Computing, pp. 95–104. Springer, Heidelberg (2005)Google Scholar
  8. 8.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. (2006)Google Scholar
  9. 9.
    The Kolmogorov-Smirnov Test When Parameters are estimated from data: Hovhannes Keutelian, FermilabGoogle Scholar
  10. 10.
    Webera, M.D., Leemisa, L.M., Kincaida, R.K.: Minimum Kolmogorov-Smirnov test Statistic Parameter Estimates. Journal of Statistical Computation and Simulation 76(3), 195–206 (2006)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Yu, L., Liu, H.: Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution. In: Proceedings of the Twentieth International Conference on Machine Leaning, Washington, D.C, pp. 856–863Google Scholar
  12. 12.
    Biesiada, J., Duch, W.: Feature Selection for High-Dimensional Data: A Kolmogorov-Smirnov Correlation-Based Filter. In: CORES, pp. 95–103 (2005)Google Scholar
  13. 13.
    Hall, M.A.: Correlation-based Feature Subset Selection for Machine Learning. PhD dissertation, Department of Computer Science. University of Waikatoa (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Pakkurthi Srinivasu
    • 1
  • P. S. Avadhani
    • 2
  • Suresh Chandra Satapathy
    • 1
  • Tummala Pradeep
    • 3
  1. 1.Department of CSEANITSVisakhapatnamIndia
  2. 2.Department of CS&SEAndhra UniversityVisakhapatnamIndia
  3. 3.Department of CSEBITRanchiIndia

Personalised recommendations