Advertisement

Improved Feature Selection Algorithm Based on SVM and Correlation

  • Zong-Xia Xie
  • Qing-Hua Hu
  • Da-Ren Yu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3971)

Abstract

As a feature selection method, support vector machines-recursive feature elimination (SVM-RFE) can remove irrelevance features but don’t take redundant features into consideration. In this paper, it is shown why this method can’t remove redundant features and an improved technique is presented. Correlation coefficient is introduced to measure the redundancy in the selected subset with SVM-RFE. The features which have a great correlation coefficient with some important feature are removed. Experimental results show that there actually are several strongly redundant features in the selected subsets by SVM-RFE. The coefficients are high to 0.99. The proposed method can not only reduce the number of features, but also keep the classification accuracy.

Keywords

Support Vector Machine Feature Selection Feature Subset Feature Selection Method Irrelevant Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Dash, M., Liu, H.: Feature Selection for Classification. Intelligent Data Analysis 1, 131–156 (1997)CrossRefGoogle Scholar
  2. 2.
    Kohavi, R., George, J.: Wrappers for Feature Subset Selection. Artificial Intelligence 97, 273–324 (1997)MATHCrossRefGoogle Scholar
  3. 3.
    Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)MATHCrossRefGoogle Scholar
  4. 4.
    Hu, Q., Yu, D., Xie, Z.: Information Preserving Hybrid Data Reduction Based on Fuzzy Rough Techniques. Pattern Recognition Letters (in press)Google Scholar
  5. 5.
    Liu, H., Yu, L., Dash, M., Motoda, H.: Active Feature Selection Using Classes. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637. Springer, Heidelberg (2003)Google Scholar
  6. 6.
    Guyon, I., Matic, N., Vapnik, V.: Discovering Informative Patterns and Data Cleaning. Advances in Knowledge Discovery and Data Mining, 181–203 (1996)Google Scholar
  7. 7.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene Selection for Cancer Classification Using Support Vector Machines. Machine Learning 46, 389–422 (2002)MATHCrossRefGoogle Scholar
  8. 8.
    Rakotomamonjy, A.: Variable Selection Using SVM-based Criteria. Journal of Machine Learning Research 3, 1357–1370 (2003)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Li, G., Yang, J., Liu, G., Li, X.: Feature Selection for Multi-class Problems Using Support Vector Machines. In: Zhang, C., W. Guesgen, H., Yeap, W.-K. (eds.) PRICAI 2004. LNCS (LNAI), vol. 3157, pp. 292–300. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Duan, K., Rajapakse, J.C., Wang, H., Francisco, A.: Multiple SVM-RFE for Gene Selection in Cancer Classification with Expression Data. IEEE Transactions on Nanobioscience, 228–234 (2005)Google Scholar
  11. 11.
    Yu, L., Liu, H.: Efficient Feature Selection via Analysis of Relevance and Redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)MathSciNetGoogle Scholar
  12. 12.
    Hsing, T., Liu, L., Brun, M., et al.: The Coefficient of Intrinsic Dependence (Feature Selection Using el CID). Pattern Recognition, 623–636 (2005)Google Scholar
  13. 13.
    Yao, K., Lu, W., Zhang, S., et al.: Feature Expansion and Feature Selection for General Pattern Recognition Problems. IEEE Int. Conf. Neural Networks and Signal Processing, 29–32 (2003)Google Scholar
  14. 14.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)Google Scholar
  15. 15.
    Blake, C., Keogh, E., Merz, C.: UCI Repository of Machine Learning Databases. Technical Report, Department of Information and Computer Science, University of California, Irvine, CA (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Zong-Xia Xie
    • 1
  • Qing-Hua Hu
    • 1
  • Da-Ren Yu
    • 1
  1. 1.Harbin Institute of TechnologyHarbinP.R. China

Personalised recommendations