Evaluating Feature Selection for SVMs in High Dimensions

  • Roland Nilsson
  • José M. Peña
  • Johan Björkegren
  • Jesper Tegnér
Conference paper

DOI: 10.1007/11871842_72

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)
Cite this paper as:
Nilsson R., Peña J.M., Björkegren J., Tegnér J. (2006) Evaluating Feature Selection for SVMs in High Dimensions. In: Fürnkranz J., Scheffer T., Spiliopoulou M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science, vol 4212. Springer, Berlin, Heidelberg

Abstract

We perform a systematic evaluation of feature selection (FS) methods for support vector machines (SVMs) using simulated high- dimensional data (up to 5000 dimensions). Several findings previously reported at low dimensions do not apply in high dimensions. For example, none of the FS methods investigated improved SVM accuracy, indicating that the SVM built-in regularization is sufficient. These results were also validated using microarray data. Moreover, all FS methods tend to discard many relevant features. This is a problem for applications such as microarray data analysis, where identifying all biologically important features is a major objective.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Roland Nilsson
    • 1
  • José M. Peña
    • 1
  • Johan Björkegren
    • 2
  • Jesper Tegnér
    • 1
  1. 1.IFM Computational BiologyLinköping UniversityLinköpingSweden
  2. 2.Gustav V Research Institute, Karolinska InstituteStockholmSweden

Personalised recommendations