Knowledge and Information Systems

, Volume 12, Issue 1, pp 95–116

Stability of feature selection algorithms: a study on high-dimensional spaces

  • Alexandros Kalousis
  • Julien Prados
  • Melanie Hilario
Regular Paper

DOI: 10.1007/s10115-006-0040-8

Cite this article as:
Kalousis, A., Prados, J. & Hilario, M. Knowl Inf Syst (2007) 12: 95. doi:10.1007/s10115-006-0040-8

Abstract

With the proliferation of extremely high-dimensional data, feature selection algorithms have become indispensable components of the learning process. Strangely, despite extensive work on the stability of learning algorithms, the stability of feature selection algorithms has been relatively neglected. This study is an attempt to fill that gap by quantifying the sensitivity of feature selection algorithms to variations in the training set. We assess the stability of feature selection algorithms based on the stability of the feature preferences that they express in the form of weights-scores, ranks, or a selected feature subset. We examine a number of measures to quantify the stability of feature preferences and propose an empirical way to estimate them. We perform a series of experiments with several feature selection algorithms on a set of proteomics datasets. The experiments allow us to explore the merits of each stability measure and create stability profiles of the feature selection algorithms. Finally, we show how stability profiles can support the choice of a feature selection algorithm.

Keywords

Feature selection High dimensionality Feature stability 

Copyright information

© Springer-Verlag London Limited 2006

Authors and Affiliations

  • Alexandros Kalousis
    • 1
  • Julien Prados
    • 1
  • Melanie Hilario
    • 1
  1. 1.Computer Science DepartmentUniversity of GenevaGenevaSwitzerland

Personalised recommendations