Abstract
With the proliferation of extremely high-dimensional data, feature selection algorithms have become indispensable components of the learning process. Strangely, despite extensive work on the stability of learning algorithms, the stability of feature selection algorithms has been relatively neglected. This study is an attempt to fill that gap by quantifying the sensitivity of feature selection algorithms to variations in the training set. We assess the stability of feature selection algorithms based on the stability of the feature preferences that they express in the form of weights-scores, ranks, or a selected feature subset. We examine a number of measures to quantify the stability of feature preferences and propose an empirical way to estimate them. We perform a series of experiments with several feature selection algorithms on a set of proteomics datasets. The experiments allow us to explore the merits of each stability measure and create stability profiles of the feature selection algorithms. Finally, we show how stability profiles can support the choice of a feature selection algorithm.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alon U, Barkai N, Notterman D, Gish K, Ybarra S, Mack D, Levine A (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12):6745–6750
Domingos P (2000) A unified bias-variance decomposition and its applications. In: Langley P (ed) Proceedings of the seventeenth international conference on machine learning. Morgan Kaufmann, San Fransisco, pp 231–238
Domingos P (2000) A unified bias-variance decomposition for zero-one and squared loss. In: Proceedings of the seventeenth national conference on artificial intelligence. AAAI Press, Melno, pp 564–569
Duda R, Hart P, Stork D (2001) Pattern classification and scene analysis. Wiley, New York
Fayyad U, Irani K (1993) Multi-interval discretization of continuous attributes as preprocessing for classification learning. In: Bajcsy R (ed) Proceedings of the 13th international joint conference on artificial intelligence. Morgan Kaufmann, San Fransisco, pp 1022–1027
Geman S, Bienenstock E, Doursat R (1992) Neural networks and the bias/variance dilemma. Neural Comput 4:1–58
Golub T, Slonim D, Tamayo P, Huard C, Gaasenbeek M, Mesirov J, Coller H, Loh M, Downing J, Caligiuri M, Bloomfield C, Lander E (1999) Molecular classification of cancer: class discovery and class prediction by gene expression. Science 286:531–537
Guyon I, Weston J, Barnhill S, Vladimir V (2002) Gene selection for cancer classification using support vector machines. Machine Learn 46(1–3):389–422
Hall M, Holmes G (2003) Benchmarking attribute selection techniques for discere class data mining. IEEE Trans Knowl Data Eng 15(3)
Mitchel A, Divoli A, Kim JH, Hilario M, Selimas I, Attwood T (2005) Metis: multiple extraction techniques for informative sentences. Bioinformatics 21:4196–4197
Petricoin E, Ardekani A, Hitt B, Levine P, Fusaro V, Steinberg S, Mills G, Simone C, Fishman D, Kohn E, Liotta L (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 395:572–577
Petricoin E, Ornstein D, Paweletz C, Ardekani A, Hackett P, Hitt B, Velassco A, Trucco C, Wiegand L, Wood K, Simone C, Levine P, Marston Linehan W, Emmert-Buck M, Steinberg S, Kohn E, Liotta L (2002) Serum proteomic patterns for detection of prostate cancer. J NCI 94(20)
Pomeroy S, Tamayo P, Gaasenbeek M, Sturla L, Angelo M, McLaughlin M, Kim J, Goumnerova L, Black P, Lau C, Allen J, Zagzag D, Olson J, Curran T, Wetmore C, Biegel J, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis D, Mesirov J, Lander E, Golub T (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870):436–442
Prados J, Kalousis A, Sanchez JC, Allard L, Carrette O, Hilario M (2004) Mining mass spectra for diagnosis and biomarker discovery of cerebral accidents. Proteomics 4(8):2320–2332
Robnik-Sikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Machine Learn 53(1–2):23–693
Turney P (1995) Technical note: bias and the quantification of stability. Machine Learn 20:23–33
Witten I, Frank E (1999) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Fransisco
Author information
Authors and Affiliations
Corresponding author
Additional information
Alexandros Kalousis received the B.Sc. degree in computer science, in 1994, and the M.Sc. degree in advanced information systems, in 1997, both from the University of Athens, Greece. He received the Ph.D. degree in meta-learning for classification algorithm selection from the University of Geneva, Department of Computer Science, Geneva, in 2002. Since then he is a Senior Researcher in the same university. His research interests include relational learning with kernels and distances, stability of feature selection algorithms, and feature extraction from spectral data.
Julien Prados is a Ph.D. student at the University of Geneva, Switzerland. In 1999 and 2001, he received the B.Sc. and M.Sc. degrees in computer science from the University Joseph Fourier (Grenoble, France). After a year of work in industry, he joined the Geneva Artificial Intelligence Laboratory, where he is working on bioinformatics and datamining tools for mass spectrometry data analysis.
Melanie Hilario has a Ph.D. in computer science from the University of Paris VI and currently works at the University of Geneva’s Artificial Intelligence Laboratory. She has initiated and participated in several European research projects on neuro-symbolic integration, meta-learning, and biological text mining. She has served on the program committees of many conferences and workshops in machine learning, data mining, and artificial intelligence. She is currently an Associate Editor of theInternational Journal on Artificial Intelligence Toolsand a member of the Editorial Board of theIntelligent Data Analysis journal.
Rights and permissions
About this article
Cite this article
Kalousis, A., Prados, J. & Hilario, M. Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12, 95–116 (2007). https://doi.org/10.1007/s10115-006-0040-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-006-0040-8