Abstract
Feature selection is a search problem for an “optimal” subset of features. The class separability is normally used as one of the basic feature selection criteria. Instead of maximizing the class separability as in the literature, this work adopts a criterion aiming to maintain the discriminating power of the data. After examining the pros and cons of two existing algorithms for feature selection, we propose a hybrid algorithm of probabilistic and complete search that can take advantage of both algorithms. It begins by running LVF (probabilistic search) to reduce the number of features; then it runs “Automatic Branch & Bound (ABB)” (complete search). By imposing a limit on the amount of time this algorithm can run, we obtain an approximation algorithm. The empirical study suggests that dividing the time equally between the two phases yields nearly the best performance, and that the hybrid search algorithm substantially outperforms earlier methods in general.
Preview
Unable to display preview. Download preview PDF.
References
H. Almuallim and T.G. Dietterich. Learning with many irrelevant features. In Proceedings of the Ninth National Conference on Artificial Intelligence, pages 547–552, Anaheim, California, 1991, AAAI Press/The MIT Press, Menlo Park, California.
A.L. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth. Occam’s razor. In J.W. Shavlik and T.G. Dietterich, editors, Readings in Machine Learning, pages 201–204. Morgan Kaufmann, 1990.
G. Brassard and P. Bratley. Fundamentals of Algorithms. Prentice Hall, New Jersey, 1996.
M. Dash and H. Liu. Feature selection methods for classifications. Intelligent Data Analysis: An International Journal, 1(3), 1997. http://www-east.elsevier.com/ida/free.htm.
P.A. Devijver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice Hall International, 1982.
K. Kira and L.A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 129–134. Menlo Park: AAAI Press/The MIT Press, 1992.
R. Kohavi. Wrappers for performance enhancement and oblivious decision graphs. PhD thesis, Department of Computer Science, Standford University, Stanford, CA, 1995.
D. Koller and M. Sahami. Toward optimal feature selection. In L. Saitta, editor, Machine Learning: Proceedings of the Thirteenth International Conference (ICML-96), July 3–6, 1996, pages 284–292, Bari, Italy, 1996 San Francisco: Morgan Kaufmann Publishers.
I. Kononenko. Estimating attributes: Analysis and extension of RELIEF. In F. Bergadano and L. De Raedt, editors, Proceedings of the European Conference on Machine Learning, April 6–8, pages 171–182, Catania, Italy, 1994. Berlin: Springer-Verlag.
P. Langley. Selection of relevant features in machine learning. In Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, 1994.
H. Liu and H. Motoda. Feature Selection for Knowledge Discovery Data Mining. Boston: Kluwer Academic Publishers, 1998.
H. Liu, H. Motoda, and M. Dash. A monotonic measure for optimial feature selection. In C. Nedellec and C. Rouveirol, editors, Machine Learning: ECML-98, April 21–23, 1998, pages 101–106, Chemnitz, Germany, April 1998. Berlin Heidelberg: Springer-Verlag.
H. Liu and R. Setiono. A probabilistic approach to feature selection—a filter solution. In L. Saitta, editor, Proceedings of International Conference on Machine Learning (ICML-96), July 3–6, 1996, pages 319–327, Bari, Italy, 1996. San Francisco: Morgan Kaufmann Publishers, CA.
C.J. Merz and P.M. Murphy. UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html Irvine, CA: University of California, Department of Information and Computer Science, 1996.
P.M. Narendra and K. Fukunaga. A branch and bound algorithm for feature subset selection. IEEE Trans. on Computer, C-26(9):917–922, September 1977.
J.R. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
T.W. Rauber. Inductive Pattern Classification Methods-Features-Sensors. PhD thesis, Dept. of Electrical Engineering, Universidade Nova de Lisboa, Lisboa, 1994.
J. C. Schlimmer. Efficiently inducing determinations: a complete and systematic search algorithm that uses optimal pruning. In Proceedings of the Tenth International Conference on Machine Learning, pages 284–290, 1993.
W. Siedlecki and J. Sklansky. On automatic feature selection. International Journal of Pattern Recognition and Artificial Intelligence, 2:197–220, 1988.
S. Watanabe. Pattern Recognition: Human and Mechanical. Wiley Interscience, 1985.
A. Zell and et al. Stuttgart neural network simulator (SNNS), user manual, version 4.1. Technical Report 6/95, Institute for Parallel and Distributed High Performance Systems (IPVR), University of Stuttgart, FTP: ftp.informatik.unistuttgart.de/pub/SNNS, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dash, M., Liu, H. (1998). Hybrid search of feature subsets. In: Lee, HY., Motoda, H. (eds) PRICAI’98: Topics in Artificial Intelligence. PRICAI 1998. Lecture Notes in Computer Science, vol 1531. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0095273
Download citation
DOI: https://doi.org/10.1007/BFb0095273
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65271-7
Online ISBN: 978-3-540-49461-4
eBook Packages: Springer Book Archive