Abstract
Pattern classification and knowledge discovery problems require selection of a subset of features to represent the patterns to be classified. This is due to the fact that the performance of the classifier and the cost of classification are sensitive to the choice of the features used to construct the classifier. Genetic algorithms offer an attractive approach to find near-optimal solutions to such optimization problems. This chapter presents an approach to feature subset selection using a genetic algorithm. Our experiments demonstrate the feasibility of this approach to feature subset selection in the automated design of neural networks for pattern classification and knowledge discovery.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Almuallim, H. and Dietterich, T. (1994). Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69(1–2):279–305.
Balakrishnan, K. and Honavar, V. (1996). On sensor evolution in robotics. In Koza, Goldberg, Fogel, and Riolo, editors, Proceedings of the 1996 Genetic Programming Conference - GP-96, pages 455–460. MIT Press, Cambridge, MA.
Brill, F., Brown, D., and Martin, W. (1992). Fast genetic selection of features for neural network classifiers. IEEE Transactions on Neural Networks, 3(2):324328.
Caruana, R. and Freitag, D. (1994). Greedy attribute selection. In Proceedings of the Eleventh International Conference on Machine Learning, pages 28–36, New Brunswick, NJ. Morgan Kaufmann.
Cost, S. and Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10(1):57–78.
Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13:21–27.
Dasarathy, B. (1991). Nearest Neighbor (NN) Norms: NN Pattern Classification Techiniques, IEEE Computer Society Press, Los Alamitos, CA.
Dash, M. and Liu, H. (1997). Feature selection for classification. Intelligent Data Analysis1(3)
Doak, J. (1992). An evaluation of feature selection methods and their application to computer security. Technical Report CSE-92–18, Department of Computer Science, University of California, Davis, CA.
Foroutan, I. and Sklansky, J. (1987). Feature selection for automatic classification of non-gaussian data. IEEE Transactions on Systems, Man and Cybernetics, 17:187–198.
Guo, Z. (1992). Nuclear Power Plant Fault Diagnostics and Thermal Performance Studies Using Neural Networks and Genetic Algorithms. PhD thesis, University of Tennessee, Knoxville, TN.
Guo, Z. and Uhrig, R. (1992). Using genetic algorithms to select inputs for neural networks. In Proceedings of COGANN’92,pages 223–234.
Honavar, V. (1998a). Machine learning: Principles and applications. In Webster, J., editor, Encyclopedia of Electrical and Electronics Engineering. Wiley, New York. To appear.
Honavar, V. (1998b). Structural learning. In Webster, J., editor, Encyclopedia of Electrical and Electronics Engineering. Wiley, New York. To appear.
John, G., Kohavi, R., and Pfleger, K. (1994). Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning, pages 121–129, New Brunswick, NJ. Morgan Kaufmann.
Keeney, R. and Raiffa, H. (1976). Decisions with Multiple Objectives: Preferences and Value Tradeoffs. Wiley, New York.
Kira, K. and Rendell, L. (1992). A practical approach to feature selection. In Proceedings of the Ninth International Conference on Machine Learning, pages 249–256. Morgan Kaufmann.
Kohavi, R. (1994). Feature subset selection as search with probabilistic estimates. In AAAI Fall Symposium on Relevance.
Koller, D. and Sahami, M. (1996). Toward optimal feature selection. In Machine Learning: Proceedings of the Thirteenth International Conference. Morgan Kaufmann.
Koller, D. and Sahami, M. (1997). Hierarchically classifying documents using very few words. In International Conference on Machine Learning, pages 170–178.
Kononenko, I. (1994). Estimating attributes: Analysis and extension of relief. InProceedings of European Conference on Mahcine Learningpages 171–182.
Koza, J. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA.
Langley, P. (1994). Selection of relevant features in machine learning. In Proceedings of the AAAI Fall Symposium on Relevance,pages 1–5, New Orleans, LA. AAAI Press.
Langley, P. (1995). Elements of Machine Learning. Morgan Kaufmann, Palo Alto, CA.
Liu, H. and Setiono, R. (1996a). Feature selection and classification - a probabilistic wrapper approach. In Proceedings of the Ninth International Conference on Industrial and Engineering Applications of AI and ES.
Liu, H. and Setiono, R. (1996b). A probabilistic approach to feature selection - a filter solution. In Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann.
Mitchell, M. (1996). An Introduction to Genetic algorithms. MIT Press, Cambridge, MA.
Mitchell, T. (1997). Machine Learning. McGraw Hill, New York.
Modrzejewski, M. (1993). Feature selection using rough sets theory. In Proceedings of the European Conference on Machine Learning, pages 213–226. Springer.
Murphy, P. and Aha, D. (1994). Repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA.
Narendra, P. and Fukunaga, K. (1977). A branch and bound algorithm for feature subset selection. IEEE Transactions on Computers, 26:917–922.
Parekh, R., Yang, J., and Honavar, V. (1997). Constructive neural network learning algorithms for multi-category real-valued pattern classification. Technical Report ISU-CS-TR97–06, Department of Computer Science, Iowa State University.
Richeldi, M. and Lanzi, P. (1996). Performing effective feature selection by investigating the deep structure of the data. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pages 379–383. AAAI Press.
Ripley, B. (1996). Pattern Recognition and Neural Networks. Cambridge University Press, New York.
Salton, G. (1989). Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Massachusetts.
Sheinvald, J., Dom, B., and Niblack, W. (1990). A modelling approach to feature selection. In Proceedings of the Tenth International Conference on Pattern Recognition, pages 535–539.
Siedlecki, W. and Sklansky, J. (1988). On automatic feature selection. International Journal of Pattern Recognition, 2:197–220.
Siedlecki, W. and Sklansky, J. (1989). A note on genetic algorithms for large-scale feature selection. IEEE Transactions on Computers, 10:335–347.
Skalak, D. (1994). Prototype and feature selection by sampling and random mutation hill-climbing algorithms. In Proceedings of the Eleventh International Conference on Machine Learning, pages 293–301, New Brunswick, NJ. Morgan Kaufmann.
Vafaie, H. and De Jong, K. (1993). Robust feature selection algorithms. In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence,pages 356–363.
Wettschereck, D., Aha, D., and Mohri, T. (1995). A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Technical Report AIC95–012, Naval Research Laboratory, Navy Center for Applied Research in Artificial Intelligence, Washington, D.C.
Yang, J., Pai, P., Honavar, V., and Miller, L. (1998a). Mobile intelligent agents for document classification and retrieval: A machine learning approach. In 14th European Meeting on Cybernetics and Systems Research. Symposium on Agent Theory to Agent Implementation, Vienna, Austria.
Yang, J., Parekh, R., and Honavar, V. (1998b). DistAl: An inter-pattern distance-based constructive learning algorithm. In Proceedings of the International Joint Conference on Neural Networks, Anchorage, Alaska. To appear.
Zhou, G., McCalley, J., and Honavar, V. (1997). Power system security margin prediction using radial basis function networks. In Proceedings of the 29th Annual North American Power Symposium, Laramie, Wyoming.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Yang, J., Honavar, V. (1998). Feature Subset Selection Using a Genetic Algorithm. In: Liu, H., Motoda, H. (eds) Feature Extraction, Construction and Selection. The Springer International Series in Engineering and Computer Science, vol 453. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5725-8_8
Download citation
DOI: https://doi.org/10.1007/978-1-4615-5725-8_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7622-4
Online ISBN: 978-1-4615-5725-8
eBook Packages: Springer Book Archive