Skip to main content

Feature Subset Selection Using a Genetic Algorithm

  • Chapter
Feature Extraction, Construction and Selection

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 453))

Abstract

Pattern classification and knowledge discovery problems require selection of a subset of features to represent the patterns to be classified. This is due to the fact that the performance of the classifier and the cost of classification are sensitive to the choice of the features used to construct the classifier. Genetic algorithms offer an attractive approach to find near-optimal solutions to such optimization problems. This chapter presents an approach to feature subset selection using a genetic algorithm. Our experiments demonstrate the feasibility of this approach to feature subset selection in the automated design of neural networks for pattern classification and knowledge discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Almuallim, H. and Dietterich, T. (1994). Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69(1–2):279–305.

    Article  MathSciNet  MATH  Google Scholar 

  • Balakrishnan, K. and Honavar, V. (1996). On sensor evolution in robotics. In Koza, Goldberg, Fogel, and Riolo, editors, Proceedings of the 1996 Genetic Programming Conference - GP-96, pages 455–460. MIT Press, Cambridge, MA.

    Google Scholar 

  • Brill, F., Brown, D., and Martin, W. (1992). Fast genetic selection of features for neural network classifiers. IEEE Transactions on Neural Networks, 3(2):324328.

    Article  Google Scholar 

  • Caruana, R. and Freitag, D. (1994). Greedy attribute selection. In Proceedings of the Eleventh International Conference on Machine Learning, pages 28–36, New Brunswick, NJ. Morgan Kaufmann.

    Google Scholar 

  • Cost, S. and Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10(1):57–78.

    Google Scholar 

  • Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13:21–27.

    Article  MATH  Google Scholar 

  • Dasarathy, B. (1991). Nearest Neighbor (NN) Norms: NN Pattern Classification Techiniques, IEEE Computer Society Press, Los Alamitos, CA.

    Google Scholar 

  • Dash, M. and Liu, H. (1997). Feature selection for classification. Intelligent Data Analysis1(3)

    Google Scholar 

  • Doak, J. (1992). An evaluation of feature selection methods and their application to computer security. Technical Report CSE-92–18, Department of Computer Science, University of California, Davis, CA.

    Google Scholar 

  • Foroutan, I. and Sklansky, J. (1987). Feature selection for automatic classification of non-gaussian data. IEEE Transactions on Systems, Man and Cybernetics, 17:187–198.

    Article  Google Scholar 

  • Guo, Z. (1992). Nuclear Power Plant Fault Diagnostics and Thermal Performance Studies Using Neural Networks and Genetic Algorithms. PhD thesis, University of Tennessee, Knoxville, TN.

    Google Scholar 

  • Guo, Z. and Uhrig, R. (1992). Using genetic algorithms to select inputs for neural networks. In Proceedings of COGANN’92,pages 223–234.

    Google Scholar 

  • Honavar, V. (1998a). Machine learning: Principles and applications. In Webster, J., editor, Encyclopedia of Electrical and Electronics Engineering. Wiley, New York. To appear.

    Google Scholar 

  • Honavar, V. (1998b). Structural learning. In Webster, J., editor, Encyclopedia of Electrical and Electronics Engineering. Wiley, New York. To appear.

    Google Scholar 

  • John, G., Kohavi, R., and Pfleger, K. (1994). Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Conference on Machine Learning, pages 121–129, New Brunswick, NJ. Morgan Kaufmann.

    Google Scholar 

  • Keeney, R. and Raiffa, H. (1976). Decisions with Multiple Objectives: Preferences and Value Tradeoffs. Wiley, New York.

    Google Scholar 

  • Kira, K. and Rendell, L. (1992). A practical approach to feature selection. In Proceedings of the Ninth International Conference on Machine Learning, pages 249–256. Morgan Kaufmann.

    Google Scholar 

  • Kohavi, R. (1994). Feature subset selection as search with probabilistic estimates. In AAAI Fall Symposium on Relevance.

    Google Scholar 

  • Koller, D. and Sahami, M. (1996). Toward optimal feature selection. In Machine Learning: Proceedings of the Thirteenth International Conference. Morgan Kaufmann.

    Google Scholar 

  • Koller, D. and Sahami, M. (1997). Hierarchically classifying documents using very few words. In International Conference on Machine Learning, pages 170–178.

    Google Scholar 

  • Kononenko, I. (1994). Estimating attributes: Analysis and extension of relief. InProceedings of European Conference on Mahcine Learningpages 171–182.

    Google Scholar 

  • Koza, J. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA.

    Google Scholar 

  • Langley, P. (1994). Selection of relevant features in machine learning. In Proceedings of the AAAI Fall Symposium on Relevance,pages 1–5, New Orleans, LA. AAAI Press.

    Google Scholar 

  • Langley, P. (1995). Elements of Machine Learning. Morgan Kaufmann, Palo Alto, CA.

    Google Scholar 

  • Liu, H. and Setiono, R. (1996a). Feature selection and classification - a probabilistic wrapper approach. In Proceedings of the Ninth International Conference on Industrial and Engineering Applications of AI and ES.

    Google Scholar 

  • Liu, H. and Setiono, R. (1996b). A probabilistic approach to feature selection - a filter solution. In Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann.

    Google Scholar 

  • Mitchell, M. (1996). An Introduction to Genetic algorithms. MIT Press, Cambridge, MA.

    Google Scholar 

  • Mitchell, T. (1997). Machine Learning. McGraw Hill, New York.

    MATH  Google Scholar 

  • Modrzejewski, M. (1993). Feature selection using rough sets theory. In Proceedings of the European Conference on Machine Learning, pages 213–226. Springer.

    Google Scholar 

  • Murphy, P. and Aha, D. (1994). Repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA.

    Google Scholar 

  • Narendra, P. and Fukunaga, K. (1977). A branch and bound algorithm for feature subset selection. IEEE Transactions on Computers, 26:917–922.

    Article  MATH  Google Scholar 

  • Parekh, R., Yang, J., and Honavar, V. (1997). Constructive neural network learning algorithms for multi-category real-valued pattern classification. Technical Report ISU-CS-TR97–06, Department of Computer Science, Iowa State University.

    Google Scholar 

  • Richeldi, M. and Lanzi, P. (1996). Performing effective feature selection by investigating the deep structure of the data. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pages 379–383. AAAI Press.

    Google Scholar 

  • Ripley, B. (1996). Pattern Recognition and Neural Networks. Cambridge University Press, New York.

    MATH  Google Scholar 

  • Salton, G. (1989). Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Massachusetts.

    Google Scholar 

  • Sheinvald, J., Dom, B., and Niblack, W. (1990). A modelling approach to feature selection. In Proceedings of the Tenth International Conference on Pattern Recognition, pages 535–539.

    Google Scholar 

  • Siedlecki, W. and Sklansky, J. (1988). On automatic feature selection. International Journal of Pattern Recognition, 2:197–220.

    Article  Google Scholar 

  • Siedlecki, W. and Sklansky, J. (1989). A note on genetic algorithms for large-scale feature selection. IEEE Transactions on Computers, 10:335–347.

    MATH  Google Scholar 

  • Skalak, D. (1994). Prototype and feature selection by sampling and random mutation hill-climbing algorithms. In Proceedings of the Eleventh International Conference on Machine Learning, pages 293–301, New Brunswick, NJ. Morgan Kaufmann.

    Google Scholar 

  • Vafaie, H. and De Jong, K. (1993). Robust feature selection algorithms. In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence,pages 356–363.

    Google Scholar 

  • Wettschereck, D., Aha, D., and Mohri, T. (1995). A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Technical Report AIC95–012, Naval Research Laboratory, Navy Center for Applied Research in Artificial Intelligence, Washington, D.C.

    Google Scholar 

  • Yang, J., Pai, P., Honavar, V., and Miller, L. (1998a). Mobile intelligent agents for document classification and retrieval: A machine learning approach. In 14th European Meeting on Cybernetics and Systems Research. Symposium on Agent Theory to Agent Implementation, Vienna, Austria.

    Google Scholar 

  • Yang, J., Parekh, R., and Honavar, V. (1998b). DistAl: An inter-pattern distance-based constructive learning algorithm. In Proceedings of the International Joint Conference on Neural Networks, Anchorage, Alaska. To appear.

    Google Scholar 

  • Zhou, G., McCalley, J., and Honavar, V. (1997). Power system security margin prediction using radial basis function networks. In Proceedings of the 29th Annual North American Power Symposium, Laramie, Wyoming.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Science+Business Media New York

About this chapter

Cite this chapter

Yang, J., Honavar, V. (1998). Feature Subset Selection Using a Genetic Algorithm. In: Liu, H., Motoda, H. (eds) Feature Extraction, Construction and Selection. The Springer International Series in Engineering and Computer Science, vol 453. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5725-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-5725-8_8

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7622-4

  • Online ISBN: 978-1-4615-5725-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics