Advertisement

A Comparative Evaluation of Sequential Feature Selection Algorithms

  • David W. Aha
  • Richard L. Bankert
Part of the Lecture Notes in Statistics book series (LNS, volume 112)

Abstract

Several recent machine learning publications demonstrate the utility of using feature selection algorithms in supervised learning tasks. Among these, sequential feature selection algorithms are receiving attention. The most frequently studied variants of these algorithms are forward and backward sequential selection. Many studies on supervised learning with sequential feature selection report applications of these algorithms, but do not consider variants of them that might be more appropriate for some performance tasks. This paper reports positive empirical results on such variants, and argues for their serious consideration in similar learning tasks.

Keywords

Feature Selection Feature Subset Queue Size Feature Selection Algorithm Naval Research Laboratory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aha, D. W. (1992). Generalizing from case studies: A case study. In Proceedings of the Ninth International Conference on Machine Learning (pp. 1–10). Aberdeen, Scotland: Morgan Kaufmann.Google Scholar
  2. Aha, D. W., & Bankert, R. L. (1994). Feature selection for case-based classification of cloud types: An empirical comparison. In D. W. Aha (Ed.) Case-Based Reasoning: Papers from the 1994 Workshop (Technical Report WS-94–01). Menlo Park, CA: AAAI Press.Google Scholar
  3. Almuallim, H., & Dietterich, T. G. (1991). Learning with many irrelevant features. In Proceedings of the Ninth National Conference on Artificial Intelligence (pp. 547–552). Menlo Park, CA: AAAI Press.Google Scholar
  4. Bankert, R. L. (1994a). Cloud classification of AVHRR imagery in maritime regions using a probabilistic neural network. Journal of Applied Meteorology, 33,909–918.CrossRefGoogle Scholar
  5. Bankert, R., L. (1994b). Cloud pattern identification as part of an automated image analysis. Proceedings of the Seventh Conference on Satellite Meteorology and Oceanography (pp. 441–443). Boston, MA: American Meteorological Society.Google Scholar
  6. Caruana, R & Freitag, D. (1994). Greedy attribute selection. In Proceedings of the Eleventh International Machine Learning Conference (pp. 28–36). New Brunswick, NJ: Morgan Kaufmann.Google Scholar
  7. Cover, T. M., & van Campenhout, J. M. (1977). On the possible orderings in the measurement selection problem. IEEE Transactions on Systems Man and Cybernetics, 7, 657–661.zbMATHCrossRefGoogle Scholar
  8. Doak, J. (1992). An evaluation of feature selection methods and their application to computer security (Technical Report CSE-92–18). Davis, CA: University of California, Department of Computer Science.Google Scholar
  9. Fu, K. S. (1968). Sequential methods in pattern recognition and machine learning. New York: Academic Press.zbMATHGoogle Scholar
  10. John, G., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. In Proceedings of the Eleventh International Machine Learning Conference (pp. 121–129). New Brunswick, NJ: Morgan Kaufmann.Google Scholar
  11. Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF. In Proceedings of the 1994 European Conference on Machine Learning (pp. 171–182). Catania, Italy: Springer Verlag.Google Scholar
  12. Langley, P., & Sage, S. (1994). Oblivious decision trees and abstract cases. In D. W. Aha (Ed.), Case-Based Reasoning: Papers from the 1994 Workshop (Technical Report WS-94–01). Menlo Park, CA: AAAI Press.Google Scholar
  13. Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.CrossRefGoogle Scholar
  14. Moore, A. W., & Lee, M. S. (1994). Efficient algorithms for minimizing cross validation error. In Proceedings of the Eleventh International Conference on Machine Learning (pp. 190–198). New Brunswick, NJ: Morgan Kaufmann.Google Scholar
  15. Mucciardi, A. N., & Gose, E. E. (1971). A comparison of seven techniques for choosing subsets of pattern recognition properties. IEEE Transaction on Computers, 20, 1023–1031.zbMATHCrossRefGoogle Scholar
  16. Skalak, D. (1994). Prototype and feature selection by sampling and random mutation hill climbing algorithms. In Proceedings of the Eleventh International Machine Learning Conference (pp. 293–301). New Brunswick, NJ: Morgan Kaufmann.Google Scholar
  17. Townsend-Weber, T., & Kibler, D. (1994). Instance-based prediction of continuous values. In D. W. Aha (Ed.), Case-Based Reasoning: Papers from the 1994 Workshop (Technical Report WS-94–01). Menlo Park, CA: AAAI Press.Google Scholar
  18. Vafaie, H., & De Jong, K. (1993). Robust feature selection algorithms. In Proceedings of the Fifth Conference on Tools for Artificial Intelligence (pp. 356–363). Boston, MA: IEEE Computer Society Press.Google Scholar

Copyright information

© Springer-Verlag New York, Inc. 1996

Authors and Affiliations

  • David W. Aha
    • 1
  • Richard L. Bankert
    • 2
  1. 1.Navy Center for Applied Research in AINaval Research LaboratoryUSA
  2. 2.Marine Meteorology DivisionNaval Research LaboratoryMontereyMexico

Personalised recommendations