Feature Selection Using the Domain Relationship with Genetic Algorithms
- 70 Downloads
- 3 Citations
Abstract
Considering the importance of the domain relationship in eliminating noisy features in feature selection, we present an alternate approach to designing a multi-objective fitness function using multiple correlation for the genetic algorithm (GA), which is used as a search tool in the problem. Multiple correlation is a simple statistical technique that uses the multiple correlation coefficients to measure the relationship between a dependent variable and a set of independent variables within the domain space. Simulation studies were conducted on both real-world and controlled data sets to assess the performance of the proposed fitness function. The comparison between the traditional fitness function and our proposed function is also reported. The results show that the proposed fitness function can perform more satisfactorily than the traditional one in all cases considered, including different data types, multi-class and multi-dimensional data.
En]Keywords
Feature selection genetic algorithm fitness function domain relationship multiple correlationPreview
Unable to display preview. Download preview PDF.
References
- 1.A. A. Afifi, S. P. Azen. Statistical Analysis a Computer Oriented Approach, Academic Press, New York, 1972, pp. 107–128.MATHGoogle Scholar
- 2.H. Aluallim, T. G. Dietterich. Learning with many irrelevant features. In: T. L. Dean, K. McKeown (eds.), Proc. 9th Nat’l Conf. Artificial Intelligence, AAAI-91, Anaheim, July 1991, MIT Press: USA, 1991, pp. 547–552.Google Scholar
- 3.H. Aluallim, T. G. Dietterich. Efficient algorithms for identifying relevant features. In: J. Glasgow, R. Hedley (eds.), Proc. 9th Canadian Conf. Artificial Intelligence, AI-92, Vancouver, Canada, May 1992, Morgan Kau]fmann: CA, 1992, pp. 38–45.Google Scholar
- 4.S. D. Bay. Combining nearest neighbor classifier through multiple feature subsets. In: P. Langley (ed.), Proc. 15th Int’l Conf. Machine Learning, ICML-98, Madison, Wisconsin, USA, July 1998, Morgan Kau]fmann, 1998.Google Scholar
- 5.R. Caruana, D. Freitag. Greedy attribute selection. In: W. W. Cohen, H. Hirsh (eds.), Proc. 11th Int’l Conf. Machine Learning, ML-94, New Brunswick, NJ, July 1994. Morgan Kau]fmann: San Francisco, CA, 1994, pp. 28–36.Google Scholar
- 6.T. Cover, P. Hart. NN pattern classification, IEEE Trans. Information Theory 13, 21–27, 1967.MATHCrossRefGoogle Scholar
- 7.K. A. De Jong. Analysis of the behavior of a class of genetic adaptive systems, PhD Thesis, Department of Computer and Communication Sciences, University of Michigan, USA, 1975.Google Scholar
- 8.P. A. Devijver. An overview of asymptotic properties of NN rules. Pattern Recognition in Practice, Elsevier Science Publishers B.V.: New York, 1980, pp. 343–350.Google Scholar
- 9.L. Devroye. au]tomatic pattern recognition: A study of the probability of error, IEEE Trans. Pattern Analysis and Machine Intelligence 10(4), 530–543, 1988.MATHCrossRefGoogle Scholar
- 10.E. F. Fix, J. Hodges. Discriminatory analysis: Small performance, Tech. Rep. Project 21-49-004, Rep. No. 11, USAF School of Aviation Medicine, Randolph Field, Tex., au]gust 1952.Google Scholar
- 11.D. E. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, USA, 1989, pp. 27–57.MATHGoogle Scholar
- 12.J. Holland. Outline for a logical theory of adaptive systems, J. Association for Computing Machinery (ACM) 3, 293–314, 1962.Google Scholar
- 13.A. Jain, D. Zongker. Feature selection: Evaluation, application, and small sample performance, IEEE Trans. Pattern Analysis and Machine Intelligence 19(2), 153–158, 1997.CrossRefGoogle Scholar
- 14.G. H. John, R. Kohavi, K. Pfleger. Irrelevant features and the subset selection problem. In: W.W. Cohen, H. Hirsh (eds.), Proc. 11th Int’l Conf. Machine Learning, ICML-94, New Brunswick, NJ, July 1994, Morgan Kau]fmann: San Francisco, CA, 1994, pp. 121–129.Google Scholar
- 15.K. Kira, L. A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In: W. Swartout (ed.), Proc. 10th Nat’l Conf. Artificial Intelligence, AAAI-92, San Jose, CA, USA, July 1992, MIT Press: USA, 1992, pp. 129–134.Google Scholar
- 16.K. Kira, L. A. Rendell. A practical approach to feature selection. In: D. Sleeman, P. Edwards (eds.), Proc. 9th Int’l Conf. Machine Learning, ML-92, Aberdeen, UK, July 1992, Morgan Kaufmann: CA, 1992, pp. 249–256.Google Scholar
- 17.P. Langley. Selection of relevant features in machine learning. In: Proc. AAAI Fall Symposium on Relevance, New Orleans, LA, 1994, AAAI Press, 1994, pp. 1–5.Google Scholar
- 18.H. Liu, R. Setiono. Chi2: Feature selection and discretization of numeric attributes. In: Proc. 7th IEEE Int’l Conf. Tools with Artificial Intelligence, TAI-95, Washington D.C., USA, November 1995, IEEE Press, 1995, pp. 388–391.Google Scholar
- 19.H. Liu, R. Setiono. Dimensionality reduction via discretization, Knowledge-Based Systems 9(1), 67–72, 1996.CrossRefGoogle Scholar
- 20.H. Liu, R. Setiono. A probabilistic approach to feature selection: A filter solution. In: Proc. 13th Int’l Conf. Machine Learning, ICML-96, Bari, Italy, July 1996, pp. 319–327.Google Scholar
- 21.H. Liu, R. Setiono. Neural network feature selector, IEEE Trans. On Neural Networks 8(3), 654–662, 1997.CrossRefGoogle Scholar
- 22.H. Liu, R. Setiono. Feature selection via discretization of numeric attributes, IEEE Trans. Knowledge and Data Engineering 9(4), 642–645, 1997.CrossRefGoogle Scholar
- 23.H. Liu, R. Setiono. Incremental feature selection, Applied Intelligence 9(3), 217–230, 1998.CrossRefGoogle Scholar
- 24.M. Pei, E. D. Goodman, W. F. Punch, D. Ying. Genetic algorithms for classification and feature extraction. In: Proc. 1995 Annual Meeting Classification Society of North America, CSNA-95, Colorado, June 1995.Google Scholar
- 25.M. Pei, E. D. Goodman, W. F. Punch. Pattern discovery from data using genetic algorithms. In: Proc. 1st Pacific-Asia Conf. Knowledge Discovery and Data Mining, February 1997.Google Scholar
- 26.W. F. Punch, E. D. Goodman, M. Pei, L. Chia-Shun, P. Hovland, R. Enbody. Further research on feature selection and classification using genetic algorithms. In: Proc. 5th Int’l Conf. Genetic Algorithms, ICGA-93, Urbana-Champaign, July 1993, pp. 557–564.Google Scholar
- 27.M. L. Raymer, W. F. Punch, E. D. Goodman, P. C. Sanschagrin, L. A. Kuhn. Simultaneous feature extraction and selection using a masking genetic algorithm. In: Proc. 7th Int’l Conf. Genetic Algorithms, ICGA-97, East Lansing, Michigan, July 1997, Morgan Kaufmann: San Francisco, 1997, pp. 561–567.Google Scholar
- 28.S. Salzberg, A. L. Delcher. Best-case results for nearest-neighbor learning, IEEE Trans. Pattern Analysis and Machine Intelligence 17(6), 599–608, 1995.CrossRefGoogle Scholar
- 29.W. Siedlecki, J. Sklansky. On automatic feature selection, Int. J. Pattern Recognition and Artificial Intelligence 2(2), 197–220, 1988.CrossRefGoogle Scholar
- 30.W. Siedlecki, J. Sklansky. A note on genetic algorithm for large-scale feature selection, IEEE Trans. on Computers 10, 335–347, 1989.MATHGoogle Scholar
- 31.D. B. Skalak. Prototype and feature selection by sampling and random mutation hill-climbing algorithms. In: Proc. 11th Int. Conf. Machine Learning, ML-94, New Brunswick, NJ, July 1994, Morgan Kaufmann: San Francisco, CA, 1994, pp. 293-301.Google Scholar
- 32.J. T. Tou, R. C. Gonzalez. Pattern Recognition Principles, Addison-Wesley: Massachusetts, USA, 1977, pp. 76–86.Google Scholar
- 33.H. Vafaie, K. A. De Jong. Robust feature selection algorithm. Proc. IEEE Int. Conf. Tools with Artificial Intelligence, TAI-93, Boston, MA, 1993, IEEE Press, 1993, pp. 356–363.Google Scholar
- 34.H. Vafaie, K. A. De Jong. Genetic algorithm as a tool for feature selection in machine learning. In: Proc. IEEE Int. Conf. Tools with Artificial Intelligence, TAI-92, Arlington, VA, 1992, IEEE Press, 1992, pp. 200–204.Google Scholar
- 35.H. Vafaie, K. De Jong. Improving a rule learning system using genetic algorithms. In: Machine Learning: A Multistrategy Approach, Morgan Kaufmann, 1994, pp. 453-470.Google Scholar
- 36.H. Vafaie, I. F. Imam. Feature selection methods: Genetic algorithms vs. greedy-like search. In: Proc. Int. Conf. Fuzzy and Intelligent Control Systems, Louisville, KY, 1994.Google Scholar