On the Scalability of Genetic Algorithms to Very Large-Scale Feature Selection
- Andreas MoserAffiliated withGerman Research Center for Artificial Intelligence GmbH
- , M. Narasimha MurtyAffiliated withDepartment of Computer Science and Automation, Indian Institute of Science
Feature Selection is a very promising optimisation strategy for Pattern Recognition systems. But, as an NP-complete task, it is extremely difficult to carry out. Past studies therefore were rather limited in either the cardinality of the feature space or the number of patterns utilised to assess the feature subset performance.
This study examines the scalability of Distributed Genetic Algorithms to very large-scale Feature Selection. As domain of application, a classification system for Optical Characters is chosen. The system is tailored to classify hand-written digits, involving 768 binary features. Due to the vastness of the investigated problem, this study forms a step into new realms in Feature Selection for classification.
We present a set of customisations of GAs that provide for an application of known concepts to Feature Selection problems of practical interest. Some limitations of GAs in the domain of Feature Selection are unrevealed and improvements are suggested. A widely used strategy to accelerate the optimisation process, Training Set Sampling, was observed to fail in this domain of application.
Experiments on unseen validation data suggest that Distributed GAs are capable of reducing the problem complexity significantly. The results show that the classification accuracy can be maintained while reducing the feature space cardinality by about 50%. Genetic Algorithms are demonstrated to scale well to very large-scale problems in Feature Selection.
- On the Scalability of Genetic Algorithms to Very Large-Scale Feature Selection
- Book Title
- Real-World Applications of Evolutionary Computing
- Book Subtitle
- EvoWorkshops 2000: EvoIASP, EvoSCONDI, EvoTel, EvoSTIM, EvoRob, and EvoFlight Edinburgh, Scotland, UK, April 17, 2000 Proceedings
- pp 77-86
- Print ISBN
- Online ISBN
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- Series ISSN
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-Verlag Berlin Heidelberg
- Additional Links
- Industry Sectors
- eBook Packages
- Stefano Cagnoni (4)
- Editor Affiliations
- 4. Department of Computer Engineering, University of Parma
- Author Affiliations
- 5. German Research Center for Artificial Intelligence GmbH, 67608, Kaiserslautern, Germany
- 6. Department of Computer Science and Automation, Indian Institute of Science, Bangalore, 560 012, India
To view the rest of this content please follow the download PDF link above.