Efficient GA based techniques for automating the design of classification models

Glover, Robin; Sharpe, Peter

doi:10.1007/BFb0052837

Robin Glover¹ &
Peter Sharpe¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1280))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

712 Accesses

Abstract

As genetic algorithms are able to perform extensive global search they have the potential to find the optimal set of model parameters for a classification algorithm. However if test set error is used to calculate fitness, the computational costs can be high and there is a danger that over-fitting to the test set can occur. This paper empirically examines the over-fitting problem in a feature selection context and then proposes techniques for modifying the fitness function to improve speed and accuracy. It is shown that test set sampling can dramatically speed up the evaluation function and hence enable the GA approach to be feasibly applied to large data sets. A technique is then proposed which combines the use of Occam's razor with statistical confidence tests to determine the number of samples utilized by the evaluation function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

I Kuscu, C Thornton: “Design of Artificial Neural Networks using Genetic Algorithms: review and prospect”, Technical Report, Cognitive and Computing Sciences, University of Sussex (1994)
Google Scholar
F Brill, D Brown et al.: ”Fast genetic selection of features for neural network classifiers”, IEEE Transactions on Neural Networks, Vol.3 No.2 (1992) 324–328.
Article Google Scholar
S Salzberg: ”A Critique of Current Research and Methods”, Technical Report JHU-95/06, John Hopkins University, Department of Computer Science. (1995).
Google Scholar
R Forsyth: ”IOGA: An Instance-Oriented Genetic algorithm”, Parallel Problem Solving from Nature 4, (1996) 482–493.
Google Scholar
J Kelly, L Davis: ”A hybrid genetic algorithm for classification.”, Proceedings of the Twelfth International Joint conference on Artificial Intelligence, Morgan Kaufmann (1991) 1022–1029.
Google Scholar
A Miller: ”Subset Selection in Regression”, Chapman and Hall (1990).
Google Scholar
R Kohavi, D Sommerfield: ”Feature subset selection using the wrapper model: Overfitting and dynamic search space topology”, First International conference on Knowledge Discovery and Data mining. (1995) 192–197
Google Scholar
Statlog data and documentation at ftp.ncc.up.pt/pub/statlog
Google Scholar
J Holland: ”Adaption in natural and artificial systems”, University of Michigan Press(1975).
Google Scholar
J Fitzpatrick, T Grefenstette: ”Genetic Algorithms in noisy environments”, Machine Learning Vol. 3. No. 2/3 (1985) 101–120.
Article Google Scholar
S Rana, D Whitley et al.: ”Searching in the presence of noise”, Parallel Problem Solving from Nature 4, (1996) 198–207.
Google Scholar
O Maron, A Moore: ”Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation”, Advances in Neural Information Processing Systems 6. Morgan Kaufmann (1994).
Google Scholar
A Moore, M Lee: ”Efficient algorithms for minimizing cross validation error”, Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann (1994)
Google Scholar
R Smith, E Dike et al.: ”Inheritance in Genetic Algorithms”, Proceedings of the ACM 1995 Symposium on Applied Computing, ACM Press (1994).
Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Computer Systems Centre Faculty of Computer Science and Mathematics, University of the West of England, BS16 1QY, Bristol
Robin Glover & Peter Sharpe

Authors

Robin Glover
View author publications
You can also search for this author in PubMed Google Scholar
Peter Sharpe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Xiaohui Liu Paul Cohen Michael Berthold

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Glover, R., Sharpe, P. (1997). Efficient GA based techniques for automating the design of classification models. In: Liu, X., Cohen, P., Berthold, M. (eds) Advances in Intelligent Data Analysis Reasoning about Data. IDA 1997. Lecture Notes in Computer Science, vol 1280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052837

Download citation

DOI: https://doi.org/10.1007/BFb0052837
Published: 19 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63346-4
Online ISBN: 978-3-540-69520-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics