Skip to main content

Efficient GA based techniques for automating the design of classification models

  • Conference paper
  • First Online:
Advances in Intelligent Data Analysis Reasoning about Data (IDA 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1280))

Included in the following conference series:

  • 712 Accesses

Abstract

As genetic algorithms are able to perform extensive global search they have the potential to find the optimal set of model parameters for a classification algorithm. However if test set error is used to calculate fitness, the computational costs can be high and there is a danger that over-fitting to the test set can occur. This paper empirically examines the over-fitting problem in a feature selection context and then proposes techniques for modifying the fitness function to improve speed and accuracy. It is shown that test set sampling can dramatically speed up the evaluation function and hence enable the GA approach to be feasibly applied to large data sets. A technique is then proposed which combines the use of Occam's razor with statistical confidence tests to determine the number of samples utilized by the evaluation function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. I Kuscu, C Thornton: “Design of Artificial Neural Networks using Genetic Algorithms: review and prospect”, Technical Report, Cognitive and Computing Sciences, University of Sussex (1994)

    Google Scholar 

  2. F Brill, D Brown et al.: ”Fast genetic selection of features for neural network classifiers”, IEEE Transactions on Neural Networks, Vol.3 No.2 (1992) 324–328.

    Article  Google Scholar 

  3. S Salzberg: ”A Critique of Current Research and Methods”, Technical Report JHU-95/06, John Hopkins University, Department of Computer Science. (1995).

    Google Scholar 

  4. R Forsyth: ”IOGA: An Instance-Oriented Genetic algorithm”, Parallel Problem Solving from Nature 4, (1996) 482–493.

    Google Scholar 

  5. J Kelly, L Davis: ”A hybrid genetic algorithm for classification.”, Proceedings of the Twelfth International Joint conference on Artificial Intelligence, Morgan Kaufmann (1991) 1022–1029.

    Google Scholar 

  6. A Miller: ”Subset Selection in Regression”, Chapman and Hall (1990).

    Google Scholar 

  7. R Kohavi, D Sommerfield: ”Feature subset selection using the wrapper model: Overfitting and dynamic search space topology”, First International conference on Knowledge Discovery and Data mining. (1995) 192–197

    Google Scholar 

  8. Statlog data and documentation at ftp.ncc.up.pt/pub/statlog

    Google Scholar 

  9. J Holland: ”Adaption in natural and artificial systems”, University of Michigan Press(1975).

    Google Scholar 

  10. J Fitzpatrick, T Grefenstette: ”Genetic Algorithms in noisy environments”, Machine Learning Vol. 3. No. 2/3 (1985) 101–120.

    Article  Google Scholar 

  11. S Rana, D Whitley et al.: ”Searching in the presence of noise”, Parallel Problem Solving from Nature 4, (1996) 198–207.

    Google Scholar 

  12. O Maron, A Moore: ”Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation”, Advances in Neural Information Processing Systems 6. Morgan Kaufmann (1994).

    Google Scholar 

  13. A Moore, M Lee: ”Efficient algorithms for minimizing cross validation error”, Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann (1994)

    Google Scholar 

  14. R Smith, E Dike et al.: ”Inheritance in Genetic Algorithms”, Proceedings of the ACM 1995 Symposium on Applied Computing, ACM Press (1994).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Xiaohui Liu Paul Cohen Michael Berthold

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag

About this paper

Cite this paper

Glover, R., Sharpe, P. (1997). Efficient GA based techniques for automating the design of classification models. In: Liu, X., Cohen, P., Berthold, M. (eds) Advances in Intelligent Data Analysis Reasoning about Data. IDA 1997. Lecture Notes in Computer Science, vol 1280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052837

Download citation

  • DOI: https://doi.org/10.1007/BFb0052837

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63346-4

  • Online ISBN: 978-3-540-69520-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics