Some Probabilistic Modelling Ideas for Boolean Classification in Genetic Programming

  • Jorge Muruzábal
  • Carlos Cotta-Porras
  • Amelia Fernández
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1802)


We discuss the problem of boolean classification via Genetic Programming. When predictors are numeric, the standard approach proceeds by classifying according to the sign of the value provided by the evaluated function. We consider an alternative approach whereby the magnitude of such a quantity also plays a role in prediction and evaluation. Specifically, the original, unconstrained value is transformed into a probability value which is then used to elicit the classification. This idea stems from the well-known logistic regression paradigm and can be seen as an attempt to squeeze all the information in each individual function. We investigate the empirical behaviour of these variants and discuss a third evaluation measure equally based on probabilistic ideas. To put these ideas in perspective, we present comparative results obtained by alternative methods, namely recursive splitting and logistic regression.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bojarczuk, C.C., Lopes, H.S., Freitas, A.A.: Discovering Comprehensible Classification Rules Using Genetic Programming: A Case Study in a Medical Domain. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 1999), vol. 2 (1999)Google Scholar
  2. 2.
    Koza, J.R.: Genetic Programming. MIT Press, Cambridge (1992)zbMATHGoogle Scholar
  3. 3.
    Christensen, R.: Log-Linear Models and Logistic Regression, 2nd edn. Springer, Heidelberg (1997)zbMATHGoogle Scholar
  4. 4.
    McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman & Hall, Boca Raton (1983)zbMATHGoogle Scholar
  5. 5.
    Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)Google Scholar
  6. 6.
    Jordan, M.I.: Why the Logistic Function? A Tutorial Discussion on Probabilities and Neural Networks. Computational Cognitive Science Technical Report 9503. MIT (1995)Google Scholar
  7. 7.
    Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S-PLUS, 2nd edn. Springer, Heidelberg (1997)zbMATHGoogle Scholar
  8. 8.
  9. 9.
    Eggermont, J., Eiben, A.E., van Hemert, J.I.: A Comparison of Genetic Programming Variants for Data Classification. In: Eggermont, J., Eiben, A.E., van Hemert, J.I. (eds.) IDA 1999. LNCS, vol. 1642, p. 281. Springer, Heidelberg (1999)Google Scholar
  10. 10.
    Freitas, A.A.: A Genetic Programming Framework for Two Data Mining Tasks: Classification and Generalized Rule Induction. In: Proceedings of the Second Genetic Programming Conference, GP 1997 (1997)Google Scholar
  11. 11.
    Cavaretta, M.J., Chellapilla, K.: Data Mining Using Genetic Programming: the Implications of Parsimony on Generalization Error. In: Proceedings of the 1999 Conference on Evolutionary Computation, CEC 1999 (1999)Google Scholar
  12. 12.
    Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)Google Scholar
  13. 13.
    Iba, H.: Bagging, Boosting and Bloating in Genetic Programming. In: Proceedings of GECCO 1999, vol. 2 (1999)Google Scholar
  14. 14.
    Hillis, W.D.: Co-Evolving Parasites Improve Simulated Evolution as an Optimization Procedure. In: Langton, C.G., Taylor, C., Farmer, J.D., Rasmussen, S. (eds.) Artificial Life II, SFI Studies in the Science of Complexity, Addison-Wesley, Reading (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Jorge Muruzábal
    • 1
  • Carlos Cotta-Porras
    • 2
  • Amelia Fernández
    • 2
  1. 1.University Rey Juan CarlosMóstolesSpain
  2. 2.University of MálagaMálagaSpain

Personalised recommendations