Advertisement

Computational Statistics

, Volume 16, Issue 3, pp 341–359 | Cite as

Combining statistical and reinforcement learning in rule-based classification

  • Jorge MuruzábalEmail author
Article

Summary

BYPASS is a rule-based learning algorithm designed loosely after John Holland’s Classifier System (CS) idea. In essence, CSs are incremental data processors that attempt to (self-)organize a population of rules under the guidance of certain reinforcement policy. BYPASS uses a reinforcement scheme based on predictive scoring and handles conditions similar to those in standard classification trees. However, these conditions are allowed to overlap; this makes it possible to base individual predictions on several rules (just like in committees of trees). Moreover, conditions are supplied with simple bayesian predictive distributions that evolve as data are processed. The paper presents the details of the algorithm and discusses empirical results suggesting that statistical and reinforcement-based learning blend together in interesting and useful ways.

Keywords

Evolutionary Computation Probabilistic Prediction Self-Organization Bayesian Learning Regularity Detection 

Notes

Acknowledgement

Support by CICYT grants HID98-0379-C02-01 and TIC98-0272-C02-01 is appreciated.

5 References

  1. Breiman, L. (1996). Bagging Predictors. Machine Learning 24, 123–140.zbMATHGoogle Scholar
  2. Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classiffication and Regression Trees. Pacific Grove, CA: Wadsworth.zbMATHGoogle Scholar
  3. Butz, M. V., Goldberg, D. E., and Stolzmann, W. (2000). The Anticipatory Classifier System and Genetic Generalization. Technical Report 2000032, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign.Google Scholar
  4. Chipman, H., George, E., and McCulloch, R. (1998). Bayesian CART Model Search. Journal of the American Statistical Association 93, 935–960.CrossRefGoogle Scholar
  5. De Jong, K. A., Spears, W. M. and Gordon, D. F. (1993). Using Genetic Algorithms for Concept Learning. Machine Learning 13, 161–188.Google Scholar
  6. Denison, D. G. T., Adams, N. M., Holmes, C. C. and Hand, D. J. (2000). Bayesian Partition Modelling. Preprint available at http://stats.ma.ic.ac.uk/~dgtd/tech.html
  7. Freitas, A. A. (Ed.) (1999). Data Mining with Evolutionary Algorithms: Research Directions. Proceedings of the AAAI-99 & GECCO-99 Workshop on Data Mining with Evolutionary Algorithms. Technical Report WS-99-06. A.A.A.I. Press.Google Scholar
  8. Frey, P. W. and Slate, D. J. (1991). Letter Recognition Using Holland-style Adaptive Classifiers. Machine Learning 6, 161–182.Google Scholar
  9. Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning. Reading, MA: Addison-Wesley.zbMATHGoogle Scholar
  10. Hand, D. J. (1997). Construction and Assessment of Classification Rules. New York: John Wiley.zbMATHGoogle Scholar
  11. Holland, J. H., Holyoak, K. J., Nisbett, R. E. and Thagard, P. R. (1986). Induction: Processes of Inference, Learning and Discovery. Cambridge, MA: M.I.T. Press.Google Scholar
  12. Lanzi, P. L., Stolzmann, W. and Wilson, S. W. (Eds.) (2000). Learning Classifier Systems. Berlin: Springer-Verlag.zbMATHGoogle Scholar
  13. Michie, D., Spiegelhalter, D. J. and Taylor, C. C. (1994). Machine Learning, Neural and Statistical Classification. New York: Ellis Horwood.zbMATHGoogle Scholar
  14. Muruzábal, J. (1995). Fuzzy and Probabilistic Reasoning in Simple Learning Classifier Systems. In Proceedings of the 2nd IEEE International Conference on Evolutionary Computation (Ed. D. B. Fogel), 262–266. Piscataway, NJ: I.E.E.E. Press.Google Scholar
  15. Muruzábal, J. (1999). Mining the Space of Generality with Uncertainty-Concerned, Cooperative Classifiers. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO-99 (Eds. W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela and R. E. Smith), 449–457. San Francisco, CA: Morgan Kaufmann.Google Scholar
  16. Muruzábal, J. and Muñoz, A. (1994). Diffuse Pattern Learning with Fuzzy ARTMAP and PASS. Springer-Verlag Lecture Notes in Computer Science, Vol. 866, 376–385.Google Scholar
  17. Paass, G. and Kindermann, J. (1998). Bayesian Classification Trees with Overlapping Leaves Applied to Credit-Scoring. Springer-Verlag Lecture Notes in Computer Science, Vol. 1394, 234–245.CrossRefGoogle Scholar
  18. Tierney, L. (1990). LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics. New York: John Wiley.CrossRefGoogle Scholar
  19. Tukey, J. W. (1985). Discussion of G. J. Hahn’s “More Intelligent Statistical Software and Statistical Expert Systems: Future Directions”. The American Statistician, Vol. 39.Google Scholar
  20. Venables, W. N. and Ripley, B. D. (1997). Modern Applied Statistics with S-PLUS. New York: Springer-Verlag.CrossRefGoogle Scholar
  21. Whitehead, B. A. and Choate, T. D. (1996). Cooperative-Competitive Genetic Evolution of Radial Basis Function Centers and Widths for Time Series Prediction. IEEE Transactions on Neural Networks 7(4), 869–880.CrossRefGoogle Scholar
  22. Wilson, S. W. (1987). Classifier Systems and the Animat Problem. Machine Learning 2, 199–228.Google Scholar
  23. Wilson, S. W. (1995). Classifier Fitness Based on Accuracy. Evolutionary Computation 3(2), 149–175.MathSciNetCrossRefGoogle Scholar

Copyright information

© Physica-Verlag 2001

Authors and Affiliations

  1. 1.Statistics and Decision Sciences GroupUniversity Rey Juan CarlosMóstoles, MadridSpain

Personalised recommendations