Gradient-Based Learning Updates Improve XCS Performance in Multistep Problems

  • Martin V. Butz
  • David E. Goldberg
  • Pier Luca Lanzi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3103)

Abstract

This paper introduces a gradient-based reward prediction update mechanism to the XCS classifier system as applied in neural-network type learning and function approximation mechanisms. A strong relation of XCS to tabular reinforcement learning and more importantly to neural-based reinforcement learning techniques is drawn. The resulting gradient-based XCS system learns more stable and reliable in previously investigated hard multistep problems. While the investigations are limited to the binary XCS classifier system, the applied gradient-based update mechanism appears also suitable for the real-valued XCS and other learning classifier systems.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Wilson, S.W.: Get real! XCS with continuous-valued inputs. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) Learning classifier systems: From foundations to applications, pp. 209–219. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  2. 2.
    Bernadó, E., Llorà, X., Garrell, J.M.: XCS and GALE: A comparative study of two learning classifier systems and six other learning algorithms on classification tasks. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 115–132. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Lanzi, P.L.: Mining interesting knowledge from data with the XCS classifier system. In: Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 958–965 (2001)Google Scholar
  4. 4.
    Dixon, P.W., Corne, D.W., Oates, M.J.: A preliminary investigation of modified XCS as a generic data mining tool. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 133–150. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3, 149–175 (1995)CrossRefGoogle Scholar
  6. 6.
    Wilson, S.W.: Generalization in the XCS classifier system. In: Genetic Programming 1998: Proceedings of the Third Annual Conference, pp. 665–674 (1998)Google Scholar
  7. 7.
    Lanzi, P.L.: An analysis of generalization in the XCS classifier system. Evolutionary Computation 7, 125–149 (1999)CrossRefGoogle Scholar
  8. 8.
    Barry, A.M.: The stability of long action chains in XCS. Journal of Soft Computing 6, 183–199 (2002)MATHGoogle Scholar
  9. 9.
    Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, UK (1989)Google Scholar
  10. 10.
    Wilson, S.W.: ZCS: A zeroth level classifier system. Evolutionary Computation 2, 1–18 (1994)CrossRefGoogle Scholar
  11. 11.
    Lanzi, P.L.: Learning classifier systems from a reinforcement learning perspective. Soft Computing: A Fusion of Foundations, Methodologies and Applications 6 (2002)Google Scholar
  12. 12.
    Wilson, S.W.: Function approximation with a classifier system. In: Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 974–981 (2001)Google Scholar
  13. 13.
    Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)Google Scholar
  14. 14.
    Baird, L.C.: Residual algorithms: Reinforcement learning with function approximation. In: Machine Learning: Proceedings of the Twelfth International Conference (1995)Google Scholar
  15. 15.
    Baird, L.C.: Reinforcement Learning Through Gradient Descent. PhD thesis, School of Computer Science. Carnegie Mellon University, Pittsburgh, PA 15213 (1999)Google Scholar
  16. 16.
    Butz, M.V., Wilson, S.W.: An algorithmic description of XCS. Soft Computing 6, 144–153 (2002)MATHGoogle Scholar
  17. 17.
    Butz, M.V., Goldberg, D.E., Lanzi, P.L.: Gradient Descent Methods in Learning Classifier Systems. IlliGAL report 2003028, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign (2003)Google Scholar
  18. 18.
    Smith, R.E., Cribbs, H.B.: Is a learning classifier system a type of neural network? Evolutionary Computation 2, 19–36 (1994)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Martin V. Butz
    • 1
  • David E. Goldberg
    • 1
  • Pier Luca Lanzi
    • 1
  1. 1.Illinois Genetic Algorithms Laboratory (IlliGAL)University of Illinois at Urbana-ChampaignUrbana

Personalised recommendations