Controlling a Simulated Khepera with an XCS Classifier System with Memory

  • Andrew Webb
  • Emma Hart
  • Peter Ross
  • Alistair Lawson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2801)

Abstract

Autonomous agents commonly suffer from perceptual aliasing in which differing situations are perceived as identical by the robots sensors, yet require different courses of action. One technique for addressing this problem is to use additional internal states within a reinforcement learning system, in particular a learning classifier system. Previous research has shown that adding internal memory states can allow an animat within a cellular world to successfully navigate complex mazes. However, the technique has not previously been applied to robotic environments in which sensory data is noisy and somewhat unpredictable. We present results of using XCS with additional internal memory in the simulated Khepera environment, and show that control rules can be evolved to allow the robot to navigate a variety of problems.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Maes, P., Brooks, R.: Learning to coordinate behaviors. In: Proceedings of the Eighth International Conference on Artificial Intelligence (AAAI 1990), pp. 796–802 (1990)Google Scholar
  2. 2.
    Lee, W.P.e.: Applying genetic programming to evolve behaviour primitives and arbitrators for mobile robots. In: Proceedings of the IEEE International Conference on Evolutionary Computation, Indianapolis, U A (2000)Google Scholar
  3. 3.
    Sutton, R.S.: Planning by incremental dynamic programming. In: Proceedings of the Eighth International Workshop on Machine Learning, pp. 353–357 (1991)Google Scholar
  4. 4.
    Chapman, D., Kaelbling, L.P.: Learning from delayed reinforcement ina complex domain. In: Proceedings of the 12th Int. Joint Conf. on Artificial Intelligence (1991)Google Scholar
  5. 5.
    Mahadevan, S., Connell, J.: Scaling reinforcement learning to robotics by exploting the subsumption arc itecture. In: Proceedings of the Eighth International Workshop on Machine Learning (1991)Google Scholar
  6. 6.
    Whitehead, S., Ballad, D.H.: Learning to perceive and act by trial and error. Machine Learning 7, 45–83 (1991)Google Scholar
  7. 7.
    McCallum, A.: Reinforcement Learning with Selective Perception and Hidden State. PhD thesis, University of Rochester (1996)Google Scholar
  8. 8.
    Sondik, E.: The optimal control of partially observable Markov processes. PhD thesis, Computer Science, Stanford University (1971)Google Scholar
  9. 9.
    Hansen, E.: Finite-memory control of partially observable systems. PhD thesis, Computer Science, University of Massachussetts at Amherst (1998)Google Scholar
  10. 10.
    Kim, D., Hallam, J.: An evolutionary approach to quantify internal states needed for the woods proble. In: Proceedings of the Seventh International Conference on the Simulation of Adaptive Behavior, MIT Press, From Animals to Animats (2000)Google Scholar
  11. 11.
    Wilson, S.: Zcs: a zeroth level classifier. Evolutionary Computation 2, 1–18 (1994)CrossRefGoogle Scholar
  12. 12.
    Cliff, D., Ross, S.: Adding temporary memory to zcs. Adaptive Behavior 3, 101–150 (1994)CrossRefGoogle Scholar
  13. 13.
    Lanzi, P.L.: Adding memory to xcs. In: Proceedings of the IEEE World Congress on Computational Intelligence, IEEE Press, Anchorage, Alaska, pp. 609–661 (1998)Google Scholar
  14. 14.
    Lanzi, P.L.: An analysis of the memory mechanism of XCSM. In: Genetic Programming 1998: Proceedings of the Third Annual Conference, pp. 643–665. Morgan Kaufmann, San Francisco (1998)Google Scholar
  15. 15.
    Wilson, S.W.: Generalization in xcs. Evolutionary Computation 3, 149–175 (1995)CrossRefGoogle Scholar
  16. 16.
    Lanzi, P.: Adding memory to wilson–s xcs classifier system: to learn in partially observable environments. In: Procedings of AAAI Fall Symposium on Partiallyobservable Markov Decision Processes, pp. 91–98. AAAI Press, Menlo Park (1998)Google Scholar
  17. 17.
    Stolzmann, W., Butz, M.: Latent learning and action-planning in robots with anticipatory classifier systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S. (eds.) Learning Classifier Systems: From Foundations to Application Advances in Evolutionary Computing, pp. 301–317. Springer, Heidelberg (2000)Google Scholar
  18. 18.
    Carse, B., Pipe, A.: X-fcs: A fuzzy classifier system using accuracy-based fitness - first results. Technical Report UWELCSG01-007, University of the West of England, Bristol (2001)Google Scholar
  19. 19.
    Hurst, J., Bull, L., Melhuish, C.: ZCS and TCS learning classifier system controllers on real robots. Technical Report UWELCSG02-002, University of the West of England, Bristol (2002)Google Scholar
  20. 20.
    Dorigo, M.: Alecsys and the autonomouse: Learning to control a real robot by distributed classifier systems. Machine Learning 19, 209–240 (1995)Google Scholar
  21. 21.
  22. 22.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Andrew Webb
    • 1
  • Emma Hart
    • 1
  • Peter Ross
    • 1
  • Alistair Lawson
    • 1
  1. 1.Napier UniversityEdinburghScotland, UK

Personalised recommendations