2-D Pole Balancing with Recurrent Evolutionary Networks
The success of evolutionary methods on standard control learning tasks has created a need for new benchmarks. The classic pole balancing problem is no longer difficult enough to serve as a viable yardstick for measuring the learning efficiency of these systems. In this paper we present a more difficult version to the classic problem where the cart and pole can move in a plane. We demonstrate a neuroevolution system (Enforced Sub-Populations, or ESP) that can solve this difficult problem without velocity information.
KeywordsInverted Pendulum Recurrent Network Velocity Information Float Point Number Recurrent Connection
Unable to display preview. Download preview PDF.
- D. Michie and R. A. Chambers. BOXES: An experiment in adaptive control. In E. Dale and D. Michie, editors, Machine Intelligence. Oliver and Boyd, Edinburgh, UK, 1968.Google Scholar
- J. Schaffer and R. Cannon. On the control of unstable mechanincal systems. In Automatic and Remote Control III: Proceedings of the Third Congress of the International Federation of Automatic Control, 1966.Google Scholar
- R. Sutton. Temporal Credit Assignment in Reinforcement Learning. PhD thesis, University of Massachusetts, Amherst, MA, 1984.Google Scholar
- D. E. Moriarty and R. Miikkulainen. Efficient reinforcement learning through symbiotic evolution. Machine Learning, 22:11–32, 1996.Google Scholar
- A. Wieland. Evolving neural network controllers for unstable systems. In Proceedings of the International Joint Conference on Neural Networks (Seattle, WA), volume II, pages 667–673, Piscataway, NJ, 1991. IEEE.Google Scholar
- D. E. Moriarty. Symbiotic Evolution of Neural Networks in Sequential Decision Tasks. PhD thesis, Department of Computer Sciences, The University of Texas at Austin, 1997. Technical Report UT-AI97-257.Google Scholar
- D. E. Moriarty and R. Miikkulainen. Evolving obstacle avoidance behavior in a robot arm. In P. Maes, M. Mataric, J.-A. Meyer, and J. Pollack, editors, From Animals to Animats 4: Proceedings of the 4th International Conference on Simulation of Adaptive Behavior, pages 468–475, Cambridge, MA, 1996. MIT Press.Google Scholar