Skip to main content

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

  • 12k Accesses

Abstract

The challenging task of autonomously learning skills without the help of a teacher, solely based on feedback from the environment to actions, is called reinforcement learning. Still being an active area of research, some impressive results can be shown on robots. Reinforcement learning enables robots to learn motor skills as well as simple cognitive behavior. We use a simple robot with only two degrees of freedom to demonstrate the strengths of the value iteration and Q-learning algorithms, as well as their limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    The arm movement space consisting of arcs is rendered as a right-angled grid.

  2. 2.

    Further information and related sources about crawling robots are available through www.hs-weingarten.de/~ertel/kibuch.

References

  1. A. Billard, S. Calinon, R. Dillmann, and S. Schaal. Robot programming by demonstration. In B. Siciliano and O. Khatib, editors, Handbook of Robotics, pages 1371–1394. Springer, Berlin, 2008.

    Chapter  Google Scholar 

  2. R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton, 1957.

    MATH  Google Scholar 

  3. A. G. Barto and S. Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Syst., Special Issue on Reinforcement Learning, 13:41–77, 2003.

    MATH  MathSciNet  Google Scholar 

  4. W. Ertel, M. Schneider, R. Cubek, and M. Tokic. The teaching-box: a universal robot learning framework. In Proceedings of the 14th International Conference on Advanced Robotics (ICAR 2009), 2009. www.servicerobotik.hs-weingarten.de/teachingbox.

    Google Scholar 

  5. L. P. Kaelbling, M. L. Littman, and A. P. Moore. Reinforcement learning: a survey. J. Artif. Intell. Res., 4:237–285, 1996. www-2.cs.cmu.edu/afs/cs/project/jair/pub/volume4/kaelbling96a.pdf.

    Google Scholar 

  6. H. Kimura, K. Miyazaki, and S. Kobayashi. Reinforcement learning in POMDPs with function approximation. In 14th International Conference on Machine Learning, pages 152–160. Morgan Kaufmann, San Mateo, 1997. http://sysplan.nams.kyushu-u.ac.jp/gen/papers/JavaDemoML97/robodemo.html.

    Google Scholar 

  7. T. Mitchell. Machine Learning. McGraw–Hill, New York, 1997. www-2.cs.cmu.edu/~tom/mlbook.html.

    MATH  Google Scholar 

  8. L. Panait and S. Luke. Cooperative multi-agent learning: the state of the art. Auton. Agents Multi-Agent Syst., 11(3):387–434, 2005.

    Article  Google Scholar 

  9. J. Peters and S. Schaal. Reinforcement learning of motor skills with policy gradients. Neural Netw., 21(4):682–697, 2008. http://www-clmc.usc.edu/publications/P/peters-NN2008.pdf.

    Article  Google Scholar 

  10. M. Riedmiller, M. Montemerlo, and H. Dahlkamp. Learning to drive a real car in 20 minutes. In FBIT ’07: Proceedings of the 2007 Frontiers in the Convergence of Bioscience and Information Technologies, pages 645–650. IEEE Computer Society, Washington, DC, 2007.

    Google Scholar 

  11. The robocup soccer simulator. http://sserver.sourceforge.net.

  12. R. Sutton and A. Barto. Reinforcement Learning. MIT Press, Cambridge, 1998. www.cs.ualberta.ca/~sutton/book/the-book.html.

    Google Scholar 

  13. M. Schneider and W. Ertel. Robot learning by demonstration with local gaussian process regression. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’10), 2010.

    Google Scholar 

  14. P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for robocup-soccer keepaway. Adaptive Behavior, 2005. www.cs.utexas.edu/~pstone/Papers/bib2html-links/AB05.pdf.

  15. C. Szepesvari. Algorithms for Reinforcement Learning. Morgan & Claypool, San Rafael, 2010. Draft available online: http://www.ualberta.ca/~szepesva/RLBook.html.

    MATH  Google Scholar 

  16. R. Tedrake. Learning control at intermediate Reynolds numbers. In Workshop on: Robotics Challenges for Machine Learning II, International Conference on Intelligent Robots and Systems (IROS 2008), Nice, France, 2008.

    Google Scholar 

  17. M. Tokic, W. Ertel, and J. Fessler. The crawler, a class room demonstrator for reinforcement learning. In Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference (FLAIRS 09). AAAI Press, Menlo Park, 2009.

    Google Scholar 

  18. G. Tesauro. Temporal difference learning and td-gammon. Commun. ACM, 38(3), 1995. www.research.ibm.com/massive/tdl.html.

  19. M. Tokic. Entwicklung eines Lernfähigen Laufroboters. Diplomarbeit Hochschule Ravensburg-Weingarten, 2006. Inklusive Simulationssoftware verfügbar auf www.hs-weingarten.de/~ertel/kibuch.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wolfgang Ertel .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Ertel, W. (2011). Reinforcement Learning. In: Introduction to Artificial Intelligence. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-0-85729-299-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-299-5_10

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-298-8

  • Online ISBN: 978-0-85729-299-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics