Reinforcement Learning

Ertel, Wolfgang

doi:10.1007/978-0-85729-299-5_10

Wolfgang Ertel²

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

12k Accesses

Abstract

The challenging task of autonomously learning skills without the help of a teacher, solely based on feedback from the environment to actions, is called reinforcement learning. Still being an active area of research, some impressive results can be shown on robots. Reinforcement learning enables robots to learn motor skills as well as simple cognitive behavior. We use a simple robot with only two degrees of freedom to demonstrate the strengths of the value iteration and Q-learning algorithms, as well as their limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
The arm movement space consisting of arcs is rendered as a right-angled grid.
2.
Further information and related sources about crawling robots are available through www.hs-weingarten.de/~ertel/kibuch.

References

A. Billard, S. Calinon, R. Dillmann, and S. Schaal. Robot programming by demonstration. In B. Siciliano and O. Khatib, editors, Handbook of Robotics, pages 1371–1394. Springer, Berlin, 2008.
Chapter Google Scholar
R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton, 1957.
MATH Google Scholar
A. G. Barto and S. Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Syst., Special Issue on Reinforcement Learning, 13:41–77, 2003.
MATH MathSciNet Google Scholar
W. Ertel, M. Schneider, R. Cubek, and M. Tokic. The teaching-box: a universal robot learning framework. In Proceedings of the 14th International Conference on Advanced Robotics (ICAR 2009), 2009. www.servicerobotik.hs-weingarten.de/teachingbox.
Google Scholar
L. P. Kaelbling, M. L. Littman, and A. P. Moore. Reinforcement learning: a survey. J. Artif. Intell. Res., 4:237–285, 1996. www-2.cs.cmu.edu/afs/cs/project/jair/pub/volume4/kaelbling96a.pdf.
Google Scholar
H. Kimura, K. Miyazaki, and S. Kobayashi. Reinforcement learning in POMDPs with function approximation. In 14th International Conference on Machine Learning, pages 152–160. Morgan Kaufmann, San Mateo, 1997. http://sysplan.nams.kyushu-u.ac.jp/gen/papers/JavaDemoML97/robodemo.html.
Google Scholar
T. Mitchell. Machine Learning. McGraw–Hill, New York, 1997. www-2.cs.cmu.edu/~tom/mlbook.html.
MATH Google Scholar
L. Panait and S. Luke. Cooperative multi-agent learning: the state of the art. Auton. Agents Multi-Agent Syst., 11(3):387–434, 2005.
Article Google Scholar
J. Peters and S. Schaal. Reinforcement learning of motor skills with policy gradients. Neural Netw., 21(4):682–697, 2008. http://www-clmc.usc.edu/publications/P/peters-NN2008.pdf.
Article Google Scholar
M. Riedmiller, M. Montemerlo, and H. Dahlkamp. Learning to drive a real car in 20 minutes. In FBIT ’07: Proceedings of the 2007 Frontiers in the Convergence of Bioscience and Information Technologies, pages 645–650. IEEE Computer Society, Washington, DC, 2007.
Google Scholar
The robocup soccer simulator. http://sserver.sourceforge.net.
R. Sutton and A. Barto. Reinforcement Learning. MIT Press, Cambridge, 1998. www.cs.ualberta.ca/~sutton/book/the-book.html.
Google Scholar
M. Schneider and W. Ertel. Robot learning by demonstration with local gaussian process regression. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’10), 2010.
Google Scholar
P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for robocup-soccer keepaway. Adaptive Behavior, 2005. www.cs.utexas.edu/~pstone/Papers/bib2html-links/AB05.pdf.
C. Szepesvari. Algorithms for Reinforcement Learning. Morgan & Claypool, San Rafael, 2010. Draft available online: http://www.ualberta.ca/~szepesva/RLBook.html.
MATH Google Scholar
R. Tedrake. Learning control at intermediate Reynolds numbers. In Workshop on: Robotics Challenges for Machine Learning II, International Conference on Intelligent Robots and Systems (IROS 2008), Nice, France, 2008.
Google Scholar
M. Tokic, W. Ertel, and J. Fessler. The crawler, a class room demonstrator for reinforcement learning. In Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference (FLAIRS 09). AAAI Press, Menlo Park, 2009.
Google Scholar
G. Tesauro. Temporal difference learning and td-gammon. Commun. ACM, 38(3), 1995. www.research.ibm.com/massive/tdl.html.
M. Tokic. Entwicklung eines Lernfähigen Laufroboters. Diplomarbeit Hochschule Ravensburg-Weingarten, 2006. Inklusive Simulationssoftware verfügbar auf www.hs-weingarten.de/~ertel/kibuch.

Download references

Author information

Authors and Affiliations

FB Elektrotechnik und Informatik, Hochschule Ravensburg-Weingarten, University of Applied Sciences, Weingarten, Germany
Prof. Dr. Wolfgang Ertel

Authors

Prof. Dr. Wolfgang Ertel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wolfgang Ertel .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ertel, W. (2011). Reinforcement Learning. In: Introduction to Artificial Intelligence. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-0-85729-299-5_10

Download citation

DOI: https://doi.org/10.1007/978-0-85729-299-5_10
Publisher Name: Springer, London
Print ISBN: 978-0-85729-298-8
Online ISBN: 978-0-85729-299-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics