The Open-Source TEXPLORE Code Release for Reinforcement Learning on Robots

Hester, Todd; Stone, Peter

doi:10.1007/978-3-662-44468-9_47

Todd Hester²³ &
Peter Stone²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8371))

Included in the following conference series:

Robot Soccer World Cup

3173 Accesses

Abstract

The use of robots in society could be expanded by using reinforcement learning (RL) to allow robots to learn and adapt to new situations on-line. RL is a paradigm for learning sequential decision making tasks, usually formulated as a Markov Decision Process (MDP). For an RL algorithm to be practical for robotic control tasks, it must learn in very few samples, while continually taking actions in real-time. In addition, the algorithm must learn efficiently in the face of noise, sensor/actuator delays, and continuous state features. In this paper, we present the texplore ROS code release, which contains texplore, the first algorithm to address all of these challenges together. We demonstrate texplore learning to control the velocity of an autonomous vehicle in real-time. texplore has been released as an open-source ROS repository, enabling learning on a variety of robot tasks.

Download to read the full chapter text

Chapter PDF

Learning High-Level Navigation Strategies via Inverse Reinforcement Learning: A Comparative Analysis

Practical Bayesian Inverse Reinforcement Learning for Robot Navigation

RL-Studio: A Tool for Reinforcement Learning Methods in Robotics

Keywords

References

Beeson, P., O’Quin, J., Gillan, B., Nimmagadda, T., Ristroph, M., Li, D., Stone, P.: Multiagent interactions in urban driving. Journal of Physical Agents 2(1), 15–30 (2008)
Google Scholar
Brafman, R., Tennenholtz, M.: R-Max - a general polynomial time algorithm for near-optimal reinforcement learning. In: IJCAI (2001)
Google Scholar
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Dietterich, T.: The MAXQ method for hierarchical reinforcement learning. In: ICML, pp. 118–126 (1998)
Google Scholar
Hester, T.: TEXPLORE: Temporal Difference Reinforcement Learning for Robots and Time-Constrained Domains. PhD thesis, Department of Computer Science, University of Texas at Austin, Austin, TX (December 2012)
Google Scholar
Hester, T., Quinlan, M., Stone, P.: Generalized model learning for reinforcement learning on a humanoid robot. In: ICRA (May 2010)
Google Scholar
Hester, T., Quinlan, M., Stone, P.: RTMBA: A real-time model-based reinforcement learning architecture for robot control. In: ICRA (2012)
Google Scholar
Hester, T., Stone, P.: Real time targeted exploration in large domains. In: ICDL (August 2010)
Google Scholar
Hester, T., Stone, P.: TEXPLORE: Real-time sample-efficient reinforcement learning for robots. Machine Learning 87, 10–20 (2012)
Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Google Scholar
Kohl, N., Stone, P.: Machine learning for fast quadrupedal locomotion. In: AAAI Conference on Artificial Intelligence (2004)
Google Scholar
Konidaris, G., Barto, A.G.: Building portable options: Skill transfer in reinforcement learning. In: IJCAI (2007)
Google Scholar
Ng, A., Kim, H.J., Jordan, M., Sastry, S.: Autonomous helicopter flight via reinforcement learning. In: NIPS (2003)
Google Scholar
Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.: ROS: An open-source robot operating system. In: ICRA Workshop on Open Source Software (2009)
Google Scholar
Quinlan, R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Quinlan, R.: Learning with continuous classes. In: 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348. World Scientific, Singapore (1992)
Google Scholar
Rummery, G., Niranjan, M.: On-line Q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166. Cambridge University Engineering Department (1994)
Google Scholar
Sutton, R.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: ICML, pp. 216–224 (1990)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Tanner, B., White, A.: RL-Glue: Language-independent software for reinforcement-learning experiments. JMLR 10, 2133–2136 (2009)
Google Scholar
Watkins, C.: Learning From Delayed Rewards. PhD thesis. University of Cambridge (1989)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Texas at Austin, Austin, TX, 78712, USA
Todd Hester & Peter Stone

Authors

Todd Hester
View author publications
You can also search for this author in PubMed Google Scholar
Peter Stone
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Institute VI: Autonomous Intelligent Systems, University Bonn, Friedrich-Ebert-Allee 144, 53113, Bonn, Germany
Sven Behnke
School of Computing Science, Computer Science Department, Carnegie Mellon University, 500 Forbes Avenue, 15213-3890, USA, Pittsburgh, PA, USA
Manuela Veloso
Faculty of Science, Informatics Institute, Intelligent Robotics Lab, University of Amsterdam, Science Park 904, 1098 XH, Amsterdam, The Netherlands
Arnoud Visser
Institute of Cyber-Systems and Control, Department of Control Science and Engineering, Zhejiang University, 310027, Hangzhou, China
Rong Xiong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hester, T., Stone, P. (2014). The Open-Source TEXPLORE Code Release for Reinforcement Learning on Robots. In: Behnke, S., Veloso, M., Visser, A., Xiong, R. (eds) RoboCup 2013: Robot World Cup XVII. RoboCup 2013. Lecture Notes in Computer Science(), vol 8371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44468-9_47

Download citation

DOI: https://doi.org/10.1007/978-3-662-44468-9_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44467-2
Online ISBN: 978-3-662-44468-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics