Sparse Kernel-SARSA(λ) with an Eligibility Trace

Robards, Matthew; Sunehag, Peter; Sanner, Scott; Marthi, Bhaskara

doi:10.1007/978-3-642-23808-6_1

Matthew Robards^23,24,
Peter Sunehag²⁴,
Scott Sanner^23,24 &
…
Bhaskara Marthi²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6913))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

5592 Accesses
3 Citations

Abstract

We introduce the first online kernelized version of SARSA(λ) to permit sparsification for arbitrary λ for 0 ≤ λ ≤ 1; this is possible via a novel kernelization of the eligibility trace that is maintained separately from the kernelized value function. This separation is crucial for preserving the functional structure of the eligibility trace when using sparse kernel projection techniques that are essential for memory efficiency and capacity control. The result is a simple and practical Kernel-SARSA(λ) algorithm for general 0 ≤ λ ≤ 1 that is memory-efficient in comparison to standard SARSA(λ) (using various basis functions) on a range of domains including a real robotics task running on a Willow Garage PR2 robot.

Download to read the full chapter text

Chapter PDF

The kernel Kalman rule

Article 18 June 2019

Sparse Contextual Task Learning and Classification to Assist Mobile Robot Teleoperation with Introspective Estimation

Article 03 October 2017

Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Aronszajn, N.: Theory of reproducing kernels. Transactions of the American Mathematical Society 68 (1950)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics), 1st edn. Springer, Heidelberg (October 2007)
Google Scholar
Csato, L., Opper, M.: Sparse representation for gaussian process models. In: Advances in Neural Information Processing Systems, vol. 13 (2001)
Google Scholar
Engel, Y.: Algorithms and Representations for Reinforcement Learning. PhD thesis, Hebrew University (April 2005)
Google Scholar
Engel, Y., Mannor, S., Meir, R.: Sparse online greedy support vector regression. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 84–96. Springer, Heidelberg (2002)
Chapter Google Scholar
Engel, Y., Mannor, S., Meir, R.: Bayes meets bellman: The gaussian process approach to temporal difference learning. In: Proc. of the 20th International Conference on Machine Learning, pp. 154–161 (2003)
Google Scholar
Jong, N., Stone, P.: Kernel-based models for reinforcement learning in continuous state spaces. In: ICML Workshop on Kernel Machines and Reinforcement Learning (2006)
Google Scholar
Jung, T., Stone, P.: Gaussian processes for sample efficient reinforcement learning with rmax-like exploration. In: Proceedings of the European Conference on Machine Learning (September 2010)
Google Scholar
Orabona, F., Keshet, J., Caputo, B.: The projectron: a bounded kernel-based perceptron. In: ICML 2008: Proceedings of the 25th International Conference on Machine Learning, pp. 720–727. ACM, New York (2008)
Chapter Google Scholar
Ormoneit, D., Sen, S.: Kernel-based reinforcement learning. Machine Learning 49, 161–178 (2002)
Article MATH Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994)
Book MATH Google Scholar
Rasmussen, C.E., Kuss, M.: Gaussian processes in reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 16, pp. 751–759. MIT Press, Cambridge (2003)
Google Scholar
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning. The MIT Press, Cambridge (1998)
Google Scholar
Taylor, G., Parr, R.: Kernelized value function approximation for reinforcement learning. In: ICML 2009: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1017–1024. ACM, New York (2009)
Google Scholar
Xu, X.: A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning. In: Jiao, L., Wang, L., Gao, X.-b., Liu, J., Wu, F. (eds.) ICNC 2006. LNCS, vol. 4221, pp. 47–56. Springer, Heidelberg (2006)
Chapter Google Scholar
Xu, X., Hu, D., Lu, X.: Kernel-based least squares policy iteration for reinforcement learning. IEEE Transactions on Neural Networks 18(4), 973–992 (2007)
Article Google Scholar
Xu, X., Xie, T., Hu, D., Lu, X., Xu, X., Xie, T., Hu, D., Lu, X.: Kernel least-squares temporal difference learning kernel least-squares temporal difference learning. International Journal of Information Technology, 54–63 (2005)
Google Scholar
Engel, Y., Mannor, S., Meir, R.: Reinforcement learning with Gaussian processes. In: 22nd International Conference on Machine Learning (ICML 2005), Bonn, Germany, pp. 201–208 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

National ICT Australia, Locked Bag 8001, Canberra, ACT, 2601, Australia
Matthew Robards & Scott Sanner
Research School of Computer Science, Australian National University, Canberra, ACT, 0200, Australia
Matthew Robards, Peter Sunehag & Scott Sanner
Willow Garage, Inc., 68 Willow Road, Menlo Park, CA, 94025, USA
Bhaskara Marthi

Authors

Matthew Robards
View author publications
You can also search for this author in PubMed Google Scholar
Peter Sunehag
View author publications
You can also search for this author in PubMed Google Scholar
Scott Sanner
View author publications
You can also search for this author in PubMed Google Scholar
Bhaskara Marthi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics and Telecommunications, University of Athens, Panepistimioupolis, Ilisia, 15784, Athens, Greece
Dimitrios Gunopulos
Google Switzerland GmbH, Brandschenkestrasse 110, 8002, Zurich, Switzerland
Thomas Hofmann
Department of Computer Science, University of Bari “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Donato Malerba
Deptartment of Informatics, Athens University of Economics and Business, Patision 76, 10434, Athens, Greece
Michalis Vazirgiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Robards, M., Sunehag, P., Sanner, S., Marthi, B. (2011). Sparse Kernel-SARSA(λ) with an Eligibility Trace. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6913. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23808-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-23808-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23807-9
Online ISBN: 978-3-642-23808-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sparse Kernel-SARSA(λ) with an Eligibility Trace

Abstract

Chapter PDF

Similar content being viewed by others

The kernel Kalman rule

Sparse Contextual Task Learning and Classification to Assist Mobile Robot Teleoperation with Introspective Estimation

Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Sparse Kernel-SARSA(λ) with an Eligibility Trace

Abstract

Chapter PDF

Similar content being viewed by others

The kernel Kalman rule

Sparse Contextual Task Learning and Classification to Assist Mobile Robot Teleoperation with Introspective Estimation

Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation