Value Function Approximation through Sparse Bayesian Modeling

Tziortziotis, Nikolaos; Blekas, Konstantinos

doi:10.1007/978-3-642-29946-9_15

Nikolaos Tziortziotis²¹ &
Konstantinos Blekas²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7188))

Included in the following conference series:

European Workshop on Reinforcement Learning

2229 Accesses
3 Citations

Abstract

In this study we present a sparse Bayesian framework for value function approximation. The proposed method is based on the on-line construction of a dictionary of states which are collected during the exploration of the environment by the agent. A linear regression model is established for the observed partial discounted return of such dictionary states, where we employ the Relevance Vector Machine (RVM) and exploit its enhanced modeling capability due to the embedded sparsity properties. In order to speed-up the optimization procedure and allow dealing with large-scale problems, an incremental strategy is adopted. A number of experiments have been conducted on both simulated and real environments, where we took promising results in comparison with another Bayesian approach that uses Gaussian processes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
MATH Google Scholar
Bradtke, S.J., Barto, A.G.: Linear least-squares algorithms for temporal difference learning. Machine Learning 22, 33–57 (1996)
MATH Google Scholar
Engel, Y., Mannor, S., Meir, R.: Reinforcement learning with gaussian process. In: International Conference on Machine Learning, pp. 201–208 (2005)
Google Scholar
Geist, M., Pietquin, O.: Kalman Temporal Differences. Journal of Artificial Intelligence Research 39, 483–532 (2010)
MathSciNet MATH Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Inteligence Research 4, 237–285 (1996)
Google Scholar
Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. Journal of Machine Learning Research 4, 1107–1149 (2003)
MathSciNet Google Scholar
Moore, A.: Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces. In: Machine Learning: Proceedings of the Eighth International Conference. Morgan Kaufmann (June 1991)
Google Scholar
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. MIT Press (2006)
Google Scholar
Rummery, G.A., Niranjan, M.: On-line q-learning using connectionist systems. Tech. rep., Cambridge University Engineering Department (1994)
Google Scholar
Scholkopf, B., Smola, A.: Learning with Kernels. MIT Press (2002)
Google Scholar
Seeger, M.: Bayesian Inference and Optimal Design for the Sparse Linear Model. Journal of Machine Learning Research 9, 759–813 (2008)
MathSciNet MATH Google Scholar
Singh, S., Sutton, R.S., Kaelbling, P.: Reinforcement learning with replacing eligibility traces. Machine Learning, 123–158 (1996)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Taylor, G., Parr, R.: Kernelized value function approximation for reinforcement learning. In: International Conference on Machine Learning, pp. 1017–1024 (2009)
Google Scholar
Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)
MathSciNet MATH Google Scholar
Tipping, M.E., Faul, A.C.: Fast marginal likelihood maximization for sparse bayesian models. In: Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (2003)
Google Scholar
Tzikas, D., Likas, A., Galatsanos, N.: Sparse Bayesian modeling with adaptive kernel learning. IEEE Trans. on Neural Networks 20(6), 926–937 (2009)
Article Google Scholar
Watkins, C., Dayan, P.: Q-learning. Machine Learning 8(3), 279–292 (1992)
MATH Google Scholar
Xu, X., Hu, D., Lu, X.: Kernel-based least squares policy iteration for reinforcement learning. IEEE Transactions on Neural Networks 18(4), 973–992 (2007)
Article Google Scholar
Xu, X., Xie, T., Hu, D., Lu, X.: Kernel least-squares temporal difference learning. International Journal of Information Technology 11(9), 54–63 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Ioannina, P.O. Box 1186, 45110, Ioannina, Greece
Nikolaos Tziortziotis & Konstantinos Blekas

Authors

Nikolaos Tziortziotis
View author publications
You can also search for this author in PubMed Google Scholar
Konstantinos Blekas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NICTA and the Australian National University, 7 London Circuit, ACT 2601, Canberra, Australia
Scott Sanner
Research School of Computer Science, Australian National University, ACT 0200, Canberra, Australia
Marcus Hutter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tziortziotis, N., Blekas, K. (2012). Value Function Approximation through Sparse Bayesian Modeling. In: Sanner, S., Hutter, M. (eds) Recent Advances in Reinforcement Learning. EWRL 2011. Lecture Notes in Computer Science(), vol 7188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29946-9_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-29946-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29945-2
Online ISBN: 978-3-642-29946-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics