Value Function Approximation through Sparse Bayesian Modeling
In this study we present a sparse Bayesian framework for value function approximation. The proposed method is based on the on-line construction of a dictionary of states which are collected during the exploration of the environment by the agent. A linear regression model is established for the observed partial discounted return of such dictionary states, where we employ the Relevance Vector Machine (RVM) and exploit its enhanced modeling capability due to the embedded sparsity properties. In order to speed-up the optimization procedure and allow dealing with large-scale problems, an incremental strategy is adopted. A number of experiments have been conducted on both simulated and real environments, where we took promising results in comparison with another Bayesian approach that uses Gaussian processes.
KeywordsValue function approximation Sparse Bayesian modeling Relevance Vector Machine Incremental learning
Unable to display preview. Download preview PDF.
- 3.Engel, Y., Mannor, S., Meir, R.: Reinforcement learning with gaussian process. In: International Conference on Machine Learning, pp. 201–208 (2005)Google Scholar
- 5.Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Inteligence Research 4, 237–285 (1996)Google Scholar
- 7.Moore, A.: Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces. In: Machine Learning: Proceedings of the Eighth International Conference. Morgan Kaufmann (June 1991)Google Scholar
- 8.Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. MIT Press (2006)Google Scholar
- 9.Rummery, G.A., Niranjan, M.: On-line q-learning using connectionist systems. Tech. rep., Cambridge University Engineering Department (1994)Google Scholar
- 10.Scholkopf, B., Smola, A.: Learning with Kernels. MIT Press (2002)Google Scholar
- 12.Singh, S., Sutton, R.S., Kaelbling, P.: Reinforcement learning with replacing eligibility traces. Machine Learning, 123–158 (1996)Google Scholar
- 13.Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
- 14.Taylor, G., Parr, R.: Kernelized value function approximation for reinforcement learning. In: International Conference on Machine Learning, pp. 1017–1024 (2009)Google Scholar
- 16.Tipping, M.E., Faul, A.C.: Fast marginal likelihood maximization for sparse bayesian models. In: Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (2003)Google Scholar
- 20.Xu, X., Xie, T., Hu, D., Lu, X.: Kernel least-squares temporal difference learning. International Journal of Information Technology 11(9), 54–63 (2005)Google Scholar