Abstract
The Kernel-based Least-squares Policy Iteration (KLSPI) algorithm provides a general reinforcement learning solution for large-scale Markov decision problems. In KLSPI, the Radial Basis Function (RBF) kernel is usually used to approximate the optimal value-function with high precision. However, selecting a proper kernel-width for the RBF kernel function is very important for KLSPI to be adopted successfully. In previous research, the kernel-width was usually set manually or calculated according to the sample distribution in advance, which requires prior knowledge or model information. In this paper, an adaptive kernel-width selection method is proposed for the KLSPI algorithm. Firstly, a sparsification procedure with neighborhood analysis based on the l 2-ball of radius ε is adopted, which helps obtain a reduced kernel dictionary without presetting the kernel-width. Secondly, a gradient descent method based on the Bellman Residual Error (BRE) is proposed so as to find out a kernel-width minimizing the sum of the BRE. The experimental results show the proposed method can help KLSPI approximate the true value-function more accurately, and, finally, obtain a better control policy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Michai, G.L., Parr, R.: Least-Squares Policy Iteration. Journal of Machine Learning Research 4, 1107–1149 (2003)
Xu, X., Hu, D.W., Lu, X.C.: Kernel-based Least Squares Policy Iteration for Reinforcement Learning. IEEE Transactions on Neural Networks 18(4), 973–992 (2007)
Vapnik, V.: Statistical Learning Theory. Wiley Interscience, NewYork (1998)
Xu, X., Xie, T., Hu, D.W., et al.: Kernel Least-Squares Temporal Difference Learning. Int. J. Inf. Technol. 11(9), 54–63 (2005)
Wu, T.: Kernels’ Properties, Tricks and Its Applications on Obstacle Detection. National University of Defense Technology, Doctor Thesis (2003)
Orr, M.J.L.: Introduction to Radial Basis Functions. Networks (1996)
Haykin, S.: Neural Networks-a Comprehensive Foundation. Prentice-Hall, Englewood Cliffs (1999)
Moody, J., Darken, C.J.: Fast Learning In Networks of Locally-Tuned Processing Units. Neural Computation 1(2), 281–294 (1989)
Archambeau, C., Lendasse, A., Trullemans, C., et al.: Phosphene Evaluation in a Visual Prosthesis with Artificial Neural Networks. In: Proceedings of the European Symposium on Intelligent Technologies, Hybrid Systems and their Implementation on Smart Adaptive Systems, Tenerife, Spain, pp. 509–515 (2001)
Wang, Y., Huang, G., Saratchandran, P., et al.: Self- Adjustment of Neuron Impact Width in Growing and Pruning RBF (GAP-RBF) Neuron Networks. In: Proceedings of ICS 2005, vol. 2, pp. 1014–1017 (2003)
Gao, D.Q.: Adaptive Structure and Parameter Optimizations of Cascade RBF-LBF Neural Networks. Chinese Journal of Computers 26(5), 575–586 (2003)
Chang, Q., Chen, Q., Wang, X.: Scaling Gaussian RBF Kernel Width to Improve SVM Classification. In: International Conference on Neural Networks and Brain, pp. 19–22 (2005)
Liu, J.H., Lampinen, J.: A Differential Evolution Based Incremental Training Method for RBF Networks. In: Proceedings of GECCO 2005, Washington, DC, USA, pp. 881–888 (2005)
Wang, H.J., Leung, C.S., Sum, P.F., et al.: Kernel Width Optimization for Faulty RBF Neural Networks with Multi-node Open Fault. Neural Processing Letters 32(1), 97–107 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, J., Xu, X., Zuo, L., Li, Z., Wang, J. (2011). Adaptive Kernel-Width Selection for Kernel-Based Least-Squares Policy Iteration Algorithm. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds) Advances in Neural Networks – ISNN 2011. ISNN 2011. Lecture Notes in Computer Science, vol 6676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21090-7_70
Download citation
DOI: https://doi.org/10.1007/978-3-642-21090-7_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21089-1
Online ISBN: 978-3-642-21090-7
eBook Packages: Computer ScienceComputer Science (R0)