Adaptive Kernel-Width Selection for Kernel-Based Least-Squares Policy Iteration Algorithm

Wu, Jun; Xu, Xin; Zuo, Lei; Li, Zhaobin; Wang, Jian

doi:10.1007/978-3-642-21090-7_70

Jun Wu²¹,
Xin Xu²¹,
Lei Zuo²¹,
Zhaobin Li²¹ &
…
Jian Wang²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6676))

Included in the following conference series:

International Symposium on Neural Networks

2373 Accesses

Abstract

The Kernel-based Least-squares Policy Iteration (KLSPI) algorithm provides a general reinforcement learning solution for large-scale Markov decision problems. In KLSPI, the Radial Basis Function (RBF) kernel is usually used to approximate the optimal value-function with high precision. However, selecting a proper kernel-width for the RBF kernel function is very important for KLSPI to be adopted successfully. In previous research, the kernel-width was usually set manually or calculated according to the sample distribution in advance, which requires prior knowledge or model information. In this paper, an adaptive kernel-width selection method is proposed for the KLSPI algorithm. Firstly, a sparsification procedure with neighborhood analysis based on the l ₂-ball of radius ε is adopted, which helps obtain a reduced kernel dictionary without presetting the kernel-width. Secondly, a gradient descent method based on the Bellman Residual Error (BRE) is proposed so as to find out a kernel-width minimizing the sum of the BRE. The experimental results show the proposed method can help KLSPI approximate the true value-function more accurately, and, finally, obtain a better control policy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Michai, G.L., Parr, R.: Least-Squares Policy Iteration. Journal of Machine Learning Research 4, 1107–1149 (2003)
MathSciNet MATH Google Scholar
Xu, X., Hu, D.W., Lu, X.C.: Kernel-based Least Squares Policy Iteration for Reinforcement Learning. IEEE Transactions on Neural Networks 18(4), 973–992 (2007)
Article Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley Interscience, NewYork (1998)
MATH Google Scholar
Xu, X., Xie, T., Hu, D.W., et al.: Kernel Least-Squares Temporal Difference Learning. Int. J. Inf. Technol. 11(9), 54–63 (2005)
Google Scholar
Wu, T.: Kernels’ Properties, Tricks and Its Applications on Obstacle Detection. National University of Defense Technology, Doctor Thesis (2003)
Google Scholar
Orr, M.J.L.: Introduction to Radial Basis Functions. Networks (1996)
Google Scholar
Haykin, S.: Neural Networks-a Comprehensive Foundation. Prentice-Hall, Englewood Cliffs (1999)
MATH Google Scholar
Moody, J., Darken, C.J.: Fast Learning In Networks of Locally-Tuned Processing Units. Neural Computation 1(2), 281–294 (1989)
Article Google Scholar
Archambeau, C., Lendasse, A., Trullemans, C., et al.: Phosphene Evaluation in a Visual Prosthesis with Artificial Neural Networks. In: Proceedings of the European Symposium on Intelligent Technologies, Hybrid Systems and their Implementation on Smart Adaptive Systems, Tenerife, Spain, pp. 509–515 (2001)
Google Scholar
Wang, Y., Huang, G., Saratchandran, P., et al.: Self- Adjustment of Neuron Impact Width in Growing and Pruning RBF (GAP-RBF) Neuron Networks. In: Proceedings of ICS 2005, vol. 2, pp. 1014–1017 (2003)
Google Scholar
Gao, D.Q.: Adaptive Structure and Parameter Optimizations of Cascade RBF-LBF Neural Networks. Chinese Journal of Computers 26(5), 575–586 (2003)
Google Scholar
Chang, Q., Chen, Q., Wang, X.: Scaling Gaussian RBF Kernel Width to Improve SVM Classification. In: International Conference on Neural Networks and Brain, pp. 19–22 (2005)
Google Scholar
Liu, J.H., Lampinen, J.: A Differential Evolution Based Incremental Training Method for RBF Networks. In: Proceedings of GECCO 2005, Washington, DC, USA, pp. 881–888 (2005)
Google Scholar
Wang, H.J., Leung, C.S., Sum, P.F., et al.: Kernel Width Optimization for Faulty RBF Neural Networks with Multi-node Open Fault. Neural Processing Letters 32(1), 97–107 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Automation, National University of Defense Technology, Changsha, 410073, P.R. China
Jun Wu, Xin Xu, Lei Zuo, Zhaobin Li & Jian Wang

Authors

Jun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zuo
View author publications
You can also search for this author in PubMed Google Scholar
Zhaobin Li
View author publications
You can also search for this author in PubMed Google Scholar
Jian Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Automation, Key Laboratory of Complex Systems and Intelligence Science, Chinese Academy of Sciences, 100190, Beijing, China
Derong Liu
College of Information Science and Engineering, Northeastern University, 110004, Shenyang, Liaoing, China
Huaguang Zhang
Department of Electrical and Computer Engineering, University of Cyprus, 75 Kallipoleos Avenue, 1678, Nicosia, Cyprus
Marios Polycarpou
Dipartimento di Elettronica, Politecnico di Milano, Piazza L. da Vinci 32, 20133, Milano, Italy
Cesare Alippi
Deptartment of Electrical, Computer and Biomedical Engineering, University of Rhode Island, 02881, Kingston, RI, USA
Haibo He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, J., Xu, X., Zuo, L., Li, Z., Wang, J. (2011). Adaptive Kernel-Width Selection for Kernel-Based Least-Squares Policy Iteration Algorithm. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds) Advances in Neural Networks – ISNN 2011. ISNN 2011. Lecture Notes in Computer Science, vol 6676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21090-7_70

Download citation

DOI: https://doi.org/10.1007/978-3-642-21090-7_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21089-1
Online ISBN: 978-3-642-21090-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics