Advertisement

Kernel-Based Reinforcement Learning

  • Guanghua Hu
  • Yuqin Qiu
  • Liming Xiang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4113)

Abstract

We consider the problem of approximating the cost-to-go functions in reinforcement learning. By mapping the state implicitly into a feature space, we perform a simple algorithm in the feature space, which corresponds to a complex algorithm in the original state space. Two kernel-based reinforcement learning algorithms, the ε -insensitive kernel based reinforcement learning (ε – KRL) and the least squares kernel based reinforcement learning (LS-KRL) are proposed. An example shows that the proposed methods can deal effectively with the reinforcement learning problem without having to explore many states.

Keywords

Feature Space Reinforcement Learning Reinforcement Learning Method Greedy Policy Eligibility Trace 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sutton, R.S.: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3, 9–44 (1988)Google Scholar
  2. 2.
    Watkins, C.J.C.H.: Q-Learning. Machine Learning 8, 279–292 (1992)MATHGoogle Scholar
  3. 3.
    Santharam, G., Sastry, P.S.: A Reinforcement Learning Neural Network for Adaptive Control Markov Chains. IEEE Transactions on System, Man and Cybernetics-Part A 27, 588–600 (1997)CrossRefGoogle Scholar
  4. 4.
    Tsitsiklis, J.N., Roy, B.V.: An Analysis of Temporal-Difference Learning with Function Approximation. IEEE Transactions on Automatic Control 42, 674–690 (1997)MATHCrossRefGoogle Scholar
  5. 5.
    Tsitsiklis, J.N., Roy, B.V.: Feature-Based Methods for Large Scale Dynamic Programming. Machine Learning 22, 59–94 (1996)MATHGoogle Scholar
  6. 6.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Camgbridge (2004)Google Scholar
  7. 7.
    Bertsekas, D.P.: Dynamic Programming: Deterministic and Stochastic Methods. Prentice-Hall, Englewood Cliffs (1987)Google Scholar
  8. 8.
    Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons, New York (1998)MATHGoogle Scholar
  9. 9.
    Cawley, G.C., Talbot, N.L.C.: Improved Sparse Least-Squares Support Vector Machines. Neurocomputing 48, 1025–1031 (2002)MATHCrossRefGoogle Scholar
  10. 10.
    Flake, G.W., Lawrence, S.: Efficient SVM Regression Training with SMO. Machine Learning 46, 271–290 (2002)MATHCrossRefGoogle Scholar
  11. 11.
    Suykens, J.A.K., Brabanter, J.D., Lukas, L., Vandewalle, J.: Weighted Least Squares Support Machines: Robustness and Sparse Approximation. Neurocomputing 48, 85–105 (2002)MATHCrossRefGoogle Scholar
  12. 12.
    Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Scholkopf, B., Burges, C.J., Smola, A.J. (eds.) Advances in Kernel Methods –Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Guanghua Hu
    • 1
  • Yuqin Qiu
    • 1
  • Liming Xiang
    • 2
  1. 1.School of Mathematics and StatisticsYunnan University, KunmingYunnanP.R. China
  2. 2.Department of Management SciencesCity University of Hong KongKowloon, Hong Kong

Personalised recommendations