Advertisement

Probabilistic Inference for Fast Learning in Control

  • Carl Edward Rasmussen
  • Marc Peter Deisenroth
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5323)

Abstract

We provide a novel framework for very fast model-based reinforcement learning in continuous state and action spaces. The framework requires probabilistic models that explicitly characterize their levels of confidence. Within this framework, we use flexible, non-parametric models to describe the world based on previously collected experience. We demonstrate learning on the cart-pole problem in a setting where we provide very limited prior knowledge about the task. Learning progresses rapidly, and a good policy is found after only a hand-full of iterations.

Keywords

Gaussian Process Reinforcement Learning Goal State Successor State Neural Information Processing System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sutton, R.S.: Integrated Architectures for Learning, Planning, and Reacting Based on Approximate Dynamic Programming. In: Proceedings of the Seventh International Conference on Machine Learning, pp. 215–224. Morgan Kaufman Publishers, San Francisco (1990)Google Scholar
  2. 2.
    Atkeson, C.G., Santamaría, J.C.: A Comparison of Direct and Model-Based Reinforcement Learning. In: Proceedings of the International Conference on Robotics and Automation (1997)Google Scholar
  3. 3.
    Atkeson, C.G., Schaal, S.: Robot Learning from Demonstration. In: Proceedings of the 14th International Conference on Machine Learning, Nashville, TN, USA, July 1997, pp. 12–20. Morgan Kaufmann, San Francisco (1997)Google Scholar
  4. 4.
    Abbeel, P., Quigley, M., Ng, A.Y.: Using Inaccurate Models in Reinforcement Learning. In: Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, June 2006, pp. 1–8 (2006)Google Scholar
  5. 5.
    Poupart, P., Vlassis, N.: Model-based Bayesian Reinforcement Learning in Partially Observable Domains. In: Proceedings of the International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL, USA (January 2008)Google Scholar
  6. 6.
    Schaal, S.: Learning From Demonstration. In: Advances in Neural Information Processing Systems, vol. 9, pp. 1040–1046. The MIT Press, Cambridge (1997)Google Scholar
  7. 7.
    Abbeel, P., Ng, A.Y.: Exploration and Apprenticeship Learning in Reinforcement Learning. In: Proceedings of th 22nd International Conference on Machine Learning, Bonn, Germay, August 2005, pp. 1–8 (2005)Google Scholar
  8. 8.
    Peters, J., Schaal, S.: Learning to Control in Operational Space. The International Journal of Robotics Research 27(2), 197–212 (2008)CrossRefGoogle Scholar
  9. 9.
    Kuss, M.: Gaussian Process Models for Robust Regression, Classification, and Reinforcement Learning. Ph.D thesis, Technische Universität Darmstadt, Germany (February 2006)Google Scholar
  10. 10.
    Rasmussen, C.E., Kuss, M.: Gaussian Processes in Reinforcement Learning. In: Advances in Neural Information Processing Systems, June 2004, vol. 16, pp. 751–759. The MIT Press, Cambridge (2004)Google Scholar
  11. 11.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2006)zbMATHGoogle Scholar
  12. 12.
    Girard, A., Rasmussen, C.E., Quiñonero Candela, J., Murray-Smith, R.: Gaussian Process Priors with Uncertain Inputs—Application to Multiple-Step Ahead Time Series Forecasting. In: Advances in Neural Information Processing Systems, vol. 15, pp. 529–536. The MIT Press, Cambridge (2003)Google Scholar
  13. 13.
    Snelson, E., Ghahramani, Z.: Sparse Gaussian Processes using Pseudo-inputs. In: Advances in Neural Information Processing Systems, vol. 18, pp. 1257–1264. The MIT Press, Cambridge (2006)Google Scholar
  14. 14.
    Doya, K.: Reinforcement Learning in Continuous Time and Space. Neural Computation 12(1), 219–245 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Carl Edward Rasmussen
    • 1
    • 2
  • Marc Peter Deisenroth
    • 1
    • 3
  1. 1.Department of EngineeringUniversity of CambridgeUK
  2. 2.Max Planck Institute for Biological CyberneticsTübingenGermany
  3. 3.Faculty of InformaticsUniversität Karlsruhe (TH)Germany

Personalised recommendations