Reinforcement Learning for Control Using Value Function Approximation

Gatsis, Konstantinos; Pappas, George J.

doi:10.1007/978-1-4471-5102-9_100067-1

Konstantinos Gatsis³ &
George J. Pappas⁴

236 Accesses

Abstract

This entry provides a short introduction to a class of reinforcement learning algorithms, in particular value function approximation, applied to stochastic optimal control problems. The entry demonstrates how core ideas from dynamic programming and Bellman equations are utilized in common data-driven reinforcement learning algorithms, as well as discuss fundamental challenges of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Bibliography

Berkenkamp F, Turchetta M, Schoellig A, Krause A (2017) Safe model-based reinforcement learning with stability guarantees. In: Advances in neural information processing systems, pp 908–918
Google Scholar
Bertsekas DP (2015) Dynamic programming and optimal control, vol II, 4th edn. Athena Scientific, Nashua
Google Scholar
Buşoniu L, de Bruin T, Tolić D, Kober J, Palunko I (2018) Reinforcement learning for control: performance, stability, and deep approximators. Ann Rev Control 46:8–28
Article MathSciNet Google Scholar
Dean S, Mania H, Matni N, Recht B, Tu S (2018) Regret bounds for robust adaptive control of the linear quadratic regulator. In: Advances in neural information processing systems, pp 4188–4197
Google Scholar
Fisac JF, Akametalu AK, Zeilinger MN, Kaynama S, Gillula J, Tomlin CJ (2019) A general safety framework for learning-based control in uncertain robotic systems. IEEE Trans Autom Control 64(7):2737–2752
Article MathSciNet Google Scholar
Kumar PR, Varaiya P (2015) Stochastic systems: estimation, identification, and adaptive control, vol 75. Society for Industrial and Applied Mathematics, Philadelphia
Book Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436
Article Google Scholar
Levine S, Finn C, Darrell T, Abbeel P (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373
MathSciNet MATH Google Scholar
Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529
Article Google Scholar
Rosolia U, Borrelli F (2017) Learning model predictive control for iterative tasks. a data-driven control framework. IEEE Trans Autom Control 63(7):1883–1896
Article MathSciNet Google Scholar
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484
Article Google Scholar
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge
MATH Google Scholar
Sutton RS, Maei HR, Precup D, Bhatnagar S, Silver D, Szepesvári C, Wiewiora E (2009) Fast gradient-descent methods for temporal-difference learning with linear function approximation. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 993–1000
Google Scholar
Tu S, Recht B (2018) Least-squares temporal difference learning for the linear quadratic regulator. In: International conference on machine learning, pp 5012–5021
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Engineering Science, University of Oxford, Oxford, UK
Konstantinos Gatsis
Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA, USA
George J. Pappas

Authors

Konstantinos Gatsis
View author publications
You can also search for this author in PubMed Google Scholar
George J. Pappas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Konstantinos Gatsis .

Editor information

Editors and Affiliations

Electrical and Computer Engineering, Boston University, Boston, MA, USA
John Baillieul
Automation and Control Solutions, Honeywell, Golden Valley, MN, USA
Tariq Samad

Section Editor information

Department of Aeronautics and Astronautics, Aerospace Controls Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
Jonathan P. How

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Gatsis, K., Pappas, G.J. (2020). Reinforcement Learning for Control Using Value Function Approximation. In: Baillieul, J., Samad, T. (eds) Encyclopedia of Systems and Control. Springer, London. https://doi.org/10.1007/978-1-4471-5102-9_100067-1

Download citation

DOI: https://doi.org/10.1007/978-1-4471-5102-9_100067-1
Published: 09 April 2020
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5102-9
Online ISBN: 978-1-4471-5102-9
eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics