Integral Policy Iteration for Zero-Sum Games with Completely Unknown Nonlinear Dynamics

Li, Hongliang; Liu, Derong; Wang, Ding

doi:10.1007/978-3-642-42054-2_29

Hongliang Li²⁰,
Derong Liu²⁰ &
Ding Wang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8226))

Included in the following conference series:

International Conference on Neural Information Processing

3820 Accesses
2 Citations

Abstract

In this paper, we develop a model-free integral policy iteration algorithm to learn online the Nash equilibrium solution of two-player zero-sum differential games with completely unknown nonlinear continuous-time dynamics. The developed algorithm updates value function, control and disturbance policies simultaneously. To implement this algorithm, three neural networks are used to approximate the game value function, the control policy and the disturbance policy. The least squares method is used to estimate the unknown parameters of the neural networks. The effectiveness of the developed scheme is demonstrated by a simulation example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Online solving Nash equilibrium solution of N-player nonzero-sum differential games via recursive least squares

Article 04 August 2023

Finite-Horizon Near Optimal Design of Nonlinear Two-Player Zero-Sum Game in Presence of Completely Unknown Dynamics

Article 31 March 2015

Off-Policy Integral Reinforcement Learning Method for Multi-player Non-zero-Sum Games

References

Lewis, F.L., Liu, D.: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Wiley, Hoboken (2012)
Book Google Scholar
Vamvoudakis, K.G., Lewis, F.L.: Online Actor-critic Algorithm to Solve the Continuous-time Infinite Horizon Optimal Control Problem. Automatica 46, 878–888 (2010)
Article MathSciNet MATH Google Scholar
Zhang, H., Cui, L., Zhang, X., Luo, Y.: Data-driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method. IEEE Trans. Neural Netw. 22, 2226–2236 (2011)
Article Google Scholar
Bhasin, S., Kamalapurkar, R., Johnson, M., Vamvoudakis, K.G., Lewis, F.L., Dixon, W.E.: A Novel Actor-critic-identifier Architecture for Approximate Optimal Control of Uncertain Nonlinear Systems. Automatica 49, 82–92 (2013)
Article MathSciNet MATH Google Scholar
Vrabie, D., Pastravanu, O., Abu-Khalaf, M., Lewis, F.L.: Adaptive Optimal Control for Continuous-time Linear Systems Based on Policy Iteration. Automatica 45, 477–484 (2009)
Article MathSciNet MATH Google Scholar
Vrabie, D., Lewis, F.L.: Neural Network Approach to Continuous-time Direct Adaptive Optimal Control for Partially Unknown Nonlinear Systems. Neural Netw. 22, 237–246 (2009)
Article Google Scholar
Mehta, P., Meyn, S.: Q-learning and Pontryagins Minimum Principle. In: Proceedings of the 48th IEEE Conference on Decision and Control, pp. 3598–3605 (2009)
Google Scholar
Lee, J.Y., Park, J.B., Choi, Y.H.: Integral Q-learning and Explorized Policy Iteration for Adaptive Optimal Control of Continuous-time Linear Systems. Automatica 48, 2850–2859 (2012)
Article MathSciNet MATH Google Scholar
Lee, J.Y., Park, J.B., Choi, Y.H.: Integral Reinforcement Learning with Explorations for Continuous-time Nonlinear Systems. In: Proceedings of the 2012 IEEE World Congress on Computational Intelligence, pp. 1042–1047 (2012)
Google Scholar
Jiang, Y., Jiang, Z.P.: Computational Adaptive Optimal Control for Continuous-time Linear Systems with Completely Unknown Dynamics. Automatica 48, 2699–2704 (2012)
Article MATH Google Scholar
Basar, T., Olsder, G.J.: Dynamic Noncooperative Game, 2nd edn. SIAM, Philadelphia (1997)
Google Scholar
Abu-Khalaf, M., Lewis, F.L., Huang, J.: Neurodynamic Progarmming and Zero-sum Games for Constrained Control Systems. IEEE Trans. Neural Netw. 19, 1243–1252 (2008)
Article Google Scholar
Zhang, H., Wei, Q., Liu, D.: An Iterative Adaptive Dynamic Programming Method for Solving a Class of Nonlinear Zero-sum Differential Games. Automatica 47, 207–214 (2011)
Article MathSciNet MATH Google Scholar
Vamvoudakis, K.G., Lewis, F.L.: Online Solution of Nonlinear Two-player Zero-sum Games using Synchronous Policy Iteration. Int. J. Robust. Nonlinear Control 22, 1460–1483 (2012)
Article MathSciNet Google Scholar
Johnson, M., Bhasin, S., Dixon, W.E.: Nonlinear Two-player Zero-sum Game Approximate Solution Using a Policy Iteration Algorithm. In: Proceedings of Conference on Decision and Control and European Control Conference, pp. 142–147 (2011)
Google Scholar
Varbie, D., Lewis, F.L.: Adaptive Dynamic Programming for Online Solution of a Zero-sum Differential Game. J. Control Theory Appl. 9, 353–360 (2011)
Article MathSciNet Google Scholar
Wu, H.N., Luo, B.: Neural Network Based Online Simultaneous Policy Update Algorithm for Solving the HJI Equation in Nonlinear H _∞ Control. IEEE Trans. Neural Netw. and Learn. Syst. 23, 1884–1895 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Hongliang Li, Derong Liu & Ding Wang

Authors

Hongliang Li
View author publications
You can also search for this author in PubMed Google Scholar
Derong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ding Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Kyungpook National University, 1370 Sankyuk-Dong, Puk-Gu, 702-701, Taegu, Korea
Minho Lee
The University of Tokyo, 7-3-1 Hongo, 113-8656, Bunkyo-ku, Tokyo, Japan
Akira Hirose
Key Laboratory of Complex Systems and Intelligence Science, Chinese Academy of Sciences, Institute of Automation, 100190, Beijing, China
Zeng-Guang Hou
Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu,, 440-746, Suwon, Korea
Rhee Man Kil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H., Liu, D., Wang, D. (2013). Integral Policy Iteration for Zero-Sum Games with Completely Unknown Nonlinear Dynamics. In: Lee, M., Hirose, A., Hou, ZG., Kil, R.M. (eds) Neural Information Processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-42054-2_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-42053-5
Online ISBN: 978-3-642-42054-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Integral Policy Iteration for Zero-Sum Games with Completely Unknown Nonlinear Dynamics

Abstract

Access this chapter

Preview

Similar content being viewed by others

Online solving Nash equilibrium solution of N-player nonzero-sum differential games via recursive least squares

Finite-Horizon Near Optimal Design of Nonlinear Two-Player Zero-Sum Game in Presence of Completely Unknown Dynamics

Off-Policy Integral Reinforcement Learning Method for Multi-player Non-zero-Sum Games

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Integral Policy Iteration for Zero-Sum Games with Completely Unknown Nonlinear Dynamics

Abstract

Access this chapter

Preview

Similar content being viewed by others

Online solving Nash equilibrium solution of N-player nonzero-sum differential games via recursive least squares

Finite-Horizon Near Optimal Design of Nonlinear Two-Player Zero-Sum Game in Presence of Completely Unknown Dynamics

Off-Policy Integral Reinforcement Learning Method for Multi-player Non-zero-Sum Games

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation