Abstract
This paper aims to figure out the robust zero-sum differential game problem using an off-policy reinforcement learning technique. The robust system model is first established based on the nominal one. The control strategy is proposed with the asymptotic stability and optimality being strictly proved. The off-policy reinforcement learning technique is built from the Bellman equation to generate the control policy. A potentially inaccurate system dynamic model’s influence is avoided because the outcome is attained from the system data set obtained. It is the first-time application of the off-policy RL algorithm on this robust two-player zero-sum differential game problem. Additionally, the final algorithm’s convergence is demonstrated, and a simulation example is run to confirm its efficacy.
Similar content being viewed by others
References
Sun J, Liu C, Zhao X (2018) Backstepping-based zero-sum differential games for missile-target interception systems with input and output constraints. IET Control Theory Appl 12(2):243–253
Gao Y, Liu C, Jiang S et al (2022) Zero-sum differential games-based fast adaptive robust optimal sliding mode control design for uncertain missile autopilot with constrained input. Int J Control 95(7):1789–1801
Bogosyan S, Gokasan M, Vamvoudakis KG (2022) Zero-Sum Game (ZSG) based integral reinforcement learning for trajectory tracking control of autonomous smart car. In: 2022 IEEE 31st international symposium on industrial electronics (ISIE), Anchorage, AK, USA, pp 1–4
Lee D, Keimer A, Bayen AM et al (2020) Hamilton–Jacobi formulation for state-constrained optimal control and zero-sum game problems. 2020 59th IEEE conference on decision and control (CDC). pp 1078–1085
Zhuang H, Ding D, Chen H et al (2022) Effectiveness of reaction control system in hypersonic rarefied reactive flow. J Spacecr Rockets 59(3):717–727
Van Brummelen J, O’Brien M, Gruyer D et al (2018) Autonomous vehicle perception: the technology of today and tomorrow. Transp Res Part C Emerg Technol 89:384–406
Vamvoudakis KG, Lewis FL (2012) Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. Int J Robust Nonlinear Control 22(13):1460–1483
Modares H, Lewis FL, Jiang ZP (2015) \({H} _ {{\infty }} \) tracking control of completely unknown continuous-time systems via off-policy reinforcement learning. IEEE Trans Neural Netw Learn Syst 26(10):2550–2562
Xue S, Luo B, Liu D et al (2020) Constrained event-triggered \(H_{\infty }\) control based on adaptive dynamic programming with concurrent learning. IEEE Trans Syst Man Cybern Syst 52(1):357–369
Luo B, Wu HN, Huang T (2014) Off-policy reinforcement learning for \( H_{\infty } \) control design. IEEE Trans Cybern 45(1):65–76
Yang D, Li T, Zhang H et al (2019) Event-trigger-based robust control for nonlinear constrained-input systems using reinforcement learning method. Neurocomputing 340:158–170
Kiumarsi B, Lewis FL, Jiang ZP (2017) \(H_{\infty }\) control of linear discrete-time systems: off-policy reinforcement learning. Automatica 78:144–152
Duan D, Liu C, Sun J (2020) Adaptive periodic event-triggered control for missile-target interception system with finite-horizon convergence. Trans Inst Meas Control 42(10):1808–1822
Sun J, Liu C (2018) Finite-horizon differential games for missile-target interception system using adaptive dynamic programming with input constraints. Int J Syst Sci 49(2):264–283
Ji Y, Zhou H, Bai B (2018) Event-driven-modular adaptive backstepping optimal control for strict-feedback systems through zero-sum differential games. IEEE Access 8:126511–126522
Sun J, Liu C (2018) Distributed zero-sum differential game for multi-agent systems in strict-feedback form with input saturation and output constraint. Neural Netw 106:8–19
Sun J, Liu C (2019) Decentralised zero-sum differential game for a class of large-scale interconnected systems via adaptive dynamic programming. Int J Control 92(12):2917–2927
Luo B, Wu HN, Huang T et al (2014) Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 50(12):3281–3290
Jiang Y, Jiang ZP (2012) Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10):2699–2704
Wang D, Liu D, Li H et al (2014) Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming. Inf Sci 282:167–179
Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791
Vamvoudakis KG, Lewis FL (2010) Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888
Vamvoudakis KG, Lewis FL (2014) Online adaptive algorithm for optimal control with integral reinforcement learning. Int J Robust Nonlinear Control 24(17):2686–2710
Vamvoudakis KG, Lewis FL (2014) Successive Galerkin approximation algorithms for nonlinear optimal and robust control. Int J Control 71(5):717–743
Jeffreys H, Jeffreys B, Swirles B (1999) Methods of mathematical physics. Cambridge University Press, Cambridge
Lepage GP (1978) A new algorithm for adaptive multidimensional integration. J Comput Phys 27(2):192–203
Funding
The funding supported this work include National Natural Science Foundation of China under Grant 62103275, U20B2054, and U20B2056, Natural Science Foundation of Shanghai under grant 20ZR1427000, and Shanghai Sailing Program under grant 20YF1421600.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhuang, H., Zhu, H., Wu, S. et al. Robust control design for zero-sum differential games problem based on off-policy reinforcement learning technique. AS (2023). https://doi.org/10.1007/s42401-023-00263-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42401-023-00263-0