Skip to main content
Log in

Robust control design for zero-sum differential games problem based on off-policy reinforcement learning technique

  • Original Paper
  • Published:
Aerospace Systems Aims and scope Submit manuscript

Abstract

This paper aims to figure out the robust zero-sum differential game problem using an off-policy reinforcement learning technique. The robust system model is first established based on the nominal one. The control strategy is proposed with the asymptotic stability and optimality being strictly proved. The off-policy reinforcement learning technique is built from the Bellman equation to generate the control policy. A potentially inaccurate system dynamic model’s influence is avoided because the outcome is attained from the system data set obtained. It is the first-time application of the off-policy RL algorithm on this robust two-player zero-sum differential game problem. Additionally, the final algorithm’s convergence is demonstrated, and a simulation example is run to confirm its efficacy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Sun J, Liu C, Zhao X (2018) Backstepping-based zero-sum differential games for missile-target interception systems with input and output constraints. IET Control Theory Appl 12(2):243–253

    Article  MathSciNet  Google Scholar 

  2. Gao Y, Liu C, Jiang S et al (2022) Zero-sum differential games-based fast adaptive robust optimal sliding mode control design for uncertain missile autopilot with constrained input. Int J Control 95(7):1789–1801

    Article  MathSciNet  MATH  Google Scholar 

  3. Bogosyan S, Gokasan M, Vamvoudakis KG (2022) Zero-Sum Game (ZSG) based integral reinforcement learning for trajectory tracking control of autonomous smart car. In: 2022 IEEE 31st international symposium on industrial electronics (ISIE), Anchorage, AK, USA, pp 1–4

  4. Lee D, Keimer A, Bayen AM et al (2020) Hamilton–Jacobi formulation for state-constrained optimal control and zero-sum game problems. 2020 59th IEEE conference on decision and control (CDC). pp 1078–1085

  5. Zhuang H, Ding D, Chen H et al (2022) Effectiveness of reaction control system in hypersonic rarefied reactive flow. J Spacecr Rockets 59(3):717–727

    Article  Google Scholar 

  6. Van Brummelen J, O’Brien M, Gruyer D et al (2018) Autonomous vehicle perception: the technology of today and tomorrow. Transp Res Part C Emerg Technol 89:384–406

    Article  Google Scholar 

  7. Vamvoudakis KG, Lewis FL (2012) Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. Int J Robust Nonlinear Control 22(13):1460–1483

    Article  MathSciNet  MATH  Google Scholar 

  8. Modares H, Lewis FL, Jiang ZP (2015) \({H} _ {{\infty }} \) tracking control of completely unknown continuous-time systems via off-policy reinforcement learning. IEEE Trans Neural Netw Learn Syst 26(10):2550–2562

    Article  MathSciNet  Google Scholar 

  9. Xue S, Luo B, Liu D et al (2020) Constrained event-triggered \(H_{\infty }\) control based on adaptive dynamic programming with concurrent learning. IEEE Trans Syst Man Cybern Syst 52(1):357–369

    Article  Google Scholar 

  10. Luo B, Wu HN, Huang T (2014) Off-policy reinforcement learning for \( H_{\infty } \) control design. IEEE Trans Cybern 45(1):65–76

    Article  Google Scholar 

  11. Yang D, Li T, Zhang H et al (2019) Event-trigger-based robust control for nonlinear constrained-input systems using reinforcement learning method. Neurocomputing 340:158–170

    Article  Google Scholar 

  12. Kiumarsi B, Lewis FL, Jiang ZP (2017) \(H_{\infty }\) control of linear discrete-time systems: off-policy reinforcement learning. Automatica 78:144–152

    Article  MathSciNet  MATH  Google Scholar 

  13. Duan D, Liu C, Sun J (2020) Adaptive periodic event-triggered control for missile-target interception system with finite-horizon convergence. Trans Inst Meas Control 42(10):1808–1822

    Article  Google Scholar 

  14. Sun J, Liu C (2018) Finite-horizon differential games for missile-target interception system using adaptive dynamic programming with input constraints. Int J Syst Sci 49(2):264–283

    Article  MathSciNet  MATH  Google Scholar 

  15. Ji Y, Zhou H, Bai B (2018) Event-driven-modular adaptive backstepping optimal control for strict-feedback systems through zero-sum differential games. IEEE Access 8:126511–126522

    Article  Google Scholar 

  16. Sun J, Liu C (2018) Distributed zero-sum differential game for multi-agent systems in strict-feedback form with input saturation and output constraint. Neural Netw 106:8–19

    Article  MATH  Google Scholar 

  17. Sun J, Liu C (2019) Decentralised zero-sum differential game for a class of large-scale interconnected systems via adaptive dynamic programming. Int J Control 92(12):2917–2927

    Article  MathSciNet  MATH  Google Scholar 

  18. Luo B, Wu HN, Huang T et al (2014) Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 50(12):3281–3290

    Article  MathSciNet  MATH  Google Scholar 

  19. Jiang Y, Jiang ZP (2012) Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10):2699–2704

    Article  MathSciNet  MATH  Google Scholar 

  20. Wang D, Liu D, Li H et al (2014) Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming. Inf Sci 282:167–179

    Article  MathSciNet  MATH  Google Scholar 

  21. Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791

  22. Vamvoudakis KG, Lewis FL (2010) Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888

  23. Vamvoudakis KG, Lewis FL (2014) Online adaptive algorithm for optimal control with integral reinforcement learning. Int J Robust Nonlinear Control 24(17):2686–2710

  24. Vamvoudakis KG, Lewis FL (2014) Successive Galerkin approximation algorithms for nonlinear optimal and robust control. Int J Control 71(5):717–743

    MathSciNet  Google Scholar 

  25. Jeffreys H, Jeffreys B, Swirles B (1999) Methods of mathematical physics. Cambridge University Press, Cambridge

  26. Lepage GP (1978) A new algorithm for adaptive multidimensional integration. J Comput Phys 27(2):192–203

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

The funding supported this work include National Natural Science Foundation of China under Grant 62103275, U20B2054, and U20B2056, Natural Science Foundation of Shanghai under grant 20ZR1427000, and Shanghai Sailing Program under grant 20YF1421600.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Shen.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhuang, H., Zhu, H., Wu, S. et al. Robust control design for zero-sum differential games problem based on off-policy reinforcement learning technique. AS (2023). https://doi.org/10.1007/s42401-023-00263-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42401-023-00263-0

Keywords

Navigation