Adaptive critic design for nonlinear multi-player zero-sum games with unknown dynamics and control constraints

Huo, Yu; Wang, Ding; Qiao, Junfei; Li, Menghua

doi:10.1007/s11071-023-08419-5

Adaptive critic design for nonlinear multi-player zero-sum games with unknown dynamics and control constraints

Original Paper
Published: 12 April 2023

Volume 111, pages 11671–11683, (2023)
Cite this article

Nonlinear Dynamics Aims and scope Submit manuscript

Yu Huo¹,
Ding Wang¹,
Junfei Qiao ORCID: orcid.org/0000-0002-1707-6074¹ &
…
Menghua Li¹

577 Accesses
7 Citations
Explore all metrics

Abstract

In this paper, a novel optimal control scheme is established to solve the multi-player zero-sum game (ZSG) issue of continuous-time nonlinear systems with control constraints and unknown dynamics based on the adaptive critic technology. To relax the requirement of system dynamics, a neural network-based identifier is applied to reconstruct the unknown multi-player ZSG system. Then, by developing a new nonquadratic function, the associated Hamilton-Jacobi-Isaacs (HJI) equation of the constrained ZSG is derived. Moreover, an adaptive critic framework is constructed to approximate the optimal cost function. Meanwhile, the strategy sets of optimal control and the worst disturbance are estimated by utilizing the single-critic network, respectively. After that, a modified critic weight updating mechanism with experience replay technique is introduced to relax the requirement of the persistence of excitation condition. Theoretically, by employing the Lyapunov stability theorem, the uniform ultimate boundedness stability of the ZSG system state and the critic network weight approximation error are proved. Finally, a representative example is simulated to validate the efficacy of the constructed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel Z-function-based completely model-free reinforcement learning method to finite-horizon zero-sum game of nonlinear system

Article 09 January 2022

Neural-Network-Based Synchronous Iteration Learning Method for Multi-player Zero-Sum Games

Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games

Data availability statement

No data were used in this paper.

References

Denardo, E.V.: Introduction to Game Theory. Springer, Boston (2011)
Google Scholar
Vamvoudakis, K.G., Modares, H., Kiumarsi, B., Lewis, F.L.: Game theory-based control system algorithms with real-time reinforcement learning: how to solve multiplayer games online. IEEE Control Syst. Mag. 37(1), 33–52 (2017)
MathSciNet Google Scholar
Ni, Z., Paul, S.: A multistage game in smart grid security: a reinforcement learning solution. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2684–2695 (2019)
MathSciNet Google Scholar
Bidram, A., Davoudi, A., Lewis, F.L., Guerrero, J.M.: Distributed cooperative secondary control of microgrids using feedback linearization. IEEE Trans. Power Syst. 28(3), 3462–3470 (2013)
Google Scholar
Wei, Q., Li, H., Yang, X., He, H.: Continuous-time distributed policy iteration for multi-controller nonlinear systems. IEEE Trans. Cybern. 51(5), 2372–2383 (2021)
Google Scholar
Liu, D., Li, H., Wang, D.: Online synchronous approximate optimal learning algorithm for multiplayer nonzero-sum games with unknown dynamics. IEEE Trans. Syst. Man Cybern. Syst. 44(8), 1015–1027 (2014)
Google Scholar
Li, Y., Wei, C., An, T., Ma, B., Dong, B.: Event-triggered-based cooperative game optimal tracking control for modular robot manipulator with constrained input. Nonlinear Dyn. 109(4), 2759–2779 (2022)
Google Scholar
Modares, H., Lewis, F.L., Sistani, M.B.N.: Online solution of nonquadratic two-player zero-sum games arising in the \(H_ \infty \) control of constrained input systems. Int. J. Adapt. Control Signal Process. 28(3), 232–254 (2014)
MathSciNet MATH Google Scholar
Vamvoudakis, K.G.: Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems. Automatica 61, 274–281 (2015)
MathSciNet MATH Google Scholar
Wang, D., Ha, M., Zhao, M.: The intelligent critic framework for advanced optimal control. Artif. Intell. Rev. 55(1), 1–22 (2022)
Google Scholar
Ha, M., Wang, D., Liu, D.: Discounted iterative adaptive critic designs with novel stability analysis for tracking control. IEEE/CAA J. Automatica Sinica 9(7), 1262–1272 (2022)
Google Scholar
Li, Y., Liu, Y., Tong, S.: Observer-based neuro-adaptive optimized control of strict-feedback nonlinear systems with state constraints. IEEE Trans. Neural Netw. Learn. Syst. 33(7), 3131–3145 (2022)
MathSciNet Google Scholar
Wang, H., Yang, C., Liu, X., Zhou, L.: Neural-network-based adaptive control of uncertain MIMO singularly perturbed systems with full-state constraints. IEEE Trans. Neural Netw. Learn. Syst. (2021). https://doi.org/10.1109/TNNLS.2021.3123361
Article Google Scholar
Huo, Y., Wang, D., Qiao, J.: Adaptive critic optimization to decentralized event-triggered control of continuous-time nonlinear interconnected systems. Opt. Control Appl. Methods 43(1), 198–212 (2022)
MathSciNet Google Scholar
Lv, Y., Na, J., Zhao, X., Huang, Y., Ren, X.: Multi-\(H_\infty \) controls for unknown input-interference nonlinear system with reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst. (2021). https://doi.org/10.1109/TNNLS.2021.3130092
Article Google Scholar
Wei, Q., Liu, D., Lin, Q., Song, R.: Adaptive dynamic programming for discrete-time zero-sum games. IEEE Trans. Neural Netw. Learn. Syst. 29(4), 957–969 (2018)
Google Scholar
Dong, B., An, T., Zhou, F., Liu, K., Li, Y.: Decentralized robust zero-sum neuro-optimal control for modular robot manipulators in contact with uncertain environments: theory and experimental verification. Nonlinear Dyn. 97(1), 503–524 (2019)
Google Scholar
Wu, H., Liu, Z.: Data-driven guaranteed cost control design via reinforcement learning for linear systems with parameter uncertainties. IEEE Trans. Syst. Man, Cybern. Syst. 50(11), 4151–4159 (2020)
Google Scholar
Song, R., Lewis, F.L., Wei, Q.: Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 704–713 (2017)
MathSciNet Google Scholar
Zhao, Q., Sun, J., Wang, G., Chen, J.: Event-triggered ADP for nonzero-sum games of unknown nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 33(5), 1905–1913 (2022)
Google Scholar
Wei, Q., Zhu, L., Song, R., Zhang, P., Liu, D., Xiao, J.: Model-free adaptive optimal control for unknown nonlinear multiplayer nonzero-sum game. IEEE Trans. Neural Netw. Learn. Syst. 33(2), 879–892 (2022)
MathSciNet Google Scholar
Yang, X., He, H.: Event-driven \(H_\infty \) constrained control using adaptive critic learning. IEEE Trans. Cybern 51(10), 4860–4872 (2021)
Google Scholar
Zhao, J., Lv, Y., Zhao, J.: Adaptive learning based output-feedback optimal control of CT two-player zero-sum games. IEEE Trans. Circuits Syst.-II: Express Briefs 69(3), 1437–1441 (2022)
Google Scholar
Yazidi, A., Silvestre, D., Oommen, B.J.: Solving two-person zero-sum stochastic games with incomplete information using learning automata with artificial barriers. IEEE Trans. Neural Netw. Learn. Syst. (2021). https://doi.org/10.1109/TNNLS.2021.3099095
Article Google Scholar
Guo, X., Yan, W., Cui, R.: Reinforcement learning-based nearly optimal control for constrained-input partially unknown systems using differentiator. IEEE Trans. Neural Netw. Learn. Syst. 31(11), 4713–4725 (2020)
MathSciNet Google Scholar
Song, R., Li, J., Lewis, F.L.: Robust optimal control for disturbed nonlinear zero-sum differential games based on single NN and least squares. IEEE Trans. Syst. Man, Cybern. Syst. 50(11), 4009–4019 (2020)
Google Scholar
Song, R., Wei, Q., Song, B.: Neural-network-based synchronous iteration learning method for multi-player zero-sum games. Neurocomputing 242(14), 73–82 (2017)
Google Scholar
Zhang, Y., Zhao, B., Liu, D.: Event-triggered adaptive dynamic programming for multi-player zero-sum games with unknown dynamics. Soft. Comput. 25, 2237–2251 (2021)
MATH Google Scholar
Qiao, J., Li, M., Wang, D.: Asymmetric constrained optimal tracking control with critic learning of nonlinear multiplayer zero-sum games. IEEE Trans. Neural Netw. Learn. Syst. (2022). https://doi.org/10.1109/TNNLS.2022.3208611
Article Google Scholar
Wei, Q., Song, R., Yan, P.: Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP. IEEE Trans. Neural Netw. Learn. Syst. 27(2), 444–458 (2016)
MathSciNet Google Scholar
Yang, X., Zhao, B.: Optimal neuro-control strategy for nonlinear systems with asymmetric input constraints. IEEE/CAA J. Automatica Sinica 7(2), 575–583 (2020)
MathSciNet Google Scholar
Yang, Y., Ding, Z., Wang, R., Modares, H., Wunsch, D.C.: Data-driven human-robot interaction without velocity measurement using off-policy reinforcement learning. IEEE/CAA J. Autom. Sinica 9(1), 47–63 (2022)
MathSciNet Google Scholar
Na, J., Lv, Y., Zhang, K., Zhao, J.: Adaptive identifier-critic-based optimal tracking control for nonlinear systems with experimental validation. IEEE Trans. Syst. Man, Cybern. Syst. 52(1), 459–472 (2022)
Google Scholar
Xue, S., Luo, B., Liu, D.: Event-triggered adaptive dynamic programming for zero-sum game of partially unknown continuous-time nonlinear systems. IEEE Trans. Syst. Man, Cybern. Syst. 50(9), 3189–3199 (2020)
Google Scholar
Wang, D.: Intelligent critic control with robustness guarantee of disturbed nonlinear plants. IEEE Trans. Cybern. 50(6), 2740–2748 (2020)
Google Scholar
Huo, X., Karimi, H.R., Zhao, X., Wang, B., Zong, G.: Adaptive-critic design for decentralized event-triggered control of constrained nonlinear interconnected systems within an identifier-critic framework. IEEE Trans. Cybern. 52(8), 7478–7491 (2022)
Google Scholar
Zhao, D., Zhang, Q., Wang, D., Zhu, Y.: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics. IEEE Trans. Cybern. 46(3), 854–865 (2016)
Google Scholar
Xue, S., Luo, B., Liu, D., Yang, Y.: Constrained event-triggered \(H_\infty \) control based on adaptive dynamic programming with concurrent learning. IEEE Trans. Syst. Man, Cybern. Syst. 52(1), 357–369 (2022)
Google Scholar
Xu, Y., Li, T., Bai, W., Shan, Q., Yuan, L., Wu, Y.: Online event-triggered optimal control for multi-agent systems using simplified ADP and experience replay technique. Nonlinear Dyn. 106(1), 509–522 (2021)
Google Scholar
Kamalapurkar, R., Reish, B., Chowdhary, G., Dixon, W.E.: Concurrent learning for parameter estimation using dynamic state-derivative estimators. IEEE Trans. Autom. Control 62(7), 3594–3601 (2017)
MathSciNet MATH Google Scholar
Zhang, Q., Zhao, D.: Data-based reinforcement learning for nonzero-sum games with unknown drift dynamics. IEEE Trans. Cybern. 49(8), 2874–2885 (2019)
Google Scholar
Yang, X., He, H.: Adaptive critic learning and experience replay for decentralized event-triggered control of nonlinear interconnected systems. IEEE Trans. Syst. Man, Cybern. Syst. 50(11), 4043–4055 (2020)
Google Scholar
Zhu, Y., Zhao, D., He, H., Ji, J.: Event-triggered optimal control for partially unknown constrained-input systems via adaptive dynamic programming. IEEE Trans. Industr. Electron. 64(5), 4101–4109 (2017)
Google Scholar
Luo, B., Yang, Y., Liu, D.: Adaptive Q-learning for data-based optimal output regulation with experience replay. IEEE Trans. Cybern. 48(12), 3337–3348 (2018)
Google Scholar
Xia, L., Li, Q., Song, R., Modares, H.: Optimal synchronization control of heterogeneous asymmetric input-constrained unknown nonlinear MASs via reinforcement learning. IEEE/CAA J. Autom. Sinica 9(3), 520–532 (2022)
Google Scholar
Zhao, B., Liu, D., Luo, C.: Reinforcement learning-based optimal stabilization for unknown nonlinear systems subject to inputs with uncertain constraints. IEEE Trans. Neural Netw. Learn. Syst. 31(10), 4330–4340 (2020)
MathSciNet Google Scholar
Zhao, S., Wang, J.: Robust optimal control for constrained uncertain switched systems subjected to input saturation: The adaptive event-triggered case. Nonlinear Dyn. 110(1), 363–380 (2022)
Google Scholar
Mishra, A., Ghosh, S.: Variable gain gradient descent-based reinforcement learning for robust optimal tracking control of uncertain nonlinear system with input constraints. Nonlinear Dyn. 107(3), 2195–2214 (2022)
Google Scholar
Yang, X., Zhou, Y., Dong, N., Wei, Q.: Adaptive critics for decentralized stabilization of constrained-input nonlinear interconnected systems. IEEE Trans. Syst. Man, Cybern. Syst. 52(7), 4187–4199 (2022)
Google Scholar
Mu, C., Wang, K., Sun, C.: Policy-iteration-based learning for nonlinear player game systems with constrained inputs. IEEE Trans. Syst. Man, Cybern. Syst. 51(10), 6488–6502 (2021)
Google Scholar
Zhang, S., Zhao, B., Liu, D., Zhang, Y.: Observer-based event-triggered control for zero-sum games of input constrained multi-player nonlinear systems. Neural Netw. 114(8), 101–112 (2021)
Google Scholar
Sun, J., Liu, C.: Distributed zero-sum differential game for multi-agent systems in strict-feedback form with input saturation and output constraint. Neural Netw. 106, 8–19 (2018)
MATH Google Scholar
Zhu, Y., Zhao, D., Li, X.: Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 714–725 (2017)
MathSciNet Google Scholar
Bhasin, S., Kamalapurkar, R., Johnson, M., Vamvoudakis, K.G., Lewis, F.L., Dixon, W.E.: A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49, 82–92 (2013)
MathSciNet MATH Google Scholar
Yasini, S., Sitani, M.B.N., Kirampor, A.: Reinforcement learning and neural networks for multi-agent nonzero-sum games of nonlinear constrained-input systems. Int. J. Mach. Learn. Cybern. 7, 967–980 (2016)
Google Scholar

Download references

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant 2021ZD0112302; and in part by the National Natural Science Foundation of China under Grant 62222301, Grant 61890930-5, and Grant 62021003.

Author information

Authors and Affiliations

Faculty of Information Technology, The Beijing Key Laboratory of Computational Intelligence and Intelligent System, The Beijing Institute of Artificial Intelligence, and The Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing, 100124, China
Yu Huo, Ding Wang, Junfei Qiao & Menghua Li

Authors

Yu Huo
View author publications
You can also search for this author in PubMed Google Scholar
Ding Wang
View author publications
You can also search for this author in PubMed Google Scholar
Junfei Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Menghua Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junfei Qiao.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Ethical approval

No conflict of interest exits in this submission, and the research work does not involve any human participants and/or animals. The manuscript is approved by all authors for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huo, Y., Wang, D., Qiao, J. et al. Adaptive critic design for nonlinear multi-player zero-sum games with unknown dynamics and control constraints. Nonlinear Dyn 111, 11671–11683 (2023). https://doi.org/10.1007/s11071-023-08419-5

Download citation

Received: 12 November 2022
Accepted: 16 March 2023
Published: 12 April 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11071-023-08419-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive critic design for nonlinear multi-player zero-sum games with unknown dynamics and control constraints

Abstract

Access this article

Similar content being viewed by others

A novel Z-function-based completely model-free reinforcement learning method to finite-horizon zero-sum game of nonlinear system

Neural-Network-Based Synchronous Iteration Learning Method for Multi-player Zero-Sum Games

Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games

Data availability statement

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive critic design for nonlinear multi-player zero-sum games with unknown dynamics and control constraints

Abstract

Access this article

Similar content being viewed by others

A novel Z-function-based completely model-free reinforcement learning method to finite-horizon zero-sum game of nonlinear system

Neural-Network-Based Synchronous Iteration Learning Method for Multi-player Zero-Sum Games

Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games

Data availability statement

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation