Skip to main content
Log in

Reinforcement learning-based unknown reference tracking control of HMASs with nonidentical communication delays

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

This paper focuses on the optimal output synchronization control problem of heterogeneous multiagent systems (HMASs) subject to nonidentical communication delays by a reinforcement learning method. Compared with existing studies assuming that the precise model of the leader is globally or distributively accessible to all or some of the followers, the leader’s precise dynamical model is entirely inaccessible to all the followers in this paper. A data-based learning algorithm is first proposed to reconstruct the leader’s unknown system matrix online. A distributed predictor subject to communication delays is further devised to estimate the leader’s state, where interaction delays are allowed to be nonidentical. Then, a learning-based local controller, together with a discounted performance function, is projected to reach the optimal output synchronization. Bellman equations and game algebraic Riccati equations are constructed to learn the optimal solution by developing a model-based reinforcement learning (RL) algorithm online without solving regulator equations, which is followed by a model-free off-policy RL algorithm to relax the requirement of all agents’ dynamics faced by the model-based RL algorithm. The optimal tracking control of HMASs subject to unknown leader dynamics and communication delays is shown to be solvable under the proposed RL algorithms. Finally, the effectiveness of theoretical analysis is verified by numerical simulations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ni W, Cheng D. Leader-following consensus of multi-agent systems under fixed and switching topologies. Syst Control Lett, 2010, 59: 209–217

    Article  MathSciNet  MATH  Google Scholar 

  2. Yu W W, Wang H, Hong H F, et al. Distributed cooperative anti-disturbance control of multi-agent systems: an overview. Sci China Inf Sci, 2017, 60: 110202

    Article  MathSciNet  Google Scholar 

  3. Liu T F, Qi J, Jiang Z P. Distributed containment control of multi-agent systems with velocity and acceleration saturations. Automatica, 2020, 117: 108992

    Article  MathSciNet  MATH  Google Scholar 

  4. Xu Y, Fang M, Shi P, et al. Multileader multiagent systems containment control with event-triggering. IEEE Trans Syst Man Cybern Syst, 2021, 51: 1642–1651

    Google Scholar 

  5. Liu T F, Qin Z, Jiang Z P. A new look at distributed optimal output agreement of multi-agent systems. Automatica, 2022, 136: 110053

    Article  MathSciNet  MATH  Google Scholar 

  6. Su Y F, Huang J. Cooperative output regulation of linear multi-agent systems. IEEE Trans Automat Contr, 2012, 57: 1062–1066

    Article  MathSciNet  MATH  Google Scholar 

  7. Lu M, Liu L. Cooperative output regulation of linear multi-agent systems by a novel distributed dynamic compensator. IEEE Trans Automat Contr, 2017, 62: 6481–6488

    Article  MathSciNet  MATH  Google Scholar 

  8. Deng C, Yang G H. Distributed adaptive fault-tolerant control approach to cooperative output regulation for linear multi-agent systems. Automatica, 2019, 103: 62–68

    Article  MathSciNet  MATH  Google Scholar 

  9. Dong S L, Chen G R, Liu M Q, et al. Cooperative neural-adaptive fault-tolerant output regulation for heterogeneous nonlinear uncertain multiagent systems with disturbance. Sci China Inf Sci, 2021, 64: 172212

    Article  MathSciNet  Google Scholar 

  10. Li G Q, Wang L. Adaptive output consensus of heterogeneous nonlinear multiagent systems: a distributed dynamic compensator approach. IEEE Trans Automat Contr, 2023, 68: 2483–2489

    Article  MathSciNet  Google Scholar 

  11. Xu Y, Wu Z G. Distributed adaptive event-triggered fault-tolerant synchronization for multiagent systems. IEEE Trans Ind Electron, 2021, 68: 1537–1547

    Article  Google Scholar 

  12. Cai H, Lewis F L, Hu G, et al. The adaptive distributed observer approach to the cooperative output regulation of linear multi-agent systems. Automatica, 2017, 75: 299–305

    Article  MathSciNet  MATH  Google Scholar 

  13. Cai H, Huang J. Output based adaptive distributed output observer for leader-follower multiagent systems. Automatica, 2021, 125: 109413

    Article  MathSciNet  MATH  Google Scholar 

  14. Sutton R S, Barto A G. Introduction to Reinforcement Learning. Cambridge: MIT Press, 1998

    Book  MATH  Google Scholar 

  15. Modares H, Nageshrao S P, Lopes G A D, et al. Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning. Automatica, 2016, 71: 334–341

    Article  MathSciNet  MATH  Google Scholar 

  16. Yang Y, Modares H, Wunsch D C, et al. Leader-follower output synchronization of linear heterogeneous systems with active leader using reinforcement learning. IEEE Trans Neural Netw Learn Syst, 2018, 29: 2139–2153

    Article  MathSciNet  Google Scholar 

  17. Mu C, Zhao Q, Sun C. Optimal tracking control of heterogeneous MASs using event-driven adaptive observer and reinforcement learning. IEEE Trans Neural Netw Learn Syst, 2022. doi: https://doi.org/10.1109/TNNLS.2022.3208237

  18. Xu Y, Wu Z G. Data-efficient off-policy learning for distributed optimal tracking control of HMAS with unidentified exosystem dynamics. IEEE Trans Neural Netw Learn Syst, 2022. doi: https://doi.org/10.1109/TNNLS.2022.3172130

  19. Li Q, Xia L, Song R. Leader-follower bipartite output synchronization on signed digraphs under adversarial factors via data-based reinforcement learning. IEEE Trans Neural Netw Learn Syst, 2020, 31: 4185–4195

    Article  MathSciNet  Google Scholar 

  20. Olfati-Saber R, Murray R M. Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans Automat Contr, 2004, 49: 1520–1533

    Article  MathSciNet  MATH  Google Scholar 

  21. Zhou B, Lin Z. Consensus of high-order multi-agent systems with large input and communication delays. Automatica, 2014, 50: 452–464

    Article  MathSciNet  MATH  Google Scholar 

  22. Yang X, Zhou B. Consensus of discrete-time multiagent systems with input delays by truncated pseudo-predictor feedback. IEEE Trans Cybern, 2017, 49: 505–516

    Article  MathSciNet  Google Scholar 

  23. Ge X, Han Q L, Zhang X M. Achieving cluster formation of multi-agent systems under aperiodic sampling and communication delays. IEEE Trans Ind Electron, 2017, 65: 3417–3426

    Article  Google Scholar 

  24. Lu M, Liu L. Consensus of linear multi-agent systems subject to communication delays and switching networks. Int J Robust Nonlinear Control, 2017, 27: 1379–1396

    MathSciNet  MATH  Google Scholar 

  25. Lu M, Liu L. Distributed feedforward approach to cooperative output regulation subject to communication delays and switching networks. IEEE Trans Automat Contr, 2017, 62: 1999–2005

    Article  MathSciNet  MATH  Google Scholar 

  26. Yu J, Wang L. Group consensus in multi-agent systems with switching topologies and communication delays. Syst Control Lett, 2010, 59: 340–348

    Article  MathSciNet  MATH  Google Scholar 

  27. Jiang Y, Jiang Z P. Robust adaptive dynamic programming for large-scale systems with an application to multimachine power systems. IEEE Trans Circuits Syst II, 2012, 59: 693–697

    Google Scholar 

  28. Wang X, Sun J, Wang G, et al. Data-driven control of distributed event-triggered network systems. IEEE CAA J Autom Sin, 2023, 10: 351–364

    Article  Google Scholar 

  29. Moreau L. Stability of continuous-time distributed consensus algorithms. In: Proceedings of the 43rd IEEE Conference on Decision and Control (CDC), 2004. 4: 3998–4003

  30. Modares H, Lewis F L, Jiang Z P. H tracking control of completely unknown continuous-time systems via off-policy reinforcement learning. IEEE Trans Neural Netw Learn Syst, 2015, 26: 2550–2562

    Article  MathSciNet  Google Scholar 

  31. Vamvoudakis K G, Lewis F L, Hudas G R. Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica, 2012, 48: 1598–1611

    Article  MathSciNet  MATH  Google Scholar 

  32. Liu W, Sun J, Wang G, et al. Data-driven resilient predictive control under denial-of-service. IEEE Trans Automat Contr, 2022. doi: https://doi.org/10.1109/TAC.2022.3209399

  33. Xu Y, Sun J, Pan Y J, et al. Dynamic deadband event-triggered strategy for distributed adaptive consensus control with applications to circuit systems. IEEE Trans Circuits Syst I, 2022, 69: 4663–4673

    Article  Google Scholar 

  34. Zhang P, Liu T F, Jiang Z P. Event-triggered stabilization of a class of nonlinear time-delay systems. IEEE Trans Automat Contr, 2021, 66: 421–428

    Article  MathSciNet  MATH  Google Scholar 

  35. Xu Y. Resilient secure control of networked systems over unreliable communication networks. IEEE Trans Ind Inf, 2022, 18: 4069–4077

    Article  Google Scholar 

  36. Tao Y Y, Wu Z G. Asynchronous control of two-dimensional Markov jump Roesser systems: an event-triggering strategy. IEEE Trans Netw Sci Eng, 2022, 9: 2278–2289

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 62103047, U19662-02), Beijing Institute of Technology Research Fund Program for Young Scholars, and Young Elite Scientists Sponsorship Program by BAST (Grant No. BYESS2023365)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng-Guang Wu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Y., Wu, ZG., Che, WW. et al. Reinforcement learning-based unknown reference tracking control of HMASs with nonidentical communication delays. Sci. China Inf. Sci. 66, 170203 (2023). https://doi.org/10.1007/s11432-022-3729-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-022-3729-7

Keywords

Navigation