Abstract
This paper focuses on an optimal consensus problem for heterogeneous discrete-time nonlinear multi-agent systems (MASs) with partially unknown dynamics. For those systems, it is difficult to obtain the solution of the coupled Hamilton-Jacobi-Bellman (HJB) equations, which is necessary to address the optimal consensus problem. A new hierarchical and distributed optimal control strategy is developed to derive the near solution of the HJB equations. Its control structure consists of the model reference adaptive control (MRAC) layer and distributed control layer. In the MRAC layer, the adaptive feedforward and feedback controller is designed to make the states of followers converge to ones of their corresponding reference models. Then, the optimal consensus problem of heterogeneous MASs is formulated as that of homogeneous MASs. In the distributed control layer, an online distributed value iteration algorithm is proposed to approximate the optimal solution of the HJB equations for reference models. Thereby, the optimal consensus is also achieved for the heterogeneous MASs. The two convergence properties are analyzed to demonstrate the MRAC performance and the optimal consensus, respectively. Simulation results verify the effectiveness of the proposed strategy.
Similar content being viewed by others
References
J. Si, A. G. Barto, W. B. Powell, and D. Wunsch, Handbook of Learning and Approximate Dynamic Programming, Wiley, Hoboken, NJ, 2004.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.
P. J. Werbos, “Neural networks for control and system identification,” Proceedings of the 28th IEEE Conference on Decision and Control, vol. 1, Tampa, FL, USA, pp. 260–265, 1989.
W. T. Miller, R. S. Sutton, and P. J. Werbos, Neural Networks for Control, MIT Press, Cambridge, MA, 1990.
A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 38, no. 4, pp. 943–949, August 2008.
D. Wang, D. Liu, C. Mu, and Y. Zhang, “Neural network learning and robust stabilization of nonlinear systems with dynamic uncertainties,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 4, pp. 1342–1351, April 2018.
Q. Wei, B. Li, and R. Song, “Discrete-time stable generalized self-learning optimal control with approximation errors,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 4, pp. 1226–1238, April 2018.
X. Chen, P. Xie, Y. Xiong, Y. He, and M. Wu, “Two-phase iteration for value function approximation and hy-perparameter optimization in Gaussiankernel-based adaptive critic design,” Mathematical Problems in Engineering, vol. 2015, pp. 1–14, 2015.
B. Zhao and Y. Li, “Model-free adaptive dynamic programming based near-optimal decentralized tracking control of reconfigurable manipulators,” International Journal of Control, Automation and Systems, vol. 16, no. 2, pp. 478–490, April 2018.
K. G. Vamvoudakis, F. L. Lewis, and G. R. Hudas, “multiagent differential graphical games: Online adaptive learning solution for synchronization with optimality,” Automat-ica, vol. 48, pp. 1598–1611, August 2012.
J. Li, H. Modares, T. Chai, F. L. Lewis, and L. Xie, “Off-policy reinforcement learning for synchronization in multiagent graphical games,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2434–2445, October 2017.
M. Abouheaf, F. Lewis, S. Haesaert, R. Babuska, and K. Vamvoudakis, “multiagent discrete-time graphical games: Interactive nash equilibrium and value iteration solution,” Proceedings of the 52nd American Control Conference, Washington, DC, USA, pp. 4189–4195, 2013.
M. I. Abouheaf, F. L. Lewis, K. G. Vamvoudakis, S. Hae-saert, and R. Babuska, “multiagent discrete-time graphical games and reinforcement learning solutions,” Automatica, vol. 50, no. 12, pp. 3038–3053, December 2014.
H. Zhang, H. Jiang, Y. Luo and G. Xiao, “Data-driven optimal consensus control for discrete-time multiagent systems with unknown dynamics using reinforcement learning method,” IEEE Transactions on Industrial Electronics, vol. 64, no. 5, pp. 4091–4100, May 2017.
W. Wang and X. Chen, “Model-free optimal containment control of multiagent systems based on actor-critic framework,” Neurocomputing, vol. 314, pp. 242–250, November 2018.
W. Wang, X. Chen, H. Fu, and M. Wu, “Model-free distributed consensus control based on actor-critic framework for discrete-time nonlinear multiagent systems,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018. DOI: 10.1109/TSMC.2018.2883801
K. Lee, J. Lee, Y. Choi, and J. Park, “On stability and inverse optimality for a class of multiagent linear consensus protocols,” International Journal of Control, Automation and Systems, vol. 16, no. 3, pp. 1194–1206, June 2018.
W. Gao, Y. Liu, A. Odekunle, Y. Yu, and P. Lu, “Adaptive dynamic programming and cooperative output regulation of discrete-time multiagent systems,” International Journal of Control, Automation and Systems, vol. 16, no. 5, pp. 2273–2281, October 2018.
Q. Wei, D. Liu, and F. L. Lewis, “Optimal distributed synchronization control for continuous-time heterogeneous multiagent differential graphical games,” Information Sciences, vol. 317, pp. 96–113, October 2015.
H. Modares, S. P. Nageshrao, G. A. D. Lopes, R. Babuska, and F. L. Lewis, “Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning,” Automatica, vol. 71, pp. 334–341, September 2016.
Y. Yang, H. Modares, D. C. Wunsch, and Y. Yin, “Leader-follower output synchronization of linear heterogeneous systems with active leader using reinforcement learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2139–2153, June 2018.
H. Zhang, H. Liang, Z. Wang, and T. Feng, “Optimal output regulation for heterogeneous multiagent systems via adaptive dynamic programming,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 1, pp. 18–29, January 2017.
S. Zuo, Y. Song, F. L. Lewis, and A. Davoudi, “Optimal robust output containment of unknown heterogeneous multiagent system using off-policy reinforcement learning,” IEEE Transactions on Cybernetics, vol. 48, no. 11, pp. 3197–3207, November 2018.
H. Modares, F. L. Lewis, W. Kang, and A. Davoudi, “Optimal synchronization of heterogeneous nonlinear systems with unknown dynamics,” IEEE Transactions on Automatic Control, vol. 63, no. 1, pp. 117–131, January 2018.
B. Kiumarsi and F. L. Lewis, “Output synchronization of heterogeneous discrete-time systems: a model-free optimal approach,” Automatica, vol. 84, pp. 86–94, October 2017.
E. O. Ari and E. Kocaoglanb, “An SRWNN-based approach on developing a self-learning and self-evolving adaptive control system for motion platforms,” International Journal of Control, vol. 89, no. 2, pp. 380–396, 2016.
R. Kumar, S. Srivastava, and J. R. P. Gupta, “Diagonal recurrent neural network based adaptive control of nonlinear dynamical systems using lyapunov stability criterion,” ISA Transactions, vol. 67, pp. 407–427, March 2017.
M. A. Khanesar, Y. Oniz, O. Kaynak, and H. Gao, “Direct model reference adaptive fuzzy control of networked SISO nonlinear systems,” IEEE/ASME Transactions on Mecha-tronics, vol. 21, no. 1, pp. 205–213, Feburary 2016.
N. Wang, Z. Sun, J. Yin, Z. Zou, and S. F. Su, “Fuzzy unknown observer-based robust adaptive path following control of underactuated surface vehicles subject to multiple unknowns,” Ocean Engineering, vol. 176, pp. 57–64, March 2019.
N. Wang, Q. Deng, G. Xie, and X. Pan, “Hybrid finite-time trajectory tracking control of a quadrotor,” ISA Transactions, vol. 90, pp. 278–286, July 2019.
N. Wang, G. Xie, X. Pan, and S. Su, “Full-state regulation control of asymmetric underactuated surface vehicles,” IEEE Transactions on Industrial Electronics, vol. 66, no. 11, pp. 8741–8750, Nov. 2019.
N. Wang, S. Su, X. Pan, X. Yu, and G. Xie, “Yaw-guided trajectory tracking control of an asymmetric underactuated surface vehicle,” IEEE Transactions on Industrial Informatics vol. 15, no. 6, pp. 3502–3513, June 2019.
H. Zhang, T. Feng, H. Liang, and Y. Luo, “LQR-based optimal distributed cooperative design for linear discrete-time multiagent systems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 3, pp. 599–611, March 2017.
H. Fu, X. Chen, and W. Wang, “A model reference adaptive control with ADP-to-SMC strategy for unknown nonlinear systems,” Proceedings of the 11th Asian Control Conference, Gold Coast, Australia, pp. 1537–1542, 2017.
D. Wang, D. Liu, Q. Wei, D. Zhao, and N. Jin, “Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming,” Automat-ica, vol. 48, no. 8, pp. 1825–1832, August 2012.
D. Liu, D. Wang, and X. Yang, “An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs,” Information Science, vol. 220, pp. 331–342, January 2013.
Author information
Authors and Affiliations
Corresponding author
Additional information
Recommended by Ho Jae Lee under the direction of Editor Jessie (Ju H.) Park. This work is supported by the Science and Technology Project of State Grid Corporation of China under Grant 52153216000R and the National Natural Science Foundation of China under Grants 61873248.
Tao Wang received his B.S. degree in electric engineering from Chongqing University, Chongqing, China, in 1987. His current research interests include smart grid, fault diagnosis, state appraisal and live working for high-voltage electric equipment.
Hao Fu received his B.S. degree in machine design manufacture and automation, and his M.S. degree in mechatronic engineering from Hunan University of Technology, Zhuzhou, China, in 2012 and 2016, respectively. He is currently pursuing a Ph.D. degree in control science and engineering from China University of Geosciences, Wuhan, China. His current research interests include adaptive control, approximate dynamic programming, and multi-agent system.
Jinbin Li received his B.S. degree in Electrical Engineering from Nanyang Technological University, Singapore, in 2011. His current research interests include smart grid, high voltage technique, intelligent inspection, maintenance and live-working for high voltage electric equipments.
Yaodong Zhang received his B.S. degree in Electrical Engineering from Xi’an Jiao-tong University, Xi’an, China, in 2012. His current research interests include smart grid and parameters test of UHV transmission lines.
Xinfeng Zhou received his M.S. degree in Electrical Engineering from Wuhan University of Hydraulic and Electrical Engineering, Huhan, China, in 1995. his current research interests include management of electrical enterprise, grid production management system, and reliability management.
Xin Chen received his B.S. degree in Industrial Automation, and his M.S. degree in Control Theory and Control Engineering from Central South University, Changsha, China, in 1999 and 2002, respectively. In 2003, he was recommended by the National Ministry of Education to University of Macao, Taipa, Macao S.A.R., China, to pursue his Ph.D. degree. He received his Ph.D. degree in 2007. In 2011, he finished post-doctoral research of Control Science and Engineering at Central South University. In 2014, he moved to the China University of Geosciences, Wuhan, China, where he is a professor in the School of Automation. His current research interests include multi-agent system, robotics, process control and intelligent control.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, T., Fu, H., Li, J. et al. Optimal Consensus Control for Heterogeneous Nonlinear Multiagent Systems with Partially Unknown Dynamics. Int. J. Control Autom. Syst. 17, 2400–2413 (2019). https://doi.org/10.1007/s12555-018-0904-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12555-018-0904-1