A Multi-group Multi-agent System Based on Reinforcement Learning and Flocking

Wang, Gang; Xiao, Jian; Xue, Rui; Yuan, Yongting

doi:10.1007/s12555-021-0170-5

A Multi-group Multi-agent System Based on Reinforcement Learning and Flocking

Regular Papers
Intelligent Control and Applications
Published: 09 June 2022

Volume 20, pages 2364–2378, (2022)
Cite this article

International Journal of Control, Automation and Systems Aims and scope Submit manuscript

Gang Wang¹,
Jian Xiao^2,3,
Rui Xue ORCID: orcid.org/0000-0002-9188-175X⁴ &
…
Yongting Yuan⁵

264 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we present an inter-group confrontation and intra-group cooperation method for a predator group and prey group, and construct a multi-group multi-agent system. We model the motion of the prey group using the flocking control algorithm. The prey group can cooperatively avoid predators and maintain the integrity of the group after the predators have been detected. The autonomous decision-making of the predator group is implemented based on the distributed reinforcement learning algorithm. To efficiently share the learning experience among agents in the predator group, a distributed cooperative reinforcement learning algorithm with variable weights is proposed to accelerate the convergence of the learning algorithm. Simulations show the feasibility of this proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning

Flock of Robots with Self-Cooperation for Prey-Predator Task

Article 02 February 2021

A survey of the pursuit–evasion problem in swarm intelligence

Article 30 August 2023

References

M. Tang, Z. Chen, and F. Yin, “Robot tracking in SLAM with Masreliez-Martin unscented Kalman filter,” International Journal of Control, Automation, and Systems, vol. 18, no. 9, pp. 2315–2325, 2020.
Article Google Scholar
M. G. Ball, B. Qela, and S. Wesolkowski, “A review of the use of computational intelligence in the design of military surveillance networks,” Recent Advances in Computational Intelligence in Defense and Security, pp. 663–693, 2016.
L. V. Truong, S. Huang, V. T. Yen, and P. V. Cuong, “Adaptive trajectory neural network tracking control for industrial robot manipulators with deadzone robust compensator,” International Journal of Control, Automation, and Systems, vol. 18, no. 9, pp. 2423–2434, 2020.
Article Google Scholar
G. Tian, Y. Ren, and M. C. Zhou, “Dual-objective scheduling of rescue vehicles to distinguish forest fires via differential evolution and particle swarm optimization combined algorithm,” IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 11, pp. 3009–3021, 2016.
Article Google Scholar
L. F. Gonzalez, G. A. Montes, E. Puig, S. Johnson, K. Mengersen, and K. J. Gaston, “Unmanned aerial vehicles (UAVs) and artificial intelligence revolutionizing wildlife monitoring and conservation,” Sensors, vol. 16, no. 1, pp. 86–97, 2016.
Article Google Scholar
K. Derr and M. Manic, “Wireless sensor network configuration-Part I: Mesh simplification for centralized algorithms,” IEEE Transactions on Industrial Informatics, vol. 9, no. 3, pp. 1717–1727, 2013.
Article Google Scholar
K. Derr and M. Manic, “Wireless sensor network configuration-Part II: Adaptive coverage for decentralized algorithms,” IEEE Transactions on Industrial Informatics, vol. 9, no. 3, pp. 1728–1738, 2013.
Article Google Scholar
A. H. Sayed, “Adaptive networks,” Proceedings of IEEE, vol. 102, no. 4, pp. 460–497, 2014.
Article Google Scholar
R. Olfati-Saber and P. Jalalkamali, “Coupled distributed estimation and control for mobile sensor networks,” IEEE Transactions on Automatic Control, vol. 57, no. 10, pp. 2609–2614, 2012.
Article MathSciNet Google Scholar
C. W. Reynolds, “Flocks, herds, and schools: A distributed behavioral model,” Proc. of the 14th Annual Conference on Computer Graphics and Interactive Techniques, pp. 25–34, 1984.
R. Olfati-Saber, “Flocking for multi-agent dynamic systems: Algorithms and theory,” IEEE Transactions on Automatic Control, vol. 51, no. 3, pp. 401–420, March 2006.
Article MathSciNet Google Scholar
S. Jung, “A neural network technique of compensating for an inertia model error in a time-delayed controller for robot manipulators,” International Journal of Control, Automation, and Systems, vol. 18, no. 7, pp. 1863–1871, 2020.
Article Google Scholar
Y. Q. Miao, A. Khamis, and M. S. Kamel, “Applying anti-flocking model in mobile surveillance systems,” Proc. of IEEE International Conference on Automatic Control and Intelligent Systems (AIS’10), pp. 1–6, 2010.
N. Ganganath, C. T. Cheng, and C. Tse, “Distributed anti-flocking algorithms for dynamic coverage of mobile sensor networks,” IEEE Transactions on Industrial Informatics, vol. 12, no. 5, pp. 1795–1805, 2016.
Article Google Scholar
S. H. Semnani and O. A. Basir, “Semi-flocking algorithm for motion control of mobile sensors in large-scale surveillance systems,” IEEE Transactions on Cybernetics, vol. 45, no. 1, pp. 129–137, 2015.
Article Google Scholar
W. Yuan, N. Ganganath, C. T. Cheng, G. Qing, and F. C. M. Lau, “Semi-flocking-controlled mobile sensor networks for dynamic area coverage and multiple target tracking,” IEEE Sensors Journal, vol. 18, no. 21, pp. 8883–8892, 2018.
Article Google Scholar
M. Wang, H. Su, M. Zhao, M. Z. Q. Chen, and H. Wang, “Flocking of multiple autonomous agents with preserved network connectivity and heterogeneous nonlinear dynamics,” Neurocomputing, vol. 115, pp. 169–177, 2013.
Article Google Scholar
N. Zhao and J. Zhu, “Sliding mode control for robust consensus of general linear uncertain multi-agent systems,” International Journal of Control, Automation, and Systems, vol. 18, no. 8, pp. 2170–2175, 2020.
Article Google Scholar
G. Wang, R. Xue, C. Zhou, and J. Gong, “Complex-valued adaptive networks based on entropy estimation,” Signal Processing, vol. 149, pp. 124–134, 2018.
Article Google Scholar
S. Battilotti, F. Cacace, M. d’Angelo, and A. Germani, “Distributed Kalman filtering over sensor networks with unknown random link failures,” IEEE Control Systems Letters, vol. 2, no. 4, pp. 587–592, 2018.
Article MathSciNet Google Scholar
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, USA, 1998.
MATH Google Scholar
M. Qiao, H. Zhao, L. Zhou, C. Zhu, and S. Huang, “Topology-transparent scheduling based on reinforcement learning in self-organized wireless networks,” IEEE Access, vol. 6, pp. 20221–20230, 2018.
Article Google Scholar
A. Konar, I. G. Chakraborty, S. J. Singh, L. C. Jain, and A. K. Nagar, “A deterministic improved Q-learning for path planning of a mobile robot,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 43, no. 5, pp. 1141–1153, 2013.
Article Google Scholar
Q. Wang, H. Liu, K. Gao, and L. Zhang, “Improved multi-agent reinforcement learning for path planning-based crowd simulation,” IEEE Access, vol. 7, pp. 73841–73855, 2019.
Article Google Scholar
E. S. Low, P. Ong, and K. C. Cheah, “Solving the optimal path planning of a mobile robot using improved Q-learning,” Robotics and Autonomous Systems, vol. 115, pp. 143–161, 2019.
Article Google Scholar
F. Li, Q. Jiang, S. Zhang, M. Wei, and R. Song, “Robot skill acquisition in assembly process using deep reinforcement learning,” Neurocomputing, vol. 345, pp. 92–102, 2019.
Article Google Scholar
S. Hung and S. N. Givigi, “A Q-learning approach to flocking with UAVs in a stochastic environment,” IEEE Transactions on Cybernetics, vol. 47, no. 1, pp. 186–197, January 2017.
Article Google Scholar
C. Wang, J. Wang, X. Zhang, and X. Zhang, “Autonomous navigation of UAV in large-scale unknown complex environment with deep reinforcement learning,” Proc. of IEEE Global Conference on Signal and Information Processing (Global SIP), Montreal, QC, pp. 858–862, 2017.
H. M. La, R. Lim, and W. Sheng, “Multirobot cooperative learning for predator avoidance,” IEEE Transactions on Control Systems Technology, vol. 23, no. 1, pp. 52–63, 2015.
Article Google Scholar
J. Krause, G. Ruxton, and D. Rubenstein, “Is there always an influence of shoal size on predator hunting success?” Journal of Fish Biology, vol. 52, no. 3, pp. 494–501, 1998.
Article Google Scholar
C. J. C. H. Watkins and P. Dayan, “Technical note: Q-learning,” Machine Learning, vol. 8, no. 3, pp. 279–292, 1992.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Center for Robotics, School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Gang Wang
School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Jian Xiao
Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
Jian Xiao
School of Electronics and Information Engineering, Beihang University, Beijing, 100191, China
Rui Xue
No. 31435 Research Institute, Shenyang, 110000, China
Yongting Yuan

Authors

Gang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Rui Xue
View author publications
You can also search for this author in PubMed Google Scholar
Yongting Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Xue.

Additional information

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the National Key Research and Development Program of China (Project No. 2017YFB0503400), National Natural Science Foundation of China under Grants 61371182 and 41301459.

Gang Wang received his B.E. degree in communication engineering and a Ph.D. degree in biomedical engineering from University of Electronic Science and Technology of China, Chengdu, China, in 1999 and 2008, respectively. He is an Associate Professor in the School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu. His current research interests include distributed signal processing and intelligent systems.

Jian Xiao was born in Hengyang, Hunan Province, China. He received his B.Eng. degree in electronic information science and technology from Physics and Electrical Engineering College of Jishou University, Hunan, China, in 2016. He received his M.Eng. degree in the School of Information and Communication Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2020. He is currently pursuing a Ph.D. degree in the School of Information and Communication Engineering of UESTC.

Rui Xue received his B.S. and Ph.D. degrees from Beihang University in 2001 and 2010, respectively. He is a professor in the School of Electronics and Information Engineering, Beihang University. His research interests are satellite navigation error modeling, estimation, and validation. His current research interests include statistical signal processing and intelligent unmanned systems.

Yongting Yuan was born in Fugou, Henan Province, in 1972. He received his M.S. degree from Northeastern University in 2005. He is now a senior engineer of No. 31435 research institute. His research interests include air traffic control technology.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, G., Xiao, J., Xue, R. et al. A Multi-group Multi-agent System Based on Reinforcement Learning and Flocking. Int. J. Control Autom. Syst. 20, 2364–2378 (2022). https://doi.org/10.1007/s12555-021-0170-5

Download citation

Received: 25 February 2021
Revised: 18 July 2021
Accepted: 01 September 2021
Published: 09 June 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s12555-021-0170-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Multi-group Multi-agent System Based on Reinforcement Learning and Flocking

Abstract

Access this article

Similar content being viewed by others

Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning

Flock of Robots with Self-Cooperation for Prey-Predator Task

A survey of the pursuit–evasion problem in swarm intelligence

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Multi-group Multi-agent System Based on Reinforcement Learning and Flocking

Abstract

Access this article

Similar content being viewed by others

Evaluating the Coordination of Agents in Multi-agent Reinforcement Learning

Flock of Robots with Self-Cooperation for Prey-Predator Task

A survey of the pursuit–evasion problem in swarm intelligence

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation