Skip to main content
Log in

A Multi-group Multi-agent System Based on Reinforcement Learning and Flocking

  • Regular Papers
  • Intelligent Control and Applications
  • Published:
International Journal of Control, Automation and Systems Aims and scope Submit manuscript

Abstract

In this paper, we present an inter-group confrontation and intra-group cooperation method for a predator group and prey group, and construct a multi-group multi-agent system. We model the motion of the prey group using the flocking control algorithm. The prey group can cooperatively avoid predators and maintain the integrity of the group after the predators have been detected. The autonomous decision-making of the predator group is implemented based on the distributed reinforcement learning algorithm. To efficiently share the learning experience among agents in the predator group, a distributed cooperative reinforcement learning algorithm with variable weights is proposed to accelerate the convergence of the learning algorithm. Simulations show the feasibility of this proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. M. Tang, Z. Chen, and F. Yin, “Robot tracking in SLAM with Masreliez-Martin unscented Kalman filter,” International Journal of Control, Automation, and Systems, vol. 18, no. 9, pp. 2315–2325, 2020.

    Article  Google Scholar 

  2. M. G. Ball, B. Qela, and S. Wesolkowski, “A review of the use of computational intelligence in the design of military surveillance networks,” Recent Advances in Computational Intelligence in Defense and Security, pp. 663–693, 2016.

  3. L. V. Truong, S. Huang, V. T. Yen, and P. V. Cuong, “Adaptive trajectory neural network tracking control for industrial robot manipulators with deadzone robust compensator,” International Journal of Control, Automation, and Systems, vol. 18, no. 9, pp. 2423–2434, 2020.

    Article  Google Scholar 

  4. G. Tian, Y. Ren, and M. C. Zhou, “Dual-objective scheduling of rescue vehicles to distinguish forest fires via differential evolution and particle swarm optimization combined algorithm,” IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 11, pp. 3009–3021, 2016.

    Article  Google Scholar 

  5. L. F. Gonzalez, G. A. Montes, E. Puig, S. Johnson, K. Mengersen, and K. J. Gaston, “Unmanned aerial vehicles (UAVs) and artificial intelligence revolutionizing wildlife monitoring and conservation,” Sensors, vol. 16, no. 1, pp. 86–97, 2016.

    Article  Google Scholar 

  6. K. Derr and M. Manic, “Wireless sensor network configuration-Part I: Mesh simplification for centralized algorithms,” IEEE Transactions on Industrial Informatics, vol. 9, no. 3, pp. 1717–1727, 2013.

    Article  Google Scholar 

  7. K. Derr and M. Manic, “Wireless sensor network configuration-Part II: Adaptive coverage for decentralized algorithms,” IEEE Transactions on Industrial Informatics, vol. 9, no. 3, pp. 1728–1738, 2013.

    Article  Google Scholar 

  8. A. H. Sayed, “Adaptive networks,” Proceedings of IEEE, vol. 102, no. 4, pp. 460–497, 2014.

    Article  Google Scholar 

  9. R. Olfati-Saber and P. Jalalkamali, “Coupled distributed estimation and control for mobile sensor networks,” IEEE Transactions on Automatic Control, vol. 57, no. 10, pp. 2609–2614, 2012.

    Article  MathSciNet  Google Scholar 

  10. C. W. Reynolds, “Flocks, herds, and schools: A distributed behavioral model,” Proc. of the 14th Annual Conference on Computer Graphics and Interactive Techniques, pp. 25–34, 1984.

  11. R. Olfati-Saber, “Flocking for multi-agent dynamic systems: Algorithms and theory,” IEEE Transactions on Automatic Control, vol. 51, no. 3, pp. 401–420, March 2006.

    Article  MathSciNet  Google Scholar 

  12. S. Jung, “A neural network technique of compensating for an inertia model error in a time-delayed controller for robot manipulators,” International Journal of Control, Automation, and Systems, vol. 18, no. 7, pp. 1863–1871, 2020.

    Article  Google Scholar 

  13. Y. Q. Miao, A. Khamis, and M. S. Kamel, “Applying anti-flocking model in mobile surveillance systems,” Proc. of IEEE International Conference on Automatic Control and Intelligent Systems (AIS’10), pp. 1–6, 2010.

  14. N. Ganganath, C. T. Cheng, and C. Tse, “Distributed anti-flocking algorithms for dynamic coverage of mobile sensor networks,” IEEE Transactions on Industrial Informatics, vol. 12, no. 5, pp. 1795–1805, 2016.

    Article  Google Scholar 

  15. S. H. Semnani and O. A. Basir, “Semi-flocking algorithm for motion control of mobile sensors in large-scale surveillance systems,” IEEE Transactions on Cybernetics, vol. 45, no. 1, pp. 129–137, 2015.

    Article  Google Scholar 

  16. W. Yuan, N. Ganganath, C. T. Cheng, G. Qing, and F. C. M. Lau, “Semi-flocking-controlled mobile sensor networks for dynamic area coverage and multiple target tracking,” IEEE Sensors Journal, vol. 18, no. 21, pp. 8883–8892, 2018.

    Article  Google Scholar 

  17. M. Wang, H. Su, M. Zhao, M. Z. Q. Chen, and H. Wang, “Flocking of multiple autonomous agents with preserved network connectivity and heterogeneous nonlinear dynamics,” Neurocomputing, vol. 115, pp. 169–177, 2013.

    Article  Google Scholar 

  18. N. Zhao and J. Zhu, “Sliding mode control for robust consensus of general linear uncertain multi-agent systems,” International Journal of Control, Automation, and Systems, vol. 18, no. 8, pp. 2170–2175, 2020.

    Article  Google Scholar 

  19. G. Wang, R. Xue, C. Zhou, and J. Gong, “Complex-valued adaptive networks based on entropy estimation,” Signal Processing, vol. 149, pp. 124–134, 2018.

    Article  Google Scholar 

  20. S. Battilotti, F. Cacace, M. d’Angelo, and A. Germani, “Distributed Kalman filtering over sensor networks with unknown random link failures,” IEEE Control Systems Letters, vol. 2, no. 4, pp. 587–592, 2018.

    Article  MathSciNet  Google Scholar 

  21. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, USA, 1998.

    MATH  Google Scholar 

  22. M. Qiao, H. Zhao, L. Zhou, C. Zhu, and S. Huang, “Topology-transparent scheduling based on reinforcement learning in self-organized wireless networks,” IEEE Access, vol. 6, pp. 20221–20230, 2018.

    Article  Google Scholar 

  23. A. Konar, I. G. Chakraborty, S. J. Singh, L. C. Jain, and A. K. Nagar, “A deterministic improved Q-learning for path planning of a mobile robot,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 43, no. 5, pp. 1141–1153, 2013.

    Article  Google Scholar 

  24. Q. Wang, H. Liu, K. Gao, and L. Zhang, “Improved multi-agent reinforcement learning for path planning-based crowd simulation,” IEEE Access, vol. 7, pp. 73841–73855, 2019.

    Article  Google Scholar 

  25. E. S. Low, P. Ong, and K. C. Cheah, “Solving the optimal path planning of a mobile robot using improved Q-learning,” Robotics and Autonomous Systems, vol. 115, pp. 143–161, 2019.

    Article  Google Scholar 

  26. F. Li, Q. Jiang, S. Zhang, M. Wei, and R. Song, “Robot skill acquisition in assembly process using deep reinforcement learning,” Neurocomputing, vol. 345, pp. 92–102, 2019.

    Article  Google Scholar 

  27. S. Hung and S. N. Givigi, “A Q-learning approach to flocking with UAVs in a stochastic environment,” IEEE Transactions on Cybernetics, vol. 47, no. 1, pp. 186–197, January 2017.

    Article  Google Scholar 

  28. C. Wang, J. Wang, X. Zhang, and X. Zhang, “Autonomous navigation of UAV in large-scale unknown complex environment with deep reinforcement learning,” Proc. of IEEE Global Conference on Signal and Information Processing (Global SIP), Montreal, QC, pp. 858–862, 2017.

  29. H. M. La, R. Lim, and W. Sheng, “Multirobot cooperative learning for predator avoidance,” IEEE Transactions on Control Systems Technology, vol. 23, no. 1, pp. 52–63, 2015.

    Article  Google Scholar 

  30. J. Krause, G. Ruxton, and D. Rubenstein, “Is there always an influence of shoal size on predator hunting success?” Journal of Fish Biology, vol. 52, no. 3, pp. 494–501, 1998.

    Article  Google Scholar 

  31. C. J. C. H. Watkins and P. Dayan, “Technical note: Q-learning,” Machine Learning, vol. 8, no. 3, pp. 279–292, 1992.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Xue.

Additional information

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the National Key Research and Development Program of China (Project No. 2017YFB0503400), National Natural Science Foundation of China under Grants 61371182 and 41301459.

Gang Wang received his B.E. degree in communication engineering and a Ph.D. degree in biomedical engineering from University of Electronic Science and Technology of China, Chengdu, China, in 1999 and 2008, respectively. He is an Associate Professor in the School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu. His current research interests include distributed signal processing and intelligent systems.

Jian Xiao was born in Hengyang, Hunan Province, China. He received his B.Eng. degree in electronic information science and technology from Physics and Electrical Engineering College of Jishou University, Hunan, China, in 2016. He received his M.Eng. degree in the School of Information and Communication Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2020. He is currently pursuing a Ph.D. degree in the School of Information and Communication Engineering of UESTC.

Rui Xue received his B.S. and Ph.D. degrees from Beihang University in 2001 and 2010, respectively. He is a professor in the School of Electronics and Information Engineering, Beihang University. His research interests are satellite navigation error modeling, estimation, and validation. His current research interests include statistical signal processing and intelligent unmanned systems.

Yongting Yuan was born in Fugou, Henan Province, in 1972. He received his M.S. degree from Northeastern University in 2005. He is now a senior engineer of No. 31435 research institute. His research interests include air traffic control technology.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, G., Xiao, J., Xue, R. et al. A Multi-group Multi-agent System Based on Reinforcement Learning and Flocking. Int. J. Control Autom. Syst. 20, 2364–2378 (2022). https://doi.org/10.1007/s12555-021-0170-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12555-021-0170-5

Keywords

Navigation