Skip to main content
Log in

Learning cooperative strategies in multi-agent encirclement games with faster prey using prior knowledge

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Multi-agent encirclement with collision avoidance constitutes a common challenge in the multi-agent confrontation domain, wherein the focus lies in the development of cooperative strategies among agents. Previous studies encountered difficulties in addressing the dynamic encirclement of faster prey in obstacles environment. This paper introduces a novel multi-agent deep reinforcement learning approach based on prior knowledge. It is dedicated to exploring the multi-agent encirclement with collision avoidance task involving slower multiple pursuers collaboratively encircling faster prey in an obstacles environment. Firstly, the utilization of the classic Apollonius circle theory as prior knowledge guides agent action selection, narrows the exploratory action space, and accelerates the learning of strategies. Subsequently, the variance descriptor restricts the motion direction of pursuers, thus ensuring that pursuers continuously narrow the encirclement until the prey is successfully encircled. Finally, experiments in an obstacles environment were conducted to validate the proposed method. The results indicate that our method can acquire an effective encirclement strategy, with an encirclement success rate exceeding that of previous methods by more than 10%, and simulation experiment results demonstrate the effectiveness and practicability of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Algorithm 2
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availibility

For data access requests, interested researchers are encouraged to contact the corresponding author at. In addition to data access, we provide detailed information about the experimental setup and configurations to aid in result replication: We conducted experiments using Python 3.8 on a Linux-based server with the following dependencies: open AI gym(0.10.5) TensorFlow 2.4, Numpy(1.14.5), python(3.7), https://github.com/openai/multiagent-particle-envs. We are committed to fostering collaboration and transparency in research, and we encourage fellow researchers to reach out for any inquiries regarding data access or the experimental setup.

References

  1. Turetsky V, Shima T (2016) Target evasion from a missile performing multiple switches in guidance law. J Guid Control Dyn 39(10):2364–2373

    Article  Google Scholar 

  2. Perelman A, Shima T, Rusnak I (2011) Cooperative differential games strategies for active aircraft protection from a homing missile. J Guid Control Dyn 34(3):761–773

    Article  Google Scholar 

  3. Camci E, Kayacan E (2016) Game of drones: UAV pursuit-evasion game with type-2 fuzzy logic controllers tuned by reinforcement learning. In: 2016 IEEE international conference on fuzzy systems (FUZZ-IEEE), pp 618–625

  4. Sun Z, Wu H, Shi Y, Yu X, Gao Y, Pei W, Yang Z, Piao H, Hou Y (2023) Multi-agent air combat with two-stage graph-attention communication. Neural Comput Appl 35:1–17

    Article  Google Scholar 

  5. Du W, Guo T, Chen J, Li B, Zhu G, Cao X (2021) Cooperative pursuit of unauthorized UAVS in urban airspace via multi-agent reinforcement learning. Transp Res Part C Emerg Technol 128:103122

    Article  Google Scholar 

  6. Wan K, Wu D, Zhai Y, Li B, Gao X, Hu Z (2021) An improved approach towards multi-agent pursuit-evasion game decision-making using deep reinforcement learning. Entropy 23(11):1433

    Article  MathSciNet  Google Scholar 

  7. Alexopoulos A, Schmidt T, Badreddin E (2015) Cooperative pursue in pursuit-evasion games with unmanned aerial vehicles. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, pp 4538–4543

  8. Peng K, Rong H, Qian Y (2023) Agrcnet: communicate by attentional graph relations in multi-agent reinforcement learning for traffic signal control. Neural Comput Appl 35(28):21007–21022

    Article  Google Scholar 

  9. Wishart D (1966) Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Phys Bull 17(2):60. https://doi.org/10.1088/0031-9112/17/2/009

    Article  Google Scholar 

  10. Sun W, Tsiotras P, Lolla T, Subramani DN, Lermusiaux PF (2017) Multiple-pursuer/one-evader pursuit-evasion game in dynamic flowfields. J Guid Control Dyn 40(7):1627–1637

    Article  Google Scholar 

  11. Wei L, Zhihua Q, Simaan MA (2015) Nash strategies for pursuit-evasion differential games involving limited observations. IEEE Trans Aerosp Electron Syst 51(2):1347–1356

    Article  Google Scholar 

  12. Wang Y, Dong L, Sun C (2020) Cooperative control for multi-player pursuit-evasion games with reinforcement learning. Neurocomputing 412:101–114

    Article  Google Scholar 

  13. Zhang B-K, Hu B, Chen L, Zhang D-X, Cheng X-M, Guan Z-H (2021) Probabilistic reward-based reinforcement learning for multi-agent pursuit and evasion. In: 2021 33rd Chinese control and decision conference (CCDC),IEEE, pp 3352–3357

  14. Verma S, Verma R, Sujit P (2019) Mapel: multi-agent pursuer-evader learning using situation report. In: 2019 international joint conference on neural networks (IJCNN), IEEE, pp 1–8

  15. Sun L, Chang Y-C, Lyu C, Shi Y, Shi Y, Lin C-T (2023) Toward multi-target self-organizing pursuit in a partially observable Markov game. Inf Sci 648:119475

    Article  Google Scholar 

  16. Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems 30

  17. Tang H, Zhang W, Sun M, Lin B, Hu Z (2021) A PE game with one superior hunter and multi-pursuer against an evader. In: 2021 40th Chinese control conference (CCC), IEEE, pp 5124–5130

  18. Li S, Wang C, Xie G (2022) Pursuit-evasion differential games of players with different speeds in spaces of different dimensions. In: 2022 American control conference (ACC), IEEE, pp 1299–1304

  19. Bakolas E, Tsiotras P (2012) Relay pursuit of a maneuvering target using dynamic Voronoi diagrams. Automatica 48(9):2213–2220

    Article  MathSciNet  Google Scholar 

  20. Lopez VG, Lewis FL, Wan Y, Sanchez EN, Fan L (2019) Solutions for multiagent pursuit-evasion games on communication graphs: finite-time capture and asymptotic behaviors. IEEE Trans Autom Control 65(5):1911–1923

    Article  MathSciNet  Google Scholar 

  21. Yugang L, Goldie N (2013) Robotic urban search and rescue: a survey from the control perspective. J Intell Robot Syst 72(2):147–165

    Article  Google Scholar 

  22. Chen J, Zha W, Peng Z, Gu D (2016) Multi-player pursuit-evasion games with one superior evader. Automatica 71:24–32

    Article  MathSciNet  Google Scholar 

  23. Pierson A, Wang Z, Schwager M (2016) Intercepting rogue robots: an algorithm for capturing multiple evaders with multiple pursuers. IEEE Robot Autom Lett 2(2):530–537

    Article  Google Scholar 

  24. Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, Choi DH, Powell R, Ewalds T, Georgiev P et al (2019) Grandmaster level in starcraft II using multi-agent reinforcement learning. Nature 575(7782):350–354

    Article  Google Scholar 

  25. Berner C, Brockman G, Chan B, Cheung V, Dębiak P, Dennison C, Farhi D, Fischer Q, Hashme S, Hesse C, et al (2019) Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680

  26. Xia J, Luo Y, Liu Z, Zhang Y, Shi H, Liu Z (2023) Cooperative multi-target hunting by unmanned surface vehicles based on multi-agent reinforcement learning. Def Technol 29:80–94

    Article  Google Scholar 

  27. Li S, Wu Y, Cui X, Dong H, Fang F, Russell S (2019) Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. Proc AAAI Conf Artif Intell 33:4213–4220

    Google Scholar 

  28. Selvakumar J, Bakolas E (2022) Min–max q-learning for multi-player pursuit-evasion games. Neurocomputing 475:1–14

    Article  Google Scholar 

  29. Yu C, Velu A, Vinitsky E, Gao J, Wang Y, Bayen A, Wu Y (2022) The surprising effectiveness of PPO in cooperative multi-agent games. Adv Neural Inf Process Syst 35:24611–24624

    Google Scholar 

  30. Zhou Z, Xu H (2020) Mean field game and decentralized intelligent adaptive pursuit evasion strategy for massive multi-agent system under uncertain environment. In: 2020 American control conference (ACC), IEEE, pp 5382–5387

  31. Grupen NA, Lee DD, Selman B (2022) Multi-agent curricula and emergent implicit signaling. In: Proceedings of the 21st international conference on autonomous agents and multiagent systems. AAMAS ’22, pp 553–561. International foundation for autonomous agents and multiagent systems, Richland, SC

  32. Kouzeghar M, Song Y, Meghjani M, Bouffanais R (2023) Multi-target pursuit by a decentralized heterogeneous uav swarm using deep multi-agent reinforcement learning. arXiv preprint arXiv:2303.01799

  33. Fang X, Wang C, Xie L, Chen J (2020) Cooperative pursuit with multi-pursuer and one faster free-moving evader. IEEE Trans Cybernet 52(3):1405–1414

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dianxi Shi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest related to this research. There are no financial, professional, or personal conflicts of interest that could influence the impartiality or integrity of this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Number of agents

Appendix A Number of agents

Based on the mathematical principles of the Apollonius circle and the encirclement task defined in section 3.1, when the pursuers encircle the prey, the \(360^{\circ }\) range around the prey is covered by the pursuers’ capture angles, as shown in Fig. 11. Consequently, we can establish the relationship between the number of pursuers and prey within an encirclement task.

Fig. 11
figure 11

The number of agents

$$n_{{\min }} = \left[ {\frac{\pi }{{\arcsin \frac{{v_{p} }}{{v_{e} }}}}} \right]$$
(18)

It is evident that the correlation between the number of agents in the encirclement task is contingent upon the speed ratio between the pursuer and the prey.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, T., Shi, D., Wang, Z. et al. Learning cooperative strategies in multi-agent encirclement games with faster prey using prior knowledge. Neural Comput & Applic (2024). https://doi.org/10.1007/s00521-024-09727-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00521-024-09727-6

Keywords

Navigation