Skip to main content
Log in

Multi-agent air combat with two-stage graph-attention communication

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Air-to-air combat system is a complex multi-agent system (MAS) wherein a large number of unmanned combat aerial vehicles learn to combat with their opponents in a highly dynamic and uncertain environment. Because of the local observability of each individual, it is difficult for classical multi-agent learning methods to get effective cooperative strategies. Recently, a communication mechanism has been proposed to solve the local observability issue of MAS. However, existing methods with predefined rules easily cause an exponential increase in state–action pairs, leading to high communication costs. Taking this cue, this paper designs a graph neural network based on a two-stage graph-attention mechanism to capture the key interaction relationships and communication connections between agents in complex air-to-air combat scenarios. Based on an essential backbone multi-agent reinforcement learning method, known as Multi-Agent Proximal Policy Optimization, the proposed method with a hard- and soft-attention scheme can realize the dynamic adjustment of the communication relationship and ad hoc network of multiple agents, by cutting off the unrelated interaction connections while building the correlation importance between pair agents, concurrently. Last but not least, the experimental study in the simulation environment has validated the significance of our proposed method in solving the large-scale air-to-air combat problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availibility statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Ferber J, Weiss G (1999) Multi-agent Systems: an Introduction to Distributed Artificial Intelligence. Addison-wesley-Reading, uk

  2. Zhang K, Yang Z, Başar T (2021) Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of Reinforcement Learning and Control, 321–384

  3. Roesch M, Linder C, Zimmermann R, Rudolf A, Hohmann A, Reinhart G (2020) Smart grid for industry using multi-agent reinforcement learning. Applied Sciences 10(19):6900

    Article  Google Scholar 

  4. Arel I, Liu C, Urbanik T, Kohls AG (2010) Reinforcement learning-based multi-agent system for network traffic signal control. IET Intelligent Transport Systems 4(2):128–135

    Article  Google Scholar 

  5. Tan CF, Wahidin L, Khalil S, Tamaldin N, Hu J, Rauterberg G (2016) The application of expert system: A review of research and applications. ARPN Journal of Engineering and Applied Sciences 11(4):2448–2453

    Google Scholar 

  6. Oliehoek FA, Spaan MT, Vlassis N (2008) Optimal and approximate q-value functions for decentralized pomdps. Journal of Artificial Intelligence Research 32:289–353

    Article  MathSciNet  MATH  Google Scholar 

  7. Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems 30

  8. Yu C, Velu A, Vinitsky E, Wang Y, Bayen A, Wu Y (2021) The surprising effectiveness of ppo in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955

  9. Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32

  10. Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi V, Jaderberg M, Lanctot M, Sonnerat N, Leibo JZ, Tuyls K, et al. (2017) Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296

  11. Rashid T, Samvelyan M, Schroeder C, Farquhar G, Foerster J, Whiteson S (2018) Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 4295–4304 . PMLR

  12. Son K, Kim D, Kang WJ, Hostallero DE, Yi Y (2019) Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 5887–5896. PMLR

  13. Yang Y, Hao J, Liao B, Shao K, Chen G, Liu W, Tang H (2020) Qatten: A general framework for cooperative multiagent reinforcement learning. arXiv preprint arXiv:2002.03939

  14. Kurin V, Igl M, Rocktäschel T, Boehmer W, Whiteson S (2020) My body is a cage: the role of morphology in graph-based incompatible control. arXiv preprint arXiv:2010.01856

  15. Li S, Gupta JK, Morales P, Allen R, Kochenderfer MJ (2020) Deep implicit coordination graphs for multi-agent reinforcement learning. arXiv preprint arXiv:2006.11438

  16. Jiang J, Dun C, Huang T, Lu Z (2018) Graph convolutional reinforcement learning. arXiv preprint arXiv:1810.09202

  17. Ryu H, Shin H, Park J (2020) Multi-agent actor-critic with hierarchical graph attention network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7236–7243

  18. Sukhbaatar S, Fergus R, et al. (2016) Learning multiagent communication with backpropagation. Advances in neural information processing systems 29

  19. Foerster J, Assael IA, De Freitas N, Whiteson S (2016) Learning to communicate with deep multi-agent reinforcement learning. Advances in neural information processing systems 29

  20. Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. Advances in neural information processing systems 31

  21. Singh A, Jain T, Sukhbaatar S (2018) Learning when to communicate at scale in multiagent cooperative and competitive tasks. arXiv preprint arXiv:1812.09755

  22. Hu G, Zhu Y, Zhao D, Zhao M, Hao J (2020) Event-triggered multi-agent reinforcement learning with communication under limited-bandwidth constraint. arXiv preprint arXiv:2010.04978

  23. Kim D, Moon S, Hostallero D, Kang WJ, Lee T, Son K, Yi Y (2019) Learning to schedule communication in multi-agent reinforcement learning. arXiv preprint arXiv:1902.01554

  24. Liu Y, Wang W, Hu Y, Hao J, Chen X, Gao Y (2020) Multi-agent game abstraction via graph attention neural network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7211–7218

  25. Niu Y, Paleja RR, Gombolay MC (2021) Multi-agent graph-attention communication and teaming. In: AAMAS, pp. 964–973

  26. Du Y, Liu B, Moens V, Liu Z, Ren Z, Wang J, Chen X, Zhang H (2021) Learning correlated communication topology in multi-agent reinforcement learning. In: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, pp. 456–464

  27. Li Z-P, Su H-L, Zhu X-B, Wei X-M, Jiang X-S, Gribova V, Filaretov VF, Huang D-S (2021) Hierarchical graph pooling with self-adaptive cluster aggregation. IEEE Trans Cognit Dev Syst 14(3):1198–1207

  28. Apicella CL, Marlowe FW, Fowler JH, Christakis NA (2012) Social networks and cooperation in hunter-gatherers. Nature 481(7382):497–501

    Article  Google Scholar 

  29. Battaglia P, Pascanu R, Lai M, Jimenez Rezende D, et al. (2016) Interaction networks for learning about objects, relations and physics. Advances in neural information processing systems 29

  30. Feng F, He X, Zhang H, Chua T-S (2021) Cross-gcn: Enhancing graph convolutional network with k-order feature interactions. IEEE Trans Knowl Data Eng 35(1):225–236

  31. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. stat 1050:20

    Google Scholar 

  32. Ye Y, Ji S (2021) Sparse graph attention networks. IEEE Trans Knowl Data Eng 35(1):905–916

  33. Chen Y, Rosolia U, Ames AD (2021) Decentralized task and path planning for multi-robot systems. IEEE Robotics and Automation Letters 6(3):4337–4344

    Article  Google Scholar 

  34. Li Q, Gama F, Ribeiro A, Prorok A (2020) Graph neural networks for decentralized multi-robot path planning. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11785–11792. IEEE

  35. Luo T, Subagdja B, Wang D, Tan A-H (2019) Multi-agent collaborative exploration through graph-based deep reinforcement learning. In: 2019 IEEE International Conference on Agents (ICA), pp. 2–7. IEEE

  36. Ziyang Z, Ping Z, Yixuan X, Yuxuan J (2019) Distributed intelligent self-organized mission planning of multi-uav for dynamic targets cooperative search-attack. Chinese Journal of Aeronautics 32(12):2706–2716

    Article  Google Scholar 

  37. Yu B, Yin H, Zhu Z (2017) Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875

  38. Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-second AAAI Conference on Artificial Intelligence

  39. Yu C, Velu A, Vinitsky E, Gao J, Wang Y, Bayen A, Wu Y (2022) The surprising effectiveness of ppo in cooperative multi-agent games. Advances in Neural Information Processing Systems 35:24611–24624

    Google Scholar 

  40. Cui J, Liu Y, Nallanathan A (2019) Multi-agent reinforcement learning-based resource allocation for uav networks. IEEE Transactions on Wireless Communications 19(2):729–743

    Article  Google Scholar 

  41. Li L, Yang Z, Sun Z, Zhan G, Piao H, Zhou D (2022) Generation method of autonomous evasive maneuver strategy in air combat. In: 2022 22nd International Conference on Control, Automation and Systems (ICCAS), pp. 360–365. https://doi.org/10.23919/ICCAS55662.2022.10003888

  42. Yang Z, Zhou D, Piao H, Zhang K, Kong W, Pan Q (2020) Evasive maneuver strategy for ucav in beyond-visual-range air combat based on hierarchical multi-objective evolutionary algorithm. IEEE Access 8:46605–46623. https://doi.org/10.1109/ACCESS.2020.2978883

    Article  Google Scholar 

  43. Gao W, Yang Z, Sun Z, Piao H, He Y, Zhou D (2022) Real-time calculation of tactical control range in beyond visual range air combat. In: 2022 IEEE International Conference on Unmanned Systems (ICUS), pp. 76–80 . https://doi.org/10.1109/ICUS55513.2022.9986608

  44. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013)Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602

  45. Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, Choi DH, Powell R, Ewalds T, Georgiev P et al (2019) Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782):350–354

    Article  Google Scholar 

  46. Panait L, Luke S (2005) Cooperative multi-agent learning: The state of the art. Autonomous agents and multi-agent systems 11:387–434

    Article  Google Scholar 

  47. Open A Gym A Toolkit for Developing and Comparing Reinforcement Learning Algorithms

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China (No. 2021ZD0112400), the National Natural Science Foundation of China under Grants 61906032 and 62206041, the Young Elite Scientists Sponsorship Program by CAST under Grant 2022QNRC001, the NSFC-Liaoning Province United Foundation under Grant U1908214, the 111 Project, No. D23006, the Fundamental Research Funds for the Central Universitiesunder grant DUT21TD107, DUT22ZD214, the LiaoNing Revitalization Talents Program, No. XLYC2008017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yaqing Hou.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Zhixiao Sun and Huahua Wu are joint first authors.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, Z., Wu, H., Shi, Y. et al. Multi-agent air combat with two-stage graph-attention communication. Neural Comput & Applic 35, 19765–19781 (2023). https://doi.org/10.1007/s00521-023-08784-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08784-7

Keywords

Navigation