Multi-agent air combat with two-stage graph-attention communication

Sun, Zhixiao; Wu, Huahua; Shi, Yandong; Yu, Xiangchao; Gao, Yifan; Pei, Wenbin; Yang, Zhen; Piao, Haiyin; Hou, Yaqing

doi:10.1007/s00521-023-08784-7

Multi-agent air combat with two-stage graph-attention communication

Original Article
Published: 06 July 2023

Volume 35, pages 19765–19781, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Zhixiao Sun¹,
Huahua Wu²,
Yandong Shi²,
Xiangchao Yu²,
Yifan Gao²,
Wenbin Pei²,
Zhen Yang³,
Haiyin Piao³ &
…
Yaqing Hou ORCID: orcid.org/0000-0002-9929-2650²

731 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Air-to-air combat system is a complex multi-agent system (MAS) wherein a large number of unmanned combat aerial vehicles learn to combat with their opponents in a highly dynamic and uncertain environment. Because of the local observability of each individual, it is difficult for classical multi-agent learning methods to get effective cooperative strategies. Recently, a communication mechanism has been proposed to solve the local observability issue of MAS. However, existing methods with predefined rules easily cause an exponential increase in state–action pairs, leading to high communication costs. Taking this cue, this paper designs a graph neural network based on a two-stage graph-attention mechanism to capture the key interaction relationships and communication connections between agents in complex air-to-air combat scenarios. Based on an essential backbone multi-agent reinforcement learning method, known as Multi-Agent Proximal Policy Optimization, the proposed method with a hard- and soft-attention scheme can realize the dynamic adjustment of the communication relationship and ad hoc network of multiple agents, by cutting off the unrelated interaction connections while building the correlation importance between pair agents, concurrently. Last but not least, the experimental study in the simulation environment has validated the significance of our proposed method in solving the large-scale air-to-air combat problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Secretary bird optimization algorithm: a new metaheuristic for solving global optimization problems

Article Open access 23 April 2024

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

Data availibility statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Ferber J, Weiss G (1999) Multi-agent Systems: an Introduction to Distributed Artificial Intelligence. Addison-wesley-Reading, uk
Zhang K, Yang Z, Başar T (2021) Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of Reinforcement Learning and Control, 321–384
Roesch M, Linder C, Zimmermann R, Rudolf A, Hohmann A, Reinhart G (2020) Smart grid for industry using multi-agent reinforcement learning. Applied Sciences 10(19):6900
Article Google Scholar
Arel I, Liu C, Urbanik T, Kohls AG (2010) Reinforcement learning-based multi-agent system for network traffic signal control. IET Intelligent Transport Systems 4(2):128–135
Article Google Scholar
Tan CF, Wahidin L, Khalil S, Tamaldin N, Hu J, Rauterberg G (2016) The application of expert system: A review of research and applications. ARPN Journal of Engineering and Applied Sciences 11(4):2448–2453
Google Scholar
Oliehoek FA, Spaan MT, Vlassis N (2008) Optimal and approximate q-value functions for decentralized pomdps. Journal of Artificial Intelligence Research 32:289–353
Article MathSciNet MATH Google Scholar
Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems 30
Yu C, Velu A, Vinitsky E, Wang Y, Bayen A, Wu Y (2021) The surprising effectiveness of ppo in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955
Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32
Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi V, Jaderberg M, Lanctot M, Sonnerat N, Leibo JZ, Tuyls K, et al. (2017) Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296
Rashid T, Samvelyan M, Schroeder C, Farquhar G, Foerster J, Whiteson S (2018) Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 4295–4304 . PMLR
Son K, Kim D, Kang WJ, Hostallero DE, Yi Y (2019) Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning, pp. 5887–5896. PMLR
Yang Y, Hao J, Liao B, Shao K, Chen G, Liu W, Tang H (2020) Qatten: A general framework for cooperative multiagent reinforcement learning. arXiv preprint arXiv:2002.03939
Kurin V, Igl M, Rocktäschel T, Boehmer W, Whiteson S (2020) My body is a cage: the role of morphology in graph-based incompatible control. arXiv preprint arXiv:2010.01856
Li S, Gupta JK, Morales P, Allen R, Kochenderfer MJ (2020) Deep implicit coordination graphs for multi-agent reinforcement learning. arXiv preprint arXiv:2006.11438
Jiang J, Dun C, Huang T, Lu Z (2018) Graph convolutional reinforcement learning. arXiv preprint arXiv:1810.09202
Ryu H, Shin H, Park J (2020) Multi-agent actor-critic with hierarchical graph attention network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7236–7243
Sukhbaatar S, Fergus R, et al. (2016) Learning multiagent communication with backpropagation. Advances in neural information processing systems 29
Foerster J, Assael IA, De Freitas N, Whiteson S (2016) Learning to communicate with deep multi-agent reinforcement learning. Advances in neural information processing systems 29
Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. Advances in neural information processing systems 31
Singh A, Jain T, Sukhbaatar S (2018) Learning when to communicate at scale in multiagent cooperative and competitive tasks. arXiv preprint arXiv:1812.09755
Hu G, Zhu Y, Zhao D, Zhao M, Hao J (2020) Event-triggered multi-agent reinforcement learning with communication under limited-bandwidth constraint. arXiv preprint arXiv:2010.04978
Kim D, Moon S, Hostallero D, Kang WJ, Lee T, Son K, Yi Y (2019) Learning to schedule communication in multi-agent reinforcement learning. arXiv preprint arXiv:1902.01554
Liu Y, Wang W, Hu Y, Hao J, Chen X, Gao Y (2020) Multi-agent game abstraction via graph attention neural network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7211–7218
Niu Y, Paleja RR, Gombolay MC (2021) Multi-agent graph-attention communication and teaming. In: AAMAS, pp. 964–973
Du Y, Liu B, Moens V, Liu Z, Ren Z, Wang J, Chen X, Zhang H (2021) Learning correlated communication topology in multi-agent reinforcement learning. In: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, pp. 456–464
Li Z-P, Su H-L, Zhu X-B, Wei X-M, Jiang X-S, Gribova V, Filaretov VF, Huang D-S (2021) Hierarchical graph pooling with self-adaptive cluster aggregation. IEEE Trans Cognit Dev Syst 14(3):1198–1207
Apicella CL, Marlowe FW, Fowler JH, Christakis NA (2012) Social networks and cooperation in hunter-gatherers. Nature 481(7382):497–501
Article Google Scholar
Battaglia P, Pascanu R, Lai M, Jimenez Rezende D, et al. (2016) Interaction networks for learning about objects, relations and physics. Advances in neural information processing systems 29
Feng F, He X, Zhang H, Chua T-S (2021) Cross-gcn: Enhancing graph convolutional network with k-order feature interactions. IEEE Trans Knowl Data Eng 35(1):225–236
Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. stat 1050:20
Google Scholar
Ye Y, Ji S (2021) Sparse graph attention networks. IEEE Trans Knowl Data Eng 35(1):905–916
Chen Y, Rosolia U, Ames AD (2021) Decentralized task and path planning for multi-robot systems. IEEE Robotics and Automation Letters 6(3):4337–4344
Article Google Scholar
Li Q, Gama F, Ribeiro A, Prorok A (2020) Graph neural networks for decentralized multi-robot path planning. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11785–11792. IEEE
Luo T, Subagdja B, Wang D, Tan A-H (2019) Multi-agent collaborative exploration through graph-based deep reinforcement learning. In: 2019 IEEE International Conference on Agents (ICA), pp. 2–7. IEEE
Ziyang Z, Ping Z, Yixuan X, Yuxuan J (2019) Distributed intelligent self-organized mission planning of multi-uav for dynamic targets cooperative search-attack. Chinese Journal of Aeronautics 32(12):2706–2716
Article Google Scholar
Yu B, Yin H, Zhu Z (2017) Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875
Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-second AAAI Conference on Artificial Intelligence
Yu C, Velu A, Vinitsky E, Gao J, Wang Y, Bayen A, Wu Y (2022) The surprising effectiveness of ppo in cooperative multi-agent games. Advances in Neural Information Processing Systems 35:24611–24624
Google Scholar
Cui J, Liu Y, Nallanathan A (2019) Multi-agent reinforcement learning-based resource allocation for uav networks. IEEE Transactions on Wireless Communications 19(2):729–743
Article Google Scholar
Li L, Yang Z, Sun Z, Zhan G, Piao H, Zhou D (2022) Generation method of autonomous evasive maneuver strategy in air combat. In: 2022 22nd International Conference on Control, Automation and Systems (ICCAS), pp. 360–365. https://doi.org/10.23919/ICCAS55662.2022.10003888
Yang Z, Zhou D, Piao H, Zhang K, Kong W, Pan Q (2020) Evasive maneuver strategy for ucav in beyond-visual-range air combat based on hierarchical multi-objective evolutionary algorithm. IEEE Access 8:46605–46623. https://doi.org/10.1109/ACCESS.2020.2978883
Article Google Scholar
Gao W, Yang Z, Sun Z, Piao H, He Y, Zhou D (2022) Real-time calculation of tactical control range in beyond visual range air combat. In: 2022 IEEE International Conference on Unmanned Systems (ICUS), pp. 76–80 . https://doi.org/10.1109/ICUS55513.2022.9986608
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013)Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602
Vinyals O, Babuschkin I, Czarnecki WM, Mathieu M, Dudzik A, Chung J, Choi DH, Powell R, Ewalds T, Georgiev P et al (2019) Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782):350–354
Article Google Scholar
Panait L, Luke S (2005) Cooperative multi-agent learning: The state of the art. Autonomous agents and multi-agent systems 11:387–434
Article Google Scholar
Open A Gym A Toolkit for Developing and Comparing Reinforcement Learning Algorithms

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China (No. 2021ZD0112400), the National Natural Science Foundation of China under Grants 61906032 and 62206041, the Young Elite Scientists Sponsorship Program by CAST under Grant 2022QNRC001, the NSFC-Liaoning Province United Foundation under Grant U1908214, the 111 Project, No. D23006, the Fundamental Research Funds for the Central Universitiesunder grant DUT21TD107, DUT22ZD214, the LiaoNing Revitalization Talents Program, No. XLYC2008017.

Author information

Authors and Affiliations

Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, China
Zhixiao Sun
College of Computer Science and Technology, Dalian University of Technology (DLUT), Dalian, China
Huahua Wu, Yandong Shi, Xiangchao Yu, Yifan Gao, Wenbin Pei & Yaqing Hou
School of Electronics and Information, Northwestern Polytechnical University, Xi’an, China
Zhen Yang & Haiyin Piao

Authors

Zhixiao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Huahua Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yandong Shi
View author publications
You can also search for this author in PubMed Google Scholar
Xiangchao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yifan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Pei
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Yang
View author publications
You can also search for this author in PubMed Google Scholar
Haiyin Piao
View author publications
You can also search for this author in PubMed Google Scholar
Yaqing Hou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yaqing Hou.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Zhixiao Sun and Huahua Wu are joint first authors.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, Z., Wu, H., Shi, Y. et al. Multi-agent air combat with two-stage graph-attention communication. Neural Comput & Applic 35, 19765–19781 (2023). https://doi.org/10.1007/s00521-023-08784-7

Download citation

Received: 08 November 2022
Accepted: 12 June 2023
Published: 06 July 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s00521-023-08784-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent air combat with two-stage graph-attention communication

Abstract

Access this article

Similar content being viewed by others

Secretary bird optimization algorithm: a new metaheuristic for solving global optimization problems

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

Data availibility statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-agent air combat with two-stage graph-attention communication

Abstract

Access this article

Similar content being viewed by others

Secretary bird optimization algorithm: a new metaheuristic for solving global optimization problems

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

Data availibility statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation