A MADDPG-based multi-agent antagonistic algorithm for sea battlefield confrontation

Chen, Wei; Nie, Jing

doi:10.1007/s00530-022-00922-w

A MADDPG-based multi-agent antagonistic algorithm for sea battlefield confrontation

Special Issue Paper
Published: 13 April 2022

Volume 29, pages 2991–3000, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Wei Chen^1,2 &
Jing Nie³

808 Accesses
2 Citations
Explore all metrics

Abstract

There is a concerted effort to build intelligent sea and numerous artificial intelligence technologies have been explored. At present, more and more people are engaged in the research of deep reinforcement learning algorithm, and its mainstream application is in the field of games. Reinforcement learning has conquered chess belonging to complete information game, and Texas poker belonging to incomplete information games. And it reached or even surpassed the highest player level of mankind in E-sports games with huge state space and complex action space. However, reinforcement learning algorithm still has great challenges in fields such as automatic driving. The main reason is that the training of reinforcement learning needs to build an environment for interacting with agents. However, it is very difficult to construct realistic simulation scenes, and there is no guarantee that we will not encounter the state that the agent has not seen. Therefore, it is necessary to explore the simulation scene first. Based on this, this paper mainly studies reinforcement learning in simulation scenario. There are huge challenges in migrating them to real scenario applications, especially in sea missions. Aiming at the heterogeneous multi-agent game confrontation scenario, this paper proposes a sea battlefield game confrontation decision algorithm based on multi-agent deep deterministic policy gradient. The algorithm combines long short-term memory and actor-critic, which not only realizes the convergence of the algorithm in huge state space and action space, but also solves the problem of sparse real rewards. At the same time, imitation learning is integrated into the decision algorithm, which not only improves the convergence speed of the algorithm, but also greatly improves the effectiveness of the algorithm. The results show that the algorithm can deal with a variety of different tactical sea battlefield scenarios, make flexible decisions according to the changes of the enemy, and the average winning rate is close to 90%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

A review of cooperative multi-agent deep reinforcement learning

Article 14 October 2022

Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction

Article 28 December 2023

References

Chu, T., Wang, J., Codecà, L., et al.: Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans. Intell. Transp. Syst. 21(3), 1086–1095 (2019)
Article Google Scholar
Yao, F., Jia, L.: A collaborative multi-agent reinforcement learning anti-jamming algorithm in wireless networks. IEEE Wirel. Commun. Lett. 8(4), 1024–1027 (2019)
Article MathSciNet Google Scholar
Silver, D., Huang, A., Maddison, C.J., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Silver, D., Schrittwieser, J., Simonyan, K., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Article Google Scholar
Li, X., Lv, Z., Wang, S., et al.: A reinforcement learning model based on temporal difference algorithm. IEEE Access 7, 121922–121930 (2019)
Article Google Scholar
Silver, D., Hubert, T., Schrittwieser, J., et al.: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362(6419), 1140–1144 (2018)
Article MathSciNet MATH Google Scholar
Gangopadhyay, B., Soora, H., Dasgupta, P.: Hierarchical program-triggered reinforcement learning agents for automated driving. IEEE Trans. Intell. Transp. Syst. (2021). https://doi.org/10.1109/TITS.2021.3096998
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., et al.: Natural actor–critic algorithms. Automatica 45(11), 2471–2482 (2009)
Article MathSciNet MATH Google Scholar
Zhang, M., Zhang, Y., Gao, Z., et al.: An improved DDPG and its application based on the double-layer BP neural network. IEEE Access 8, 177734–177744 (2020)
Article Google Scholar
Font, J.M., Mahlmann, T.: Dota 2 bot competition. IEEE Trans. Games 11(3), 285–289 (2018)
Article Google Scholar
Vinyals, O., Babuschkin, I., Czarnecki, W.M., et al.: Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)
Article Google Scholar
Wu, B., et al.: Hierarchical Macro Strategy Model for MOBA Game AI. Proceed. AAAI Conf. Artif. Intell. 33, 1206–1213 (2019)
Google Scholar
Liu, H., Zhang, Z., Wang, D.: WRFMR: a multi-agent reinforcement learning method for cooperative tasks. IEEE Access 8, 216320–216331 (2020)
Article Google Scholar
Cui, H., Zhang, Z.: A cooperative multi-agent reinforcement learning method based on coordination degree. IEEE Access 9, 123805–123814 (2021)
Article Google Scholar
Wen, J., Yang, J., Wang, T.: Path planning for autonomous underwater vehicles under the influence of ocean currents based on a fusion heuristic algorithm. IEEE Trans. Veh. Technol. 70(9), 8529–8544 (2021)
Article Google Scholar
Yang, J., Wen, J., Wang, Y., et al.: Fog-based marine environmental information monitoring toward ocean of things. IEEE Internet Things J. 7(5), 4238–4247 (2019)
Article Google Scholar
Yang, J., Wen, J., Jiang, B., et al.: Blockchain-based sharing and tamper-proof framework of big data networking. IEEE Netw. 34(4), 62–67 (2020)
Article Google Scholar
Yang, J., Guo, X., Li, Y., et al.: A survey of few-shot learning in smart agriculture: developments, applications, and challenges. Plant Methods 18(1), 1–12 (2022)
Article Google Scholar
Li, Y., Yang, J., Wen, J.: Entropy-based redundancy analysis and information screening. Digit. Commun. Netw. (2021). https://doi.org/10.1016/j.dcan.2021.12.001
Article Google Scholar
Li, Y., Chao, X., Ercisli, S.: Disturbed-entropy: a simple data quality assessment approach. ICT Express (2022). https://doi.org/10.1016/j.icte.2022.01.006
Article Google Scholar
Li, Y., Chao, X.: Toward sustainability: trade-off between data quality and quantity in crop pest recognition. Front. Plant Sci. 12, 811241 (2021)
Article Google Scholar
Li, Y., Chao, X.: Distance-entropy: an effective indicator for selecting informative data. Front. Plant Sci. 1, 8195 (2022)
Google Scholar
Xu, D., Shen, X., Huang, Y., et al.: RB-Net: integrating region and boundary features for image manipulation localization. Multimed. Syst. (2022). https://doi.org/10.1007/s00530-022-00903-z
Article Google Scholar
Herouala, A.T., Ziani, B., Kerrache, C.A., et al.: CaDaCa: a new caching strategy in NDN using data categorization. Multimed. Syst. (2022). https://doi.org/10.1007/s00530-022-00904-y
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Science and Technology on Marine Navigation and Control, China State Shipbuilding Corporation, Beijing, 100036, China
Wei Chen
Tianjin Navigation Instruments Research Institute, Tianjin, 300131, China
Wei Chen
College of Mechanical and Electrical Engineering, Shihezi University, Shihezi, 832003, China
Jing Nie

Authors

Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jing Nie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, W., Nie, J. A MADDPG-based multi-agent antagonistic algorithm for sea battlefield confrontation. Multimedia Systems 29, 2991–3000 (2023). https://doi.org/10.1007/s00530-022-00922-w

Download citation

Received: 12 January 2022
Accepted: 14 March 2022
Published: 13 April 2022
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00530-022-00922-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A MADDPG-based multi-agent antagonistic algorithm for sea battlefield confrontation

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

A review of cooperative multi-agent deep reinforcement learning

Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation