Skip to main content
Log in

A MADDPG-based multi-agent antagonistic algorithm for sea battlefield confrontation

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

There is a concerted effort to build intelligent sea and numerous artificial intelligence technologies have been explored. At present, more and more people are engaged in the research of deep reinforcement learning algorithm, and its mainstream application is in the field of games. Reinforcement learning has conquered chess belonging to complete information game, and Texas poker belonging to incomplete information games. And it reached or even surpassed the highest player level of mankind in E-sports games with huge state space and complex action space. However, reinforcement learning algorithm still has great challenges in fields such as automatic driving. The main reason is that the training of reinforcement learning needs to build an environment for interacting with agents. However, it is very difficult to construct realistic simulation scenes, and there is no guarantee that we will not encounter the state that the agent has not seen. Therefore, it is necessary to explore the simulation scene first. Based on this, this paper mainly studies reinforcement learning in simulation scenario. There are huge challenges in migrating them to real scenario applications, especially in sea missions. Aiming at the heterogeneous multi-agent game confrontation scenario, this paper proposes a sea battlefield game confrontation decision algorithm based on multi-agent deep deterministic policy gradient. The algorithm combines long short-term memory and actor-critic, which not only realizes the convergence of the algorithm in huge state space and action space, but also solves the problem of sparse real rewards. At the same time, imitation learning is integrated into the decision algorithm, which not only improves the convergence speed of the algorithm, but also greatly improves the effectiveness of the algorithm. The results show that the algorithm can deal with a variety of different tactical sea battlefield scenarios, make flexible decisions according to the changes of the enemy, and the average winning rate is close to 90%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Chu, T., Wang, J., Codecà, L., et al.: Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans. Intell. Transp. Syst. 21(3), 1086–1095 (2019)

    Article  Google Scholar 

  2. Yao, F., Jia, L.: A collaborative multi-agent reinforcement learning anti-jamming algorithm in wireless networks. IEEE Wirel. Commun. Lett. 8(4), 1024–1027 (2019)

    Article  MathSciNet  Google Scholar 

  3. Silver, D., Huang, A., Maddison, C.J., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  4. Silver, D., Schrittwieser, J., Simonyan, K., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)

    Article  Google Scholar 

  5. Li, X., Lv, Z., Wang, S., et al.: A reinforcement learning model based on temporal difference algorithm. IEEE Access 7, 121922–121930 (2019)

    Article  Google Scholar 

  6. Silver, D., Hubert, T., Schrittwieser, J., et al.: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362(6419), 1140–1144 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  7. Gangopadhyay, B., Soora, H., Dasgupta, P.: Hierarchical program-triggered reinforcement learning agents for automated driving. IEEE Trans. Intell. Transp. Syst. (2021). https://doi.org/10.1109/TITS.2021.3096998

    Article  Google Scholar 

  8. Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  9. Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., et al.: Natural actor–critic algorithms. Automatica 45(11), 2471–2482 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Zhang, M., Zhang, Y., Gao, Z., et al.: An improved DDPG and its application based on the double-layer BP neural network. IEEE Access 8, 177734–177744 (2020)

    Article  Google Scholar 

  11. Font, J.M., Mahlmann, T.: Dota 2 bot competition. IEEE Trans. Games 11(3), 285–289 (2018)

    Article  Google Scholar 

  12. Vinyals, O., Babuschkin, I., Czarnecki, W.M., et al.: Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)

    Article  Google Scholar 

  13. Wu, B., et al.: Hierarchical Macro Strategy Model for MOBA Game AI. Proceed. AAAI Conf. Artif. Intell. 33, 1206–1213 (2019)

    Google Scholar 

  14. Liu, H., Zhang, Z., Wang, D.: WRFMR: a multi-agent reinforcement learning method for cooperative tasks. IEEE Access 8, 216320–216331 (2020)

    Article  Google Scholar 

  15. Cui, H., Zhang, Z.: A cooperative multi-agent reinforcement learning method based on coordination degree. IEEE Access 9, 123805–123814 (2021)

    Article  Google Scholar 

  16. Wen, J., Yang, J., Wang, T.: Path planning for autonomous underwater vehicles under the influence of ocean currents based on a fusion heuristic algorithm. IEEE Trans. Veh. Technol. 70(9), 8529–8544 (2021)

    Article  Google Scholar 

  17. Yang, J., Wen, J., Wang, Y., et al.: Fog-based marine environmental information monitoring toward ocean of things. IEEE Internet Things J. 7(5), 4238–4247 (2019)

    Article  Google Scholar 

  18. Yang, J., Wen, J., Jiang, B., et al.: Blockchain-based sharing and tamper-proof framework of big data networking. IEEE Netw. 34(4), 62–67 (2020)

    Article  Google Scholar 

  19. Yang, J., Guo, X., Li, Y., et al.: A survey of few-shot learning in smart agriculture: developments, applications, and challenges. Plant Methods 18(1), 1–12 (2022)

    Article  Google Scholar 

  20. Li, Y., Yang, J., Wen, J.: Entropy-based redundancy analysis and information screening. Digit. Commun. Netw. (2021). https://doi.org/10.1016/j.dcan.2021.12.001

    Article  Google Scholar 

  21. Li, Y., Chao, X., Ercisli, S.: Disturbed-entropy: a simple data quality assessment approach. ICT Express (2022). https://doi.org/10.1016/j.icte.2022.01.006

    Article  Google Scholar 

  22. Li, Y., Chao, X.: Toward sustainability: trade-off between data quality and quantity in crop pest recognition. Front. Plant Sci. 12, 811241 (2021)

    Article  Google Scholar 

  23. Li, Y., Chao, X.: Distance-entropy: an effective indicator for selecting informative data. Front. Plant Sci. 1, 8195 (2022)

    Google Scholar 

  24. Xu, D., Shen, X., Huang, Y., et al.: RB-Net: integrating region and boundary features for image manipulation localization. Multimed. Syst. (2022). https://doi.org/10.1007/s00530-022-00903-z

    Article  Google Scholar 

  25. Herouala, A.T., Ziani, B., Kerrache, C.A., et al.: CaDaCa: a new caching strategy in NDN using data categorization. Multimed. Syst. (2022). https://doi.org/10.1007/s00530-022-00904-y

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, W., Nie, J. A MADDPG-based multi-agent antagonistic algorithm for sea battlefield confrontation. Multimedia Systems 29, 2991–3000 (2023). https://doi.org/10.1007/s00530-022-00922-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-022-00922-w

Keywords

Navigation