Abstract
Considering the complexities of the modern maritime operational environment and aiming for effective safe navigation and communication maintenance, research into the collaborative trajectory tracking problem of unmanned surface vehicles (USVs) and unmanned aerial vehicles (UAVs) clusters during patrol and target tracking missions holds paramount significance. This paper proposes a multi-agent deep reinforcement learning (MADRL) approach, specifically the action-constrained multi-agent deep deterministic policy gradient (MADDPG), to efficiently solve the collaborative maritime-aerial distributed information fusion-based trajectory tracking problem. The proposed approach incorporates a constraint model based on the characteristics of maritime-aerial distributed information fusion mode and two designed reward functions—one global for target tracking and one local for cross-domain collaborative unmanned clusters. Simulation experiments under three different mission scenarios have been conducted, and results demonstrate that the proposed approach possesses excellent applicability to trajectory tracking tasks in collaborative maritime-aerial settings, exhibiting strong convergence and robustness in mobile target tracking. In a complex three-dimensional simulation environment, the improved algorithm demonstrated an 11.04% reduction in training time for convergence and an 8.03% increase in reward values compared to the original algorithm. This indicates that the introduction of attention mechanisms and the design of reward functions enable the algorithm to learn optimal strategies more quickly and effectively.
Similar content being viewed by others
Data availability
Not applicable.
References
Liu Z, Zhang Y, Yu X, Yuan C (2016) Unmanned surface vehicles: an overview of developments and challenges. Annu Rev Control 41:71–93. https://doi.org/10.1016/j.arcontrol.2016.04.018
Wei-Dong Z, Xiao-Cheng L, Peng H (2020) Progress and challenges of overwater unmanned systems. Acta Autom Sinica 46(5):847–857. https://doi.org/10.16383/j.aas.c190363
Danoy G, Brust MR, Bouvry P (2015) Connectivity stability in autonomous multi-level UAV swarms for wide area monitoring. In: Proceedings of the fifth ACM international symposium on development and analysis of intelligent vehicular networks and applications, pp 1–8. https://doi.org/10.1145/2815347.2815351
Yang Y, Polycarpou MM, Minai AA (2007) Multi-UAV cooperative search using an opportunistic learning method. J Dyn Syst Measur Control 129(5):716–728. https://doi.org/10.1115/1.2764515
Zhe YM (2016) Cooperative task allocation method for ship and UAVs, Hefei University of Technology. (in Chinese)
Xiaobin X, Haibin D, Zhigang Z, Yimin D, University B (2020) Progresses in UAV/USV Cooperative Control. https://doi.org/10.12132/ISSN.1673-5048.2020.0227
Zhang Z (2014) New challenge and development trends for maritime situation awareness. Modern Radar 36(7):1–4. https://doi.org/10.3969/j.issn.1004-7859.2014.07.001
Fu XW, Wang H, Xu Z (2022) Cooperative pursuit strategy for multi-UAVs based on DE-MADDPG algorithm. Acta Aeronautica et Astronautica Sinica 43(5):325311. https://doi.org/10.7527/S10006-893.2021.25311(in Chinese)
Niu JF, Gan XS, Wei XL et al (2023) Cooperative maneuver decision-making of antagonistic air combat of special shaped UAV[J]. J Command Control 9(3):292–302. https://doi.org/10.3969/j.issn.2096-0204.2023.03.0292
Ouyang Z, Wang H, Huang Y, Yang K, Yi H (2020) Path planning technologies for USV formation based on improved RRT. Chin J Ship Res 15(3):18–24. http://www.ship-research.com/EN/Y2020/V15/I3/18
Zhou X, Wu P, Zhang H, Guo W, Liu Y (2019) Learn to navigate: cooperative path planning for unmanned surface vehicles using deep reinforcement learning. IEEE Access 7:165262–165278. https://doi.org/10.1109/ACCESS.2019.2953326
Shen LT, Niu YF, Zhu HY (2013) Theoretical and methodological framework for autonomous collaborative control of multiple drones
C. Szepesvári (2009) Synthesis lectures on artificial intelligence and machine learning. Synth Lect Artif Intell Mach Learn
Sigaud O, Buffet O (2010) Markov decision processes in artificial intelligence. Markov Process Controll Markov Chains.
He M, Zhang B, Liu Q, Chen XL, Yang C (2021) Multi-agent deep deterministic policy gradient algorithm via prioritized experience selected method. Control Decis 36(1):7. https://doi.org/10.13195/j.kzyjc.2019.0834
Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Adv Neural Inform Process Syst 30
Ang GAO, Zhiming DONG, Liang LI, Jinghua SONG, Li DUAN (2021) Parallel priority experience replay mechanism of MADDPG algorithm. Syst Eng Electron 43(2):420–433. https://doi.org/10.12305/j.issn.1001-506X.2021.02.17
Gui X (2020) Research on multi-agent cooperative control based on MADDPG Algorithm, Wuhan Textile University (in Chinese)
Sheikh HU, Bölöni L (2019) Designing a multi-objective reward function for creating teams of robotic bodyguards using deep reinforcement learning
Yang J, Nakhaei A, Isele D, Zha H, Fujimura K (2018) CM3: cooperative multi-goal multi-stage multi-agent reinforcement learning
Mazumder S, Liu B, Wang S, Zhu Y, Liu L, Li J (2018) Action permissibility in deep reinforcement learning and application to autonomous driving.
Chen Y (2019) Research on vessel track prediction method considering the influence of marine environment. In: Har-bin Engineering University (in Chinese)
Hou YQ, Tao H, Gong JB et al (2021) Cooperative path planning of USV and UAV swarms under multiple constraints [J]. Chin J Ship Res 16(1):74–82. https://doi.org/10.19693/j.issn.1673-3185.02091
Boukerche A, Sun P (2018) Connectivity and coverage based protocols for wireless sensor networks. Ad Hoc Netw 80(Nov):54–69. https://doi.org/10.1016/j.adhoc.2018.07.003
Hare J (2019) Dealing with sparse rewards in reinforcement learning. https://doi.org/10.48550/arXiv.1910.09281
Funding
The authors have not disclosed any funding.
Author information
Authors and Affiliations
Contributions
HW is the main developer of C-MADDPG and finished the original manuscript with MW, JW, GW. HW contributed to algorithm design. MW was also responsible for funding acquisition and research supervision. All co-authors contributed to manuscript editing and review prior to submission.
Corresponding author
Ethics declarations
Conflict of interest
The authors have not disclosed any competing interests.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, H., wang, M., Wang, J. et al. Distributed information fusion based trajectory tracking for USV and UAV clusters via multi-agent deep learning approach. AS (2024). https://doi.org/10.1007/s42401-024-00275-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42401-024-00275-4