Distributed information fusion based trajectory tracking for USV and UAV clusters via multi-agent deep learning approach

Wu, Hongzhi; wang, Miao; Wang, Jingshi; wang, Guoqing

doi:10.1007/s42401-024-00275-4

Distributed information fusion based trajectory tracking for USV and UAV clusters via multi-agent deep learning approach

Review
Published: 23 February 2024

(2024)
Cite this article

Aerospace Systems Aims and scope Submit manuscript

Hongzhi Wu¹,
Miao wang ORCID: orcid.org/0000-0002-2049-295X¹,
Jingshi Wang^1,2 &
…
Guoqing wang¹

110 Accesses
Explore all metrics

Abstract

Considering the complexities of the modern maritime operational environment and aiming for effective safe navigation and communication maintenance, research into the collaborative trajectory tracking problem of unmanned surface vehicles (USVs) and unmanned aerial vehicles (UAVs) clusters during patrol and target tracking missions holds paramount significance. This paper proposes a multi-agent deep reinforcement learning (MADRL) approach, specifically the action-constrained multi-agent deep deterministic policy gradient (MADDPG), to efficiently solve the collaborative maritime-aerial distributed information fusion-based trajectory tracking problem. The proposed approach incorporates a constraint model based on the characteristics of maritime-aerial distributed information fusion mode and two designed reward functions—one global for target tracking and one local for cross-domain collaborative unmanned clusters. Simulation experiments under three different mission scenarios have been conducted, and results demonstrate that the proposed approach possesses excellent applicability to trajectory tracking tasks in collaborative maritime-aerial settings, exhibiting strong convergence and robustness in mobile target tracking. In a complex three-dimensional simulation environment, the improved algorithm demonstrated an 11.04% reduction in training time for convergence and an 8.03% increase in reward values compared to the original algorithm. This indicates that the introduction of attention mechanisms and the design of reward functions enable the algorithm to learn optimal strategies more quickly and effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Trajectory Planning of UAV in Unknown Dynamic Environment with Deep Reinforcement Learning

Deep Reinforcement Learning for Jointly Resource Allocation and Trajectory Planning in UAV-Assisted Networks

Multi-objective Global Path Planning for UAV-assisted Sensor Data Collection Using DRL and Transformer

Data availability

Not applicable.

References

Liu Z, Zhang Y, Yu X, Yuan C (2016) Unmanned surface vehicles: an overview of developments and challenges. Annu Rev Control 41:71–93. https://doi.org/10.1016/j.arcontrol.2016.04.018
Article Google Scholar
Wei-Dong Z, Xiao-Cheng L, Peng H (2020) Progress and challenges of overwater unmanned systems. Acta Autom Sinica 46(5):847–857. https://doi.org/10.16383/j.aas.c190363
Article Google Scholar
Danoy G, Brust MR, Bouvry P (2015) Connectivity stability in autonomous multi-level UAV swarms for wide area monitoring. In: Proceedings of the fifth ACM international symposium on development and analysis of intelligent vehicular networks and applications, pp 1–8. https://doi.org/10.1145/2815347.2815351
Yang Y, Polycarpou MM, Minai AA (2007) Multi-UAV cooperative search using an opportunistic learning method. J Dyn Syst Measur Control 129(5):716–728. https://doi.org/10.1115/1.2764515
Article Google Scholar
Zhe YM (2016) Cooperative task allocation method for ship and UAVs, Hefei University of Technology. (in Chinese)
Xiaobin X, Haibin D, Zhigang Z, Yimin D, University B (2020) Progresses in UAV/USV Cooperative Control. https://doi.org/10.12132/ISSN.1673-5048.2020.0227
Zhang Z (2014) New challenge and development trends for maritime situation awareness. Modern Radar 36(7):1–4. https://doi.org/10.3969/j.issn.1004-7859.2014.07.001
Article Google Scholar
Fu XW, Wang H, Xu Z (2022) Cooperative pursuit strategy for multi-UAVs based on DE-MADDPG algorithm. Acta Aeronautica et Astronautica Sinica 43(5):325311. https://doi.org/10.7527/S10006-893.2021.25311(in Chinese)
Article Google Scholar
Niu JF, Gan XS, Wei XL et al (2023) Cooperative maneuver decision-making of antagonistic air combat of special shaped UAV[J]. J Command Control 9(3):292–302. https://doi.org/10.3969/j.issn.2096-0204.2023.03.0292
Article Google Scholar
Ouyang Z, Wang H, Huang Y, Yang K, Yi H (2020) Path planning technologies for USV formation based on improved RRT. Chin J Ship Res 15(3):18–24. http://www.ship-research.com/EN/Y2020/V15/I3/18
Google Scholar
Zhou X, Wu P, Zhang H, Guo W, Liu Y (2019) Learn to navigate: cooperative path planning for unmanned surface vehicles using deep reinforcement learning. IEEE Access 7:165262–165278. https://doi.org/10.1109/ACCESS.2019.2953326
Article Google Scholar
Shen LT, Niu YF, Zhu HY (2013) Theoretical and methodological framework for autonomous collaborative control of multiple drones
C. Szepesvári (2009) Synthesis lectures on artificial intelligence and machine learning. Synth Lect Artif Intell Mach Learn
Sigaud O, Buffet O (2010) Markov decision processes in artificial intelligence. Markov Process Controll Markov Chains.
He M, Zhang B, Liu Q, Chen XL, Yang C (2021) Multi-agent deep deterministic policy gradient algorithm via prioritized experience selected method. Control Decis 36(1):7. https://doi.org/10.13195/j.kzyjc.2019.0834
Article Google Scholar
Lowe R, Wu YI, Tamar A, Harb J, Pieter Abbeel O, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Adv Neural Inform Process Syst 30
Ang GAO, Zhiming DONG, Liang LI, Jinghua SONG, Li DUAN (2021) Parallel priority experience replay mechanism of MADDPG algorithm. Syst Eng Electron 43(2):420–433. https://doi.org/10.12305/j.issn.1001-506X.2021.02.17
Article Google Scholar
Gui X (2020) Research on multi-agent cooperative control based on MADDPG Algorithm, Wuhan Textile University (in Chinese)
Sheikh HU, Bölöni L (2019) Designing a multi-objective reward function for creating teams of robotic bodyguards using deep reinforcement learning
Yang J, Nakhaei A, Isele D, Zha H, Fujimura K (2018) CM3: cooperative multi-goal multi-stage multi-agent reinforcement learning
Mazumder S, Liu B, Wang S, Zhu Y, Liu L, Li J (2018) Action permissibility in deep reinforcement learning and application to autonomous driving.
Chen Y (2019) Research on vessel track prediction method considering the influence of marine environment. In: Har-bin Engineering University (in Chinese)
Hou YQ, Tao H, Gong JB et al (2021) Cooperative path planning of USV and UAV swarms under multiple constraints [J]. Chin J Ship Res 16(1):74–82. https://doi.org/10.19693/j.issn.1673-3185.02091
Article Google Scholar
Boukerche A, Sun P (2018) Connectivity and coverage based protocols for wireless sensor networks. Ad Hoc Netw 80(Nov):54–69. https://doi.org/10.1016/j.adhoc.2018.07.003
Article Google Scholar
Hare J (2019) Dealing with sparse rewards in reinforcement learning. https://doi.org/10.48550/arXiv.1910.09281

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Hongzhi Wu, Miao wang, Jingshi Wang & Guoqing wang
Jiangsu Automation Research Institute, Lianyungang, China
Jingshi Wang

Authors

Hongzhi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Miao wang
View author publications
You can also search for this author in PubMed Google Scholar
Jingshi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guoqing wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HW is the main developer of C-MADDPG and finished the original manuscript with MW, JW, GW. HW contributed to algorithm design. MW was also responsible for funding acquisition and research supervision. All co-authors contributed to manuscript editing and review prior to submission.

Corresponding author

Correspondence to Miao wang.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, H., wang, M., Wang, J. et al. Distributed information fusion based trajectory tracking for USV and UAV clusters via multi-agent deep learning approach. AS (2024). https://doi.org/10.1007/s42401-024-00275-4

Download citation

Received: 12 October 2023
Revised: 08 January 2024
Accepted: 15 January 2024
Published: 23 February 2024
DOI: https://doi.org/10.1007/s42401-024-00275-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributed information fusion based trajectory tracking for USV and UAV clusters via multi-agent deep learning approach

Abstract

Access this article

Similar content being viewed by others

Trajectory Planning of UAV in Unknown Dynamic Environment with Deep Reinforcement Learning

Deep Reinforcement Learning for Jointly Resource Allocation and Trajectory Planning in UAV-Assisted Networks

Multi-objective Global Path Planning for UAV-assisted Sensor Data Collection Using DRL and Transformer

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Distributed information fusion based trajectory tracking for USV and UAV clusters via multi-agent deep learning approach

Abstract

Access this article

Similar content being viewed by others

Trajectory Planning of UAV in Unknown Dynamic Environment with Deep Reinforcement Learning

Deep Reinforcement Learning for Jointly Resource Allocation and Trajectory Planning in UAV-Assisted Networks

Multi-objective Global Path Planning for UAV-assisted Sensor Data Collection Using DRL and Transformer

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation