Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

Xu, Guoqiang; Jiang, Weilai; Wang, Zhaolei; Wang, Yaonan

doi:10.1007/s10846-022-01601-8

Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

Regular paper
Published: 19 March 2022

Volume 104, article number 60, (2022)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Guoqiang Xu¹,
Weilai Jiang¹,
Zhaolei Wang² &
…
Yaonan Wang¹

955 Accesses
18 Citations
Explore all metrics

Abstract

When using deep reinforcement learning algorithm to complete Unmanned Aerial Vehicle (UAV) autonomous obstacle avoidance and target tracking tasks, there are often some problems such as slow convergence speed and low success rate. Therefore, this paper proposes a new deep reinforcement learning algorithm, namely Multiple Pools Twin Delay Deep Deterministic Policy Gradient (MPTD3) algorithm. Firstly, the state space and action space of UAV are established as continuous models, which is closer to engineering practice than discrete models. Then, multiple experience pools mechanism and gradient truncation are designed to improve the convergence of the algorithm. Furthermore, the generalization ability of the algorithm is obtained by giving UAV environmental perception ability. Experimental results verify the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unmanned Aerial Vehicles Path Planning Based on Deep Reinforcement Learning

UAV Autonomous Navigation Based on Multi-modal Perception: A Deep Hierarchical Reinforcement Learning Method

A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance

References

Shirani, B., Najafi, M., Izadi, I.: Cooperative load transportation using multiple UAVs. Proc. Aerosp. Sci. Technol. 84, 158–169 (2019). https://doi.org/10.1016/j.ast.2018.10.027
Article Google Scholar
Khan, M.A., Cheema, T.A., Ullah, I., Noor, F., Aziz, M.A.: A dual-mode medium access control mechanism for UAV-enabled intelligent transportation system. Proc. Mob. Inf. Syst. (2021). https://doi.org/10.1155/2021/5578490
Sung, I., Nielsen, P.: Zoning a service area of unmanned aerial vehicles for package delivery services. Proc. J. Intel. Robot. Syst. 97, 719–731 (2020)
Article Google Scholar
Umemoto, K., Endo, T., Matsuno, F.: Dynamic cooperative transportation control using friction forces of n multi-rotor unmanned aerial vehicles. Proc. J. Intell. Robot. Syst. 100, 1085–1095 (2020). https://doi.org/10.1007/s10846-020-01212-1
Article Google Scholar
Liu, X., Ansari, N.: Resource allocation in UAV-assisted M2M communications for disaster rescue. Proc. IEEE Wirel. Commun. Lett. 8(2), 580–583 (2018)
Article Google Scholar
Wang, Y., Su, Z., Xu, Q., Li, R., Luan, T.H.: Lifesaving with Rescuechain: Energy-Efficient and Partition-Tolerant Blockchain Based Secure Information Sharing for UAV-Aided Disaster Rescue. In: Proceeding of IEEE Conference on Computer Communications (2021)
Google Scholar
Dong, J., Ota, K., Dong, M.: UAV-Based Real-Time Survivor Detection System in Post-Disaster Search and Rescue Operations. In: Proceeding of IEEE Journal on Miniaturization for Air and Space Systems (2021)
Google Scholar
Stampa, M., Sutorma, A., Jahn, U., Thiem, J., Wolff, C., Röhrig, C.: Maturity levels of public safety applications using unmanned aerial systems: a review. Proc. J. Intel. Robot. Syst. 103(1), 1–15 (2021). https://doi.org/10.1007/s10846-021-01462-7
Article Google Scholar
Lyu, J., Zeng, Y., Zhang, R., Lim, T.J.: Placement optimization of UAV-mounted mobile base stations. Proc. IEEE Commun. Lett. 21(3), 604–607 (2016)
Article Google Scholar
Wu, Y., Yang, W., Guan, X., Wu, Q.: UAV-Enabled Relay Communication under Malicious Jamming: Joint Trajectory and Transmit Power Optimization. Proc. IEEE Trans. Veh. Technol. (2021)
Cetin, O., Zagli, I., Yilmaz, G.: Establishing obstacle and collision free communication relay for UAVs with artificial potential fields. Proc. J. Intel. Robot. Syst. 69(1), 361–372 (2013). https://doi.org/10.1007/s10846-012-9761-y
Article Google Scholar
Oh, D., Lim, J., Lee, J.K., Baek, H.: Airborne-relay-based algorithm for locating crashed UAVs in GPS-denied environments. In: Proceeding of 2019 IEEE 10th annual ubiquitous computing, Electronics & Mobile Communication Conference (UEMCON). IEEE (2019)
Google Scholar
Huang, Z., Zhang, T., Liu, P., Lu, X.: Outdoor independent charging platform system for power patrol UAV. In: Proceeding of 2020 12th IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), pp. 1–5. IEEE (2020)
Google Scholar
Chang, A., Jiang, M., Nan, J., Zhou, W., Li, X., Wang, J., He, X.: Research on the application of computer track planning algorithm in UAV power line patrol system. In: Proceeding of Conference Series (Vol. 1915, No. 3, p. 032030). IOP Publishing (2021)
Google Scholar
Pham, H.X., La, H.M., Feil-Seifer, D., Nguyen, L.V.: Autonomous UAV navigation using reinforcement learning. arXiv 2018. arXiv preprint arXiv:1801.05086 (2018)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature. 518(7540), 529–533 (2015)
Article Google Scholar
Yan, C., Xiang, X., Wang, C.: Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments. Proc. J. Intel. Robot. Syst. 98, 297–309 (2020). https://doi.org/10.1007/s10846-019-01073-3
Article Google Scholar
Yao, Q., Zheng, Z., Qi, L., Yuan, H., Guo, X., Zhao, M., Liu, Z., Yang, T.: Path planning method with improved artificial potential field—a reinforcement learning perspective. Proc. IEEE Access. 8, 135513–135523 (2020)
Article Google Scholar
Hausknecht, M., Stone, P.: Deep Recurrent Q-Learning for Partially Observable MDPs. In: Proceeding of AAAI Fall Symposium Series (2015)
Google Scholar
Singla, A., Padakandla, S., Bhatnagar, S.: Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV with Limited Environment Knowledge. Proc. IEEE Trans. Intell. Transp. Syst. (2019)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., … Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Rodriguez-Ramos, A., Sampedro, C., Bavle, H., De La Puente, P., Campoy, P.: A deep reinforcement learning strategy for UAV autonomous landing on a moving platform. Proc J Intel Robot Syst. 93(1–2), 351–366 (2019). https://doi.org/10.1007/s10846-018-0891-8
Article Google Scholar
Li, B., Yang, Z.P., Chen, D.Q., Liang, S.Y., Ma, H.: Maneuvering target tracking of UAV based on MN-DDPG and transfer learning. Proc. Defence Technol. 17(2), 457–466 (2021). https://doi.org/10.1016/j.dt.2020.11.014
Article Google Scholar
Wan, K., Gao, X., Hu, Z., Wu, G.: Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning. Proc. Remote Sens. 12(4), 640 (2020)
Article Google Scholar
Sampedro, C., Rodriguez-Ramos, A., Bavle, H., Carrio, A., de la Puente, P., Campoy, P.: A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques. Proc. J. Intel. Robot. Syst. 95(2), 601–627 (2019). https://doi.org/10.1007/s10846-018-0898-1
Article Google Scholar
Wang, C., Wang, J., Shen, Y., Zhang, X.: Autonomous navigation of UAVs in large-scale complex environments: a deep reinforcement learning approach. Proc. IEEE Trans. Veh. Technol. 68(3), 2124–2136 (2019)
Article Google Scholar
Song, D.R., Yang, C., McGreavy, C., Li, Z.: Recurrent deterministic policy gradient method for bipedal locomotion on rough terrain challenge. In: Proceeding of 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), pp. 311–318. IEEE (2018)
Chapter Google Scholar
Fujimoto, S., Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: Proceeding of International Conference on Machine Learning, pp. 1587–1596. PMLR (2018)
Google Scholar
Li, B., Gan, Z., Chen, D., Sergey Aleksandrovich, D.: UAV maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning. Proc. Remote Sens. 12(22), 3789 (2020). https://doi.org/10.3390/rs12223789
Article Google Scholar
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of MAPPO in cooperative multi-agent games. arXiv preprint arXiv:2103.01955 (2021)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under the grant number 61903133 and 61733004.

Code or Data Availability

Source code in Python generated during the current study is available from the corresponding author on reasonable request.

Funding

This research was funded by National Science Foundation of China under the grant for Prof. Yaonan Wang having grant number 61733004 and Prof. Weilai Jiang having grant number 61903133.

Author information

Authors and Affiliations

College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China
Guoqiang Xu, Weilai Jiang & Yaonan Wang
Science and Technology on Aerospace Intelligent Control Laboratory, Beijing Aerospace Automatic Control Institute, Beijing, 100854, China
Zhaolei Wang

Authors

Guoqiang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Weilai Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaolei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yaonan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Guoqiang Xu contributed to the design, implementation and manuscript writing of the research; Weilai Jiang and Yaonan Wang contributed to the guidance of the experiment and the revision of the manuscript.

Corresponding author

Correspondence to Weilai Jiang.

Ethics declarations

Ethical Approval

No applicable as this study does not contain biological applications.

Consent to Participate

All authors of this research paper have consented to participate in the research study.

Consent to Publication

All authors of this research paper have read and approved the final version submitted.

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, G., Jiang, W., Wang, Z. et al. Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning. J Intell Robot Syst 104, 60 (2022). https://doi.org/10.1007/s10846-022-01601-8

Download citation

Received: 04 October 2021
Accepted: 21 February 2022
Published: 19 March 2022
DOI: https://doi.org/10.1007/s10846-022-01601-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

Abstract

Access this article

Similar content being viewed by others

Unmanned Aerial Vehicles Path Planning Based on Deep Reinforcement Learning

UAV Autonomous Navigation Based on Multi-modal Perception: A Deep Hierarchical Reinforcement Learning Method

A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance

References

Acknowledgements

Code or Data Availability

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent to Publication

Conflict of Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation