Skip to main content
Log in

Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

When using deep reinforcement learning algorithm to complete Unmanned Aerial Vehicle (UAV) autonomous obstacle avoidance and target tracking tasks, there are often some problems such as slow convergence speed and low success rate. Therefore, this paper proposes a new deep reinforcement learning algorithm, namely Multiple Pools Twin Delay Deep Deterministic Policy Gradient (MPTD3) algorithm. Firstly, the state space and action space of UAV are established as continuous models, which is closer to engineering practice than discrete models. Then, multiple experience pools mechanism and gradient truncation are designed to improve the convergence of the algorithm. Furthermore, the generalization ability of the algorithm is obtained by giving UAV environmental perception ability. Experimental results verify the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Shirani, B., Najafi, M., Izadi, I.: Cooperative load transportation using multiple UAVs. Proc. Aerosp. Sci. Technol. 84, 158–169 (2019). https://doi.org/10.1016/j.ast.2018.10.027

    Article  Google Scholar 

  2. Khan, M.A., Cheema, T.A., Ullah, I., Noor, F., Aziz, M.A.: A dual-mode medium access control mechanism for UAV-enabled intelligent transportation system. Proc. Mob. Inf. Syst. (2021). https://doi.org/10.1155/2021/5578490

  3. Sung, I., Nielsen, P.: Zoning a service area of unmanned aerial vehicles for package delivery services. Proc. J. Intel. Robot. Syst. 97, 719–731 (2020)

    Article  Google Scholar 

  4. Umemoto, K., Endo, T., Matsuno, F.: Dynamic cooperative transportation control using friction forces of n multi-rotor unmanned aerial vehicles. Proc. J. Intell. Robot. Syst. 100, 1085–1095 (2020). https://doi.org/10.1007/s10846-020-01212-1

    Article  Google Scholar 

  5. Liu, X., Ansari, N.: Resource allocation in UAV-assisted M2M communications for disaster rescue. Proc. IEEE Wirel. Commun. Lett. 8(2), 580–583 (2018)

    Article  Google Scholar 

  6. Wang, Y., Su, Z., Xu, Q., Li, R., Luan, T.H.: Lifesaving with Rescuechain: Energy-Efficient and Partition-Tolerant Blockchain Based Secure Information Sharing for UAV-Aided Disaster Rescue. In: Proceeding of IEEE Conference on Computer Communications (2021)

    Google Scholar 

  7. Dong, J., Ota, K., Dong, M.: UAV-Based Real-Time Survivor Detection System in Post-Disaster Search and Rescue Operations. In: Proceeding of IEEE Journal on Miniaturization for Air and Space Systems (2021)

    Google Scholar 

  8. Stampa, M., Sutorma, A., Jahn, U., Thiem, J., Wolff, C., Röhrig, C.: Maturity levels of public safety applications using unmanned aerial systems: a review. Proc. J. Intel. Robot. Syst. 103(1), 1–15 (2021). https://doi.org/10.1007/s10846-021-01462-7

    Article  Google Scholar 

  9. Lyu, J., Zeng, Y., Zhang, R., Lim, T.J.: Placement optimization of UAV-mounted mobile base stations. Proc. IEEE Commun. Lett. 21(3), 604–607 (2016)

    Article  Google Scholar 

  10. Wu, Y., Yang, W., Guan, X., Wu, Q.: UAV-Enabled Relay Communication under Malicious Jamming: Joint Trajectory and Transmit Power Optimization. Proc. IEEE Trans. Veh. Technol. (2021)

  11. Cetin, O., Zagli, I., Yilmaz, G.: Establishing obstacle and collision free communication relay for UAVs with artificial potential fields. Proc. J. Intel. Robot. Syst. 69(1), 361–372 (2013). https://doi.org/10.1007/s10846-012-9761-y

    Article  Google Scholar 

  12. Oh, D., Lim, J., Lee, J.K., Baek, H.: Airborne-relay-based algorithm for locating crashed UAVs in GPS-denied environments. In: Proceeding of 2019 IEEE 10th annual ubiquitous computing, Electronics & Mobile Communication Conference (UEMCON). IEEE (2019)

    Google Scholar 

  13. Huang, Z., Zhang, T., Liu, P., Lu, X.: Outdoor independent charging platform system for power patrol UAV. In: Proceeding of 2020 12th IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), pp. 1–5. IEEE (2020)

    Google Scholar 

  14. Chang, A., Jiang, M., Nan, J., Zhou, W., Li, X., Wang, J., He, X.: Research on the application of computer track planning algorithm in UAV power line patrol system. In: Proceeding of Conference Series (Vol. 1915, No. 3, p. 032030). IOP Publishing (2021)

    Google Scholar 

  15. Pham, H.X., La, H.M., Feil-Seifer, D., Nguyen, L.V.: Autonomous UAV navigation using reinforcement learning. arXiv 2018. arXiv preprint arXiv:1801.05086 (2018)

  16. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature. 518(7540), 529–533 (2015)

    Article  Google Scholar 

  17. Yan, C., Xiang, X., Wang, C.: Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments. Proc. J. Intel. Robot. Syst. 98, 297–309 (2020). https://doi.org/10.1007/s10846-019-01073-3

    Article  Google Scholar 

  18. Yao, Q., Zheng, Z., Qi, L., Yuan, H., Guo, X., Zhao, M., Liu, Z., Yang, T.: Path planning method with improved artificial potential field—a reinforcement learning perspective. Proc. IEEE Access. 8, 135513–135523 (2020)

    Article  Google Scholar 

  19. Hausknecht, M., Stone, P.: Deep Recurrent Q-Learning for Partially Observable MDPs. In: Proceeding of AAAI Fall Symposium Series (2015)

    Google Scholar 

  20. Singla, A., Padakandla, S., Bhatnagar, S.: Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV with Limited Environment Knowledge. Proc. IEEE Trans. Intell. Transp. Syst. (2019)

  21. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., … Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

  22. Rodriguez-Ramos, A., Sampedro, C., Bavle, H., De La Puente, P., Campoy, P.: A deep reinforcement learning strategy for UAV autonomous landing on a moving platform. Proc J Intel Robot Syst. 93(1–2), 351–366 (2019). https://doi.org/10.1007/s10846-018-0891-8

    Article  Google Scholar 

  23. Li, B., Yang, Z.P., Chen, D.Q., Liang, S.Y., Ma, H.: Maneuvering target tracking of UAV based on MN-DDPG and transfer learning. Proc. Defence Technol. 17(2), 457–466 (2021). https://doi.org/10.1016/j.dt.2020.11.014

    Article  Google Scholar 

  24. Wan, K., Gao, X., Hu, Z., Wu, G.: Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning. Proc. Remote Sens. 12(4), 640 (2020)

    Article  Google Scholar 

  25. Sampedro, C., Rodriguez-Ramos, A., Bavle, H., Carrio, A., de la Puente, P., Campoy, P.: A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques. Proc. J. Intel. Robot. Syst. 95(2), 601–627 (2019). https://doi.org/10.1007/s10846-018-0898-1

    Article  Google Scholar 

  26. Wang, C., Wang, J., Shen, Y., Zhang, X.: Autonomous navigation of UAVs in large-scale complex environments: a deep reinforcement learning approach. Proc. IEEE Trans. Veh. Technol. 68(3), 2124–2136 (2019)

    Article  Google Scholar 

  27. Song, D.R., Yang, C., McGreavy, C., Li, Z.: Recurrent deterministic policy gradient method for bipedal locomotion on rough terrain challenge. In: Proceeding of 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), pp. 311–318. IEEE (2018)

    Chapter  Google Scholar 

  28. Fujimoto, S., Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: Proceeding of International Conference on Machine Learning, pp. 1587–1596. PMLR (2018)

    Google Scholar 

  29. Li, B., Gan, Z., Chen, D., Sergey Aleksandrovich, D.: UAV maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning. Proc. Remote Sens. 12(22), 3789 (2020). https://doi.org/10.3390/rs12223789

    Article  Google Scholar 

  30. Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of MAPPO in cooperative multi-agent games. arXiv preprint arXiv:2103.01955 (2021)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under the grant number 61903133 and 61733004.

Code or Data Availability

Source code in Python generated during the current study is available from the corresponding author on reasonable request.

Funding

This research was funded by National Science Foundation of China under the grant for Prof. Yaonan Wang having grant number 61733004 and Prof. Weilai Jiang having grant number 61903133.

Author information

Authors and Affiliations

Authors

Contributions

Guoqiang Xu contributed to the design, implementation and manuscript writing of the research; Weilai Jiang and Yaonan Wang contributed to the guidance of the experiment and the revision of the manuscript.

Corresponding author

Correspondence to Weilai Jiang.

Ethics declarations

Ethical Approval

No applicable as this study does not contain biological applications.

Consent to Participate

All authors of this research paper have consented to participate in the research study.

Consent to Publication

All authors of this research paper have read and approved the final version submitted.

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, G., Jiang, W., Wang, Z. et al. Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning. J Intell Robot Syst 104, 60 (2022). https://doi.org/10.1007/s10846-022-01601-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-022-01601-8

Keywords

Navigation