UAV Path Planning Based on Deep Reinforcement Learning

Dong, Rui; Pan, Xin; Wang, Taojun; Chen, Gang

doi:10.1007/978-3-031-28715-2_2

Rui Dong⁴,
Xin Pan⁴,
Taojun Wang⁴ &
…
Gang Chen⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1093))

831 Accesses

Abstract

Currently, UAV has been used for the military and civil purposes, especially the rotor UAV, which has the ability of vertical take-off and landing, has six degrees of freedom and can hover in the air. Because of its high mobility, it has become a working platform for various environments with different purposes. When UAV performs autonomous flight mission, the static and dynamic obstacle environment occurs, therefore, research on effective obstacle avoidance and path planning technology for unknown environment is very important. Traditional path planning technology needs to rely on map information and high real-time algorithm, which requires huge storage space and computing resources. In this chapter, the author studies the deep reinforcement learning algorithm for UAV path planning. In view of the current challenges faced by UAVs in autonomous flight in obstacle environments, this chapter proposes an improved DQN algorithm combined with artificial potential fields, establishing a reward function to evaluate the behavior of UAV, which could guide the UAV to reach the target point as soon as possible under the premise of avoiding obstacles. The network structure, state space, action space and reward function of the DQN algorithm is designed and a UAV reinforcement learning path planning system is established. In order to better verify the advantages of the algorithm proposed in this chapter, a comparative experiment between the improved DQN algorithm and the DQN algorithm is carried out. The path planning performance of the DQN algorithm and the improved DQN algorithm in the same environment is compared. The loss function and success rate are selected as the comparison criteria. The experimental results show that the improved DQN algorithm is faster and more stable than the DQN algorithm for UAV path planning, which verifies the superiority of the improved DQN algorithm for path planning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Khatib, O. (1995). Real-time obstacle avoidance for manipulators and mobile robots. International Journal of Robotics Research, 5(1), 500–505. https://doi.org/10.1177/027836498600500106.
Ge, S. S., & Cui, Y. J. (2002). ‘Dynamic motion planning for mobile robots using potential field method. Autonomous robots’, 13(3), 207–222. https://doi.org/10.1023/A:1020564024509.
Mabrouk, M. H., & McInnes, C. R. (2008). Solving the potential field local minimum problem using internal agent states. Robotics and Autonomous Systems, 56(12), 1050–1060. https://doi.org/10.1016/j.robot.2008.09.006.
Jurkiewicz, P., Biernacka, E., Domżał, J., & Wójcik, R. (2021). Empirical time complexity of generic Dijkstra algorithm. In 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM) (pp. 594–598). IEEE. (May, 2021).
Google Scholar
Knuth, D. E. (1977). A generalization of Dijkstra’s algorithm. Information Processing Letters, 6(1), 1–5.
Article MathSciNet MATH Google Scholar
Podsędkowski, L., Nowakowski, J., Idzikowski, M., & Vizvary, I. (2001). ‘A new solution for path planning in partially known or unknown environment for nonholonomic mobile robots. Robotics and Autonomous Systems, 34(2–3), 145–152. https://doi.org/10.1016/S0921-8890(00)00118-4.
Zhang, Y., Li, L. L., Lin, H. C., Ma, Z., & Zhao, J. (2017, September). ‘Development of path planning approach based on improved A-star algorithm in AGV system. In International Conference on Internet of Things as a Service (pp. 276–279). Springer, Cham. https://doi.org/10.1007/978-3-030-00410-1_32. (Sept, 2017).
Sedighi, S., Nguyen, D. V., & Kuhnert, K. D. (2019). Guided hybrid A-star path planning algorithm for valet parking applications. In 2019 5th International Conference on Control, Automation and Robotics (ICCAR) (pp. 570–575). IEEE. https://doi.org/10.1109/ICCAR.2019.8813752. (Apr, 2019).
LaValle, S. M. (1998). Rapidly-exploring random trees: A new tool for path planning (pp. 293–308).
Google Scholar
Karaman, S., & Frazzoli, E. (2012). Sampling-based algorithms for optimal motion planning with deterministic μ-calculus specifications. In 2012 American Control Conference (ACC) (pp. 735–742). IEEE. https://doi.org/10.1109/ACC.2012.6315419. (June, 2012).
Kavraki, L. E., Svestka, P., Latombe, J. C., & Overmars, M. H. (1996). Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 12(4), 566–580. https://doi.org/10.1109/70.508439.
Webb, D. J., & Van Den Berg, J. (2013). Kinodynamic RRT*: Asymptotically optimal motion planning for robots with linear dynamics. In 2013 IEEE International Conference on Robotics and Automation (pp. 5054–5061). IEEE. https://doi.org/10.1109/ICRA.2013.6631299. (May, 2013).
Bry, A., & Roy, N. (2011). Rapidly-exploring random belief trees for motion planning under uncertainty. In 2011 IEEE International Conference on Robotics and Automation (pp. 723–730). IEEE. https://doi.org/10.1109/ICRA.2011.5980508. (May, 2011).
Nasir, J., Islam, F., Malik, U., Ayaz, Y., Hasan, O., Khan, M., & Muhammad, M. S. (2013). RRT*-SMART: A rapid convergence implementation of RRT. International Journal of Advanced Robotic Systems, 10(7), 299. https://doi.org/10.1109/ICRA.2011.5980508.
Gammell, J. D., Srinivasa, S. S., & Barfoot, T. D. (2014). Informed RRT*: Optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 2997–3004). IEEE. https://doi.org/10.1109/IROS.2014.6942976. (Sept, 2014).
Ye, H., Zhou, X., Wang, Z., Xu, C., Chu, J., & Gao, F. (2020). Tgk-planner: An efficient topology guided kinodynamic planner for autonomous quadrotors. IEEE Robotics and Automation Letters, 6(2), 494–501. arXiv:2008.03468.
Koohestani, B. (2020). A crossover operator for improving the efficiency of permutation-based genetic algorithms. Expert Systems with Applications, 151, 113381. https://doi.org/10.1016/j.eswa.2020.113381.
Lamini, C., Benhlima, S., & Elbekri, A. (2018). ‘Genetic algorithm based approach for autonomous mobile robot path planning. Procedia Computer Science’, 127, 180–189. https://doi.org/10.1016/J.PROCS.2018.01.113.
Li, Q., Wang, L., Chen, B., & Zhou, Z. (2011). An improved artificial potential field method for solving local minimum problem. In 2011 2nd International Conference on Intelligent Control and Information Processing (Vol. 1, pp. 420–424). IEEE. https://doi.org/10.1109/ICICIP.2011.6008278. (July, 2011).
Liang, J. H., & Lee, C. H. (2015). Efficient collision-free path-planning of multiple mobile robots system using efficient artificial bee colony algorithm. Advances in Engineering Software, 79, 47–56. https://doi.org/10.1016/j.advengsoft.2014.09.006.
Akka, K., & Khaber, F. (2018). Mobile robot path planning using an improved ant colony optimization. International Journal of Advanced Robotic Systems, 15(3), 1729881418774673. https://doi.org/10.1177/1729881418774673.
Su, Q., Yu, W., & Liu, J. (2021). Mobile robot path planning based on improved ant colony algorithm. In 2021 Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS) (pp. 220–224). IEEE. https://doi.org/10.1109/ACCTCS52002.2021.00050. (Jan, 2021).
Cheng, J., Wang, L., & Xiong, Y. (2018). Modified cuckoo search algorithm and the prediction of flashover voltage of insulators. Neural Computing and Applications, 30(2), 355–370. https://doi.org/10.1007/s00521-017-3179-1.
Khaksar, W., Hong, T. S., Khaksar, M., & Motlagh, O. R. E. (2013). A genetic-based optimized fuzzy-tabu controller for mobile robot randomized navigation in unknown environment. International Journal of Innovative Computing, Information and Control, 9(5), 2185–2202.
Google Scholar
Xiang, L., Li, X., Liu, H., & Li, P. (2021). Parameter fuzzy self-adaptive dynamic window approach for local path planning of wheeled robot. IEEE Open Journal of Intelligent Transportation Systems, 3, 1–6. https://doi.org/10.1109/OJITS.2021.3137931.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236.
Jaradat, M. A. K., Al-Rousan, M., & Quadan, L. (2011). Reinforcement based mobile robot navigation in dynamic environment. Robotics and Computer-Integrated Manufacturing, 27(1), 135–149. https://doi.org/10.1016/j.rcim.2010.06.019.
Shi, Z., Tu, J., Zhang, Q., Zhang, X., & Wei, J. (2013). The improved Q-learning algorithm based on pheromone mechanism for swarm robot system. In Proceedings of the 32nd Chinese Control Conference (pp. 6033–6038). IEEE. (July, 2013).
Google Scholar
Zhu, Y., Mottaghi, R., Kolve, E., Lim, J. J., Gupta, A., Fei-Fei, L., & Farhadi, A. (2017). Target-driven visual navigation in indoor scenes using deep reinforcement learning. In 2017 IEEE International Conference on Robotics and Automation (ICRA) (pp. 3357–3364). IEEE. https://doi.org/10.1109/ICRA.2017.7989381. (May, 2017).
Sadeghi, F., & Levine, S. (2016). Cad2rl: Real single-image flight without a single real image. https://doi.org/10.48550/arXiv.1611.04201. arXiv:1611.04201.
Tai, L., & Liu, M. (2016). Towards cognitive exploration through deep reinforcement learning for mobile robots. https://doi.org/10.48550/arXiv.1610.01733. arXiv:1610.01733.
Jisna, V. A., & Jayaraj, P. B. (2022). An end-to-end deep learning pipeline for assigning secondary structure in proteins. Journal of Computational Biophysics and Chemistry, 21(03), 335–348. https://doi.org/10.1142/S2737416522500120.
He, L., Aouf, N., & Song, B. (2021). Explainable deep reinforcement learning for UAV autonomous path planning. Aerospace Science and Technology, 118, 107052. https://doi.org/10.1016/j.ast.2021.107052.
Jeong, I., Jang, Y., Park, J., & Cho, Y. K. (2021). Motion planning of mobile robots for autonomous navigation on uneven ground surfaces. Journal of Computing in Civil Engineering, 35(3), 04021001. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000963.
Chen, C., Seff, A., Kornhauser, A., & Xiao, J. (2015). DeepDriving: Learning affordance for direct perception in autonomous driving. In 2015 IEEE International Conference on Computer Vision (ICCV). https://doi.org/10.1109/ICCV.2015.312.
Wu, K., Wang, H., Esfahani, M. A., & Yuan, S. (2020). Achieving real-time path planning in unknown environments through deep neural networks. IEEE Transactions on Intelligent Transportation Systems. https://doi.org/10.1109/tits.2020.3031962.
Maw, A. A., Tyan, M., Nguyen, T. A., & Lee, J. W. (2021). iADA*-RL: Anytime graph-based path planning with deep reinforcement learning for an autonomous UAV. Applied Sciences, 11(9), 3948. https://doi.org/10.3390/APP11093948.
Gao, J., Ye, W., Guo, J., & Li, Z. (2020). ‘Deep reinforcement learning for indoor mobile robot path planning. Sensors’, 20(19), 5493. https://doi.org/10.3390/s20195493.
Yongqi, L., Dan, X., & Gui, C. (2020). Rapid trajectory planning method of UAV based on improved A* algo-rithm. Flight Dynamics, 38(02), 40–46. https://doi.org/10.13645/j.cnki.f.d.20191116.001.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). ‘Playing atari with deep reinforcement learning. https://doi.org/10.48550/arXiv.1312.5602. arXiv:1312.5602.
Ruan, X., Ren, D., Zhu, X., & Huang, J. (2019). ‘Mobile robot navigation based on deep reinforcement learning’. In 2019 Chinese control and decision conference (CCDC) (pp. 6174–6178). IEEE. https://doi.org/10.1109/CCDC.2019.8832393. (June, 2019 ).

Download references

Author information

Authors and Affiliations

State Key Laboratory for Strength and Vibration of Mechanical Structures, School of Aerospace Engineering, Xi’an Jiaotong University, No. 28, Xianning West Road, Xi’an, 710049, Shaanxi, China
Rui Dong, Xin Pan, Taojun Wang & Gang Chen

Authors

Rui Dong
View author publications
You can also search for this author in PubMed Google Scholar
Xin Pan
View author publications
You can also search for this author in PubMed Google Scholar
Taojun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Gang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gang Chen .

Editor information

Editors and Affiliations

College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia
Ahmad Taher Azar
College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia
Anis Koubaa

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dong, R., Pan, X., Wang, T., Chen, G. (2023). UAV Path Planning Based on Deep Reinforcement Learning. In: Azar, A.T., Koubaa, A. (eds) Artificial Intelligence for Robotics and Autonomous Systems Applications. Studies in Computational Intelligence, vol 1093. Springer, Cham. https://doi.org/10.1007/978-3-031-28715-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-28715-2_2
Published: 16 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-28714-5
Online ISBN: 978-3-031-28715-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics