UCAV Path Planning Algorithm Based on Deep Reinforcement Learning

  • Kaiyuan Zheng
  • Jingpeng GaoEmail author
  • Liangxi Shen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11902)


In the field of the Unmanned Combat Aerial Vehicle (UCAV) confrontation, traditional path planning algorithms have slow operation speed and poor adaptability. This paper proposes a UCAV path planning algorithm based on deep reinforcement learning. The algorithm combines the non-cooperative game idea to build the UCAV and radar confrontation model. In the model, the UCAV must reach the target area. At the same time, in order to complete the identification of the radar communication signal based on ResNet-50 migration learning, we use the theory of Cyclic Spectrum(CS) to process the signal. With the kinematics mechanism of the UCAV, the radar detection probability and the distance between the UCAV and center of the target area are proposed as part of the reward criteria. And we make the signal recognition rate as another part of the reward criteria. The algorithm trains the Deep Q-Network(DQN) parameters to realize the autonomous planning of the UCAV path. The simulation results show that compared with the traditional reinforcement learning algorithm, the algorithm can improve the system operation speed. The accuracy reaches 90% after 300 episodes and the signal recognition rate reaches 92.59% under 0 dB condition. The proposed algorithm can be applied to a variety of electronic warfare environment. It can improve the maneuver response time of the UCAV.


UCAV Signal recognition Path planning Cyclic spectrum Reward criteria Deep Q-Network 



This paper is funded by the International Exchange Program of Harbin Engineering University for Innovation-oriented Talents Cultivation, the Fundamental Research Funds for the Central Universities (HEUCFG201832), the Key Laboratory Foundation Project of National Defense Science and Technology Bureau (KY10800180080) and the China Shipbuilding Industry Corporation 722 Research Institute Fund Project (KY10800170051).


  1. 1.
    Zou, A.M., Hou, Z.G., Fu, S.Y., Tan, M.: Neural networks for mobile robot navigation: a survey. In: Wang, J., Yi, Z., Zurada, J.M., Lu, B.L., Yin, H. (eds.) Advances in Neural Networks - ISNN 2006. Lecture Notes in Computer Science, vol. 3972, pp. 1218–1226. Springer, Berlin (2006). Scholar
  2. 2.
    Sun, Y., Ding, M.: Quantum genetic algorithm for mobile robot path planning. In: Fourth International Conference on Genetic and Evolutionary Computing, pp. 206–209 (2010)Google Scholar
  3. 3.
    Wang, H., Duan, J., Wang, M., Zhao, J., Dong, Z.: Research on robot path planning based on fuzzy neural network algorithm. In: IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), pp. 1800–1803 (2018)Google Scholar
  4. 4.
    Peng, J., Sun, X., Zun, F., Zhang, J.: 3-D path planning with multi-constrains. In: IEEE. Chinese Control and Decision Conference, pp. 3301–3305 (2008)Google Scholar
  5. 5.
    Challita, U., Saad, W., Bettstetter, C.: Interference management for cellular-connected UAVs: a deep reinforcement learning approach. IEEE Trans. Wireless Commun. 1–32 (2019)Google Scholar
  6. 6.
    Beomjoon, K., Pineau, J.: Socially adaptive path planning in human environments using inverse reinforcement learning. Int. J. Soc. Robot. 8(1), 51–66 (2016)CrossRefGoogle Scholar
  7. 7.
    Wang, C., Wang, J., Shen, Y.: Autonomous navigation of UAVs in large-scale complex environments: a deep reinforcement learning approach. IEEE Trans. Vehicular Technol. 68(3), 2124–2136 (2019)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Wu, J., Shin, S., Kim, C.: Effective lazy training method for deep q-network in obstacle avoidance and path planning. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1799–1804 (2017)Google Scholar
  9. 9.
    Çetin, H., Durdu, A.: Path planning of mobile robots with Q-learning. In: 22nd Signal Processing and Communications Applications Conference (SIU), pp. 2162–2165 (2014)Google Scholar
  10. 10.
    Richard, S., Andrew, G.: Reinforcement Learning: An Introduction. MIT press, Cambridge (2018)zbMATHGoogle Scholar
  11. 11.
    Lei, T., Ming, L.: A robot exploration strategy based on q-learning network. In: IEEE International Conference on Real-time Computing and Robotics, pp. 57–62 (2016)Google Scholar
  12. 12.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  13. 13.
    Mnih, V., Kavukcuoglu, K., Silver, D., et al: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, pp. 1–9 (2013)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.College of Information and Communication EngineeringHarbin Engineering UniversityHeilongjiangChina

Personalised recommendations