UCAV Path Planning Algorithm Based on Deep Reinforcement Learning
In the field of the Unmanned Combat Aerial Vehicle (UCAV) confrontation, traditional path planning algorithms have slow operation speed and poor adaptability. This paper proposes a UCAV path planning algorithm based on deep reinforcement learning. The algorithm combines the non-cooperative game idea to build the UCAV and radar confrontation model. In the model, the UCAV must reach the target area. At the same time, in order to complete the identification of the radar communication signal based on ResNet-50 migration learning, we use the theory of Cyclic Spectrum(CS) to process the signal. With the kinematics mechanism of the UCAV, the radar detection probability and the distance between the UCAV and center of the target area are proposed as part of the reward criteria. And we make the signal recognition rate as another part of the reward criteria. The algorithm trains the Deep Q-Network(DQN) parameters to realize the autonomous planning of the UCAV path. The simulation results show that compared with the traditional reinforcement learning algorithm, the algorithm can improve the system operation speed. The accuracy reaches 90% after 300 episodes and the signal recognition rate reaches 92.59% under 0 dB condition. The proposed algorithm can be applied to a variety of electronic warfare environment. It can improve the maneuver response time of the UCAV.
KeywordsUCAV Signal recognition Path planning Cyclic spectrum Reward criteria Deep Q-Network
This paper is funded by the International Exchange Program of Harbin Engineering University for Innovation-oriented Talents Cultivation, the Fundamental Research Funds for the Central Universities (HEUCFG201832), the Key Laboratory Foundation Project of National Defense Science and Technology Bureau (KY10800180080) and the China Shipbuilding Industry Corporation 722 Research Institute Fund Project (KY10800170051).
- 1.Zou, A.M., Hou, Z.G., Fu, S.Y., Tan, M.: Neural networks for mobile robot navigation: a survey. In: Wang, J., Yi, Z., Zurada, J.M., Lu, B.L., Yin, H. (eds.) Advances in Neural Networks - ISNN 2006. Lecture Notes in Computer Science, vol. 3972, pp. 1218–1226. Springer, Berlin (2006). https://doi.org/10.1007/11760023_177CrossRefGoogle Scholar
- 2.Sun, Y., Ding, M.: Quantum genetic algorithm for mobile robot path planning. In: Fourth International Conference on Genetic and Evolutionary Computing, pp. 206–209 (2010)Google Scholar
- 3.Wang, H., Duan, J., Wang, M., Zhao, J., Dong, Z.: Research on robot path planning based on fuzzy neural network algorithm. In: IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), pp. 1800–1803 (2018)Google Scholar
- 4.Peng, J., Sun, X., Zun, F., Zhang, J.: 3-D path planning with multi-constrains. In: IEEE. Chinese Control and Decision Conference, pp. 3301–3305 (2008)Google Scholar
- 5.Challita, U., Saad, W., Bettstetter, C.: Interference management for cellular-connected UAVs: a deep reinforcement learning approach. IEEE Trans. Wireless Commun. 1–32 (2019)Google Scholar
- 8.Wu, J., Shin, S., Kim, C.: Effective lazy training method for deep q-network in obstacle avoidance and path planning. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1799–1804 (2017)Google Scholar
- 9.Çetin, H., Durdu, A.: Path planning of mobile robots with Q-learning. In: 22nd Signal Processing and Communications Applications Conference (SIU), pp. 2162–2165 (2014)Google Scholar
- 11.Lei, T., Ming, L.: A robot exploration strategy based on q-learning network. In: IEEE International Conference on Real-time Computing and Robotics, pp. 57–62 (2016)Google Scholar
- 13.Mnih, V., Kavukcuoglu, K., Silver, D., et al: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, pp. 1–9 (2013)