Skip to main content

Target Search in Unknown Environment Based on Temporal Differential Learning

  • Conference paper
  • First Online:
Advances in Guidance, Navigation and Control

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 644))

  • 142 Accesses

Abstract

This paper provides a path planning algorithm to solve the problem of target search in the unknown environment. By updating the value function after each action in turn, this paper overcomes the problem that traditional reinforcement learning algorithms require a large number of training processes. Besides, this paper further expands and optimizes the algorithm based on the hardware characteristics of UAV (Unmanned Aerial Vehicle). When the detection range of the sensor is different, the efficiency of the algorithm can be improved by taking it into consideration. To solve the problem of high steering time cost, it increase the number of possible non-existing paths based on the value function. The improvement and optimization for practical problems in this paper makes the algorithm can be applied to UAV better. Finally, the paper tests the algorithm in a simulation environment to ensure that the algorithm can effectively complete the path planning task of the search target.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 429.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 549.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tang, D.L., Yao, B., Hu, L.: A full traversal path planning method for tank detection wall-climbing robot. Eng. Des. J. 25(3), 253–261 (2018)

    Google Scholar 

  2. Zhu, Q.X.: The optimal search theory and its application. World Sci. Technol. Res. Dev. 27(4), 39–49 (2005)

    Google Scholar 

  3. Yao, J.F., Lin, C., Xie, X.B., Wang, A.J., Hung, C.-C.: Path planning for virtual human motion using improved A* algorithm. In: Proceedings of the International Conference on Information Technology: New Generations, pp. 1154–1158 (2010)

    Google Scholar 

  4. Atinc, G.M., Stipanovic, D.M., Voulgaris, P.G., Karkoub, M.: Collision-free trajectory tracking while preserving connectivity in unicycle multi-agent systems. In: Proceedings of the American Control Conference, pp. 5392–5397 (2013)

    Google Scholar 

  5. Zhang, H.M., Lu, Y., Zhu, H.Z., Xiao, Z.H., Gao, C.Q.: Path Planning for UAV with constrained conditions based on ant colony algorithm. In: Proceedings of the 2nd International Conference on Mechatronics Engineering and Information Technology (ICMEIT) (2017)

    Google Scholar 

  6. Wang, H., Zhao, C., Wang, H.B., Weng, S.F.: A new path planning method based on RRT. J. Harbin Inst. Technol. 36(7), 963–965 (2004)

    Google Scholar 

  7. Song, J.Z., Dai, B., Shan, E.Z., He, H.G.: An improved RRT path planning algorithm. Acta Electron. Sin. 38(2A), 225–228 (2010)

    Google Scholar 

  8. Zhang, Y., Wei, X., X. Zhou: Dynamic obstacle avoidance based on multi-sensor fusion and Q-learning algorithm. In: IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chengdu, China, pp. 1569–1573 (2019)

    Google Scholar 

  9. Ruan, X., Ren, D., Zhu, X., J. Huang: Mobile robot navigation based on deep reinforcement learning. In: Chinese Control And Decision Conference (CCDC), Nan-chang, China, pp. 6174–6178 (2019)

    Google Scholar 

  10. Wang, Y.H., Li, T.S., Lin, C.J.: Backward Q-learning: the combination of Sarsa algorithm and Q-learning. Eng. Appl. Artifi. Intell.: Int. J. Intell. Real-Time Autom. 26(9), 2184–2193 (2013)

    Google Scholar 

  11. Notsu, A., Yasuda, K., Ubukata, S., Honda, K.: Optimization of learning cycles in online reinforcement learning systems. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, pp. 3530–3534 (2018)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61803309, 61703343), Fundamental Research Funds for the Central Universities (3102019ZDHKY02, 3102018JCC003). Natural Science Foundation of Shaanxi Province (2018JQ6070, 2019JM-254), China Postdoctoral Science Foundation (2018M633574) and Key Research and Development Project of Shaanxi Province (2020ZDLGY06-02).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yiming Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Y., Hu, J., Zhang, C., Xu, Z., Jia, C. (2022). Target Search in Unknown Environment Based on Temporal Differential Learning. In: Yan, L., Duan, H., Yu, X. (eds) Advances in Guidance, Navigation and Control . Lecture Notes in Electrical Engineering, vol 644. Springer, Singapore. https://doi.org/10.1007/978-981-15-8155-7_196

Download citation

Publish with us

Policies and ethics