Navigation for autonomous vehicles via fast-stable and smooth reinforcement learning

Zhang, RuiXian; Yang, JiaNan; Liang, Ye; Lu, ShengAo; Dong, YiFei; Yang, BaoQing; Zhang, LiXian

doi:10.1007/s11431-023-2483-x

Navigation for autonomous vehicles via fast-stable and smooth reinforcement learning

Article
Published: 18 December 2023

Volume 67, pages 423–434, (2024)
Cite this article

Science China Technological Sciences Aims and scope Submit manuscript

RuiXian Zhang¹,
JiaNan Yang¹,
Ye Liang¹,
ShengAo Lu¹,
YiFei Dong¹,
BaoQing Yang¹ &
…
LiXian Zhang¹

115 Accesses
Explore all metrics

Abstract

This paper investigates the navigation problem of autonomous vehicles based on reinforcement learning (RL) with both stability and smoothness guarantees. By introducing a data-based Lyapunov function, the stability criterion in mean cost is obtained, where the Lyapunov function has a property of fast descending. Then, an off-policy RL algorithm is proposed to train safe policies, in which a more strict constraint is exerted in the framework of model-free RL to ensure the fast convergence of policy generation, in contrast with the existing RL merely with stability guarantee. In addition, by simultaneously introducing constraints on action increments and action distribution variations, the difference between the adjacent actions is effectively alleviated to ensure the smoothness of the obtained policy, instead of only seeking the similarity of the distributions of adjacent actions as commonly done in the past literature. A navigation task of a ground differentially driven mobile vehicle in simulations is adopted to demonstrate the superiority of the proposed algorithm on the fast stability and smoothness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Justesen N, Bontrager P, Togelius J, et al. Deep learning for video game playing. IEEE Trans Games, 2020, 12: 1–20
Article Google Scholar
Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search. Nature, 2016, 529: 484–489
Article Google Scholar
Jeong G, Kim H Y. Improving financial trading decisions using deep Q-learning: Predicting the number of shares, action strategies, and transfer learning. Expert Syst Appl, 2019, 117: 125–138
Article Google Scholar
Deng Y, Bao F, Kong Y, et al. Deep direct reinforcement learning for financial signal representation and trading. IEEE Trans Neural Netw Learn Syst, 2016, 28: 653–664
Article Google Scholar
Sharma A R, Kaushik P. Literature survey of statistical, deep and reinforcement learning in natural language processing. In: Proceedings of the 2017 International Conference on Computing, Communication and Automation (ICCCA). Greater Noida: IEEE, 2017. 350–354
Chapter Google Scholar
Dong X, Zhang J, Cheng L, et al. A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control. Sci China Tech Sci, 2022, 65: 2409–2419
Article Google Scholar
Chen Y F, Everett M, Liu M, et al. Socially aware motion planning with deep reinforcement learning. In: Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vancouver: IEEE, 2017. 1343–1350
Achiam J, Held D, Tamar A, et al. Constrained policy optimization. In: Proceedings of the International Conference on Machine Learning. Sydney, 2017. 22–31
Berkenkamp F, Turchetta M, Schoellig A, et al. Safe model-based reinforcement learning with stability guarantees. In: Proceedings of the 31st Conference on Neural Information Processing Systems. Long Beach, 2017
Cheng R, Orosz G, Murray R M, et al. End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks. In: Proceedings of the AAAI Conference on Artificial Intelligence. Honolulu, 2019. 33: 3387–3395
Article Google Scholar
Osinenko P, Beckenbach L, Göhrt T, et al. A reinforcement learning method with closed-loop stability guarantee. IFAC-PapersOnLine, 2020, 53: 8043–8048
Article Google Scholar
Gangopadhyay B, Dasgupta P, Dey S. Safe and stable RL (S²RL) driving policies using control barrier and control lyapunov functions. IEEE Trans Intell Veh, 2023, 8: 1889–1899
Article Google Scholar
Ding L, Li S, Gao H, et al. Adaptive partial reinforcement learning neural network-based tracking control for wheeled mobile robotic systems. IEEE Trans Syst Man Cybern Syst, 2018, 50: 2512–2523
Article Google Scholar
Khader S A, Yin H, Falco P, et al. Learning deep neural policies with stability guarantees. arXiv: 2103.16432
Han M, Zhang L, Wang J, et al. Actor-critic reinforcement learning for control with stability guarantee. IEEE Robot Autom Lett, 2020, 5: 6217–6224
Article Google Scholar
Han M, Tian Y, Zhang L, et al. Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee. Automatica, 2021, 129: 109689
Article MathSciNet Google Scholar
Zhang L, Zhang R, Wu T, et al. Safe reinforcement learning with stability guarantee for motion planning of autonomous vehicles. IEEE Trans Neural Netw Learn Syst, 2021, 32: 5435–5444
Article Google Scholar
Pei M, An H, Liu B, et al. An improved dyna-Q algorithm for mobile robot path planning in unknown dynamic environment. IEEE Trans Syst Man Cybern Syst, 2021, 52: 4415–4425
Article Google Scholar
Xu X, Zuo L, Li X, et al. A reinforcement learning approach to autonomous decision making of intelligent vehicles on highways. IEEE Trans Syst Man Cybern Syst, 2020, 50: 3884–3897
Google Scholar
Huang Z, Xu X, He H, et al. Parameterized batch reinforcement learning for longitudinal control of autonomous land vehicles. IEEE Trans Syst Man Cybern Syst, 2017, 49: 730–741
Article Google Scholar
Mysore S, Mabsout B, Mancuso R, et al. Regularizing action policies for smooth control with reinforcement learning. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Xi’an: IEEE, 2021. 1810–1816
Google Scholar
Shen Q, Li Y, Jiang H, et al. Deep reinforcement learning with robust and smooth policy. In: Proceedings of the International Conference on Machine Learning. Vienna: JMLR, 2020. 8707–8718
Google Scholar
Long P, Liu W, Pan J. Deep-learned collision avoidance policy for distributed multiagent navigation. IEEE Robot Autom Lett, 2017, 2: 656–663
Article Google Scholar
Long P, Fan T, Liao X, et al. Towards optimally decentralized multirobot collision avoidance via deep reinforcement learning. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Brisbane: IEEE, 2018. 6252–6259
Google Scholar
Fan T, Long P, Liu W, et al. Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. Int J Robotics Res, 2020, 39: 856–892
Article Google Scholar
Sutton R S, Barto A G. Reinforcement learning: An introduction. Cambridge, Massachusetts: MIT Press, 2018
Google Scholar
Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of International Conference on Machine Learning. Stockholm, 2018. 1861–1870
Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double Q-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence. Phoenix: AAAI Press, 2016
Google Scholar
Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning. arXiv: 1509.02971
Cai G R, Yang S M, Du J, et al. Convolution without multiplication: A general speed up strategy for CNNs. Sci China Tech Sci, 2021, 64: 2627–2639
Article Google Scholar
Shi H, Shi L, Xu M, et al. End-to-end navigation strategy with deep reinforcement learning for mobile robots. IEEE Trans Ind Inf, 2019, 16: 2393–2402
Article Google Scholar
Quan H, Li Y, Zhang Y. A novel mobile robot navigation method based on deep reinforcement learning. Int J Adv Robotic Syst, 2020, 17, doi: https://doi.org/10.1177/1729881420921672
Yu Y P, Liu J C, Wei C. Hawk and pigeon’s intelligence for UAV swarm dynamic combat game via competitive learning pigeon-inspired optimization. Sci China Tech Sci, 2022, 65: 1072–1086
Article Google Scholar
Bai T T, Wang D B, Masood R J. Formation control of quad-rotor UAV via PIO. Sci China Tech Sci, 2022, 65: 432–439
Article Google Scholar
Wang Q S, Zhuang H, Duan Z S, et al. Robust control of uncertain robotic systems: An adaptive friction compensation approach. Sci China Tech Sci, 2021, 64: 1228–1237
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Astronautics, Harbin Institute of Technology, Harbin, 150001, China
RuiXian Zhang, JiaNan Yang, Ye Liang, ShengAo Lu, YiFei Dong, BaoQing Yang & LiXian Zhang

Authors

RuiXian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
JiaNan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ye Liang
View author publications
You can also search for this author in PubMed Google Scholar
ShengAo Lu
View author publications
You can also search for this author in PubMed Google Scholar
YiFei Dong
View author publications
You can also search for this author in PubMed Google Scholar
BaoQing Yang
View author publications
You can also search for this author in PubMed Google Scholar
LiXian Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to LiXian Zhang.

Additional information

This work was supported by the National Natural Science Foundation of China (Grant Nos. 62225305 and 12072088), the Fundamental Research Funds for the Central Universities, China (Grant Nos. HIT.OCEF.2022047, HIT.BRET.2022004 and HIT.DZIJ.2023049), the Grant JCKY2022603C016, State Key Laboratory of Robotics and System (HIT), and the Heilongjiang Touyan Team.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, R., Yang, J., Liang, Y. et al. Navigation for autonomous vehicles via fast-stable and smooth reinforcement learning. Sci. China Technol. Sci. 67, 423–434 (2024). https://doi.org/10.1007/s11431-023-2483-x

Download citation

Received: 08 June 2023
Accepted: 10 August 2023
Published: 18 December 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11431-023-2483-x

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Navigation for autonomous vehicles via fast-stable and smooth reinforcement learning

Abstract

Access this article

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation