An actor-critic based learning method for decision-making and planning of autonomous vehicles

Xu, Can; Zhao, WanZhong; Chen, QingYun; Wang, ChunYan

doi:10.1007/s11431-020-1729-2

An actor-critic based learning method for decision-making and planning of autonomous vehicles

Article
Published: 19 March 2021

Volume 64, pages 984–994, (2021)
Cite this article

Science China Technological Sciences Aims and scope Submit manuscript

Can Xu¹,
WanZhong Zhao¹,
QingYun Chen¹ &
…
ChunYan Wang¹

328 Accesses
6 Citations
3 Altmetric
Explore all metrics

Abstract

In order to improve the agility and applicability of trajectory planning algorithm for autonomous vehicles, this paper proposes a novel actor-critic based learning method for decision-making and planning in multi-vehicle complex traffic. It is the coupling planning of vehicle’s path and speed thus to make the trajectory more flexible. First, generations from the decided action to the planned trajectory are described by the end-point of the trajectory. Then, the actor-critic based learning method is built to learn an optimal policy for the decision process. It can update the policy by the gradient of the current policy’s advantage. In this process, features of the real traffic are carefully extracted by time headway (TH) and speed distribution. Reward function is built by the safety, efficiency and driving comfort. Furthermore, to make the policy network have better convergency, the policy network is modularized in two parts: the lane-changing network and the lane-keeping network, which decide the optimal end-point of the path and speed candidates respectively. Finally, the curved overtaking scenario and the interaction process with human driver are conducted to illustrate the feasibility and superiority. The results show that the proposed method has better real-time performance and can make the planned coupling trajectory more continuous and smoother than the existing rule-based method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Actor-critic objective penalty function method: an adaptive strategy for trajectory tracking in autonomous driving

Article Open access 30 September 2023

Intelligent Safety Decision-Making for Autonomous Vehicle in Highway Environment

Towards Safe Autonomous Driving: Decision Making with Observation-Robust Reinforcement Learning

Article Open access 08 November 2023

References

Cheng S, Li L, Chen X, et al. Model-predictive-control-based path tracking controller of autonomous vehicle considering parametric uncertainties and velocity-varying. IEEE Trans Ind Electron, 2020, 1
Cesari G, Schildbach G, Carvalho A, et al. Scenario model predictive control for lane change assistance and autonomous driving on highways. IEEE Intell Transp Syst Mag, 2017, 9: 23–35
Article Google Scholar
Cheng S, Li L, Liu C Z, et al. Robust LMI-based H-infinite controller integrating AFS and DYC of autonomous vehicles with parametric uncertainties. IEEE Trans Syst Man Cybern Syst, 2020, 1–10
Kasper D, Weidl G, Dang T, et al. Object-oriented bayesian networks for detection of lane change maneuvers. IEEE Intell Transp Syst Mag, 2012, 4: 19–31
Article Google Scholar
Xie Z W, Zhang Q, Jiang Z N, et al. Robot learning from demonstration for path planning: A review. Sci China Tech Sci, 2020, 63: 1325–1334
Article Google Scholar
Ulbrich S, Maurer M. Probabilistic online POMDP decision making for lane changes in fully automated driving. In: Proceedings of the IEEE Conference on Intelligent Transportation Systems. The Hague, 2013. 2063–2067
Li L, Ota K, Dong M. Humanlike driving: Empirical decision-making system for autonomous vehicles. IEEE Trans Veh Technol, 2018, 67: 6814–6823
Article Google Scholar
Yang D G, Jiang K, Zhao D, et al. Intelligent and connected vehicles: Current status and future perspectives. Sci China Tech Sci, 2018, 61: 1446–1471
Article Google Scholar
Lu C, Wang H, Lv C, et al. Learning driver-specific behavior for overtaking: A combined learning framework. IEEE Trans Veh Technol, 2018, 67: 6788–6802
Article Google Scholar
Ngai D C K, Yung N H C. A multiple-goal reinforcement learning method for complex vehicle overtaking maneuvers. IEEE Trans Intell Transp Syst, 2011, 12: 509–522
Article Google Scholar
Noh S, An K. Decision-making framework for automated driving in highway environments. IEEE Trans Intell Transp Syst, 2018, 19: 58–71
Article Google Scholar
Huang Z, Chu D, Wu C, et al. Path planning and cooperative control for automated vehicle platoon using hybrid automata. IEEE Trans Intell Transp Syst, 2019, 20: 959–974
Article Google Scholar
Feng G, Wang W, Feng J, et al. Modelling and simulation for safe following distance based on vehicle braking process. In: Proceedings of the IEEE Conference on E-business Engineering. Shanghai, 2010. 385–388
Ji J, Khajepour A, Melek W W, et al. Path planning and tracking for vehicle collision avoidance based on model predictive control with multiconstraints. IEEE Trans Veh Technol, 2017, 66: 952–964
Article Google Scholar
Wang J F, Zhang Q, Zhang Z Q, et al. Structured trajectory planning of collision-free lane change using the vehicle-driver integration data. Sci China Tech Sci, 2016, 59: 825–831
Article Google Scholar
Gonzalez D, Perez J, Milanes V, et al. A review of motion planning techniques for automated vehicles. IEEE Trans Intell Transp Syst, 2016, 17: 1135–1145
Article Google Scholar
McNaughton M, Urmson C, Dolan J, et al. Motion planning for autonomous driving with a conformal spatiotemporal lattice. In: Proceedings of the IEEE Conference on Robotics and Automation. Shanghai, 2011. 4889–4895
Park B, Lee Y C, Han W Y. Trajectory generation method using Bézier spiral curves for high-speed on-road autonomous vehicles. In: Proceedings of the IEEE Conference on Automation Science and Engineering. Taipei, 2014. 927–932
Li X, Sun Z, Cao D, et al. Real-time trajectory planning for autonomous urban driving: Framework, algorithms, and verifications. IEEE/ASME Trans Mechatron, 2016, 21: 740–753
Article Google Scholar
Li P, Duan H B. Path planning of unmanned aerial vehicle based on improved gravitational search algorithm. Sci China Tech Sci, 2012, 55: 2712–2719
Article Google Scholar
Yu L, Shao X, Yan X. Autonomous overtaking decision making of driverless bus based on deep Q-learning method. In: Proceedings of the IEEE Conference on Robotics and Biomimetics. Macau, 2017
Tram T, Jansson A, Grönberg R, et al. Learning negotiating behavior between cars in intersections using deep Q-learning. In: Proceedings of the IEEE Conference on Intelligent Transportation Systems. Maui, 2018
Zhu M, Wang X, Wang Y. Human-like autonomous car-following model with deep reinforcement learning. Transpation Res Part C-Emerging Technologies, 2018, 97: 348–368
Article Google Scholar
Wang P, Chan C Y, Fortelle A. A reinforcement learning based approach for automated lane change maneuvers. In: Proceedings of the IEEE Intelligent Vehicles Symposium (IV). Changshu, 2018. 1379–1384
Wnag C Y, Zhao W Z, Xu Z J, et al. Path planning and stability control of collision avoidance system based on active front steering. Sci China Tech Sci, 2017, 60: 1231–1243
Article Google Scholar
Mnih V, Badia A P, Mirza M, et al. Asynchronous methods for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning. New York, 2016
Hillenbrand J, Spieker A M, Kroschel K. A multilevel collision mitigation approach—Its situation assessment, decision making, and performance tradeoffs. IEEE Trans Intell Transp Syst, 2006, 7: 528–540
Article Google Scholar
Ward J R, Agamennoni G, Worrall S, et al. Extending Time to Collision for probabilistic reasoning in general traffic scenarios. Transpation Res Part C-Emerging Technologies, 2015, 51: 66–82
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Vehicle Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
Can Xu, WanZhong Zhao, QingYun Chen & ChunYan Wang

Authors

Can Xu
View author publications
You can also search for this author in PubMed Google Scholar
WanZhong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
QingYun Chen
View author publications
You can also search for this author in PubMed Google Scholar
ChunYan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to WanZhong Zhao.

Additional information

This work was supported by the Jiangsu Key R&D Plan (Grant No. BE2018124), the National Natural Science Foundation of China (Grant Nos. 51775007 and 51875279), and the Postgraduate Research and Practice Innovation Program of Jiangsu Province (Grant No. KYCX19_0157).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, C., Zhao, W., Chen, Q. et al. An actor-critic based learning method for decision-making and planning of autonomous vehicles. Sci. China Technol. Sci. 64, 984–994 (2021). https://doi.org/10.1007/s11431-020-1729-2

Download citation

Received: 26 June 2020
Accepted: 27 September 2020
Published: 19 March 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s11431-020-1729-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An actor-critic based learning method for decision-making and planning of autonomous vehicles

Abstract

Access this article

Similar content being viewed by others

Actor-critic objective penalty function method: an adaptive strategy for trajectory tracking in autonomous driving

Intelligent Safety Decision-Making for Autonomous Vehicle in Highway Environment

Towards Safe Autonomous Driving: Decision Making with Observation-Robust Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An actor-critic based learning method for decision-making and planning of autonomous vehicles

Abstract

Access this article

Similar content being viewed by others

Actor-critic objective penalty function method: an adaptive strategy for trajectory tracking in autonomous driving

Intelligent Safety Decision-Making for Autonomous Vehicle in Highway Environment

Towards Safe Autonomous Driving: Decision Making with Observation-Robust Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation