Skip to main content
Log in

An actor-critic based learning method for decision-making and planning of autonomous vehicles

  • Article
  • Published:
Science China Technological Sciences Aims and scope Submit manuscript

Abstract

In order to improve the agility and applicability of trajectory planning algorithm for autonomous vehicles, this paper proposes a novel actor-critic based learning method for decision-making and planning in multi-vehicle complex traffic. It is the coupling planning of vehicle’s path and speed thus to make the trajectory more flexible. First, generations from the decided action to the planned trajectory are described by the end-point of the trajectory. Then, the actor-critic based learning method is built to learn an optimal policy for the decision process. It can update the policy by the gradient of the current policy’s advantage. In this process, features of the real traffic are carefully extracted by time headway (TH) and speed distribution. Reward function is built by the safety, efficiency and driving comfort. Furthermore, to make the policy network have better convergency, the policy network is modularized in two parts: the lane-changing network and the lane-keeping network, which decide the optimal end-point of the path and speed candidates respectively. Finally, the curved overtaking scenario and the interaction process with human driver are conducted to illustrate the feasibility and superiority. The results show that the proposed method has better real-time performance and can make the planned coupling trajectory more continuous and smoother than the existing rule-based method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Cheng S, Li L, Chen X, et al. Model-predictive-control-based path tracking controller of autonomous vehicle considering parametric uncertainties and velocity-varying. IEEE Trans Ind Electron, 2020, 1

  2. Cesari G, Schildbach G, Carvalho A, et al. Scenario model predictive control for lane change assistance and autonomous driving on highways. IEEE Intell Transp Syst Mag, 2017, 9: 23–35

    Article  Google Scholar 

  3. Cheng S, Li L, Liu C Z, et al. Robust LMI-based H-infinite controller integrating AFS and DYC of autonomous vehicles with parametric uncertainties. IEEE Trans Syst Man Cybern Syst, 2020, 1–10

  4. Kasper D, Weidl G, Dang T, et al. Object-oriented bayesian networks for detection of lane change maneuvers. IEEE Intell Transp Syst Mag, 2012, 4: 19–31

    Article  Google Scholar 

  5. Xie Z W, Zhang Q, Jiang Z N, et al. Robot learning from demonstration for path planning: A review. Sci China Tech Sci, 2020, 63: 1325–1334

    Article  Google Scholar 

  6. Ulbrich S, Maurer M. Probabilistic online POMDP decision making for lane changes in fully automated driving. In: Proceedings of the IEEE Conference on Intelligent Transportation Systems. The Hague, 2013. 2063–2067

  7. Li L, Ota K, Dong M. Humanlike driving: Empirical decision-making system for autonomous vehicles. IEEE Trans Veh Technol, 2018, 67: 6814–6823

    Article  Google Scholar 

  8. Yang D G, Jiang K, Zhao D, et al. Intelligent and connected vehicles: Current status and future perspectives. Sci China Tech Sci, 2018, 61: 1446–1471

    Article  Google Scholar 

  9. Lu C, Wang H, Lv C, et al. Learning driver-specific behavior for overtaking: A combined learning framework. IEEE Trans Veh Technol, 2018, 67: 6788–6802

    Article  Google Scholar 

  10. Ngai D C K, Yung N H C. A multiple-goal reinforcement learning method for complex vehicle overtaking maneuvers. IEEE Trans Intell Transp Syst, 2011, 12: 509–522

    Article  Google Scholar 

  11. Noh S, An K. Decision-making framework for automated driving in highway environments. IEEE Trans Intell Transp Syst, 2018, 19: 58–71

    Article  Google Scholar 

  12. Huang Z, Chu D, Wu C, et al. Path planning and cooperative control for automated vehicle platoon using hybrid automata. IEEE Trans Intell Transp Syst, 2019, 20: 959–974

    Article  Google Scholar 

  13. Feng G, Wang W, Feng J, et al. Modelling and simulation for safe following distance based on vehicle braking process. In: Proceedings of the IEEE Conference on E-business Engineering. Shanghai, 2010. 385–388

  14. Ji J, Khajepour A, Melek W W, et al. Path planning and tracking for vehicle collision avoidance based on model predictive control with multiconstraints. IEEE Trans Veh Technol, 2017, 66: 952–964

    Article  Google Scholar 

  15. Wang J F, Zhang Q, Zhang Z Q, et al. Structured trajectory planning of collision-free lane change using the vehicle-driver integration data. Sci China Tech Sci, 2016, 59: 825–831

    Article  Google Scholar 

  16. Gonzalez D, Perez J, Milanes V, et al. A review of motion planning techniques for automated vehicles. IEEE Trans Intell Transp Syst, 2016, 17: 1135–1145

    Article  Google Scholar 

  17. McNaughton M, Urmson C, Dolan J, et al. Motion planning for autonomous driving with a conformal spatiotemporal lattice. In: Proceedings of the IEEE Conference on Robotics and Automation. Shanghai, 2011. 4889–4895

  18. Park B, Lee Y C, Han W Y. Trajectory generation method using Bézier spiral curves for high-speed on-road autonomous vehicles. In: Proceedings of the IEEE Conference on Automation Science and Engineering. Taipei, 2014. 927–932

  19. Li X, Sun Z, Cao D, et al. Real-time trajectory planning for autonomous urban driving: Framework, algorithms, and verifications. IEEE/ASME Trans Mechatron, 2016, 21: 740–753

    Article  Google Scholar 

  20. Li P, Duan H B. Path planning of unmanned aerial vehicle based on improved gravitational search algorithm. Sci China Tech Sci, 2012, 55: 2712–2719

    Article  Google Scholar 

  21. Yu L, Shao X, Yan X. Autonomous overtaking decision making of driverless bus based on deep Q-learning method. In: Proceedings of the IEEE Conference on Robotics and Biomimetics. Macau, 2017

  22. Tram T, Jansson A, Grönberg R, et al. Learning negotiating behavior between cars in intersections using deep Q-learning. In: Proceedings of the IEEE Conference on Intelligent Transportation Systems. Maui, 2018

  23. Zhu M, Wang X, Wang Y. Human-like autonomous car-following model with deep reinforcement learning. Transpation Res Part C-Emerging Technologies, 2018, 97: 348–368

    Article  Google Scholar 

  24. Wang P, Chan C Y, Fortelle A. A reinforcement learning based approach for automated lane change maneuvers. In: Proceedings of the IEEE Intelligent Vehicles Symposium (IV). Changshu, 2018. 1379–1384

  25. Wnag C Y, Zhao W Z, Xu Z J, et al. Path planning and stability control of collision avoidance system based on active front steering. Sci China Tech Sci, 2017, 60: 1231–1243

    Article  Google Scholar 

  26. Mnih V, Badia A P, Mirza M, et al. Asynchronous methods for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning. New York, 2016

  27. Hillenbrand J, Spieker A M, Kroschel K. A multilevel collision mitigation approach—Its situation assessment, decision making, and performance tradeoffs. IEEE Trans Intell Transp Syst, 2006, 7: 528–540

    Article  Google Scholar 

  28. Ward J R, Agamennoni G, Worrall S, et al. Extending Time to Collision for probabilistic reasoning in general traffic scenarios. Transpation Res Part C-Emerging Technologies, 2015, 51: 66–82

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to WanZhong Zhao.

Additional information

This work was supported by the Jiangsu Key R&D Plan (Grant No. BE2018124), the National Natural Science Foundation of China (Grant Nos. 51775007 and 51875279), and the Postgraduate Research and Practice Innovation Program of Jiangsu Province (Grant No. KYCX19_0157).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, C., Zhao, W., Chen, Q. et al. An actor-critic based learning method for decision-making and planning of autonomous vehicles. Sci. China Technol. Sci. 64, 984–994 (2021). https://doi.org/10.1007/s11431-020-1729-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11431-020-1729-2

Keywords

Navigation