Skip to main content

Planning Maneuvers for Autonomous Driving Based on Offline Reinforcement Learning: Comparative Study

  • Conference paper
  • First Online:
Proceedings of the Seventh International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’23) (IITI 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 776))

  • 305 Accesses

Abstract

One of the key challenges in developing autonomous vehicles is planning safe and efficient trajectories in complex environments, such as intersections. This paper proposes an offline RL approach for planning trajectories for autonomous vehicles at crossroads with other actors. It enables the possibility of using pre-recorded expert trajectories for algorithm tuning. We study the influence of the quality of collected trajectories on various offline reinforcement learning methods. Our approach has the potential to overcome the limitations of online RL and provide an effective planning solution for autonomous vehicles in dynamic environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Carla challenge. https://leaderboard.carla.org/challenge/

  2. Vinyals, O., et al.: Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354 (2019). https://doi.org/10.1038/s41586-019-1724-z

    Article  Google Scholar 

  3. Althoff, M., Koschi, M., Manzinger, S.: Commonroad: composable benchmarks for motion planning on roads (2017). https://doi.org/10.1109/IVS.2017.7995802

  4. Wang, B., Gong, J., Chen, H.: Motion primitives representation, extraction and connection for automated vehicle motion planning applications. IEEE Trans. Intell. Transp. Syst. 21, 3931–3945 (2020)

    Article  Google Scholar 

  5. Badue, C., et al.: Self-driving cars: a survey. Expert Syst. Appl. 165, 113816 (2021). https://doi.org/10.1016/j.eswa.2020.113816

    Article  Google Scholar 

  6. Cheng, J., Chen, Y., Zhang, Q., Gan, L., Liu, C., Liu, M.: Real-time trajectory planning for autonomous driving with gaussian process and incremental refinement. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 8999–9005. IEEE (2022)

    Google Scholar 

  7. Dixit, S.: Trajectory planning for autonomous high-speed overtaking in structured environments using robust MPC. IEEE Trans. Intell. Transp. Syst. 21, 2310–2323 (2020)

    Article  Google Scholar 

  8. Esterle, K., Kessler, T., Knoll, A.: Optimal behavior planning for autonomous driving: a generic mixed-integer formulation. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 1914–1921. IEEE (2020)

    Google Scholar 

  9. Fujimoto, S., Gu, S.S.: A minimalist approach to offline reinforcement learning (2021)

    Google Scholar 

  10. Fujimoto, S., Meger, D., Precup, D.: Off-policy deep reinforcement learning without exploration (2019)

    Google Scholar 

  11. Mouhagir, H., Talj, R., Cherfaoui, V., Aioun, F., Guillemard, F.: Evidential-based approach for trajectory planning with tentacles, for autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 21, 3485–3496 (2020)

    Article  Google Scholar 

  12. Haarnoja, T., et al.: Soft actor-critic algorithms and applications (2018). http://arxiv.org/abs/1812.05905

  13. He, R., et al.: TDR-OBCA: a reliable planner for autonomous driving in free-space environment (2020). http://arxiv.org/abs/2009.11345

  14. Hoel, C.J., Tram, T., Sjöberg, J.: Reinforcement learning with uncertainty estimation for tactical decision-making in intersections (2020). http://arxiv.org/abs/2006.09786

  15. Isele, D., Rahimi, R., Cosgun, A., Subramanian, K., Fujimura, K.: Navigating occluded intersections with autonomous vehicles using deep reinforcement learning (2017). http://arxiv.org/abs/1705.01196

  16. Janner, M., Li, Q., Levine, S.: Offline reinforcement learning as one big sequence modeling problem. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 1273–1286. Curran Associates, Inc. (2021)

    Google Scholar 

  17. Kessler, T., Esterle, K., Knoll, A.: Linear differential games for cooperative behavior planning of autonomous vehicles using mixed-integer programming. In: 2020 59th IEEE Conference on Decision and Control (CDC), pp. 4060–4066. IEEE (2020)

    Google Scholar 

  18. Kessler, T., Esterle, K., Knoll, A.: Mixed-integer motion planning on German roads within the Apollo driving stack. IEEE Trans. Intell. Veh. 8(1), 851–867 (2022)

    Article  Google Scholar 

  19. Khaitan, S., Dolan, J.M.: State dropout-based curriculum reinforcement learning for self-driving at unsignalized intersections (2022). http://arxiv.org/abs/2207.04361

  20. Kumar, A., Zhou, A., Tucker, G., Levine, S.: Conservative q-learning for offline reinforcement learning (2020)

    Google Scholar 

  21. Lee, D.H., Liu, J.L.: End-to-end deep learning of lane detection and path prediction for real-time autonomous driving (2021). http://arxiv.org/abs/2102.04738

  22. Ljungqvist, O., Evestedt, N., Cirillo, M., Axehill, D., Holmer, O.: Lattice-based motion planning for a general 2-trailer system. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 819–824. IEEE (2017)

    Google Scholar 

  23. Martinson, M., Skrynnik, A., Panov, A.I.: Navigating autonomous vehicle at the road intersection simulator with reinforcement learning. In: Kuznetsov, S.O., Panov, A.I., Yakovlev, K.S. (eds.) RCAI 2020. LNCS (LNAI), vol. 12412, pp. 71–84. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59535-7_6

    Chapter  Google Scholar 

  24. Nair, A., Gupta, A., Dalal, M., Levine, S.: AWAC: accelerating online reinforcement learning with offline datasets (2021)

    Google Scholar 

  25. Berner, C., et al.: Dota 2 with large scale deep reinforcement learning (2019)

    Google Scholar 

  26. Paden, B., Čáp, M., Yong, S.Z., Yershov, D., Frazzoli, E.: A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans. Intell. Veh. 1(1), 33–55 (2016). https://doi.org/10.1109/TIV.2016.2578706

    Article  Google Scholar 

  27. Prakash, A., Chitta, K., Geiger, A.: Multi-modal fusion transformer for end-to-end autonomous driving (2021). http://arxiv.org/abs/2104.09224

  28. Prudencio, R.F., Maximo, M.R.O.A., Colombini, E.L.: A survey on offline reinforcement learning: taxonomy, review, and open problems (2022). http://arxiv.org/abs/2203.01387

  29. Seno, T., Imai, M.: d3rlpy: an offline deep reinforcement learning library. J. Mach. Learn. Res. 23(315), 1–20 (2022). http://jmlr.org/papers/v23/22-0017.html

  30. Shikunov, M., Panov, A.I.: Hierarchical reinforcement learning approach for the road intersection task. In: Samsonovich, A.V. (ed.) BICA 2019. AISC, vol. 948, pp. 495–506. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-25719-4_64

    Chapter  Google Scholar 

  31. Skrynnik, A., Staroverov, A., Aitygulov, E., Aksenov, K., Davydov, V., Panov, A.I.: Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations. Knowl.-Based Syst. 218, 106844 (2021)

    Google Scholar 

  32. Skrynnik, A., Staroverov, A., Aitygulov, E., Aksenov, K., Davydov, V., Panov, A.I.: Hierarchical deep Q-network from imperfect demonstrations in minecraft. Cogn. Syst. Res. 65, 74–78 (2021). https://doi.org/10.1016/j.cogsys.2020.08.012

  33. Spanogiannopoulos, S., Zweiri, Y., Seneviratne, L.: Sampling-based non-holonomic path generation for self-driving cars. J. Intell. Robot. Syst. 104(1), 1–17 (2022)

    Article  Google Scholar 

  34. Zhang, T., Fu, M., Song, W., Yang, Y., Wang, M.: Trajectory planning based on spatio-temporal map with collision avoidance guaranteed by safety strip. IEEE Trans. Intell. Transp. Syst. 23, 1030–1043 (2022)

    Article  Google Scholar 

  35. The Autoware Foundation: Autoware. https://www.autoware.org/

  36. Lim, W., Lee, S., Sunwoo, M., Jo, K.: Hybrid trajectory planning for autonomous driving in on-road dynamic scenarios. IEEE Trans. Intell. Transp. Syst. 22, 341–355 (2021)

    Article  Google Scholar 

  37. Wang, X., Krasowski, H., Althoff, M.: CommonRoad-RL: a configurable reinforcement learning environment for motion planning of autonomous vehicles. In: IEEE International Conference on Intelligent Transportation Systems (ITSC) (2021). https://doi.org/10.1109/ITSC48978.2021.9564898

  38. Wang, Z., et al.: Critic regularized regression (2021)

    Google Scholar 

  39. Wu, Y., Tucker, G., Nachum, O.: Behavior regularized offline reinforcement learning (2019)

    Google Scholar 

  40. Yudin, D.A., Skrynnik, A., Krishtopik, A., Belkin, I., Panov, A.I.: Object detection with deep neural networks for reinforcement learning in the task of autonomous vehicles path planning at the intersection. Opt. Mem. Neural Netw. 28(4), 283–295 (2019). https://doi.org/10.3103/S1060992X19040118

    Article  Google Scholar 

  41. Zhou, J., et al.: DL-IAPS and PJSO: a path/speed decoupled trajectory optimization and its application in autonomous driving (2020). http://arxiv.org/abs/2009.11135

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mikhail Melkumov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Melkumov, M., Panov, A.I. (2023). Planning Maneuvers for Autonomous Driving Based on Offline Reinforcement Learning: Comparative Study. In: Kovalev, S., Kotenko, I., Sukhanov, A. (eds) Proceedings of the Seventh International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’23). IITI 2023. Lecture Notes in Networks and Systems, vol 776. Springer, Cham. https://doi.org/10.1007/978-3-031-43789-2_6

Download citation

Publish with us

Policies and ethics