Skip to main content

End-to-End Urban Autonomous Navigation with Decision Hindsight

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1969))

Included in the following conference series:

  • 380 Accesses

Abstract

Urban autonomous navigation has broad application prospects. Reinforcement Learning (RL) based navigation models can be continuously optimized through self-exploration, eliminating the need for human heuristics. However, training effective navigation models faces challenges due to the dynamic nature of urban traffic conditions and the exploration-exploitation dilemma in RL. Moreover, the limited vehicle perception and traffic uncertainty introduce potential safety hazards, hampering the real-world application of RL-based navigation models. In this paper, we proposed a novel end-to-end urban navigation framework with decision hindsight. By formulating the problem of Partially Observable Markov Decision Process (POMDP), we employ a causal Transformer-based autoregressive modeling approach to process the historical navigation information as supplementary observations. We then combine these historical observations with current perceptions to construct a history-feedforward state representation that enhances global awareness, improving data availability and decision predictability. Furthermore, by integrating the historical-feedforward state encoding upstream, we develop an end-to-end learning framework based on RL to obtain a navigation model with decision hindsight, enabling more reliable navigation. To validate the effectiveness of our proposed method, we conduct experiments on challenging urban navigation tasks using the CARLA simulator. The results demonstrate that our method achieves higher learning efficiency and improved driving performance, taking priority over prior methods on urban navigation benchmarks.

This work was supported by Shandong Provincial Natural Science Foundation, China under Grant ZR2022LZH002.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmed, M., Abobakr, A., Lim, C.P., Nahavandi, S.: Policy-based reinforcement learning for training autonomous driving agents in urban areas with affordance learning. IEEE Trans. Intell. Transp. Syst. 23(8), 12562–12571 (2022). https://doi.org/10.1109/TITS.2021.3115235

    Article  Google Scholar 

  2. Chen, D., Koltun, V., Krähenbühl, P.: Learning to drive from a world on rails. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021, pp. 15570–15579. IEEE (2021). https://doi.org/10.1109/ICCV48922.2021.01530

  3. Chen, J., Li, S.E., Tomizuka, M.: Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. 23(6), 5068–5078 (2022). https://doi.org/10.1109/TITS.2020.3046646

    Article  Google Scholar 

  4. Chen, J., Yuan, B., Tomizuka, M.: Model-free deep reinforcement learning for urban autonomous driving. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 2765–2771 (2019). https://doi.org/10.1109/ITSC.2019.8917306

  5. Chen, L., et al.: Decision transformer: Reinforcement learning via sequence modeling. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021(December), pp. 6–14, 2021. Virtual, pp. 15084–15097 (2021). https://proceedings.neurips.cc/paper/2021/hash/7f489f642a0ddb10272b5c31057f0663-Abstract.html

  6. Chen, M., Xiao, X., Zhang, W., Gao, X.: Efficient and stable information directed exploration for continuous reinforcement learning. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4023–4027 (2022). https://doi.org/10.1109/ICASSP43922.2022.9746211

  7. Chitta, K., Prakash, A., Geiger, A.: NEAT: neural attention fields for end-to-end autonomous driving. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021, pp. 15773–15783. IEEE (2021). https://doi.org/10.1109/ICCV48922.2021.01550

  8. Codevilla, F., Santana, E., López, A.M., Gaidon, A.: Exploring the limitations of behavior cloning for autonomous driving. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 9328–9337. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00942

  9. Deshpande, N., Vaufreydaz, D., Spalanzani, A.: Navigation in urban environments amongst pedestrians using multi-objective deep reinforcement learning. In: 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), pp. 923–928 (2021). https://doi.org/10.1109/ITSC48978.2021.9564601

  10. Dosovitskiy, A., Ros, G., Codevilla, F., López, A.M., Koltun, V.: CARLA: an open urban driving simulator. In: 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, California, USA, 13–15 November 2017, Proceedings. Proceedings of Machine Learning Research, vol. 78, pp. 1–16. PMLR (2017). http://proceedings.mlr.press/v78/dosovitskiy17a.html

  11. Fakoor, R., Chaudhari, P., Soatto, S., Smola, A.J.: Meta-q-learning. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=SJeD3CEFPH

  12. Huang, C., et al.: Deductive reinforcement learning for visual autonomous urban driving navigation. IEEE Trans. Neural Netw. Learn. Syst. 32(12), 5379–5391 (2021). https://doi.org/10.1109/TNNLS.2021.3109284

    Article  Google Scholar 

  13. Janner, M., Li, Q., Levine, S.: Offline reinforcement learning as one big sequence modeling problem. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021(December), pp. 6–14, 2021. virtual, pp. 1273–1286 (2021). https://proceedings.neurips.cc/paper/2021/hash/099fe6b0b444c23836c4a5d07346082b-Abstract.html

  14. Kargar, E., Kyrki, V.: Increasing the efficiency of policy learning for autonomous vehicles by multi-task representation learning. IEEE Trans. Intell. Veh. 7(3), 701–710 (2022). https://doi.org/10.1109/TIV.2022.3149891

    Article  Google Scholar 

  15. Khalil, Y.H., Mouftah, H.T.: Exploiting multi-modal fusion for urban autonomous driving using latent deep reinforcement learning. IEEE Trans. Veh. Technol. 72(3), 2921–2935 (2023). https://doi.org/10.1109/TVT.2022.3217299

    Article  Google Scholar 

  16. Li, W., Luo, H., Lin, Z., Zhang, C., Lu, Z., Ye, D.: A survey on transformers in reinforcement learning. CoRR abs/2301.03044 (2023). https://doi.org/10.48550/arXiv.2301.03044

  17. Liu, H., Huang, Z., Wu, J., Lv, C.: Improved deep reinforcement learning with expert demonstrations for urban autonomous driving. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 921–928 (2022). https://doi.org/10.1109/IV51971.2022.9827073

  18. Loynd, R., Fernandez, R., Celikyilmaz, A., Swaminathan, A., Hausknecht, M.J.: Working memory graphs. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp. 6404–6414. PMLR (2020). http://proceedings.mlr.press/v119/loynd20a.html

  19. Melo, L.C.: Transformers are meta-reinforcement learners. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvári, C., Niu, G., Sabato, S. (eds.) International Conference on Machine Learning, ICML 2022, 17–23 July 2022, Baltimore, Maryland, USA. Proceedings of Machine Learning Research, vol. 162, pp. 15340–15359. PMLR (2022). https://proceedings.mlr.press/v162/melo22a.html

  20. Parisotto, E., et al.: Stabilizing transformers for reinforcement learning. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp. 7487–7498. PMLR (2020). http://proceedings.mlr.press/v119/parisotto20a.html

  21. Prakash, A., Chitta, K., Geiger, A.: Multi-modal fusion transformer for end-to-end autonomous driving. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 7077–7087. Computer Vision Foundation/IEEE (2021). https://doi.org/10.1109/CVPR46437.2021.00700, https://openaccess.thecvf.com/content/CVPR2021/html/Prakash_Multi-Modal_Fusion_Transformer_for_End-to-End_Autonomous_Driving_CVPR_2021_paper.html

  22. Schulman, J., Moritz, P., Levine, S., Jordan, M.I., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1506.02438

  23. Schwarting, W., Alonso-Mora, J., Rus, D.: Planning and decision-making for autonomous vehicles. Ann. Rev. Control Robot. Autonom. Syst. 1(1), 187–210 (2018). https://doi.org/10.1146/annurev-control-060117-105157

    Article  Google Scholar 

  24. Shao, H., Wang, L., Chen, R., Li, H., Liu, Y.: Safety-enhanced autonomous driving using interpretable sensor fusion transformer. In: Liu, K., Kulic, D., Ichnowski, J. (eds.) Conference on Robot Learning, CoRL 2022, 14–18 December 2022, Auckland, New Zealand. Proceedings of Machine Learning Research, vol. 205, pp. 726–737. PMLR (2022). https://proceedings.mlr.press/v205/shao23a.html

  25. Vinyals, O., et al.: Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019). https://doi.org/10.1038/s41586-019-1724-z

    Article  Google Scholar 

  26. Wang, C., Wang, J., Shen, Y., Zhang, X.: Autonomous navigation of UAVs in large-scale complex environments: a deep reinforcement learning approach. IEEE Trans. Veh. Technol. 68(3), 2124–2136 (2019). https://doi.org/10.1109/TVT.2018.2890773

    Article  Google Scholar 

  27. Wang, P., Chan, C.Y.: Formulation of deep reinforcement learning architecture toward autonomous driving for on-ramp merge. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pp. 1–6 (2017). https://doi.org/10.1109/ITSC.2017.8317735

  28. Wu, J., Huang, W., de Boer, N., Mo, Y., He, X., Lv, C.: Safe decision-making for lane-change of autonomous vehicles via human demonstration-aided reinforcement learning. In: 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), pp. 1228–1233 (2022). https://doi.org/10.1109/ITSC55140.2022.9921872

  29. Zambaldi, V.F., et al.: Deep reinforcement learning with relational inductive biases. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019). https://openreview.net/forum?id=HkxaFoC9KQ

  30. Zhang, Z., Liniger, A., Dai, D., Yu, F., Van Gool, L.: End-to-end urban driving by imitating a reinforcement learning coach. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15202–15212 (2021). https://doi.org/10.1109/ICCV48922.2021.01494

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruyang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Deng, Q., Liu, G., Li, R., Hu, Q., Zhao, Y., Li, R. (2024). End-to-End Urban Autonomous Navigation with Decision Hindsight. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1969. Springer, Singapore. https://doi.org/10.1007/978-981-99-8184-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8184-7_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8183-0

  • Online ISBN: 978-981-99-8184-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics