End-to-End Urban Autonomous Navigation with Decision Hindsight

Deng, Qi; Liu, Guangqing; Li, Ruyang; Hu, Qifu; Zhao, Yaqian; Li, Rengang

doi:10.1007/978-981-99-8184-7_6

Qi Deng^10,11,
Guangqing Liu¹⁰,
Ruyang Li^10,11,
Qifu Hu¹⁰,
Yaqian Zhao¹⁰ &
…
Rengang Li¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1969))

Included in the following conference series:

International Conference on Neural Information Processing

380 Accesses

Abstract

Urban autonomous navigation has broad application prospects. Reinforcement Learning (RL) based navigation models can be continuously optimized through self-exploration, eliminating the need for human heuristics. However, training effective navigation models faces challenges due to the dynamic nature of urban traffic conditions and the exploration-exploitation dilemma in RL. Moreover, the limited vehicle perception and traffic uncertainty introduce potential safety hazards, hampering the real-world application of RL-based navigation models. In this paper, we proposed a novel end-to-end urban navigation framework with decision hindsight. By formulating the problem of Partially Observable Markov Decision Process (POMDP), we employ a causal Transformer-based autoregressive modeling approach to process the historical navigation information as supplementary observations. We then combine these historical observations with current perceptions to construct a history-feedforward state representation that enhances global awareness, improving data availability and decision predictability. Furthermore, by integrating the historical-feedforward state encoding upstream, we develop an end-to-end learning framework based on RL to obtain a navigation model with decision hindsight, enabling more reliable navigation. To validate the effectiveness of our proposed method, we conduct experiments on challenging urban navigation tasks using the CARLA simulator. The results demonstrate that our method achieves higher learning efficiency and improved driving performance, taking priority over prior methods on urban navigation benchmarks.

This work was supported by Shandong Provincial Natural Science Foundation, China under Grant ZR2022LZH002.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmed, M., Abobakr, A., Lim, C.P., Nahavandi, S.: Policy-based reinforcement learning for training autonomous driving agents in urban areas with affordance learning. IEEE Trans. Intell. Transp. Syst. 23(8), 12562–12571 (2022). https://doi.org/10.1109/TITS.2021.3115235
Article Google Scholar
Chen, D., Koltun, V., Krähenbühl, P.: Learning to drive from a world on rails. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021, pp. 15570–15579. IEEE (2021). https://doi.org/10.1109/ICCV48922.2021.01530
Chen, J., Li, S.E., Tomizuka, M.: Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning. IEEE Trans. Intell. Transp. Syst. 23(6), 5068–5078 (2022). https://doi.org/10.1109/TITS.2020.3046646
Article Google Scholar
Chen, J., Yuan, B., Tomizuka, M.: Model-free deep reinforcement learning for urban autonomous driving. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 2765–2771 (2019). https://doi.org/10.1109/ITSC.2019.8917306
Chen, L., et al.: Decision transformer: Reinforcement learning via sequence modeling. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021(December), pp. 6–14, 2021. Virtual, pp. 15084–15097 (2021). https://proceedings.neurips.cc/paper/2021/hash/7f489f642a0ddb10272b5c31057f0663-Abstract.html
Chen, M., Xiao, X., Zhang, W., Gao, X.: Efficient and stable information directed exploration for continuous reinforcement learning. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4023–4027 (2022). https://doi.org/10.1109/ICASSP43922.2022.9746211
Chitta, K., Prakash, A., Geiger, A.: NEAT: neural attention fields for end-to-end autonomous driving. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021, pp. 15773–15783. IEEE (2021). https://doi.org/10.1109/ICCV48922.2021.01550
Codevilla, F., Santana, E., López, A.M., Gaidon, A.: Exploring the limitations of behavior cloning for autonomous driving. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 9328–9337. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00942
Deshpande, N., Vaufreydaz, D., Spalanzani, A.: Navigation in urban environments amongst pedestrians using multi-objective deep reinforcement learning. In: 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), pp. 923–928 (2021). https://doi.org/10.1109/ITSC48978.2021.9564601
Dosovitskiy, A., Ros, G., Codevilla, F., López, A.M., Koltun, V.: CARLA: an open urban driving simulator. In: 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, California, USA, 13–15 November 2017, Proceedings. Proceedings of Machine Learning Research, vol. 78, pp. 1–16. PMLR (2017). http://proceedings.mlr.press/v78/dosovitskiy17a.html
Fakoor, R., Chaudhari, P., Soatto, S., Smola, A.J.: Meta-q-learning. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=SJeD3CEFPH
Huang, C., et al.: Deductive reinforcement learning for visual autonomous urban driving navigation. IEEE Trans. Neural Netw. Learn. Syst. 32(12), 5379–5391 (2021). https://doi.org/10.1109/TNNLS.2021.3109284
Article Google Scholar
Janner, M., Li, Q., Levine, S.: Offline reinforcement learning as one big sequence modeling problem. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021(December), pp. 6–14, 2021. virtual, pp. 1273–1286 (2021). https://proceedings.neurips.cc/paper/2021/hash/099fe6b0b444c23836c4a5d07346082b-Abstract.html
Kargar, E., Kyrki, V.: Increasing the efficiency of policy learning for autonomous vehicles by multi-task representation learning. IEEE Trans. Intell. Veh. 7(3), 701–710 (2022). https://doi.org/10.1109/TIV.2022.3149891
Article Google Scholar
Khalil, Y.H., Mouftah, H.T.: Exploiting multi-modal fusion for urban autonomous driving using latent deep reinforcement learning. IEEE Trans. Veh. Technol. 72(3), 2921–2935 (2023). https://doi.org/10.1109/TVT.2022.3217299
Article Google Scholar
Li, W., Luo, H., Lin, Z., Zhang, C., Lu, Z., Ye, D.: A survey on transformers in reinforcement learning. CoRR abs/2301.03044 (2023). https://doi.org/10.48550/arXiv.2301.03044
Liu, H., Huang, Z., Wu, J., Lv, C.: Improved deep reinforcement learning with expert demonstrations for urban autonomous driving. In: 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 921–928 (2022). https://doi.org/10.1109/IV51971.2022.9827073
Loynd, R., Fernandez, R., Celikyilmaz, A., Swaminathan, A., Hausknecht, M.J.: Working memory graphs. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp. 6404–6414. PMLR (2020). http://proceedings.mlr.press/v119/loynd20a.html
Melo, L.C.: Transformers are meta-reinforcement learners. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvári, C., Niu, G., Sabato, S. (eds.) International Conference on Machine Learning, ICML 2022, 17–23 July 2022, Baltimore, Maryland, USA. Proceedings of Machine Learning Research, vol. 162, pp. 15340–15359. PMLR (2022). https://proceedings.mlr.press/v162/melo22a.html
Parisotto, E., et al.: Stabilizing transformers for reinforcement learning. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp. 7487–7498. PMLR (2020). http://proceedings.mlr.press/v119/parisotto20a.html
Prakash, A., Chitta, K., Geiger, A.: Multi-modal fusion transformer for end-to-end autonomous driving. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 7077–7087. Computer Vision Foundation/IEEE (2021). https://doi.org/10.1109/CVPR46437.2021.00700, https://openaccess.thecvf.com/content/CVPR2021/html/Prakash_Multi-Modal_Fusion_Transformer_for_End-to-End_Autonomous_Driving_CVPR_2021_paper.html
Schulman, J., Moritz, P., Levine, S., Jordan, M.I., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1506.02438
Schwarting, W., Alonso-Mora, J., Rus, D.: Planning and decision-making for autonomous vehicles. Ann. Rev. Control Robot. Autonom. Syst. 1(1), 187–210 (2018). https://doi.org/10.1146/annurev-control-060117-105157
Article Google Scholar
Shao, H., Wang, L., Chen, R., Li, H., Liu, Y.: Safety-enhanced autonomous driving using interpretable sensor fusion transformer. In: Liu, K., Kulic, D., Ichnowski, J. (eds.) Conference on Robot Learning, CoRL 2022, 14–18 December 2022, Auckland, New Zealand. Proceedings of Machine Learning Research, vol. 205, pp. 726–737. PMLR (2022). https://proceedings.mlr.press/v205/shao23a.html
Vinyals, O., et al.: Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019). https://doi.org/10.1038/s41586-019-1724-z
Article Google Scholar
Wang, C., Wang, J., Shen, Y., Zhang, X.: Autonomous navigation of UAVs in large-scale complex environments: a deep reinforcement learning approach. IEEE Trans. Veh. Technol. 68(3), 2124–2136 (2019). https://doi.org/10.1109/TVT.2018.2890773
Article Google Scholar
Wang, P., Chan, C.Y.: Formulation of deep reinforcement learning architecture toward autonomous driving for on-ramp merge. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pp. 1–6 (2017). https://doi.org/10.1109/ITSC.2017.8317735
Wu, J., Huang, W., de Boer, N., Mo, Y., He, X., Lv, C.: Safe decision-making for lane-change of autonomous vehicles via human demonstration-aided reinforcement learning. In: 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), pp. 1228–1233 (2022). https://doi.org/10.1109/ITSC55140.2022.9921872
Zambaldi, V.F., et al.: Deep reinforcement learning with relational inductive biases. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019). https://openreview.net/forum?id=HkxaFoC9KQ
Zhang, Z., Liniger, A., Dai, D., Yu, F., Van Gool, L.: End-to-end urban driving by imitating a reinforcement learning coach. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 15202–15212 (2021). https://doi.org/10.1109/ICCV48922.2021.01494

Download references

Author information

Authors and Affiliations

Inspur (Beijing) Electronic Information Industry Co., Ltd., Beijing, 100085, China
Qi Deng, Guangqing Liu, Ruyang Li, Qifu Hu, Yaqian Zhao & Rengang Li
Shandong Massive Information Technology Research Institute, Jinan, 250101, China
Qi Deng & Ruyang Li

Authors

Qi Deng
View author publications
You can also search for this author in PubMed Google Scholar
Guangqing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ruyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Qifu Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yaqian Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Rengang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruyang Li .

Editor information

Editors and Affiliations

School of Automation, Central South University, Changsha, China
Biao Luo
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Long Cheng
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China
Zheng-Guang Wu
School of Automation, Guangdong University of Technology, Guangzhou, China
Hongyi Li
School of Electrical Engineering and Telecommunications, UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deng, Q., Liu, G., Li, R., Hu, Q., Zhao, Y., Li, R. (2024). End-to-End Urban Autonomous Navigation with Decision Hindsight. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1969. Springer, Singapore. https://doi.org/10.1007/978-981-99-8184-7_6

Download citation

DOI: https://doi.org/10.1007/978-981-99-8184-7_6
Published: 26 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8183-0
Online ISBN: 978-981-99-8184-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

End-to-End Urban Autonomous Navigation with Decision Hindsight