PilotAttnNet: Multi-modal Attention Network for End-to-End Steering Control

Zhang, Jincan; Song, Zhenbo; Lu, Jianfeng; Qu, Xingwei; Fan, Zhaoxin

doi:10.1007/978-3-031-18913-5_14

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13536))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1656 Accesses

Abstract

Vision-based end-to-end steering control is a popular and challenging task in autonomous driving. Previous methods take single image or image sequence as input and predict the steering control angle by deep neural networks. The image contains rich color and texture information, but it lacks spatial information. In this work, we thus incorporate LiDAR data to provide spatial structures, and propose a novel multi-modal attention model named PilotAttnNet for end-to-end steering angle prediction. We also present a new end-to-end self-driving dataset, Pandora-Driving, which provides synchronized LiDAR and image sequences, as well as corresponding standard driving behaviors. Our dataset includes rich driving scenarios, such as urban, country, and off-road. Extensive experiments are conducted on both publicly available LiVi-Set and our Pandora-Driving dataset, showing the great performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

YOLOP: You Only Look Once for Panoptic Driving Perception

Article Open access 07 November 2022

Learning vision based autonomous lateral vehicle control without supervision

Article Open access 25 February 2023

Adaptive Convolutional Neural Network for Predicting Steering Angle and Acceleration on Autonomous Driving Scenario

References

Best, A., Narang, S., Pasqualin, L., Barber, D., Manocha, D.: AutonoVi-Sim: autonomous vehicle simulation platform with weather, sensing, and traffic control. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1048–1056 (2018)
Google Scholar
Bewley, A., et al.: Learning to drive from simulation without real world labels. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 4818–4824. IEEE (2019)
Google Scholar
Bogoslavskyi, I., Stachniss, C.: Efficient online segmentation for sparse 3D laser scans. PFG-J. Photogramm. Remote Sens. Geoinf. Sci. 85(1), 41–52 (2017)
Google Scholar
Bojarski, M., et al.: End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016)
Bojarski, M., et al.: Explaining how a deep neural network trained with end-to-end learning steers a car. arXiv preprint arXiv:1704.07911 (2017)
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1907–1915 (2017)
Google Scholar
Chen, Y., et al.: Lidar-video driving dataset: learning driving policies effectively. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5870–5878 (2018)
Google Scholar
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: Carla: an open urban driving simulator. In: Conference on Robot Learning, pp. 1–16. PMLR (2017)
Google Scholar
Fan, Z., Zhu, Y., He, Y., Sun, Q., Liu, H., He, J.: Deep learning on monocular object pose detection and tracking: a comprehensive overview. arXiv e-prints, pp. arXiv-2105 (2021)
Google Scholar
Kim, J., Canny, J.: Interpretable learning for self-driving cars by visualizing causal attention. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2942–2950 (2017)
Google Scholar
Krejsa, J., Věchet, S., Hrbáček, J., Schreiber, P.: High level software architecture for autonomous mobile robot. In: Brezina, T., Jablonski, R. (eds.) Recent Advances in Mechatronics, pp. 185–190. Springer, Cham (2010). https://doi.org/10.1007/978-3-642-05022-0_32
Pomerleau, D.: An autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, vol. 1 (1998)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Sobh, I., et al.: End-to-end multi-modal sensors fusion system for urban automated driving (2018)
Google Scholar
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Xu, H., Gao, Y., Yu, F., Darrell, T.: End-to-end learning of driving models from large-scale video datasets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2174–2182 (2017)
Google Scholar
Xu, K., Xiao, X., Miao, J., Luo, Q.: Data driven prediction architecture for autonomous driving and its application on Apollo platform. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 175–181. IEEE (2020)
Google Scholar
Yang, S., Mao, X., Yang, S., Liu, Z.: Towards a hybrid software architecture and multi-agent approach for autonomous robot software. Int. J. Adv. Rob. Syst. 14(4), 1729881417716088 (2017)
Google Scholar
Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2636–2645 (2020)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zeng, W., et al.: End-to-end interpretable neural motion planner. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8660–8669 (2019)
Google Scholar
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Jincan Zhang, Zhenbo Song & Jianfeng Lu
Arcisstr. 21, 80333, Munich, Germany
Xingwei Qu
School of Information, Renmin University of China, Beijing, 100872, China
Zhaoxin Fan

Authors

Jincan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenbo Song
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Xingwei Qu
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoxin Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianfeng Lu .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Shiqi Yu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhaoxiang Zhang
Hong Kong Baptist University, Hong Kong, China
Pong C. Yuen
Northwestern Polytechnical University, Xi'an, China
Junwei Han
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hong Kong Baptist University, Hong Kong, China
Yike Guo
Sun Yat-sen University, Guangzhou, China
Jianhuang Lai
Southern University of Science and Technology, Shenzhen, China
Jianguo Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Song, Z., Lu, J., Qu, X., Fan, Z. (2022). PilotAttnNet: Multi-modal Attention Network for End-to-End Steering Control. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-18913-5_14
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18912-8
Online ISBN: 978-3-031-18913-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

PilotAttnNet: Multi-modal Attention Network for End-to-End Steering Control

Abstract

Access this chapter

Similar content being viewed by others

YOLOP: You Only Look Once for Panoptic Driving Perception

Learning vision based autonomous lateral vehicle control without supervision

Adaptive Convolutional Neural Network for Predicting Steering Angle and Acceleration on Autonomous Driving Scenario

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

PilotAttnNet: Multi-modal Attention Network for End-to-End Steering Control

Abstract

Access this chapter

Similar content being viewed by others

YOLOP: You Only Look Once for Panoptic Driving Perception

Learning vision based autonomous lateral vehicle control without supervision

Adaptive Convolutional Neural Network for Predicting Steering Angle and Acceleration on Autonomous Driving Scenario

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation