Abstract
Vision-based end-to-end steering control is a popular and challenging task in autonomous driving. Previous methods take single image or image sequence as input and predict the steering control angle by deep neural networks. The image contains rich color and texture information, but it lacks spatial information. In this work, we thus incorporate LiDAR data to provide spatial structures, and propose a novel multi-modal attention model named PilotAttnNet for end-to-end steering angle prediction. We also present a new end-to-end self-driving dataset, Pandora-Driving, which provides synchronized LiDAR and image sequences, as well as corresponding standard driving behaviors. Our dataset includes rich driving scenarios, such as urban, country, and off-road. Extensive experiments are conducted on both publicly available LiVi-Set and our Pandora-Driving dataset, showing the great performance of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Best, A., Narang, S., Pasqualin, L., Barber, D., Manocha, D.: AutonoVi-Sim: autonomous vehicle simulation platform with weather, sensing, and traffic control. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1048–1056 (2018)
Bewley, A., et al.: Learning to drive from simulation without real world labels. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 4818–4824. IEEE (2019)
Bogoslavskyi, I., Stachniss, C.: Efficient online segmentation for sparse 3D laser scans. PFG-J. Photogramm. Remote Sens. Geoinf. Sci. 85(1), 41–52 (2017)
Bojarski, M., et al.: End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016)
Bojarski, M., et al.: Explaining how a deep neural network trained with end-to-end learning steers a car. arXiv preprint arXiv:1704.07911 (2017)
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1907–1915 (2017)
Chen, Y., et al.: Lidar-video driving dataset: learning driving policies effectively. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5870–5878 (2018)
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: Carla: an open urban driving simulator. In: Conference on Robot Learning, pp. 1–16. PMLR (2017)
Fan, Z., Zhu, Y., He, Y., Sun, Q., Liu, H., He, J.: Deep learning on monocular object pose detection and tracking: a comprehensive overview. arXiv e-prints, pp. arXiv-2105 (2021)
Kim, J., Canny, J.: Interpretable learning for self-driving cars by visualizing causal attention. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2942–2950 (2017)
Krejsa, J., Věchet, S., Hrbáček, J., Schreiber, P.: High level software architecture for autonomous mobile robot. In: Brezina, T., Jablonski, R. (eds.) Recent Advances in Mechatronics, pp. 185–190. Springer, Cham (2010). https://doi.org/10.1007/978-3-642-05022-0_32
Pomerleau, D.: An autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, vol. 1 (1998)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Sobh, I., et al.: End-to-end multi-modal sensors fusion system for urban automated driving (2018)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Xu, H., Gao, Y., Yu, F., Darrell, T.: End-to-end learning of driving models from large-scale video datasets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2174–2182 (2017)
Xu, K., Xiao, X., Miao, J., Luo, Q.: Data driven prediction architecture for autonomous driving and its application on Apollo platform. In: 2020 IEEE Intelligent Vehicles Symposium (IV), pp. 175–181. IEEE (2020)
Yang, S., Mao, X., Yang, S., Liu, Z.: Towards a hybrid software architecture and multi-agent approach for autonomous robot software. Int. J. Adv. Rob. Syst. 14(4), 1729881417716088 (2017)
Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2636–2645 (2020)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zeng, W., et al.: End-to-end interpretable neural motion planner. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8660–8669 (2019)
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, J., Song, Z., Lu, J., Qu, X., Fan, Z. (2022). PilotAttnNet: Multi-modal Attention Network for End-to-End Steering Control. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-18913-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18912-8
Online ISBN: 978-3-031-18913-5
eBook Packages: Computer ScienceComputer Science (R0)