Abstract
The interest in autonomous vehicles has increased exponentially in recent years. While Lidar is a proven autonomous driving technology, end-to-end learning approaches have become popular as computer performance has improved. A fully end-to-end method—NVIDIA’s PilotNet has shown its ability to predict speed and steering angle with only camera images. This method achieved the Lidar-based methods’ performance in simple driving tasks. However, a significant drawback was no past spatiotemporal information, imposing an error-sensitive performance, especially in complex driving tasks. Spurred by this deficiency, this paper introduces two novel models: CNN + LSTM and CNN3D, aiming for complex autonomous driving tasks in indoor environments.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Change history
27 May 2023
Missing Open Access funding information has been added in the Funding Note.
References
Automated vehicles for safety. https://www.nhtsa.gov/technology-innovation/automated-vehicles-safety
Petrović, D., Mijailović, R., Pešić, D.: Traffic accidents with autonomous vehicles: type of collisions, manoeuvres and errors of conventional vehicles’ drivers. Transp. Res. Procedia 45, 161–168 (2020). https://doi.org/10.1016/j.trpro.2020.03.003. Transport infrastructure and systems in a changing world, towards a more sustainable, reliable and smarter mobility. TIS Roma 2019 conference proceedings
Mugunthan, N., Balaji, S.B., Harini, C., Naresh, V.H., Prasannaa Venkatesh, V.: Comparison review on lidar vs camera in autonomous vehicle. In: International research journal of engineering and technology (IRJET), vol. 07, pp. 4242–4246 (2020). https://www.irjet.net/archives/V7/i8/IRJET-V7I8731.pdf
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J., Bottou, L., Weinberger, K.Q. (eds.) Advances in neural information processing systems, vol. 25. Curran Associates, Inc., New York. https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf (2012)
Bojarski, M., Testa, D.D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., Zhang, X., Zhao, J., Zieba, K.: End to end learning for self-driving cars. arXiv:https://arxiv.org/abs/1604.07316 (2016)
Bojarski, M., Yeres, P., Choromanska, A., Choromanski, K., Firner, B., Jackel, L., Muller, U.: Explaining how a deep neural network trained with end-to-end learning steers a car. https://doi.org/10.48550. arXiv:1704.07911 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Ji, S., Xu, W., Yang, M., Yu, K.: 3d convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013). https://doi.org/10.1109/TPAMI.2012.59
Pomerleau, D.A.: Alvinn: an autonomous land vehicle in a neural network. In: Touretzky, D. (ed.) Advances in neural information processing systems, vol. 1. Morgan-Kaufmann, Burlington. https://proceedings.neurips.cc/paper/1988/file/812b4ba287f5ee0bc9d43bbf5bbe87fb-Paper.pdf (1988)
Maqueda, A.I., Loquercio, A., Gallego, G., García, N., Scaramuzza, D.: Event-based vision meets deep learning on steering prediction for self-driving cars. arXiv:https://arxiv.org/abs/1804.01310 (2018)
Codevilla, F., Müller, M., Dosovitskiy, A., López, A.M., Koltun, V.: End-to-end driving via conditional imitation learning. arXiv:https://arxiv.org/abs/1710.02410 (2017)
Wang, Q., Chen, L., Tian, B., Tian, W., Li, L., Cao, D.: End-to-end autonomous driving: an angle branched network approach. IEEE Trans. Veh. Technol. 68(12), 11599–11610 (2019). https://doi.org/10.1109/TVT.2019.2921918
Yang, Z., Zhang, Y., Yu, J., Cai, J., Luo, J.: End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions. In: 2018 24th international conference on pattern recognition (ICPR), pp. 2289–2294. https://doi.org/10.1109/ICPR.2018.8546189 (2018)
Wang, T., Luo, Y., Liu, J., Chen, R., Li, K.: End-to-end self-driving approach independent of irrelevant roadside objects with auto-encoder. IEEE Trans. Intell. Transp. Syst. 23(1), 641–650 (2022). https://doi.org/10.1109/TITS.2020.3018473
Drews, P., Williams, G., Goldfain, B., Theodorou, E.A., Rehg, J.M.: Vision-based high-speed driving with a deep dynamic observer. IEEE Robot. Autom. Lett. 4(2), 1564–1571 (2019). https://doi.org/10.1109/LRA.2019.2896449
Chi, L., Mu, Y.: Deep steering: learning end-to-end driving model from spatial and temporal visual cues. arXiv:https://arxiv.org/abs/1708.03798 (2017)
Hou, Y., Hornauer, S., Zipser, K.: Fast recurrent fully convolutional networks for direct perception in autonomous driving. arXiv:https://arxiv.org/abs/1711.06459 (2017)
Okamoto, K., Itti, L., Tsiotras, P.: Vision-based autonomous path following using a human driver control model with reliable input-feature value estimation. IEEE Trans. Intell. Veh. 4(3), 497–506 (2019). https://doi.org/10.1109/TIV.2019.2919476
Chen, C., Seff, A., Kornhauser, A., Xiao, J.: Deepdriving: learning affordance for direct perception in autonomous driving. In: 2015 IEEE international conference on computer vision (ICCV), pp. 2722–2730. https://doi.org/10.1109/ICCV.2015.312 (2015)
Chen, S., Zhang, S., Shang, J., Chen, B., Zheng, N.: Brain-inspired cognitive model with attention for self-driving cars. IEEE Trans. Cogn. Dev. Syst. 11(1), 13–25 (2019). https://doi.org/10.1109/TCDS.2017.2717451
Weiss, T., Behl, M.: Deepracing: parameterized trajectories for autonomous racing. arXiv:https://arxiv.org/abs/2005.05178 (2020)
Xu, H., Gao, Y., Yu, F., Darrell, T.: End-to-end learning of driving models from large-scale video datasets. arXiv:https://arxiv.org/abs/1612.01079 (2016)
Chen, Y., Praveen, P., Priyantha, M., Muelling, K., Dolan, J.: Learning on-road visual control for self-driving vehicles with auxiliary tasks. In: 2019 IEEE winter conference on applications of computer vision (WACV), pp. 331–338. https://doi.org/10.1109/WACV.2019.00041 (2019)
Pierre, J.M.: End-to-end deep learning for robotic following. In: Proceedings of the 2018 2nd international conference on mechatronics systems and control engineering. ICMSCE 2018, pp. 77–85. Association for Computing Machinery, New York. https://doi.org/10.1145/3185066.3185084 (2018)
Jeong, S.-G., Kim, J., Kim, S., Min, J.: End-to-end learning of image based lane-change decision. https://doi.org/10.48550. arXiv:1706.08211 (2017)
Jaritz, M., de Charette, R., Toromanoff, M., Perot, E., Nashashibi, F.: End-to-end race driving with deep reinforcement learning. arXiv:https://arxiv.org/abs/1807.02371 (2018)
Liang, X., Wang, T., Yang, L., Xing, E.P.: CIRL: controllable imitative reinforcement learning for vision-based self-driving. arXiv:1807.03776 (2018)
Sallab, A., Abdou, M., Perot, E., Yogamani, S.: Deep reinforcement learning framework for autonomous driving. Electron. Imaging 2017, 70–76 (2017). https://doi.org/10.2352/ISSN.2470-1173.2017.19.AVM-023
Nie, J., Yan, J., Yin, H., Ren, L., Meng, Q.: A multimodality fusion deep neural network and safety test strategy for intelligent vehicles. IEEE Trans. Intell. Veh. 6(2), 310–322 (2021). https://doi.org/10.1109/TIV.2020.3027319
Sobh, I., Amin, L., Abdelkarim, S., Elmadawy, K., Saeed, M., Abdeltawab, O., Gamal, M.E., Sallab, A.E.: End-to-end multi-modal sensors fusion system for urban automated driving (2018)
Bräunl, T.: Embedded Robotics: Mobile Robot Design and Applications with Embedded Systems, 3rd edn. Springer Berlin, Heidelberg (2006)
Clevert, D., Unterthiner, T., Hochreiter, S.: Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). https://doi.org/10.48550/arXiv.1511.07289 (2015)
Géron, A.: Hands-on Machine Learning with Scikit-Learn and TensorFlow : Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, Sebastopol (2017)
Bräunl, T.: EyeSim VR. https://robotics.ee.uwa.edu.au/eyesim/ (2022)
Gedraite, E.S., Hadad, M.: Investigation on the effect of a gaussian blur in image filtering and segmentation. In: Proceedings ELMAR-2011, pp. 393–396 (2011)
Wong, S.C., Gatt, A., Stamatescu, V., McDonnell, M.D.: Understanding data augmentation for classification: when to warp?. In: 2016 International conference on digital image computing: techniques and applications (DICTA), pp. 1–6. https://doi.org/10.1109/DICTA.2016.7797091 (2016)
Ungurean, D.: Deeprcar: an Autonomous Car Model. Czech Technical University, Prague (2018)
Yang, S., Wang, W., Liu, C., Deng, K., Hedrick, J.K.: Feature analysis and selection for training an end-to-end autonomous vehicle controller using the deep learning approach. arXiv:https://arxiv.org/abs/1703.09744 (2017)
Bojarski, M., Chen, C., Daw, J., Degirmenci, A., Deri, J., Firner, B., Flepp, B., Gogri, S., Hong, J., Jackel, L.D., Jia, Z., Lee, B.J., Liu, B., Liu, F., Muller, U., Payne, S., Prasad, N.K.N., Provodin, A., Roach, J., Rvachov, T., Tadimeti, N., van Engelen, J.E., Wen H., Yang, E., Yang, Z.: The NVIDIA pilotnet experiments. arXiv:https://arxiv.org/abs/2010.08776 (2020)
Acknowledgements
The authors would like to thank Anthony Ryan, Omar Anwar, and Michael Mollison for their previous works on the ModCar project. The authors would also like to thank Nyi MyoMaung for his assistance during the ModCar experiments. Finally, the authors would like to express their gratitude to Nvidia for donating two GPUs used for the neural network training.
Funding
Open Access funding enabled and organized by CAUL and its Member Institutions
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
(MP4 44.3 MB)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lai, Z., Bräunl, T. End-to-End Learning with Memory Models for Complex Autonomous Driving Tasks in Indoor Environments. J Intell Robot Syst 107, 37 (2023). https://doi.org/10.1007/s10846-022-01801-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-022-01801-2