Abstract
Generating depth maps using mono- or stereo- images is a topic of active research. This paper is dedicated to study of different methods of depth and disparity maps generating. It includes analysis of existing methods of depth maps generating and investigation and improvement of the real-time neural-network based method AnyNet. Our approach AnyNet-M related to the models which are make prediction stage-by-stage and increase quality of disparity estimation over time. Firstly, proposed method was tested on KITTI Stereo Dataset and custom OpenTaganrog dataset with images of 1242 × 375 resolution. The method was proved to be real-time with approximately 38 FPS using one TeslaV100 GPU. Secondly, quality of proposed model was tested and reached 5% of Average 3-Pixel Error on KITTI dataset. Finally, it was integrated to Robotic operation system for further use as part of the navigation systems of unmanned vehicles.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bokovoy, A., Muravyev, K., Yakovlev, K.: Real-time vision-based depth reconstruction with Nvidia Jetson. In: 2019 European Conference on Mobile Robots (ECMR), pp. 1–6 (2019)
Lee, W., Park, N., Woo, W.: Depth-assisted real-time 3D object detection for augmented reality. ICAT 11(2), 126–132 (2011)
Moreno-Noguer, F., Belhumeur, P.N., Nayar, S.K.: Active refocusing of images and videos. ACM Trans. Graph. 26, 67 (2007). https://doi.org/10.1145/1276377.1276461
Staroverov, A., Yudin, D.A., Belkin, I., Adeshkin, V., Solomentsev, Y.K., Panov, A.I.: Real-time object navigation with deep neural networks and hierarchical reinforcement learning. IEEE Access 8, 195608–195621 (2020)
Hazirbas, C., Ma, L., Domokos, C., Cremers, D.: FuseNet: incorporating depth into semantic segmentation via fusion-based CNN architecture. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10111, pp. 213–228. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54181-5_14
Spangenberg, R., Langner, T., Adfeldt, S., Rojas, R.: Large scale semi-global matching on the CPU. In: 2014 IEEE Intelligent Vehicles Symposium Proceedings, pp. 195–201. IEEE (2014)
Hernandez-Juarez, D., Chacón, A., Espinosa, A., Vázquez, D., Moure, J.C., López, A.M.: Embedded real-time stereo estimation via semi-global matching on the GPU. Procedia Comput. Sci. 80, 143–153 (2016)
Wang, Y., et al.: Anytime stereo image depth estimation on mobile devices. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 5893–5900. IEEE (2019)
Alhashim, I., Wonka, P.: High quality monocular depth estimation via transfer learning. arXiv preprint arXiv:1812.11941 (2018)
Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2002–2011 (2018)
Godard, C., Aodha, O.M., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3828–3838 (2019)
Fonder, M., Ernst, D., Van Droogenbroeck, M.: M4Depth: a motion-based approach for monocular depth estimation on video sequences. arXiv preprint arXiv:2105.09847 (2021)
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. arXiv preprint arXiv:2103.13413 (2021)
Wang, Y., et al.: Anytime stereo image depth estimation on mobile devices. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 5893–5900. IEEE (2019)
Zhang, F., Prisacariu, V., Yang, R., Torr, P.H.: Ga-net: guided aggregation net for end-to-end stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 185–194 (2019)
Xu, H., Zhang, J.: Aanet: adaptive aggregation network for efficient stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1959–1968 (2020)
Chen, Y., Liu, S., Shen, X., Jia, J.: Dsgn: Deep stereo geometry network for 3d object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12536–12545 (2020)
Couprie, C., Farabet, C., Najman, L., LeCun, Y.: Indoor semantic segmentation using depth information. arXiv preprint arXiv:1301.3572 (2013)
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)
Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3061–3070 (2015)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)
Li, Z., Snavely, N.: Megadepth: learning single-view depth prediction from internet photos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2041–2050 (2018)
Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. arXiv preprint arXiv:1907.01341 (2019)
Liu, S., De Mello, S., Gu, J., Zhong, G., Yang, M.H., Kautz, J.: Learning affinity via spatial propagation networks. arXiv preprint arXiv:1710.01020 (2017)
Shepel, I., Adeshkin, V., Belkin, I., Yudin, D.: Occupancy Grid Generation with Dynamic Obstacle Segmentation in Stereo Images (2021)
Acknowledgments
This work was supported in part of theoretical investigation and methodology by the Russian Science Foundation under Grant 20–71-10116. Also, the authors are grateful to Integrant LLC for consulting and providing a dataset obtained using a real robot for experimental research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kasatkin, N., Yudin, D. (2022). Real-Time Approach to Neural Network-Based Disparity Map Generation from Stereo Images. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y., Klimov, V.V. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research V. NEUROINFORMATICS 2021. Studies in Computational Intelligence, vol 1008. Springer, Cham. https://doi.org/10.1007/978-3-030-91581-0_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-91581-0_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91580-3
Online ISBN: 978-3-030-91581-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)