Skip to main content

Real-Time Approach to Neural Network-Based Disparity Map Generation from Stereo Images

  • Conference paper
  • First Online:
Advances in Neural Computation, Machine Learning, and Cognitive Research V (NEUROINFORMATICS 2021)

Abstract

Generating depth maps using mono- or stereo- images is a topic of active research. This paper is dedicated to study of different methods of depth and disparity maps generating. It includes analysis of existing methods of depth maps generating and investigation and improvement of the real-time neural-network based method AnyNet. Our approach AnyNet-M related to the models which are make prediction stage-by-stage and increase quality of disparity estimation over time. Firstly, proposed method was tested on KITTI Stereo Dataset and custom OpenTaganrog dataset with images of 1242 × 375 resolution. The method was proved to be real-time with approximately 38 FPS using one TeslaV100 GPU. Secondly, quality of proposed model was tested and reached 5% of Average 3-Pixel Error on KITTI dataset. Finally, it was integrated to Robotic operation system for further use as part of the navigation systems of unmanned vehicles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bokovoy, A., Muravyev, K., Yakovlev, K.: Real-time vision-based depth reconstruction with Nvidia Jetson. In: 2019 European Conference on Mobile Robots (ECMR), pp. 1–6 (2019)

    Google Scholar 

  2. Lee, W., Park, N., Woo, W.: Depth-assisted real-time 3D object detection for augmented reality. ICAT 11(2), 126–132 (2011)

    Google Scholar 

  3. Moreno-Noguer, F., Belhumeur, P.N., Nayar, S.K.: Active refocusing of images and videos. ACM Trans. Graph. 26, 67 (2007). https://doi.org/10.1145/1276377.1276461

    Article  Google Scholar 

  4. Staroverov, A., Yudin, D.A., Belkin, I., Adeshkin, V., Solomentsev, Y.K., Panov, A.I.: Real-time object navigation with deep neural networks and hierarchical reinforcement learning. IEEE Access 8, 195608–195621 (2020)

    Article  Google Scholar 

  5. Hazirbas, C., Ma, L., Domokos, C., Cremers, D.: FuseNet: incorporating depth into semantic segmentation via fusion-based CNN architecture. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10111, pp. 213–228. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54181-5_14

    Chapter  Google Scholar 

  6. Spangenberg, R., Langner, T., Adfeldt, S., Rojas, R.: Large scale semi-global matching on the CPU. In: 2014 IEEE Intelligent Vehicles Symposium Proceedings, pp. 195–201. IEEE (2014)

    Google Scholar 

  7. Hernandez-Juarez, D., Chacón, A., Espinosa, A., Vázquez, D., Moure, J.C., López, A.M.: Embedded real-time stereo estimation via semi-global matching on the GPU. Procedia Comput. Sci. 80, 143–153 (2016)

    Article  Google Scholar 

  8. Wang, Y., et al.: Anytime stereo image depth estimation on mobile devices. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 5893–5900. IEEE (2019)

    Google Scholar 

  9. Alhashim, I., Wonka, P.: High quality monocular depth estimation via transfer learning. arXiv preprint arXiv:1812.11941 (2018)

  10. Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2002–2011 (2018)

    Google Scholar 

  11. Godard, C., Aodha, O.M., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3828–3838 (2019)

    Google Scholar 

  12. Fonder, M., Ernst, D., Van Droogenbroeck, M.: M4Depth: a motion-based approach for monocular depth estimation on video sequences. arXiv preprint arXiv:2105.09847 (2021)

  13. Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. arXiv preprint arXiv:2103.13413 (2021)

  14. Wang, Y., et al.: Anytime stereo image depth estimation on mobile devices. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 5893–5900. IEEE (2019)

    Google Scholar 

  15. Zhang, F., Prisacariu, V., Yang, R., Torr, P.H.: Ga-net: guided aggregation net for end-to-end stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 185–194 (2019)

    Google Scholar 

  16. Xu, H., Zhang, J.: Aanet: adaptive aggregation network for efficient stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1959–1968 (2020)

    Google Scholar 

  17. Chen, Y., Liu, S., Shen, X., Jia, J.: Dsgn: Deep stereo geometry network for 3d object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12536–12545 (2020)

    Google Scholar 

  18. Couprie, C., Farabet, C., Najman, L., LeCun, Y.: Indoor semantic segmentation using depth information. arXiv preprint arXiv:1301.3572 (2013)

  19. Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4040–4048 (2016)

    Google Scholar 

  20. Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3061–3070 (2015)

    Google Scholar 

  21. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)

    Google Scholar 

  22. Li, Z., Snavely, N.: Megadepth: learning single-view depth prediction from internet photos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2041–2050 (2018)

    Google Scholar 

  23. Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. arXiv preprint arXiv:1907.01341 (2019)

  24. Liu, S., De Mello, S., Gu, J., Zhong, G., Yang, M.H., Kautz, J.: Learning affinity via spatial propagation networks. arXiv preprint arXiv:1710.01020 (2017)

  25. Shepel, I., Adeshkin, V., Belkin, I., Yudin, D.: Occupancy Grid Generation with Dynamic Obstacle Segmentation in Stereo Images (2021)

    Google Scholar 

Download references

Acknowledgments

This work was supported in part of theoretical investigation and methodology by the Russian Science Foundation under Grant 20–71-10116. Also, the authors are grateful to Integrant LLC for consulting and providing a dataset obtained using a real robot for experimental research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikita Kasatkin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kasatkin, N., Yudin, D. (2022). Real-Time Approach to Neural Network-Based Disparity Map Generation from Stereo Images. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y., Klimov, V.V. (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research V. NEUROINFORMATICS 2021. Studies in Computational Intelligence, vol 1008. Springer, Cham. https://doi.org/10.1007/978-3-030-91581-0_35

Download citation

Publish with us

Policies and ethics