Advertisement

An Improved Convolutional Neural Network for Monocular Depth Estimation

  • Jing Kang
  • Anrong DangEmail author
  • Bailing Zhang
  • Yongming Wang
  • Hang Su
  • Fei Su
  • Tianyu Ci
  • Fangping Wang
Conference paper
  • 5 Downloads
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 617)

Abstract

Depth estimation from monocular image plays an essential role in artificial intelligence, which is one of the important ways for sensing the operating environment in automatic-driving system or advanced driving assistant system. The most recent approaches have gained significant improvement for depth prediction based on convolutional neural networks (CNNs). In this paper, a novel framework of CNNs is proposed for monocular depth estimation based on deep ordinal regression network (DORN) and a U-net structure. The new model is trained, verified in process and tested on 5000 images from a simulation experiment platform provide by “Grand Theft Auto”. To eliminate or at least largely reduce the impact from ground truth with no depth values, three different training strategies were employed for network optimization. We developed an effective weighted training strategy for depth prediction to improve the estimation accuracy. The comparison of evaluations over our results and DORN demonstrated the effectiveness of our method. The results showed that the proposed method achieved state-of-the-art performances.

Keywords

Monocular depth estimation CNN Ordinal regression Automatic-driving system Image classification 

Notes

Acknowledgements

this research is supported by Natural Science Foundation of Beijing “Research on the Planning Decision Making Supporting Approaches of Healthy City Planning of Beijing Based on the Analysis of Social Sensing Data” (No. 8182027), and open fund of Institute for China Sustainable Urbanization, Tsinghua University:“Pre-study on new urban development strategy integrating multi-source big data” (TUCSU-K-17026-01). We are also grateful for the computational resources provided by GTA.

References

  1. 1.
    Rupprecht LC, Belagiannis V, Tombari F, Navab N (2016) Deeper depth prediction with fully convolutional residual networks. In: Fourth international conference on 3D Vision (3DV), vol 1, pp 239–248Google Scholar
  2. 2.
    Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651CrossRefGoogle Scholar
  3. 3.
    Tateno K, Tombari F, Laina I, Navab N (2017) CNN-SLAM: real-time dense monocular SLAM with learned depth prediction. In: IEEE conference on computer vision and pattern recognition, pp 6565–6574Google Scholar
  4. 4.
    Wang P, Shen X, Lin Z, Cohen S, Yuille A (2015) Towards unified depth and semantic prediction from a single image. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  5. 5.
    Li B, Dai Y, He M (2018) Monocular depth estimation with hierarchical fusion of dilated CNNs and soft-weighted-sum inference. Pattern Recogn 83:328–339CrossRefGoogle Scholar
  6. 6.
    Cao Y, Wu Z, Shen C (2018) Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Trans Circuits Syst Video Technol 28(11):3174–3182CrossRefGoogle Scholar
  7. 7.
    Li R, Xian K, Shen C, Cao Z, Lu H, Hang L (2018) Deep attention-based classification network for robust depth prediction. In: IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  8. 8.
    Liu F, Shen C, Lin G (2015) Deep convolutional neural fields for depth estimation from a single image. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  9. 9.
    Li B, Shen C, Dai Y, Hengel AVD, He M (2015) Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  10. 10.
    Dan X, Ricci E, Ouyang W, Wang X, Sebe N (2017) Multi-scale continuous CRFs as sequential deep networks for monocular depth estimation. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  11. 11.
    Fu H, Gong M, Wang C, Batmanghelich K, Tao D (2018) Deep ordinal regression network for monocular depth estimation. In: IEEE conference on computer vision and pattern recognition, pp 2002–2011Google Scholar
  12. 12.
    Chen W, Zhao F, Yang D, Jia D (2015) Single-image depth perception in the wild. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  13. 13.
    Garg R, Vijay Kumar BG, Carneiro G, Reid I (2016) Unsupervised CNN for single view depth estimation: geometry to the rescue. In: European conference on computer visionGoogle Scholar
  14. 14.
    Godard C, Aodha OM, Brostow GJ (2017) Unsupervised monocular depth estimation with left-right consistency. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  15. 15.
    Kuznietsov GJ, Stückler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  16. 16.
    Uhrig J, Schneider N, Schneider L, Franke U, Brox T, Geiger A (2017) Sparsity invariant CNNs. In: 2017 international conference on 3D Vision (3DV), pp 11–20Google Scholar
  17. 17.
    Fischer RP, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: IEEE conference on computer vision and pattern recognitionGoogle Scholar
  18. 18.
    Chollet F, Keras. In: GitHub. https://github.com/fchollet/keras
  19. 19.
    Abadi M et al (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems. In: Distributed, parallel, and cluster computingGoogle Scholar
  20. 20.
    Sanner MF (1999) Python: a programming language for software integration and development. J Mol Graph Model 17(1):57–61Google Scholar
  21. 21.
    Meng L (2014) Acceleration method of 3D medical images registration based on compute unified device architecture. Bio-Med Mater Eng 24(1):1109–1116CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Jing Kang
    • 1
    • 2
    • 3
  • Anrong Dang
    • 2
    • 3
    Email author
  • Bailing Zhang
    • 4
  • Yongming Wang
    • 1
  • Hang Su
    • 1
  • Fei Su
    • 1
  • Tianyu Ci
    • 5
  • Fangping Wang
    • 1
  1. 1.China Transport Telecommunications & Information CenterBeijingChina
  2. 2.Department of Urban Planning, School of ArchitectureTsinghua UniversityBeijingChina
  3. 3.Institute for China Sustainable UrbanizationTsinghua UniversityBeijingChina
  4. 4.Department of Mechanical & Electronic EngineeringChina University of Mining & TechnologyBeijingChina
  5. 5.College of Global Change and Earth System ScienceBeijing Normal UniversityBeijingChina

Personalised recommendations