Advertisement

Non-local Spatial Propagation Network for Depth Completion

Conference paper
  • 656 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12358)

Abstract

In this paper, we propose a robust and efficient end-to-end non-local spatial propagation network for depth completion. The proposed network takes RGB and sparse depth images as inputs and estimates non-local neighbors and their affinities of each pixel, as well as an initial depth map with pixel-wise confidences. The initial depth prediction is then iteratively refined by its confidence and non-local spatial propagation procedure based on the predicted non-local neighbors and corresponding affinities. Unlike previous algorithms that utilize fixed-local neighbors, the proposed algorithm effectively avoids irrelevant local neighbors and concentrates on relevant non-local neighbors during propagation. In addition, we introduce a learnable affinity normalization to better learn the affinity combinations compared to conventional methods. The proposed algorithm is inherently robust to the mixed-depth problem on depth boundaries, which is one of the major issues for existing depth estimation/completion algorithms. Experimental results on indoor and outdoor datasets demonstrate that the proposed algorithm is superior to conventional algorithms in terms of depth completion accuracy and robustness to the mixed-depth problem. Our implementation is publicly available on the project page (https://github.com/zzangjinsun/NLSPN_ECCV20).

Keywords

Depth completion Non-local Spatial propagation network 

Notes

Acknowledgement

This work was partially supported by the National Information Society Agency for construction of training data for artificial intelligence (2100-2131-305-107-19).

Supplementary material

504454_1_En_8_MOESM1_ESM.pdf (5.4 mb)
Supplementary material 1 (pdf 5479 KB)

References

  1. 1.
  2. 2.
    TESLA Autopilot. www.tesla.com/autopilot
  3. 3.
  4. 4.
    Barron, J.T., Poole, B.: The fast bilateral solver. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 617–632. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_38CrossRefGoogle Scholar
  5. 5.
    Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2005)Google Scholar
  6. 6.
    Chen, L.C., Barron, J.T., Papandreou, G., Murphy, K., Yuille, A.L.: Semantic image segmentation with task-specific edge detection using CNNs and a discriminatively trained domain transform. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  7. 7.
    Chen, Y., Yang, B., Liang, M., Urtasun, R.: Learning joint 2D–3D representations for depth completion. In: Proceedings of IEEE International Conference on Computer Vision (ICCV) (2019)Google Scholar
  8. 8.
    Cheng, X., Wang, P., Guan, C., Yang, R.: CSPP++: learning context and resource aware convolutional spatial propagation networks for depth completion. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI) (2020)Google Scholar
  9. 9.
    Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 108–125. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01270-0_7CrossRefGoogle Scholar
  10. 10.
    Chodosh, N., Wang, C., Lucey, S.: Deep convolutional compressed sensing for Lidar depth completion. In: Proceedings of Asian Conference on Computer Vision (ACCV) (2018)Google Scholar
  11. 11.
    Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Proceedings of Advances in Neural Information Processing Systems (2014)Google Scholar
  12. 12.
    Eldesokey, A., Felsberg, M., Khan, F.S.: Confidence propagation through CNNs for guided sparse depth regression. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) (2019)Google Scholar
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  14. 14.
    Imran, S., Long, Y., Liu, X., Morris, D.: Depth coefficients for depth completion. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  15. 15.
    Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Proceedings of Advances in Neural Information Processing Systems (2015)Google Scholar
  16. 16.
    Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Proceedings of Advances in Neural Information Processing Systems (2011)Google Scholar
  17. 17.
    Levin, A., Lischinski, D., Weiss, Y.: A closed form solution to natural image matting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2006)Google Scholar
  18. 18.
    Liu, F., Shen, C., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  19. 19.
    Liu, S., De Mello, S., Gu, J., Zhong, G., Yang, M.H., Kautz, J.: Learning affinity via spatial propagation networks. In: Proceedings of Advances in Neural Information Processing Systems (2017)Google Scholar
  20. 20.
    Liu, S., Pan, J., Yang, M.-H.: Learning recursive filters for low-level vision via a hybrid neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 560–576. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_34CrossRefGoogle Scholar
  21. 21.
    Ma, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA) (2018)Google Scholar
  22. 22.
    Park, J., Tai, Y.W., Cho, D., Kweon, I.S.: A unified approach of multi-scale deep and hand-crafted features for defocus estimation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  23. 23.
    Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS Autodiff Workshop (2017)Google Scholar
  24. 24.
    Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) (1990)Google Scholar
  25. 25.
    Qiu, J., et al.: Deeplidar: deep surface normal guided depth prediction for outdoor scene from sparse Lidar data and single color image. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  26. 26.
    Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) (2015)Google Scholar
  27. 27.
    Saxena, A., Chung, S.H., Ng, A.Y.: Learning depth from single monocular images. In: Proceedings of Advances in Neural Information Processing Systems (2006)Google Scholar
  28. 28.
    Shim, G., Park, J., Kweon, I.S.: Robust reference-based super-resolution with similarity-aware deformable convolution. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)Google Scholar
  29. 29.
    Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33715-4_54CrossRefGoogle Scholar
  30. 30.
    Uhrig, J., et al.: Sparsity invariant CNNs. In: International Conference on 3D Vision (3DV) (2017)Google Scholar
  31. 31.
    Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  32. 32.
    Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse Lidar data with depth-normal constraints. In: Proceedings of IEEE International Conference on Computer Vision (ICCV) (2019)Google Scholar
  33. 33.
    Yang, Y., Wong, A., Soatto, S.: Dense depth posterior (DDP) from single image and sparse range. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar
  34. 34.
    Yoon, K.J., Kweon, I.S.: Adaptive support-weight approach for correspondence search. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) (2006)Google Scholar
  35. 35.
    Zbontar, J., LeCun, Y., et al.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17, 1–32 (2016)Google Scholar
  36. 36.
    Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable ConvNets v2: more deformable, better results. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Korea Advanced Institute of Science and TechnologyDaejeonRepublic of Korea
  2. 2.Robotics Institute, Carnegie Mellon UniversityPittsburghUSA
  3. 3.Hikvision Research AmericaSanta ClaraUSA

Personalised recommendations