Advertisement

Sparse-to-Dense Depth Completion Revisited: Sampling Strategy and Graph Construction

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12366)

Abstract

Depth completion is a widely studied problem of predicting a dense depth map from a sparse set of measurements and a single RGB image. In this work, we approach this problem by addressing two issues that have been under-researched in the open literature: sampling strategy (data term) and graph construction (prior term). First, instead of the popular random sampling strategy, we suggest that Poisson disk sampling is a much more effective solution to create sparse depth map from a dense version. We experimentally compare a class of quasi-random sampling strategies and demonstrate that an optimized sampling strategy can significantly improve the performance of depth completion for the same number of sparse samples. Second, instead of the traditional square kernel, we suggest that dynamic construction of local neighborhood is a better choice for interpolating the missing values. More specifically, we proposed an end-to-end network with a graph convolution module. Since the neighborhood relationship of 3D points is more effectively exploited by our novel graph convolution module, our approach has achieved not only state-of-the-art results for depth completion of indoor scenes but also better generalization ability than other competing methods.

Keywords

Depth completion Graph neural network Poisson disk sampling Sparse-to-dense 

Supplementary material

504479_1_En_41_MOESM1_ESM.pdf (14 mb)
Supplementary material 1 (pdf 14314 KB)

References

  1. 1.
    Bratley, P., Fox, B.L., Niederreiter, H.: Implementation and tests of low-discrepancy sequences. ACM Trans. Model. Comput. Simul. (TOMACS) 2(3), 195–213 (1992)CrossRefGoogle Scholar
  2. 2.
    Bridson, R.: Fast poisson disk sampling in arbitrary dimensions. SIGGRAPH sketches 10, 1278780–1278807 (2007)Google Scholar
  3. 3.
    Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203 (2013)
  4. 4.
    Caflisch, R.E.: Monte carlo and quasi-monte carlo methods. Acta numerica 7, 1–49 (1998)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Chang, A., et al.: Matterport3d: Learning from rgb-d data in indoor environments. In: International Conference on 3D Vision (3DV) (2017)Google Scholar
  6. 6.
    Chen, Y., Yang, B., Liang, M., Urtasun, R.: Learning joint 2d–3d representations for depth completion. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 10023–10032 (2019)Google Scholar
  7. 7.
    Chen, Z., Badrinarayanan, V., Drozdov, G., Rabinovich, A.: Estimating depth from rgb and sparse sensing. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 167–182 (2018)Google Scholar
  8. 8.
    Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 103–119 (2018)Google Scholar
  9. 9.
    Cheng, X., Wang, P., Yang, R.: Learning depth with convolutional spatial propagation network. arXiv preprint arXiv:1810.02695 (2018)
  10. 10.
    Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
  11. 11.
    Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems. pp. 3844–3852 (2016)Google Scholar
  12. 12.
    Doria, D., Radke, R.J.: Filling large holes in lidar data by inpainting depth gradients. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. pp. 65–72. IEEE (2012)Google Scholar
  13. 13.
    Dunlap, R.A.: The golden ratio and Fibonacci numbers. World Scientific (1997)Google Scholar
  14. 14.
    Early, D.S., Long, D.G.: Image reconstruction and enhanced resolution imaging from irregular samples. IEEE Trans. Geosci. Remote Sens. 39(2), 291–302 (2001)CrossRefGoogle Scholar
  15. 15.
    Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2650–2658 (2015)Google Scholar
  16. 16.
    Eldar, Y., Lindenbaum, M., Porat, M., Zeevi, Y.Y.: The farthest point strategy for progressive image sampling. IEEE Trans. Image Process. 6(9), 1305–1315 (1997)CrossRefGoogle Scholar
  17. 17.
    Eldesokey, A., Felsberg, M., Khan, F.S.: Confidence propagation through cnns for guided sparse depth regression. IEEE Trans. Pattern Anal. Mach. Intell. (2019)Google Scholar
  18. 18.
    Ferstl, D., Reinbacher, C., Ranftl, R., Rüther, M., Bischof, H.: Image guided depth upsampling using anisotropic total generalized variation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 993–1000 (2013)Google Scholar
  19. 19.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. pp. 3354–3361. IEEE (2012)Google Scholar
  20. 20.
    Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 270–279 (2017)Google Scholar
  21. 21.
    Gori, M., Monfardini, G., Scarselli, F.: A new model for learning in graph domains. In: Proceedings 2005 IEEE International Joint Conference on Neural Networks, 2005. vol. 2, pp. 729–734. IEEE (2005)Google Scholar
  22. 22.
    Harrison A., Newman P.: Image and Sparse Laser Fusion for Dense Scene Reconstruction. In: Howard A., Iagnemma K., Kelly A. (eds). Field and Service Robotics. Springer Tracts in Advanced Robotics, vol 62. Springer, Berlin, Heidelberg (2010)  https://doi.org/10.1007/978-3-642-13408-1_20
  23. 23.
    Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Cambridge University Press, Chennai (2003)zbMATHGoogle Scholar
  24. 24.
    He, K., Sun, J., Tang, X.: Guided image filtering. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 1–14. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-15549-9_1CrossRefGoogle Scholar
  25. 25.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 770–778 (2016)Google Scholar
  26. 26.
    Huang, Y.K., Wu, T.H., Liu, Y.C., Hsu, W.H.: Indoor depth completion with boundary consistency and self-attention. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (2019)Google Scholar
  27. 27.
    Huang, Z., Fan, J., Cheng, S., Yi, S., Wang, X., Li, H.: Hms-net: Hierarchical multi-scale sparsity-invariant network for sparse depth completion. IEEE Trans. Image Process. 29, 3429–3441 (2019)CrossRefGoogle Scholar
  28. 28.
    Illian, J., Penttinen, A., Stoyan, H., Stoyan, D.: Statistical analysis and modelling of spatial point patterns. John Wiley & Sons, New Jersey (2008)zbMATHGoogle Scholar
  29. 29.
    Jaeger, H.: Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the" echo state network" approach. GMD-Forschungszentrum Informationstechnik Bonn, Bonn (2002)Google Scholar
  30. 30.
    Jia, X., De Brabandere, B., Tuytelaars, T., Gool, L.V.: Dynamic filter networks. In: Advances in Neural Information Processing Systems. pp. 667–675 (2016)Google Scholar
  31. 31.
    Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)
  32. 32.
    Kocis, L., Whiten, W.J.: Computational investigations of low-discrepancy sequences. ACM Trans. Math. Software (TOMS) 23(2), 266–294 (1997)CrossRefGoogle Scholar
  33. 33.
    Krcadinac, V.: A new generalization of the golden ratio. Fibonacci Quart. 44(4), 335 (2006)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems. pp. 1097–1105 (2012)Google Scholar
  35. 35.
    Kuipers, L., Niederreiter, H.: Uniform distribution of sequences. Courier Corporation (2012)Google Scholar
  36. 36.
    Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. In: ACM SIGGRAPH 2004 Papers, pp. 689–694 (2004)Google Scholar
  37. 37.
    Li, Y., Ibanez-Guzman, J.: Lidar for autonomous driving: The principles, challenges, and trends for automotive lidar and perception systems. IEEE Signal Process. Mag.(2020)Google Scholar
  38. 38.
    Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.: Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 (2015)
  39. 39.
    Liu, F., Shen, C., Lin, G., Reid, I.: Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2024–2039 (2015)CrossRefGoogle Scholar
  40. 40.
    Liu, J., Gong, X.: Guided depth enhancement via anisotropic diffusion. In: Huet, B., Ngo, C.-W., Tang, J., Zhou, Z.-H., Hauptmann, A.G., Yan, S. (eds.) PCM 2013. LNCS, vol. 8294, pp. 408–417. Springer, Cham (2013).  https://doi.org/10.1007/978-3-319-03731-8_38CrossRefGoogle Scholar
  41. 41.
    Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised sparse-to-dense: Self-supervised depth completion from lidar and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA). pp. 3288–3295. IEEE (2019)Google Scholar
  42. 42.
    Mal, F., Karaman, S.: Sparse-to-dense: Depth prediction from sparse depth samples and a single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA). pp. 1–8. IEEE (2018)Google Scholar
  43. 43.
    Marohnić, L., Strmečki, T.: Plastic number: Construction and applications. In: International Virtual Conference ARSA (Advanced Researsch in Scientific Area) (2013)Google Scholar
  44. 44.
    Matsuo, K., Aoki, Y.: Depth image enhancement using local tangent plane approximations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3574–3583 (2015)Google Scholar
  45. 45.
    McCool, M., Fiume, E.: Hierarchical poisson disk sampling distributions. In: Proceedings of the Conference on Graphics Interface. vol. 92, pp. 94–105 (1992)Google Scholar
  46. 46.
    Niederreiter, H.: Random number generation and quasi-Monte Carlo methods. Siam, Thailand (1992)CrossRefGoogle Scholar
  47. 47.
    Park, J., Kim, H., Tai, Y.W., Brown, M.S., Kweon, I.S.: High-quality depth map upsampling and completion for rgb-d cameras. IEEE Trans. Image Process. 23(12), 5559–5572 (2014)MathSciNetCrossRefGoogle Scholar
  48. 48.
    Qi, X., Liao, R., Jia, J., Fidler, S., Urtasun, R.: 3d graph neural networks for rgbd semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 5199–5208 (2017)Google Scholar
  49. 49.
    Qiu, J., et al.: Deeplidar: Deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3313–3322 (2019)Google Scholar
  50. 50.
    Ripley, B.D.: Spatial statistics. John Wiley & Sons, New Jersey (2005)zbMATHGoogle Scholar
  51. 51.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24574-4_28CrossRefGoogle Scholar
  52. 52.
    Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2008)CrossRefGoogle Scholar
  53. 53.
    Shivakumar, S.S., Nguyen, T., Miller, I.D., Chen, S.W., Kumar, V., Taylor, C.J.: Dfusenet: Deep fusion of rgb and sparse depth information for image guided dense depth completion. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC). pp. 13–20. IEEE (2019)Google Scholar
  54. 54.
    Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from rgbd images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33715-4_54CrossRefGoogle Scholar
  55. 55.
    Tang, J., Tian, F.P., Feng, W., Li, J., Tan, P.: Learning guided convolutional network for depth completion. arXiv preprint arXiv:1908.01238 (2019)
  56. 56.
    Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant cnns. In: 2017 International Conference on 3D Vision (3DV). pp. 11–20. IEEE (2017)Google Scholar
  57. 57.
    Van Gansbeke, W., Neven, D., De Brabandere, B., Van Gool, L.: Sparse and noisy lidar completion with rgb guidance and uncertainty. In: 2019 16th International Conference on Machine Vision Applications (MVA). pp. 1–6. IEEE (2019)Google Scholar
  58. 58.
    Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
  59. 59.
    Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)CrossRefGoogle Scholar
  60. 60.
    Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.S.: A comprehensive survey on graph neural networks. arXiv preprint arXiv:1901.00596 (2019)
  61. 61.
    Xian, K., et al.: Monocular relative depth perception with web stereo data supervision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  62. 62.
    Xian, K., Zhang, J., Wang, O., Mai, L., Lin, Z., Cao, Z.: Structure-guided ranking loss for single image depth prediction. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)Google Scholar
  63. 63.
    Xiong, X., Cao, Z., Zhang, C., Xian, K., Zou, H.: Binoboost: Boosting self-supervised monocular depth prediction with binocular guidance. In: 2019 IEEE International Conference on Image Processing (ICIP). pp. 1770–1774. IEEE (2019)Google Scholar
  64. 64.
    Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse lidar data with depth-normal constraints. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2811–2820 (2019)Google Scholar
  65. 65.
    Xue, H., Zhang, S., Cai, D.: Depth image inpainting: Improving low rank matrix completion with low gradient regularization. IEEE Trans. Image Process. 26(9), 4311–4320 (2017)MathSciNetCrossRefGoogle Scholar
  66. 66.
    Zhang, Y., Funkhouser, T.: Deep depth completion of a single rgb-d image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 175–185 (2018)Google Scholar
  67. 67.
    Zhou, J., et al.: Graph neural networks: A review of methods and applications. arXiv preprint arXiv:1812.08434 (2018)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.School of AIA Huazhong University of Science and TechnologyWuhanChina
  2. 2.Lane Department of CSEE, West Virginia UniversityVirginiaUSA

Personalised recommendations