Skip to main content

Self-attention Convolution for Sparse to Dense Depth Completion

  • Conference paper
  • First Online:
Proceedings of the International Conference on Intelligent Vision and Computing (ICIVC 2021) (ICIVC 2021)

Part of the book series: Proceedings in Adaptation, Learning and Optimization ((PALO,volume 15))

Included in the following conference series:

  • 636 Accesses

Abstract

Depth completion from a sparse set of depth measurements and a single RGB image has been shown to be an effective method for generating high-quality depth images. However, traditional convolutional neural network methods tend to interpolate and replicate the output from the surrounding depth values. The underutilization of sparse information leads to blurred boundaries and loss of structural information. To further improve the accuracy of depth completion, we extend the original U-shaped network by self-attention convolution to extract more useful information from the sparse depth measurements. The experimental results validate the effectiveness of self-attention convolution using the U-net architecture on the NYUv2 depth dataset. The accuracy of the proposed method has been improved by 16.9% compared to the original Unet network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2002–2011 (2018)

    Google Scholar 

  2. Wang, W., Neumann, U.: Depth-aware CNN for RGB-D segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 144–161. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_9

    Chapter  Google Scholar 

  3. Fu, H., Xu, D., Lin, S.: Object-based multiple foreground segmentation in RGBD video. IEEE Trans. Image Process. 26(3), 1418–1427 (2017)

    Article  MathSciNet  Google Scholar 

  4. Loo, S.Y., Amiri, A.J., Mashohor, S., Tang, S.H., Zhang, H.: CNN-SVO: Improving the Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction, arXiv:1810.01011 [cs], Oct. 2018. http://arxiv.org/abs/1810.01011. Accessed 26 Nov 2020

  5. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1), 7–42 (2002)

    Article  Google Scholar 

  6. Chang, J.-R., Chen, Y.-S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)

    Google Scholar 

  7. Hamzah, R.A., Kadmin, A.F., Hamid, M.S., Ghani, S.F.A., Ibrahim, H.: Improvement of stereo matching algorithm for 3D surface reconstruction. Signal Process. Image Commun. 65, 165–172 (2018)

    Article  Google Scholar 

  8. Ma, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 4796–4803 (2018)

    Google Scholar 

  9. Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity Invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), Qingdao, pp. 11–20 (October 2017). https://doi.org/10.1109/3DV.2017.00012

  10. Hawe, S., Kleinsteuber, M., Diepold, K.: Dense disparity maps from sparse disparity measurements. In: 2011 International Conference on Computer Vision, pp. 2126–2133 (2011)

    Google Scholar 

  11. Liu, L.-K., Chan, S.H., Nguyen, T.Q.: Depth reconstruction from sparse samples: representation, algorithm, and sampling. IEEE Trans. Image Process. 24(6), 1983–1996 (2015)

    Article  MathSciNet  Google Scholar 

  12. Ma, F., Carlone, L., Ayaz, U., Karaman, S.: Sparse sensing for resource-constrained depth reconstruction. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 96–103 (2016)

    Google Scholar 

  13. Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera, arXiv:1807.00275 [cs], July 2018. http://arxiv.org/abs/1807.00275. Accessed 26 Nov 2020

  14. Jaritz, M., De Charette, R., Wirbel, E., Perrotton, X., Nashashibi, F.: Sparse and dense data with CNNs: depth completion and semantic segmentation. In: 2018 International Conference on 3D Vision (3DV), pp. 52–60 (2018)

    Google Scholar 

  15. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  16. Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: International Conference on Machine Learning, pp. 933–941 (2017)

    Google Scholar 

  17. Van den Oord, A., et al.: Wavenet: a generative model for raw audio, arXiv preprint arXiv:1609.03499 (2016)

  18. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. (ToG) 36(4), 1–14 (2017)

    Article  Google Scholar 

  19. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)

    Google Scholar 

  20. Shivakumar, S.S., Nguyen, T., Miller, I.D., Chen, S.W., Kumar, V., Taylor, C.J.: Dfusenet: Deep fusion of RGB and sparse depth information for image guided dense depth completion. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 13–20 (2019)

    Google Scholar 

Download references

Acknowledgment

This research was financially supported by the National Natural Science Foundation of China (Grant No. 41774027).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuguo Pan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, T., Pan, S., Zhang, H. (2022). Self-attention Convolution for Sparse to Dense Depth Completion. In: Sharma, H., Vyas, V.K., Pandey, R.K., Prasad, M. (eds) Proceedings of the International Conference on Intelligent Vision and Computing (ICIVC 2021). ICIVC 2021. Proceedings in Adaptation, Learning and Optimization, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-030-97196-0_9

Download citation

Publish with us

Policies and ethics