Self-attention Convolution for Sparse to Dense Depth Completion

Zhao, Tao; Pan, Shuguo; Zhang, Hui

doi:10.1007/978-3-030-97196-0_9

Tao Zhao ORCID: orcid.org/0000-0001-8195-1673⁶,
Shuguo Pan⁶ &
Hui Zhang⁶

Part of the book series: Proceedings in Adaptation, Learning and Optimization ((PALO,volume 15))

Included in the following conference series:

International Conference on Intelligent Vision and Computing

636 Accesses

Abstract

Depth completion from a sparse set of depth measurements and a single RGB image has been shown to be an effective method for generating high-quality depth images. However, traditional convolutional neural network methods tend to interpolate and replicate the output from the surrounding depth values. The underutilization of sparse information leads to blurred boundaries and loss of structural information. To further improve the accuracy of depth completion, we extend the original U-shaped network by self-attention convolution to extract more useful information from the sparse depth measurements. The experimental results validate the effectiveness of self-attention convolution using the U-net architecture on the NYUv2 depth dataset. The accuracy of the proposed method has been improved by 16.9% compared to the original Unet network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An adaptive converged depth completion network based on efficient RGB guidance

Article 15 June 2022

RigNet: Repetitive Image Guided Network for Depth Completion

Single image depth estimation using improved U-Net and edge-guide loss

Article 27 April 2024

References

Fu, H., Gong, M., Wang, C., Batmanghelich, K., Tao, D.: Deep ordinal regression network for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2002–2011 (2018)
Google Scholar
Wang, W., Neumann, U.: Depth-aware CNN for RGB-D segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 144–161. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_9
Chapter Google Scholar
Fu, H., Xu, D., Lin, S.: Object-based multiple foreground segmentation in RGBD video. IEEE Trans. Image Process. 26(3), 1418–1427 (2017)
Article MathSciNet Google Scholar
Loo, S.Y., Amiri, A.J., Mashohor, S., Tang, S.H., Zhang, H.: CNN-SVO: Improving the Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction, arXiv:1810.01011 [cs], Oct. 2018. http://arxiv.org/abs/1810.01011. Accessed 26 Nov 2020
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1), 7–42 (2002)
Article Google Scholar
Chang, J.-R., Chen, Y.-S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)
Google Scholar
Hamzah, R.A., Kadmin, A.F., Hamid, M.S., Ghani, S.F.A., Ibrahim, H.: Improvement of stereo matching algorithm for 3D surface reconstruction. Signal Process. Image Commun. 65, 165–172 (2018)
Article Google Scholar
Ma, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 4796–4803 (2018)
Google Scholar
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity Invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), Qingdao, pp. 11–20 (October 2017). https://doi.org/10.1109/3DV.2017.00012
Hawe, S., Kleinsteuber, M., Diepold, K.: Dense disparity maps from sparse disparity measurements. In: 2011 International Conference on Computer Vision, pp. 2126–2133 (2011)
Google Scholar
Liu, L.-K., Chan, S.H., Nguyen, T.Q.: Depth reconstruction from sparse samples: representation, algorithm, and sampling. IEEE Trans. Image Process. 24(6), 1983–1996 (2015)
Article MathSciNet Google Scholar
Ma, F., Carlone, L., Ayaz, U., Karaman, S.: Sparse sensing for resource-constrained depth reconstruction. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 96–103 (2016)
Google Scholar
Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera, arXiv:1807.00275 [cs], July 2018. http://arxiv.org/abs/1807.00275. Accessed 26 Nov 2020
Jaritz, M., De Charette, R., Wirbel, E., Perrotton, X., Nashashibi, F.: Sparse and dense data with CNNs: depth completion and semantic segmentation. In: 2018 International Conference on 3D Vision (3DV), pp. 52–60 (2018)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: International Conference on Machine Learning, pp. 933–941 (2017)
Google Scholar
Van den Oord, A., et al.: Wavenet: a generative model for raw audio, arXiv preprint arXiv:1609.03499 (2016)
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. (ToG) 36(4), 1–14 (2017)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)
Google Scholar
Shivakumar, S.S., Nguyen, T., Miller, I.D., Chen, S.W., Kumar, V., Taylor, C.J.: Dfusenet: Deep fusion of RGB and sparse depth information for image guided dense depth completion. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 13–20 (2019)
Google Scholar

Download references

Acknowledgment

This research was financially supported by the National Natural Science Foundation of China (Grant No. 41774027).

Author information

Authors and Affiliations

School of Instrument Science and Engineering, Southeast University, Nanjing, China
Tao Zhao, Shuguo Pan & Hui Zhang

Authors

Tao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shuguo Pan
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuguo Pan .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Rajasthan Technical University, Kota, India
Harish Sharma
Sur University College, Sur, Oman
Vijay Kumar Vyas
Department of Mathematical Sciences, Indian Institute of Technology (BHU) Varanasi, Varanasi, Uttar Pradesh, India
Rajesh Kumar Pandey
School of Computer Science, University of Technology Sydney, Sydney, NSW, Australia
Mukesh Prasad

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, T., Pan, S., Zhang, H. (2022). Self-attention Convolution for Sparse to Dense Depth Completion. In: Sharma, H., Vyas, V.K., Pandey, R.K., Prasad, M. (eds) Proceedings of the International Conference on Intelligent Vision and Computing (ICIVC 2021). ICIVC 2021. Proceedings in Adaptation, Learning and Optimization, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-030-97196-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-97196-0_9
Published: 24 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-97195-3
Online ISBN: 978-3-030-97196-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Self-attention Convolution for Sparse to Dense Depth Completion

Abstract

Access this chapter

Similar content being viewed by others

An adaptive converged depth completion network based on efficient RGB guidance

RigNet: Repetitive Image Guided Network for Depth Completion

Single image depth estimation using improved U-Net and edge-guide loss

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Self-attention Convolution for Sparse to Dense Depth Completion

Abstract

Access this chapter

Similar content being viewed by others

An adaptive converged depth completion network based on efficient RGB guidance

RigNet: Repetitive Image Guided Network for Depth Completion

Single image depth estimation using improved U-Net and edge-guide loss

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation