Cost Volume Pyramid Network with Multi-strategies Range Searching for Multi-view Stereo

Gao, Shiyu; Li, Zhaoxin; Wang, Zhaoqi

doi:10.1007/978-3-031-23473-6_13

Shiyu Gao¹⁴,
Zhaoxin Li ORCID: orcid.org/0000-0001-5450-8131¹⁴ &
Zhaoqi Wang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13443))

Included in the following conference series:

Computer Graphics International Conference

1128 Accesses
3 Citations

Abstract

Multi-view stereo is an important research task in computer vision while still keeping challenging. In recent years, deep learning-based methods have shown superior performance on this task. Cost volume pyramid network-based methods which progressively refine depth map in coarse-to-fine manner, have yielded promising results while consuming less memory. However, these methods fail to take fully consideration of the characteristics of the cost volumes in each stage, leading to adopt similar range search strategies for each cost volume stage. In this work, we present a novel cost volume pyramid based network with different searching strategies for multi-view stereo. By choosing different depth range sampling strategies and applying adaptive unimodal filtering, we are able to obtain more accurate depth estimation in low resolution stages and iteratively upsample depth map to arbitrary resolution. We conducted extensive experiments on both DTU and BlendedMVS datasets, and results show that our method outperforms most state-of-the-art methods. Code is available at: https://github.com/SibylGao/MSCVP-MVSNet.git.

Supported by National Natural Science Foundation of China Under Grants (Nos. 62172392 and 61702482).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L.: SurfaceNet: an end-to-end 3D neural network for multiview stereopsis. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2307–2315 (2017)
Google Scholar
Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: depth inference for unstructured multi-view stereo. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 785–801. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_47
Chapter Google Scholar
Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T., Quan, L.: Recurrent MVSNet for high-resolution multi-view stereo depth inference. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5525–5534 (2019)
Google Scholar
Xue, Y., et al.: MVSCRF: learning multi-view stereo with conditional random fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4312–4321 (2019)
Google Scholar
Aanas, H., Jensen, R.R., Vogiatzis, G., Tola, E., Dahl, A.B.: Large-scale data for multiple-view stereopsis. Int. J. Comput. Vis. 120(2), 153–168 (2016)
Article MathSciNet Google Scholar
Knapitsch, A., Park, J., Zhou, Q.Y., Koltun, V.: Tanks and temples: benchmarking large-scale scene reconstruction. ACM Trans. Graph. (ToG) 36(4), 1–13 (2017)
Article Google Scholar
Chen, R., Han, S., Xu, J., Su, H.: Point-based multi-view stereo network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1538–1547 (2019)
Google Scholar
Yu, Z., Gao, S.: Fast-MVSNet: sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1949–1958 (2020)
Google Scholar
Yang, J., Mao, W., Alvarez, J.M., Liu, M.: Cost volume pyramid based depth inference for multi-view stereo. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4877–4886 (2020)
Google Scholar
Gu, X., et al.: Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2495–2504 (2020)
Google Scholar
Yu, A., et al.: Attention aware cost volume pyramid based multi-view stereo network for 3D reconstruction. ISPRS J. Photogramm. Remote Sens. 175, 448–460 (2021)
Article Google Scholar
Zhang, J., Yao, Y., Li, S., Luo, Z., Fang, T.: Visibility-aware multi-view stereo network. arXiv:2008.07928, August 2020. https://arxiv.org/abs/2008.07928
Zhang, Y., et al.: Adaptive unimodal cost volume filtering for deep stereo matching. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12926–12934, April 2020
Google Scholar
Guo, X., Yang, K., Yang, W., Wang, X., Li, H.: Group-wise correlation stereo network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3273–3282 (2019)
Google Scholar
Chang, J. R., Chen, Y.S.: Pyramid stereo matching network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5410–5418 (2018)
Google Scholar
Cheng, S., et al.: Deep stereo using adaptive thin volume representation with uncertainty awareness. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2524–2534 (2020)
Google Scholar
Mao, Y., et al.: UASNet: uncertainty adaptive sampling network for deep stereo matching. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6311–6319 (2021)
Google Scholar
Shen, Z., Dai, Y., Rao, Z.: CFNet: cascade and fused cost volume for robust stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13906–13915 (2021)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Galliani, S., Lasinger, K., Schindler, K.: Massively parallel multiview stereopsis by surface normal diffusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 873–881 (2015)
Google Scholar
Xu, Q., Tao, W.: PVSNet: pixelwise visibility-aware multi-view stereo network. arXiv preprint arXiv:2007.07714 (2020)
Yao, Y., et al.: BlendedMVS: a large-scale dataset for generalized multi-view stereo networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1790–1799 (2020)
Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China under Grants (Nos. 62172392 and 61702482).

Author information

Authors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Shiyu Gao, Zhaoxin Li & Zhaoqi Wang

Authors

Shiyu Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoxin Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoqi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shiyu Gao or Zhaoxin Li .

Editor information

Editors and Affiliations

University of Geneva, Geneva, Switzerland
Nadia Magnenat-Thalmann
Bournemouth University, Poole, UK
Jian Zhang
University of Sydney, Sydney, NSW, Australia
Jinman Kim
University of Crete, Heraklion, Greece
George Papagiannakis
Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
Swiss Federal Institute of Technology, Lausanne, Switzerland
Daniel Thalmann
University of Calgary, Calgary, AB, Canada
Marina Gavrilova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, S., Li, Z., Wang, Z. (2022). Cost Volume Pyramid Network with Multi-strategies Range Searching for Multi-view Stereo. In: Magnenat-Thalmann, N., et al. Advances in Computer Graphics. CGI 2022. Lecture Notes in Computer Science, vol 13443. Springer, Cham. https://doi.org/10.1007/978-3-031-23473-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-23473-6_13
Published: 01 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23472-9
Online ISBN: 978-3-031-23473-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cost Volume Pyramid Network with Multi-strategies Range Searching for Multi-view Stereo