Skip to main content
Log in

Deep-plane sweep generative adversarial network for consistent multi-view depth estimation

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Owing to the improved representation ability, recent deep learning-based methods enable to estimate scene depths accurately. However, these methods still have difficulty in estimating consistent scene depths under real-world environments containing severe illumination changes, occlusions, and texture-less regions. To solve this problem, in this paper, we propose a novel depth-estimation method for unstructured multi-view images. Accordingly, we present a plane sweep generative adversarial network, where the proposed adversarial loss significantly improves the depth-estimation accuracy under real-world settings, and the consistency loss makes the depth-estimation results insensitive to the changes in viewpoints and the number of input images. In addition, 3D convolution layers are inserted into the network to enrich feature representation. Experimental results indicate that the proposed plane sweep generative adversarial network quantitatively and qualitatively outperforms state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. W and H denote the width and height of cost volumes, respectively, and D in (1) denotes the number of depth ranges.

References

  1. Aleotti, F., Tosi, F., Poggi, M., Mattoccia, S.: Generative adversarial networks for unsupervised monocular depth prediction. In: ECCVW (2018)

  2. Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. In: CVPR (2018)

  3. Cui, Z., Heng, L., Yeo, Y.C., Geiger, A., Pollefeys, M., Sattler, T.: Real-time dense mapping for self-driving vehicles using fisheye cameras. In: ICRA (2019)

  4. Gallup, D., Frahm, J.M., Mordohai, P., Yang, Q., Pollefeys, M.: Real-time plane-sweeping stereo with multiple sweeping directions. In: CVPR (2007)

  5. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NIPS (2014)

  6. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: ECCV (2014)

  7. Huang, P.H., Matzen, K., Kopf, J., Ahuja, N., Huang, J.B.: Deepmvs: Learning multi-view stereopsis. In: CVPR (2018)

  8. Im, S., Jeon, H.G., Lin, S., Kweon, S.: DPSNet: End-to-end deep plane sweep stereo. In: ICLR (2019)

  9. Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: NIPS (2015)

  10. Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L.: Surfacenet: an end-to-end 3d neural network for multiview stereopsis. In: CVPR (2017)

  11. Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., Bry, A.: End-to-end learning of geometry and context for deep stereo regression. In: ICCV (2017)

  12. Kumar, A.C., Bhandarkar, S.M., Prasad, M.: Monocular depth prediction using generative adversarial networks. In: CVPRW (2018)

  13. Li, R., Wang, S., Long, Z., Gu, D.: Undeepvo: Monocular visual odometry through unsupervised deep learning. In: ICRA (2018)

  14. Liang, Z., Feng, Y., Guo, Y., Liu, H., Qiao, L., Chen, W., Zhou, L., Zhang, J.: Learning deep correspondence through prior and posterior feature constancy. In: CVPR (2018)

  15. Lore, K.G., Reddy, K., Giering, M., Bernal, E.A.: Generative adversarial networks for depth map estimation from rgb video. In: CVPR (2018)

  16. Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: CVPR (2016)

  17. Moniz, J.R.A., Beckham, C., Rajotte, S., Honari, S., Pal, C.: Unsupervised depth estimation, 3d face rotation and replacement. In: NeurIPS (2018)

  18. Schonberger, J.L., Zheng, E., Frahm, J.M., Pollefeys, M.: Semantic stereo matching with pyramid cost volumes. In: ECCV (2016)

  19. Sinha, S.N., Scharstein, D., Szeliski, R.: Efficient high-resolution stereo matching using local plane sweeps. In: CVPR (2014)

  20. Suominen, O., Gotchev, A.: Efficient cost volume sampling for plane sweeping based multiview depth estimation. In: ICIP (2015)

  21. Ummenhofer, B., Zhou, H., Uhrig, J., Mayer, N., Ilg, E., Dosovitskiy, A., Brox, T.: Demon: Depth and motion network for learning monocular stereo. In: CVPR (2017)

  22. Won, C., Ryu, J., Lim, J.: Omnimvs: End-to-end learning for omnidirectional stereo matching. In: ICCV (2019)

  23. Wu, Z., Wu, X., Zhang, X., Wang, S., Ju, L.: Semantic stereo matching with pyramid cost volumes. In: ICCV (2019)

  24. Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: Mvsnet: depth inference for unstructured multi-view stereo. In: ECCV (2018)

  25. Žbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(65), 1–32 (2016)

    MATH  Google Scholar 

Download references

Acknowledgements

This work was partly supported by the Chung-Ang University Graduate Research Scholarship Grants in 2018 and partly supported by ‘The Cross-Ministry Giga KOREA Project’ of The Ministry of Science, ICT and Future Planning, Korea (GK19C0200, Development of full-3D mobile display terminal and its contents).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junseok Kwon.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shu, D.W., Jang, W., Yoo, H. et al. Deep-plane sweep generative adversarial network for consistent multi-view depth estimation. Machine Vision and Applications 33, 5 (2022). https://doi.org/10.1007/s00138-021-01258-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-021-01258-7

Keywords

Navigation