Deep learning-based 3D reconstruction: a survey

Samavati, Taha; Soryani, Mohsen

doi:10.1007/s10462-023-10399-2

Deep learning-based 3D reconstruction: a survey

Published: 28 January 2023

Volume 56, pages 9175–9219, (2023)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

4441 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Image-based 3D reconstruction is a long-established, ill-posed problem defined within the scope of computer vision and graphics. The purpose of image-based 3D reconstruction is to retrieve the 3D structure and geometry of a target object or scene from a set of input images. This task has a wide range of applications in various fields, such as robotics, virtual reality, and medical imaging. In recent years, learning-based methods for 3D reconstruction have attracted many researchers worldwide. These novel methods can implicitly estimate the 3D shape of an object or a scene in an end-to-end manner, eliminating the need for developing multiple stages such as key-point detection and matching. Furthermore, these novel methods can reconstruct the shapes of objects from a single input image. Due to rapid advancements in this field, as well as the multitude of opportunities to improve the performance of 3D reconstruction methods, a thorough review of algorithms in this area seems necessary. As a result, this research provides a complete overview of recent developments in the field of image-based 3D reconstruction. The studied methods are examined from several viewpoints, such as input types, model structures, output representations, and training strategies. A detailed comparison is also provided for the reader. Finally, unresolved challenges, underlying issues, and possible future work are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 20

Fig. 21

Single image 3D object reconstruction based on deep learning: A review

Article 03 September 2020

Deep3D reconstruction: methods, data, and challenges

Article 28 May 2021

Procrustean Regression Networks: Learning 3D Structure of Non-rigid Objects from 2D Annotations

References

Aanæs H, Jensen RR, Vogiatzis G et al (2016) Large-scale data for multiple-view stereopsis. Int J Comput Vis 120(2):153–168
Article MathSciNet Google Scholar
Barnes C, Shechtman E, Finkelstein A et al (2009) Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 28(3):24
Article Google Scholar
Bhoi A (2019) Monocular depth estimation: a survey. arXiv preprint. arXiv:1901.09402
Bronstein MM, Bruna J, LeCun Y et al (2017) Geometric deep learning: going beyond Euclidean data. IEEE Signal Process Mag 34(4):18–42. https://doi.org/10.1109/msp.2017.2693418
Article Google Scholar
Cai S, Obukhov A, Dai D et al (2022) Pix2nerf: unsupervised conditional p-gan for single image to neural radiance fields translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3981–3990
Chang AX, Funkhouser T, Guibas L et al (2015) Shapenet: an information-rich 3D model repository. arXiv preprint. arXiv:1512.03012
Chen RT, Rubanova Y, Bettencourt J et al (2018) Neural ordinary differential equations. arXiv preprint. arXiv:1806.07366
Chen Z, Zhang H (2019) Learning implicit fields for generative shape modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5939–5948, https://doi.org/10.1109/cvpr.2019.00609
Chen Z, Gholami A, Nießner M et al (2021) Scan2cap: context-aware dense captioning in rgb-d scans. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3193–3203. https://doi.org/10.1109/CVPR46437.2021.00321
Choy C, Gwak J, Savarese S (2019) 4D spatio-temporal convnets: Minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3075–3084. https://doi.org/10.1109/cvpr.2019.00319
Choy CB, Xu D, Gwak J et al (2016) 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: European conference on computer vision, Springer, Cham, pp 628–644. https://doi.org/10.1007/978-3-319-46484-8_38
Collins RT (1996) A space-sweep approach to true multi-image matching. In: Proceedings CVPR IEEE Computer Society conference on computer vision and pattern recognition. IEEE, pp 358–363
Crawshaw M (2020) Multi-task learning with deep neural networks: a survey. arXiv preprint. arXiv:2009.09796
Dai A, Chang AX, Savva M et al (2017) Scannet: Richly-annotated 3d reconstructions of indoor scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5828–5839, https://doi.org/10.1109/cvpr.2017.261
De Vries H, Strub F, Mary J et al (2017) Modulating early visual processing by language. arXiv preprint. arXiv:1707.00683
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst 29:3844–3852. https://doi.org/10.5555/3157382.3157527
Article Google Scholar
Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16×16 words: transformers for image recognition at scale. arXiv preprint. arXiv:2010.11929
Du Y, Zhang Y, Yu HX et al (2021) Neural radiance flow for 4D view synthesis and video processing. In: 2021 IEEE/CVF international conference on computer vision (ICCV). IEEE Computer Society, pp 14304–14314
Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. arXiv preprint. arXiv:1406.2283
Eldar Y, Lindenbaum M, Porat M et al (1997) The farthest point strategy for progressive image sampling. IEEE Trans Image Process 6(9):1305–1315. https://doi.org/10.1109/83.623193
Article Google Scholar
Engelmann F, Rematas K, Leibe B et al (2021) From points to multi-object 3d reconstruction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4588–4597. https://doi.org/10.1109/CVPR46437.2021.00456
Fahim G, Amin K, Zarif S (2021) Single-view 3d reconstruction: a survey of deep learning methods. Comput Graph 94:164–190. https://doi.org/10.1016/j.cag.2020.12.004
Fan H, Su H, Guibas LJ (2017) A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 605–613. https://doi.org/10.1109/cvpr.2017.264
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning, PMLR, pp 1126–1135
Fu K, Peng J, He Q et al (2021) Single image 3d object reconstruction based on deep learning: a review. Multimedia Tools Appl 80(1):463–498
Article Google Scholar
Furukawa Y, Hernández C et al (2015) Multi-view stereo: a tutorial. Found Trends Comput Graph Vis 9(1–2):1–148
Gao Z, Li E, Yang G et al (2019) Object reconstruction with deep learning: a survey. In: 2019 IEEE 9th annual international conference on CYBER technology in automation, control, and intelligent systems (CYBER). IEEE, pp 643–648. https://doi.org/10.1109/CYBER46603.2019.9066595
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 3354–3361. https://doi.org/10.1109/cvpr.2012.6248074
Gkioxari G, Malik J, Johnson J (2019) Mesh R-CNN. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9785–9795. https://doi.org/10.1109/iccv.2019.00988
Godard C, Mac Aodha O, Brostow GJ (2017) Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 270–279, https://doi.org/10.1109/cvpr.2017.699
Goodfellow I, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial nets. Adv Neural Inf Process Syst 27:139–144
Gu X, Fan Z, Zhu S et al (2020) Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2495–2504
Gupta K, Chandraker M (2020) Neural mesh flow: 3D manifold mesh generation via diffeomorphic flows. Adv Neural Inf Process Syst 33:1–11
Han XF, Laga H, Bennamoun M (2019) Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era. IEEE Trans Pattern Anal Mach Intell 43(5):1578–1604. https://doi.org/10.1109/tpami.2019.2954885
Article Google Scholar
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778. https://doi.org/10.1109/cvpr.2016.90
He T, Collomosse J, Jin H et al (2020) Geo-PIFu: geometry and pixel aligned implicit functions for single-view human reconstruction. Adv Neural Inf Process Syst 33:9276–9287
Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Huang PH, Matzen K, Kopf J et al (2018) DeepMVS: learning multi-view stereopsis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2821–2830. https://doi.org/10.1109/cvpr.2018.00298
Huang T, Zou H, Cui J et al (2021) RFNet: recurrent forward network for dense point cloud completion. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 12508–12517
Huang Z, Yu Y, Xu J et al (2020) PF-Net: point fractal network for 3D point cloud completion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7662–7670. https://doi.org/10.1109/cvpr42600.2020.00768
Jensen R, Dahl A, Vogiatzis G et al (2014) Large scale multi-view stereopsis evaluation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 406–413
Ji M, Gall J, Zheng H et al (2017) SurfaceNet: an end-to-end 3D neural network for multiview stereopsis. In: Proceedings of the IEEE international conference on computer vision, pp 2307–2315
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint. arXiv:1312.6114
Knapitsch A, Park J, Zhou QY et al (2017) Tanks and temples: benchmarking large-scale scene reconstruction. ACM Trans Graph (ToG) 36(4):1–13
Article Google Scholar
Koch S, Matveev A, Jiang Z et al (2019) ABC: a big cad model dataset for geometric deep learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9601–9611. https://doi.org/10.1109/CVPR.2019.00983
Kundu A, Li Y, Rehg JM (2018) 3D-RCNN: instance-level 3D object reconstruction via render-and-compare. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3559–3568. https://doi.org/10.1109/cvpr.2018.00375
L Navaneet K, Mandikal P, Jampani V et al (2019) Differ: Moving beyond 3d reconstruction with differentiable feature rendering. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 18–24
Laga H, Jospin LV, Boussaid F et al (2020) A survey on deep learning techniques for stereo-based depth estimation. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/tpami.2020.3032602
Lin CH, Kong C, Lucey S (2018) Learning efficient point cloud generation for dense 3D object reconstruction. In: Proceedings of the AAAI conference on artificial intelligence
Liu L, Gu J, Zaw Lin K et al (2020) Neural sparse voxel fields. Adv Neural Inf Process Syst 33:15651–15663
Liu S, Li T, Chen W et al (2019) Soft rasterizer: a differentiable renderer for image-based 3D reasoning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7708–7717. https://doi.org/10.1109/ICCV.2019.00780
Lorensen WE, Cline HE (1987) Marching cubes: a high resolution 3D surface construction algorithm. ACM SIGGRAPH Comput Graph 21(4):163–169. https://doi.org/10.1145/37401.37422
Article Google Scholar
Mandikal P, Radhakrishnan VB (2019) Dense 3D point cloud reconstruction using a deep pyramid network. In: 2019 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 1052–1060. https://doi.org/10.1109/wacv.2019.00117
Mandikal P, Navaneet K, Agarwal M et al (2018) 3D-lmNET: latent embedding matching for accurate and diverse 3D point cloud reconstruction from a single image. arXiv preprint. arXiv:1807.07796
Massey FJ Jr (1951) The kolmogorov-smirnov test for goodness of fit. J Am Stat Assoc 46(253):68–78. https://doi.org/10.2307/2280095
Article MATH Google Scholar
Meagher DJ (1980) Octree encoding: a new technique for the representation, manipulation and display of arbitrary 3-D objects by computer. Electrical and Systems Engineering Department, Rensseiaer Polytechnic, Troy
Mescheder L, Oechsle M, Niemeyer M et al (2019) Occupancy networks: Learning 3D reconstruction in function space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4460–4470. https://doi.org/10.1109/cvpr.2019.00459
Mildenhall B, Srinivasan PP, Tancik M et al (2020) NeRF: representing scenes as neural radiance fields for view synthesis. In: European conference on computer vision. Springer, Cham, pp 405–421
Murez Z, van As T, Bartolozzi J et al (2020) Atlas: end-to-end 3D scene reconstruction from posed images. In:16th European conference on computer vision—ECCV 2020, Glasgow, UK, 23–28 August 2020, Proceedings, Part VII 16. Springer, Cham, pp 414–431. https://doi.org/10.1007/978-3-030-58571-6_25
Pan J, Han X, Chen W et al (2019) Deep mesh reconstruction from single RGB images via topology modification networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9964–9973. https://doi.org/10.1109/iccv.2019.01006
Pan X, Dai B, Liu Z et al (2020) Do 2D GANS know 3D shape? Unsupervised 3D shape reconstruction from 2D image gans. arXiv preprint. arXiv:2011.00844
Park JJ, Florence P, Straub J et al (2019) DeepSDF: learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 165–174. https://doi.org/10.1109/cvpr.2019.00025
Park K, Sinha U, Barron JT et al (2021) Nerfies: deformable neural radiance fields. In: Proceedings of the IEEE/CVf international conference on computer vision, pp 5865–5874
Pillai S, Ramalingam S, Leonard JJ (2016) High-performance and tunable stereo reconstruction. In: 2016 IEEE international conference on robotics and automation (ICRA). IEEE, pp 3188–3195
Popov S, Bauszat P, Ferrari V (2020) CoreNet: coherent 3D scene reconstruction from a single RGB image. In: European conference on computer vision. Springer, Cham, pp 366–383. https://doi.org/10.1007/978-3-030-58536-5_22
Qi CR, Su H, Mo K et al (2017a) PointNet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660. https://doi.org/10.1109/cvpr.2017.16
Qi CR, Yi L, Su H et al (2017b) PointNet++: deep hierarchical feature learning on point sets in a metric space. Adv Neural Inf Process Syst. arXiv preprint. arXiv:1706.02413v1
Saito S, Huang Z, Natsume R et al (2019) PIFU: pixel-aligned implicit function for high-resolution clothed human digitization. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2304–2314
Saito S, Simon T, Saragih J et al (2020) PIFuHD: multi-level pixel-aligned implicit function for high-resolution 3D human digitization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 84–93
Salvi A, Gavenski N, Pooch E et al (2020) Attention-based 3D object reconstruction from a single image. In: 2020 International joint conference on neural networks (IJCNN). IEEE, pp 1–8. https://doi.org/10.1109/ijcnn48605.2020.9206776
Sarmad M, Lee HJ, Kim YM (2019) RL-GAN-Net : a reinforcement learning agent controlled gan network for real-time point cloud shape completion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5898–5907. https://doi.org/10.1109/cvpr.2019.00605
Scarselli F, Gori M, Tsoi AC et al (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80. https://doi.org/10.1109/TNN.2008.2005605
Article Google Scholar
Schonberger JL, Frahm JM (2016) Structure-from-motion revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4104–4113
Schops T, Schonberger JL, Galliani S et al (2017) A multi-view stereo benchmark with high-resolution images and multi-camera videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3260–3269
Shin D, Fowlkes CC, Hoiem D (2018) Pixels, voxels, and views: a study of shape representations for single view 3D object shape prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3061–3069. https://doi.org/10.1109/cvpr.2018.00323
Shin D, Ren Z, Sudderth EB et al (2019) 3d scene reconstruction with multi-layer depth and epipolar transformers. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2172–2182. https://doi.org/10.1109/iccv.2019.00226
Silberman N, Hoiem D, Kohli P et al (2012) Indoor segmentation and support inference from RGBD images. In: European conference on computer vision. Springer, Cham, pp 746–760. https://doi.org/10.1007/978-3-642-33715-4_54
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556
Sinha SN (2014) Multiview stereo. Springer, Boston, pp 516–522. https://doi.org/10.1007/978-0-387-31439-6_203
Song S, Yu F, Zeng A et al (2017) Semantic scene completion from a single depth image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1746–1754. https://doi.org/10.1109/cvpr.2017.28
Sun J, Xie Y, Chen L et al (2021) NeuralRecon: real-time coherent 3D reconstruction from monocular video. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15,598–15,607
Sun X, Wu J, Zhang X et al (2018) Pix3D: dataset and methods for single-image 3d shape modeling. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2974–2983. https://doi.org/10.1109/cvpr.2018.00314
Tatarchenko M, Dosovitskiy A, Brox T (2016) Multi-view 3d models from single images with a convolutional network. In: European conference on computer vision. Springer, Cham, pp 322–337. https://doi.org/10.1007/978-3-319-46478-7_20
Tatarchenko M, Dosovitskiy A, Brox T (2017) Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE international conference on computer vision, pp 2088–2096. https://doi.org/10.1109/iccv.2017.230
Tatarchenko M, Richter SR, Ranftl R et al (2019) What do single-view 3D reconstruction networks learn? In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3405–3414. https://doi.org/10.1109/cvpr.2019.00352
Tulsiani S, Gupta S, Fouhey DF et al (2018) Factoring shape, pose, and layout from the 2D image of a 3D scene. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 302–310. https://doi.org/10.1109/cvpr.2018.00039
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Wallace B, Hariharan B (2019) Few-shot generalization for single-image 3D reconstruction via priors. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3818–3827. https://doi.org/10.1109/iccv.2019.00392
Wang D, Cui X, Chen X et al (2021a) Multi-view 3D reconstruction with transformer. arXiv preprint. arXiv:2103.12957
Wang F, Galliani S, Vogel C et al (2021b) PatchmatchNet: learned multi-view patchmatch stereo. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14194–14203
Wang N, Zhang Y, Li Z et al (2018a) Pixel2Mesh: generating 3D mesh models from single rgb images. In: Proceedings of the European conference on computer vision (ECCV), pp 52–67. https://doi.org/10.1007/978-3-030-01252-6_4
Wang TC, Liu MY, Zhu JY et al (2018b) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807
Wen C, Zhang Y, Li Z et al (2019) Pixel2Mesh++: multi-view 3D mesh generation via deformation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1042–1051. https://doi.org/10.1109/iccv.2019.00113
Wiles O, Gkioxari G, Szeliski R et al (2020) SynSin: end-to-end view synthesis from a single image. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7467–7477. https://doi.org/10.1109/cvpr42600.2020.00749
Wu J, Zhang C, Xue T et al (2016) Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp 82–90
Wu Z, Song S, Khosla A et al (2015) 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920, https://doi.org/10.1109/cvpr.2015.7298801
Xia W, Zhang Y, Yang Y et al (2022) GAN inversion: a survey. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3181070
Xian W, Huang JB, Kopf J et al (2021) Space–time neural irradiance fields for free-viewpoint video. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9421–9431
Xiang P, Wen X, Liu YS et al (2021) SnowflakeNet: point cloud completion by snowflake point deconvolution with skip-transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5499–5509
Xiang Y, Mottaghi R, Savarese S (2014) Beyond pascal: a benchmark for 3d object detection in the wild. In: IEEE winter conference on applications of computer vision, IEEE, pp 75–82, https://doi.org/10.1109/wacv.2014.6836101
Xiang Y, Kim W, Chen W et al (2016) ObjectNet3D: a large scale database for 3D object recognition. In: European conference on computer vision. Springer, Cham, pp 160–176. https://doi.org/10.1007/978-3-319-46484-8_10
Xie H, Yao H, Sun X et al (2019) Pix2Vox: context-aware 3D reconstruction from single and multi-view images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2690–2698. https://doi.org/10.1109/iccv.2019.00278
Xie H, Yao H, Zhang S et al (2020) Pix2Vox++: multi-scale context-aware 3D object reconstruction from single and multiple images. Int J Comput Vis 128(12):2919–2935. https://doi.org/10.1007/s11263-020-01347-6
Article Google Scholar
Yao Y, Luo Z, Li S et al (2018) MVSNet: depth inference for unstructured multi-view stereo. In: Proceedings of the European conference on computer vision (ECCV), pp 767–783
Yao Y, Luo Z, Li S et al (2019) Recurrent MVSNet for high-resolution multi-view stereo depth inference. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5525–5534
Yao Y, Luo Z, Li S et al (2020) BlendedMVS: a large-scale dataset for generalized multi-view stereo networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1790–1799
Yu C (2019) Semi-supervised three-dimensional reconstruction framework with GAN. In: Proceedings of the 28th international joint conference on artificial intelligence, pp 4192–4198
Yu Z, Gao S (2020) Fast-MVSNet: sparse-to-dense multi-view stereo with learned propagation and Gauss–Newton refinement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1949–1958
Zhang W, Yan Q, Xiao C (2020) Detail preserved point cloud completion via separated feature aggregation. In: European conference on computer vision. Springer, Cham, pp 512–528
Zhao C, Sun L, Stolkin R (2017) A fully end-to-end deep learning approach for real-time simultaneous 3D reconstruction and material recognition. In: 2017 18th International conference on advanced robotics (ICAR). IEEE, pp 75–82. https://doi.org/10.1109/icar.2017.8023499
Zhao H, Jiang L, Jia J et al (2021a) Point transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 16259–16268
Zhao M, Xiong G, Zhou M et al (2021d) 3D-RVP: a method for 3D object reconstruction from a single depth view using voxel and point. Neurocomputing 430:94–103
Article Google Scholar
Zheng Z, Yu T, Liu Y et al (2021) Pamir: Parametric model-conditioned implicit representation for image-based human reconstruction. IEEE transactions on pattern analysis and machine intelligence 44(6):3170–3184
Article Google Scholar
Zhou X, Wang D, Krähenbühl P (2019) Objects as points. arXiv preprint. arXiv:1904.07850
Zou C, Hoiem D (2020) Silhouette guided point cloud reconstruction beyond occlusion. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 41–50. https://doi.org/10.1109/WACV45572.2020.9093611

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Taha Samavati & Mohsen Soryani

Authors

Taha Samavati
View author publications
You can also search for this author in PubMed Google Scholar
Mohsen Soryani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohsen Soryani.

Ethics declarations

Conflict of interest

The authors confirm that there is no conflict of interest in publishing this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Samavati, T., Soryani, M. Deep learning-based 3D reconstruction: a survey. Artif Intell Rev 56, 9175–9219 (2023). https://doi.org/10.1007/s10462-023-10399-2

Download citation

Published: 28 January 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s10462-023-10399-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning-based 3D reconstruction: a survey

Abstract

Access this article

Similar content being viewed by others

Single image 3D object reconstruction based on deep learning: A review

Deep3D reconstruction: methods, data, and challenges

Procrustean Regression Networks: Learning 3D Structure of Non-rigid Objects from 2D Annotations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep learning-based 3D reconstruction: a survey

Abstract

Access this article

Similar content being viewed by others

Single image 3D object reconstruction based on deep learning: A review

Deep3D reconstruction: methods, data, and challenges

Procrustean Regression Networks: Learning 3D Structure of Non-rigid Objects from 2D Annotations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation