Abstract
Three dimensional (3D) modeling is an important stereoscopic representation of an object for multiple viewpoints aggregation and geometrical information. A general 3D modeling pipeline consists of data acquisition, 3D reconstruction and surface reconstruction. The core computational process in 3D modeling is always associated with the 3D reconstruction, which can be categorized into three types, i.e. statistical models, discriminative learning models and generative learning models. Statistical models derive handcrafted feature descriptor from mathematical theory to extract the spatial and geometric features of 3D data. A corresponding matching is performed between multiple viewpoints 3D data to search for the maximum region likelihood across datasets and compute the best match of affine transformation. Discriminative learning models learn spatial coherent of 3D data through data-driven training that leads to the computation of affine transformation with data inferencing. Generative models, on the other hand, have the unsupervised capability of ingesting raw 3D data directly to learn latent representation of input 3D data and later generate ambient output sample from the latent representation. In this paper, a detailed comparison on the three types of 3D reconstruction techniques are reviewed in term of input data structure, correspondence accuracy, precision and recall using four benchmark datasets, i.e. ModelNet10/40, ICL-NUIM, and Semantic3D. The advantages and disadvantages of 3D reconstruction techniques are highlighted for implementation guideline and future improvement.
Similar content being viewed by others
References
Achlioptas P, Diamanti O, Mitliagkas I, Guibas L (2018) Learning representations and generative models for 3D point clouds. In: Proceedings of the 35th international conference on machine learning, vol 80, pp 40–49. PMLR. http://proceedings.mlr.press/v80/achlioptas18a.html
Ahmed E, Saint A, Shabayek AER, Cherenkova K, Das R, Gusev G, Aouada D, Ottersten B (2018) A survey on deep learning advances on different 3d data representations. arXiv:1808.01462
Altwaijry H, Veit A, Belongie SJ, Tech C (2016) Learning to detect and match keypoints with deep architectures. In: Proceedings of the British Machine Vision Conference (BMVC). BMVA Press, pp 49.1–49.12. https://doi.org/10.5244/C.30.49
Azimi S, Gandhi TK (2019) Performance comparison of 3d correspondence grouping algorithm for 3d plant point clouds. arXiv:1909.00866
Berger M, Tagliasacchi A, Seversky L, Alliez P, Levine J, Sharf A, Silva C (2014) State of the art in surface reconstruction from point clouds. In: Eurographics 2014 - State of the Art Reports. The Eurographics Association, https://doi.org/10.2312/egst.20141040
Che T, Li Y, Jacob AP, Bengio Y, Li W (2017) Mode regularized generative adversarial networks. arXiv:1612.02136
Ciaparrone G, Sánchez FL, Tabik S, Troiano L, Tagliaferri R, Herrera F (2020) Deep learning in video multi-object tracking: A survey. Neurocomputing 381:61–88. https://doi.org/10.1016/j.neucom.2019.11.023
Fan H, Su H, Guibas LJ (2017) A point set generation network for 3d object reconstruction from a single image. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2463–2471, https://doi.org/10.1109/CVPR.2017.264
Fang Y, Xie J, Dai G, Wang M, Zhu F, Xu T, Wong E (2015) 3d deep shape descriptor. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 2319–2328, https://doi.org/10.1109/CVPR.2015.7298845
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceedings of the 27th international conference on neural information processing systems - volume 2. MIT Press, Cambridge, pp 2672–2680, https://doi.org/10.5555/2969033.2969125
Grilli E, Menna F, Remondino F (2017) A review of point clouds segmentation and classification algorithm. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 42:339–344. https://doi.org/10.5194/isprs-archives-XLII-2-W3-339-2017
Groueix T, Fisher M, Kim VG, Russell BC, Aubry M (2018) A papier-mâché approach to learning 3d surface generation. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 216–224, https://doi.org/10.1109/CVPR.2018.00030
Guo Y, Sohel F, Bennamoun M, Lu M, Wan J (2013) Rotational projection statistics for 3d local surface description and object recognition. Int J Comput Vis 105(1):63–86. https://doi.org/10.1007/s11263-013-0627-y
Guo Y, Wang H, Hu Q, Liu H, Liu L, Bennamoun M (2020) Deep learning for 3d point clouds: A survey. IEEE Trans Pattern Anal Mach Intell: 1–1. https://doi.org/10.1109/TPAMI.2020.3005434
Hackel T, Savinov N, Ladicky L, Wegner JD, Schindler K, Pollefeys M (2017) SEMANTIC3D.NET: A new large-scale point cloud classification benchmark. In: ISPRS annals of the photogrammetry, remote sensing and spatial information sciences, vol IV-1/W1, pp 91–98, https://doi.org/10.5194/isprs-annals-IV-1-W1-91-2017
Handa A, Whelan T, McDonald J, Davison A (2014) A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: 2014 IEEE international conference on robotics and automation (ICRA), pp 1524–1531. Hong Kong, China. https://doi.org/10.1109/ICRA.2014.6907054
Hinton GE, Sabour S, Frosst N (2018) Matrix capsules with em routing. In: International conference on learning representations. https://openreview.net/forum?id=HJWLfGWRb
Huang H, Kalogerakis E, Chaudhuri S, Ceylan D, Kim VG, Yumer E (2018) Learning local shape descriptors from part correspondences with multiview convolutional networks. ACM Transactions on Graphics (TOG) 37(1):6. https://doi.org/10.1145/3137609
Imanullah M, Yuniarno EM, Sumpeno S (2019) Sift and icp in multi-view based point clouds registration for indoor and outdoor scene reconstruction. In: 2019 International seminar on intelligent technology and its applications (ISITIA). IEEE, pp 288–293, https://doi.org/10.1109/ISITIA.2019.8937292
Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868. https://doi.org/10.1109/ACCESS.2019.2939201
Jing C, Potgieter J, Noble F, Wang R (2017) A comparison and analysis of rgb-d cameras’ depth performance for robotics application. In: 2017 24th International Conference on Mechatronics and Machine Vision in Practice (M2VIP). IEEE, pp 1–6. https://doi.org/10.1109/M2VIP.2017.8211432
Khoury M, Zhou QY, Koltun V (2017) Learning compact geometric features. In: Proceedings of the IEEE international conference on computer vision, pp 153–161, https://doi.org/10.1109/ICCV.2017.26
Kim P, Chen J, Cho YK (2018) Slam-driven robotic mapping and registration of 3d point clouds. Autom Constr 89:38–48. https://doi.org/10.1016/j.autcon.2018.01.009
Kim VG, Li W, Mitra NJ, Chaudhuri S, DiVerdi S, Funkhouser T (2013) Learning part-based templates from large collections of 3d shapes. ACM Transactions on Graphics (TOG) 32(4):70. https://doi.org/10.1145/2461912.2461933
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114
Klokov R, Lempitsky V (2017) Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In: 2017 IEEE international conference on computer vision (ICCV), pp 863–872. https://doi.org/10.1109/ICCV.2017.99
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol. 2. Lille
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Leal-Taixé L, Canton-Ferrer C, Schindler K (2016) Learning by tracking: Siamese cnn for robust target associationn. In: 2016 IEEE Conference on computer vision and pattern recognition workshops (CVPRW), pp 418–425, https://doi.org/10.1109/CVPRW.2016.59
Li Z, Chen C (2018) An improved adaptive threshold brisk feature matching algorithm based on surf. In: 2018 Chinese Automation Congress (CAC). IEEE, pp 2928–2932, https://doi.org/10.1109/CAC.2018.8623194
Li H, Zheng Y, Wu X, Cai Q (2019) 3d model generation and reconstruction using conditional generative adversarial network. Int J Comput Intell Sys 12(2):697–705. https://doi.org/10.2991/ijcis.d.190617.001
Ligon J, Bein D, Ly P, Onesto B (2018) 3d point cloud processing using spin images for object detection. In: 2018 IEEE 8th Annual computing and communication workshop and conference (CCWC). IEEE, pp 731–736, https://doi.org/10.1109/CCWC.2018.8301688
Liu M, Sheng L, Yang S, Shao J, Hu SM (2020) Morphing and sampling network for dense point cloud completion. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 11596–11603, DOI https://doi.org/10.1609/aaai.v34i07.6827
Lu X, Ma C, Ni B, Yang X (2019) Adaptive region proposal with channel regularization for robust object tracking. IEEE Trans Circuits Sys Video Technol: 1–1. https://doi.org/10.1109/TCSVT.2019.2944654
Lu X, Wang W, Shen J, Tai YW, Crandall DJ, Hoi SC (2020) Learning video object segmentation from unlabeled videos. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 8957–8967. https://doi.org/10.1109/CVPR42600.2020.00898
Lunz S, Li Y, Fitzgibbon A, Kushman N (2020) Inverse graphics gan: Learning to generate 3d shapes from unstructured 2d data. arXiv:2002.12674
Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International conference on intelligent robots and systems (IROS). IEEE, pp 922–928. https://doi.org/10.1109/IROS.2015.7353481
McGregor A, Stubbs D (2013) Sketching earth-mover distance on graph metrics. In: Approximation, randomization, and combinatorial optimization. Algorithms and techniques. Springer, Berlin, pp 274–286, https://doi.org/10.1007/978-3-642-40328-6_20
Nanni L, Ghidoni S, Brahnam S (2017) Handcrafted vs. non-handcrafted features for computer vision classification. Pattern Recogn 71:158–172. https://doi.org/10.1016/j.patcog.2017.05.025
Patel MS, Patel N, Holia MS (2015) Feature based multi-view image registration using surf. In: 2015 International Symposium on Advanced Computing and Communication (ISACC). IEEE, pp 213–218, https://doi.org/10.1109/ISACC.2015.7377344
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: Deep learning on point sets for 3d classification and segmentation. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 77–85. https://doi.org/10.1109/CVPR.2017.16
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas L (2016) Volumetric and multi-view cnns for object classification on 3d data. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 5648–5656. https://doi.org/10.1109/CVPR.2016.609
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st international conference on neural information processing systems, pp 5105–5114. https://doi.org/10.5555/3295222.3295263
RaviPrakash H, Anwar SM, Bagci U (2020) Variational capsule encoder. arXiv:2010.09102
Riegler G, Osman Ulusoy A, Geiger A (2017) Octnet: Learning deep 3d representations at high resolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6620–6629. https://doi.org/10.1109/CVPR.2017.701
Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: An efficient alternative to sift or surf. In: 2011 international conference on computer vision. IEEE, pp 2564–2571, https://doi.org/10.1109/ICCV.2011.6126544
Rusu RB, Blodow N, Beetz M (2009) Fast point feature histograms (fpfh) for 3d registration. In: 2009 IEEE International conference on robotics and automation. IEEE, pp 3212–3217, https://doi.org/10.1109/ROBOT.2009.5152473
Rusu RB, Blodow N, Marton ZC, Beetz M (2008) Aligning point cloud views using persistent feature histograms. In: 2008 IEEE/RSJ International conference on intelligent robots and systems. IEEE, pp 3384–3391, https://doi.org/10.1109/IROS.2008.4650967
Salti S, Tombari F, Di Stefano L (2014) Shot: Unique signatures of histograms for surface and texture description. Comput Vis Image Underst 125:251–264. https://doi.org/10.1016/j.cviu.2014.04.011
Shapira L, Shalom S, Shamir A, Cohen-Or D, Zhang H (2010) Contextual part analogies in 3d objects. Int J Comput Vis 89(2-3):309–326. https://doi.org/10.1007/s11263-009-0279-0
Smith E, Meger D (2017) Improved adversarial systems for 3d object generation and reconstruction. In: Proceedings of the 1st annual conference on robot learning, vol 78, pp 87–96. PMLR. http://proceedings.mlr.press/v78/smith17a.html
Srivastava N, Goh H, Salakhutdinov R (2019) Geometric capsule autoencoders for 3d point clouds. arXiv:1912.03310
Srivastava S, Lall B (2019) Deeppoint3d: Learning discriminative local descriptors using deep metric learning on 3d point clouds. Pattern Recogn Lett 127:27–36. https://doi.org/10.1016/j.patrec.2019.02.027
Srivastava S, Sharma G, Lall B (2018) Large scale novel object discovery in 3d. In: 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 179–188, https://doi.org/10.1109/WACV.2018.00026
Tan HH, Lim KH (2019) Vanishing gradient mitigation with deep learning neural network optimization. In: 2019 7th International Conference on Smart Computing & Communications (ICSCC). IEEE, pp 1–4, https://doi.org/10.1109/ICSCC.2019.8843652
Wen X, Han Z, Liu X, Liu YS (2019) Point2spatialcapsule: Aggregating features and spatial relationships of local regions on point clouds using spatial-aware capsules. arXiv:1908.11026
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 1912–1920. https://doi.org/10.1109/CVPR.2015.7298801
Wu J, Zhang C, Xue T, Freeman B, Tenenbaum J (2016) Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: Advances in neural information processing systems. Curran Associates Inc., Red Hook, pp 82–90, https://doi.org/10.5555/3157096.3157106
Yang Y, Feng C, Shen Y, Tian D (2018) Foldingnet: Point cloud auto-encoder via deep grid deformation. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 206–215, https://doi.org/10.1109/CVPR.2018.00029
Yao R, Lin G, Xia S, Zhao J, Zhou Y (2019) Video object segmentation and tracking: A survey. arXiv:1904.09172
Yu Q, Liang J, Xiao J, Lu H, Zheng Z (2018) A novel perspective invariant feature transform for rgb-d images. Comput Vis Image Underst 167:109–120. https://doi.org/10.1016/j.cviu.2017.12.001
Zeng A, Song S, Nießner M, Fisher M, Xiao J, Funkhouser T (2017) 3dmatch: Learning local geometric descriptors from rgb-d reconstructions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 199–208, https://doi.org/10.1109/CVPR.2017.29
Zhao Y, Birdal T, Deng H, Tombari F (2019) 3d point capsule networks. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1009–1018, https://doi.org/10.1109/CVPR.2019.00110
Zhao Y, Birdal T, Lenssen JE, Menegatti E, Guibas L, Tombari F (2020) Quaternion equivariant capsule networks for 3d point clouds. In: Computer Vision – ECCV 2020. Springer International Publishing, Cham, pp 1–19, https://doi.org/10.1007/978-3-030-58452-8_1
Zhou L, Zhu S, Luo Z, Shen T, Zhang R, Zhen M, Fang T, Quan L (2018) Learning and matching multi-view descriptors for registration of point clouds. In: Proceedings of the European Conference on Computer Vision (ECCV). Springer International Publishing, Cham, pp 527–544, https://doi.org/10.1007/978-3-030-01267-0_31
Zhu A, Yang J, Zhao W, Cao Z (2020) Lrf-net: Learning local reference frames for 3d local shape description and matching. Sensors 20(18):5086. https://doi.org/10.3390/s20185086
Zollhöfer M (2019) Commodity RGB-D sensors: Data acquisition. Springer International Publishing, Cham, pp 3–13
Zollhöfer M, Stotko P, Görlitz A, Theobalt C, Nießner M, Klein R, Kolb A (2018) State of the art on 3d reconstruction with rgb-d cameras. Computer Graphics Forum 37(2):625–652. https://doi.org/10.1111/cgf.13386
Funding
This study is funded by Sarawak Multimedia Authority (SMA) with the project ID - SMA-1077. We would like to gratefully acknowledge the support of NVIDIA Corporation with the donation of the the Quadro P6000 GPU used for this research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Phang, J.T.S., Lim, K.H. & Chiong, R.C.W. A review of three dimensional reconstruction techniques. Multimed Tools Appl 80, 17879–17891 (2021). https://doi.org/10.1007/s11042-021-10605-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10605-9