A survey on deep geometry learning: From a representation perspective

Abstract

Researchers have achieved great success in dealing with 2D images using deep learning. In recent years, 3D computer vision and geometry deep learning have gained ever more attention. Many advanced techniques for 3D shapes have been proposed for different applications. Unlike 2D images, which can be uniformly represented by a regular grid of pixels, 3D shapes have various representations, such as depth images, multi-view images, voxels, point clouds, meshes, implicit surfaces, etc. The performance achieved in different applications largely depends on the representation used, and there is no unique representation that works well for all applications. Therefore, in this survey, we review recent developments in deep learning for 3D geometry from a representation perspective, summarizing the advantages and disadvantages of different representations for different applications. We also present existing datasets in these representations and further discuss future research directions.

References

  1. [1]

    Bronstein, M. M.; Bruna, J.; LeCun, Y.; Szlam, A.; Vandergheynst, P. Geometric deep learning: Going beyond Euclidean data. IEEE Signal Processing Magazine Vol. 34, No. 4, 18–42, 2017.

    Article  Google Scholar 

  2. [2]

    Ahmed, E.; Saint, A.; Shabayek, A. E. R.; Cherenkova, K.; Das, R.; Gusev, G.; Aouada, D.; Ottersten, B. Deep learning advances on different 3D data representations: A survey. arXiv preprint arXiv:1808.01462, 1, 2018.

  3. [3]

    Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3D point clouds: A survey. arXiv preprint arXiv:1912.12033, 2019.

    Google Scholar 

  4. [4]

    Krizhevsky, A.; Sutskever, I.; Hinton, G. E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, 1097–1105, 2012.

    Google Scholar 

  5. [5]

    LeCun, Y.; Kavukcuoglu, K.; Farabet, C. Convolutional networks and applications in vision. In: Proceedings of the IEEE International Symposium on Circuits and Systems, 253–256, 2010.

    Google Scholar 

  6. [6]

    Charles, R. Q.; Hao, S.; Mo, K. C.; Guibas, L. J. PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 652–660, 2017.

    Google Scholar 

  7. [7]

    Qi, C. R.; Yi, L.; Su, H.; Guibas, L. J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the Advances in Neural Information Processing Systems, 5099–5108, 2017.

    Google Scholar 

  8. [8]

    Mescheder, L.; Oechsle, M.; Niemeyer, M.; Nowozin, S.; Geiger, A. Occupancy networks: Learning 3D reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4460–4470, 2019.

    Google Scholar 

  9. [9]

    Xu, Q.; Wang, W.; Ceylan, D.; Mech, R.; Neumann, U. DISN: Deep implicit surface network for high-quality single-view 3D reconstruction. In: Proceedings of the Advances in Neural Information Processing Systems, 490–500, 2019.

    Google Scholar 

  10. [10]

    Lorensen, W. E.; Cline, H. E. Marching cubes: A high resolution 3D surface construction algorithm. ACM SIGGRAPH Computer Graphics Vol. 21, No. 4, 163–169, 1987.

    Article  Google Scholar 

  11. [11]

    Zou, C. H.; Yumer, E.; Yang, J. M.; Ceylan, D.; Hoiem, D. 3D-PRNN: Generating shape primitives with recurrent neural networks. In: Proceedings of the IEE International Conference on Computer Vision, 900–909, 2017.

    Google Scholar 

  12. [12]

    Li, J.; Xu, K.; Chaudhuri, S.; Yumer, E.; Zhang, H.; Guibas, L. GRASS: Generative recursive autoencoders for shape structures. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 52, 2017.

    Google Scholar 

  13. [13]

    Wu, Z. R.; Song, S. R.; Khosla, A.; Yu, F.; Zhang, L. G.; Tang, X. O.; Xiao, J. 3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1912–1920, 2015.

    Google Scholar 

  14. [14]

    Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, 945–953, 2015.

    Google Scholar 

  15. [15]

    Masci, J.; Boscaini, D.; Bronstein, M. M.; Vandergheynst, P. Geodesic convolutional neural networks on Riemannian manifolds. In: Proceedings of the IEEE International Conference on Computer Vision Workshop, 37–45, 2015.

    Google Scholar 

  16. [16]

    Eigen, D.; Puhrsch, C.; Fergus, R. Depth map prediction from a single image using a multi-scale deep network. In: Proceedings of the Advances in Neural Information Processing Systems, 2366–2374, 2014.

    Google Scholar 

  17. [17]

    Gao, L.; Lai, Y.-K.; Liang, D.; Chen, S.-Y.; Xia, S. Efficient and flexible deformation representation for data-driven surface modeling. ACM Transactions on Graphics Vol. 35, No. 5, Article No. 158, 2016.

    Google Scholar 

  18. [18]

    Choy, C. B.; Xu, D. F.; Gwak, J.; Chen, K.; Savarese, S. 3D-R2N2: A unified approach for single and multiview 3D object reconstruction. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9912. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 628–644, 2016.

    Google Scholar 

  19. [19]

    Wu, J.; Zhang, C.; Xue, T.; Freeman, B.; Tenenbaum, J. Learning a probabilistic latent space of object shapes via 3D generativeadversarial modeling. In: Proceedings of the Advances in Neural Information Processing Systems, 82–90, 2016.

    Google Scholar 

  20. [20]

    Fan, H. Q.; Su, H.; Guibas, L. A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 605–613, 2017.

    Google Scholar 

  21. [21]

    Riegler, G.; Ulusoy, A. O.; Geiger, A. OctNet: Learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3577–3586, 2017.

    Google Scholar 

  22. [22]

    Wang, P.-S.; Liu, Y.; Guo, Y.-X.; Sun, C.-Y.; Tong, X. O-CNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 72, 2017.

    Google Scholar 

  23. [23]

    Wang, N. Y.; Zhang, Y. D.; Li, Z. W.; Fu, Y. W.; Liu, W.; Jiang, Y. G. Pixel2Mesh: Generating 3D mesh models from single RGB images. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11215. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 55–71, 2018.

    Google Scholar 

  24. [24]

    Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. PointCNN: Convolution on xtransformed points. In: Proceedings of the Advances in Neural Information Processing Systems, 820–830, 2018.

    Google Scholar 

  25. [25]

    Park, J. J.; Florence, P.; Straub, J.; Newcombe, R.; Lovegrove, S. DeepSDF: Learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.

    Google Scholar 

  26. [26]

    Chen, Z. Q.; Zhang, H. Learning implicit fields for generative shape modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5939–5948, 2019.

    Google Scholar 

  27. [27]

    Gao, L.; Yang, J.; Wu, T.; Yuan, Y.-J.; Fu, H.; Lai, Y.-K.; Zhang, H. SDM-NET: Deep generative network for structured deformable mesh. ACM Transactions on Graphics Vol. 38, No. 6, Article No. 243, 2019.

    Google Scholar 

  28. [28]

    Mo, K.; Guerrero, P.; Yi, L.; Su, H.; Wonka, P.; Mitra, N. J.; Guibas, L. J. StructureNet: Hierarchical graph networks for 3D shape generation. ACM Transactions on Graphics Vol. 38, No. 6, Article No. 242, 2019.

    Google Scholar 

  29. [29]

    Hanocka, R.; Hertz, A.; Fish, N.; Giryes, R.; Fleishman, S.; Cohen-Or, D. MeshCNN: A network with an edge. ACM Transactions on Graphics Vol. 38, No. 4, Article No. 90, 2019.

    Google Scholar 

  30. [30]

    Liu, S.; Saito, S.; Chen, W.; Li, H. Learning to infer implicit surfaces without 3D supervision. In: Proceedings of the Advances in Neural Information Processing Systems, 8293–8304, 2019.

    Google Scholar 

  31. [31]

    Chen, Z.; Tagliasacchi, A.; Zhang, H. BSPNet: Generating compact meshes via binary space partitioning. arXiv preprint arXiv:1911.06971, 2019.

    Google Scholar 

  32. [32]

    Jeruzalski, T.; Deng, B.; Norouzi, M.; Lewis, J.; Hinton, G.; Tagliasacchi, A. NASA: Neural articulated shape approximation. arXiv preprint arXiv:1912.03207, 2019.

    Google Scholar 

  33. [33]

    Socher, R.; Huval, B.; Bath, B.; Manning, C. D.; Ng, A. Y. Convolutional-recursive deep learning for 3D object classification. In: Proceedings of the Advances in Neural Information Processing Systems, 656–664, 2012.

    Google Scholar 

  34. [34]

    Gupta, S.; Girshick, R.; Arbeláez, P.; Malik, J. Learning rich features from RGB-D images for object detection and segmentation. In: Computer Vision - ECCV 2014. Lecture Notes in Computer Science, Vol. 8695. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer, Cham, 345–360, 2014.

    Google Scholar 

  35. [35]

    Gupta, S.; Arbelaez, P.; Girshick, R.; Malik, J. Aligning 3D models to RGB-D images of cluttered scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4731–4740, 2015.

    Google Scholar 

  36. [36]

    Song, S. R.; Xiao, J. X. Deep sliding shapes for amodal 3D object detection in RGB-D images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 808–816, 2016.

    Google Scholar 

  37. [37]

    Qi, C. R.; Su, H.; NieBner, M.; Dai, A.; Yan, M. Y.; Guibas, L. J. Volumetric and multi-view CNNs for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5648–5656, 2016.

    Google Scholar 

  38. [38]

    Hinton, G. E.; Osindero, S.; Teh, Y. W. A fast learning algorithm for deep belief nets. Neural Computation Vol. 18, No. 7, 1527–1554, 2006.

    MATH  Article  MathSciNet  Google Scholar 

  39. [39]

    Maturana, D.; Scherer, S. 3D convolutional neural networks for landing zone detection from LiDAR. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3471–3478, 2015.

    Google Scholar 

  40. [40]

    Maturana, D.; Scherer, S. VoxNet: A 3D convolutional neural network for real-time object recognition. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 922–928, 2015.

    Google Scholar 

  41. [41]

    Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In: Proceedings of the Advances in Neural Information Processing Systems, 2672–2680, 2014.

    Google Scholar 

  42. [42]

    Vincent, P.; Larochelle, H.; Bengio, Y.; Manzagol, P. A. Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine learning, 1096–1103, 2008.

    Google Scholar 

  43. [43]

    Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P.-A. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research Vol. 11, 3371–3408, 2010.

    MATH  MathSciNet  Google Scholar 

  44. [44]

    Sharma, A.; Grau, O.; Fritz, M. VConv-DAE: Deep volumetric shape learning without object labels. In: Computer Vision - ECCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 9915. Hua, G.; Jégou, H. Eds. Springer Cham, 236–250, 2016.

    Google Scholar 

  45. [45]

    Girdhar, R.; Fouhey, D. F.; Rodriguez, M.; Gupta, A. Learning a predictable and generative vector representation for objects. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 484–499, 2016.

    Google Scholar 

  46. [46]

    Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Computation Vol. 9, No. 8, 1735–1780, 1997.

    Article  Google Scholar 

  47. [47]

    Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoderdecoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.

    Google Scholar 

  48. [48]

    Larsen, A. B. L.; Sønderby, S. K.; Larochelle, H.; Winther, O. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.

    Google Scholar 

  49. [49]

    Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv preprint arXiv:1312.4400, 2013.

    Google Scholar 

  50. [50]

    Sedaghat, N.; Zolfaghari, M.; Amiri, E.; Brox, T. Orientation-boosted voxel nets for 3D object recognition. In: Proceedings of the British Machine Vision Conference, 2017.

    Google Scholar 

  51. [51]

    Li, Y.; Pirk, S.; Su, H.; Qi, C. R.; Guibas, L. J. FPNN: Field probing neural networks for 3D data. In: Proceedings of the Advances in Neural Information Processing Systems, 307–315, 2016.

    Google Scholar 

  52. [52]

    Meagher, D. Geometric modeling using octree encoding. Computer Graphics and Image Processing Vol. 19, No. 2, 129–147, 1982.

    Article  Google Scholar 

  53. [53]

    Hane, C.; Tulsiani, S.; Malik, J. Hierarchical surface prediction for 3D object reconstruction. In: Proceedings of the International Conference on 3D Vision, 412–420, 2017.

    Google Scholar 

  54. [54]

    Tatarchenko, M.; Dosovitskiy, A.; Brox, T. Octree generating networks: Efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision, 2088–2096, 2017.

    Google Scholar 

  55. [55]

    Wang, P.-S.; Sun, C.-Y.; Liu, Y.; Tong, X. Adaptive O-CNN: A patch-based deep representation of 3D shapes. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 217, 2018.

    Google Scholar 

  56. [56]

    Rubner, Y.; Tomasi, C.; Guibas L. J. The earth mover's distance as a metric for image retrieval. International Journal of Computer Vision Vol. 40, No. 2, 99–121, 2000.

    MATH  Article  Google Scholar 

  57. [57]

    Wang, Y.; Sun, Y. B.; Liu, Z. W.; Sarma, S. E.; Bronstein, M. M.; Solomon, J. M. Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics Vol. 38, No. 5, Article No. 146, 2019.

    Google Scholar 

  58. [58]

    Klokov, R.; Lempitsky, V. Escape from cells: Deep kd-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, 863–872, 2017.

    Google Scholar 

  59. [59]

    Yang, Y. Q.; Feng, C.; Shen, Y. R.; Tian, D. FoldingNet: Point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, 206–215, 2018.

    Google Scholar 

  60. [60]

    Mehr, E.; Jourdan, A.; Thome, N.; Cord, M.; Guitteny, V. DiscoNet: Shapes learning on disconnected manifolds for 3D editing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3474–3483, 2019.

    Google Scholar 

  61. [61]

    Meng, H. Y.; Gao, L.; Lai, Y. K.; Manocha, D. VV-net: Voxel VAE net with group convolutions for point cloud segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8500–8508, 2019.

    Google Scholar 

  62. [62]

    Yu, L. Q.; Li, X. Z.; Fu, C. W.; Cohen-Or, D.; Heng, P. A. PU-Net: Point cloud upsampling network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2790–2799, 2018.

    Google Scholar 

  63. [63]

    Wang, Y. F.; Wu, S. H.; Huang, H.; Cohen-Or, D.; Sorkine-Hornung, O. Patch-based progressive 3D point set upsampling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5958–5967, 2019.

    Google Scholar 

  64. [64]

    Li, R. H.; Li, X. Z.; Fu, C.W.; Cohen-Or, D.; Heng, P.A. PU-GAN: A point cloud upsampling adversarial network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7203–7212, 2019.

    Google Scholar 

  65. [65]

    Wang, Y.; Solomon, J. Deep closest point: Learning representations for point cloud registration. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3523–3532, 2019.

    Google Scholar 

  66. [66]

    Besl, P. J.; McKay, N. D. Method for registration of 3-D shapes. In: Proceedings of the SPIE 1611, Sensor Fusion IV: Control Paradigms and Data Structures, 586–606, 1992.

    Google Scholar 

  67. [67]

    Sinha, A.; Bai, J.; Ramani, K. Deep learning 3D shape surfaces using geometry images. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 223–240, 2016.

    Google Scholar 

  68. [68]

    Maron, H.; Galun, M.; Aigerman, N.; Trope, M.; Dym, N.; Yumer, E.; Kim, V. G.; Lipman, Y. Convolutional neural networks on surfaces via seamless toric covers. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 71, 2017.

    Google Scholar 

  69. [69]

    Sinha, A.; Unmesh, A.; Huang, Q. X.; Ramani, K. SurfNet: Generating 3D shape surfaces using deep residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6040–6049, 2017.

    Google Scholar 

  70. [70]

    Shi, B. G.; Bai, S.; Zhou, Z. C.; Bai, X. DeepPano: Deep panoramic representation for 3-D shape recognition. IEEE Signal Processing Letters Vol. 22, No. 12, 2339–2343, 2015.

    Article  Google Scholar 

  71. [71]

    Huang, J. W.; Zhang, H. T.; Yi, L.; Funkhouser, T.; NieBner, M.; Guibas, L. J. TextureNet: Consistent local parametrizations for learning from high-resolution signals on meshes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4440–4449, 2019.

    Google Scholar 

  72. [72]

    Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203, 2013.

    Google Scholar 

  73. [73]

    Henaff, M.; Bruna, J.; LeCun, Y. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163, 2015.

    Google Scholar 

  74. [74]

    Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of the Advances in Neural Information Processing Systems, 3844–3852, 2016.

    Google Scholar 

  75. [75]

    Kipf, T. N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.

    Google Scholar 

  76. [76]

    Atwood, J.; Towsley, D. Diffusionconvolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, 1993–2001, 2016.

    Google Scholar 

  77. [77]

    Verma, N.; Boyer, E.; Verbeek, J. FeaStNet: Featuresteered graph convolutions for 3D shape analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2598–2606, 2018.

    Google Scholar 

  78. [78]

    Boscaini, D.; Masci, J.; Melzi, S.; Bronstein, M. M.; Castellani, U.; Vandergheynst, P. Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks. Computer Graphics Forum Vol. 34, No. 5, 13–23, 2015.

    Article  Google Scholar 

  79. [79]

    Boscaini, D.; Masci, J.; Rodolà, E.; Bronstein, M. Learning shape correspondence with anisotropic convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, 3189–3197, 2016.

    Google Scholar 

  80. [80]

    Xu, H. T.; Dong, M.; Zhong, Z. C. Directionally convolutional networks for 3D shape segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, 2698–2707, 2017.

    Google Scholar 

  81. [81]

    Monti, F.; Boscaini, D.; Masci, J.; Rodola, E.; Svoboda, J.; Bronstein, M. M. Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5115–5124, 2017.

    Google Scholar 

  82. [82]

    Fey, M.; Lenssen, J. E.; Weichert, F.; Müller, H. SplineCNN: Fast geometric deep learning with continuous B-spline kernels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 869–877, 2018.

    Google Scholar 

  83. [83]

    Pan, H.; Liu, S.; Liu, Y.; Tong, X. Convolutional neural networks on 3D surfaces using parallel frames. arXiv preprint arXiv:1808.04952, 2018.

    Google Scholar 

  84. [84]

    Qiao, Y.-L.; Gao, L.; Yang, J.; Rosin, P. L.; Lai, Y.- K.; Chen, X. LaplacianNet: Learning on 3D meshes with Laplacian encoding and pooling. arXiv preprint arXiv:1910.14063, 2019.

    Google Scholar 

  85. [85]

    Wen, C.; Zhang, Y. D.; Li, Z. W.; Fu, Y. W. Pixel2Mesh++: Multi-view 3D mesh generation via deformation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1042–1051, 2019.

    Google Scholar 

  86. [86]

    Groueix, T.; Fisher, M.; Kim, V. G.; Russell, B. C.; Aubry, M. A papier-Mache approach to learning 3D surface generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 216–224, 2018.

    Google Scholar 

  87. [87]

    Ben-Hamu, H.; Maron, H.; Kezurer, I.; Avineri, G.; Lipman, Y. Multi-chart generative surface modeling. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 215, 2019.

    Google Scholar 

  88. [88]

    Pan, J. Y.; Han, X. G.; Chen, W. K.; Tang, J. P.; Jia, K. Deep mesh reconstruction from single RGB images via topology modification networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9964–9973, 2019.

    Google Scholar 

  89. [89]

    Tang, J. P.; Han, X. G.; Pan, J. Y.; Jia, K.; Tong, X. A skeleton-bridged deep learning approach for generating meshes of complex topologies from single RGB images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4541–4550, 2019.

    Google Scholar 

  90. [90]

    Nash, C.; Ganin, Y.; Eslami, S.; Battaglia P. W. PolyGen: An autoregressive generative model of 3D meshes. arXiv preprint arXiv:2002.10880, 2020.

    Google Scholar 

  91. [91]

    Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I. Attention is all you need. In: Proceedings of the Advances in Neural Information Processing Systems, 5998–6008, 2017.

    Google Scholar 

  92. [92]

    Genova, K.; Cole, F.; Vlasic, D.; Sarna, A.; Freeman, W.; Funkhouser, T. Learning shape templates with structured implicit functions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7154–7164, 2019.

    Google Scholar 

  93. [93]

    Genova, K.; Cole, F.; Sud, A.; Sarna, A.; Funkhouser, T. Deep structured implicit functions. arXiv preprint arXiv:1912.06126, 2019.

    Google Scholar 

  94. [94]

    Wu, R.; Zhuang, Y.; Xu, K.; Zhang, H.; Chen, B. PQ-NET: A generative part seq2seq network for 3D shapes. arXiv preprint arXiv:1911.10949, 2019.

    Google Scholar 

  95. [95]

    Socher, R.; Lin, C. C.; Manning, C.; Ng, A. Y. Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th International Conference on Machine Learning, 129–136, 2011.

    Google Scholar 

  96. [96]

    Wu, Z.; Wang, X.; Lin, D.; Lischinski, D.; Cohen-Or, D.; Huang, H. SAGNet: Structure-aware generative network for 3D shape modeling. ACM Transactions on Graphics Vol. 38, No. 4, Article No. 91, 2019.

    Google Scholar 

  97. [97]

    Wang, H.; Schor, N.; Hu, R.; Huang, H.; Cohen-Or, D.; Huang, H. Global-tolocal generative model for 3D shapes. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 214, 2018.

    Google Scholar 

  98. [98]

    Tan, Q. Y.; Gao, L.; Lai, Y. K.; Xia, S. H. Variational autoencoders for deforming 3D mesh models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5841–5850, 2018.

    Google Scholar 

  99. [99]

    Gao, L.; Lai, Y.K.; Yang, J.; Zhang, L.-X.; Xia, S. H.; Kobbelt, L. Sparse data driven mesh deformation. IEEE Transactions on Visualization and Computer Graphics DOI: 10.1109/TVCG.2019.2941200, 2019.

    Google Scholar 

  100. [100]

    Tan, Q.; Gao, L.; Lai, Y.-K.; Yang, J.; Xia, S. Mesh-based autoencoders for localized deformation component analysis. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018.

    Google Scholar 

  101. [101]

    Duvenaud, D. K.; Maclaurin, D.; Iparraguirre, J.; Bombarell, R.; Hirzel, T.; Aspuru-Guzik, A.; Adams, R. P. Convolutional networks on graphs for learning molecular fingerprints. In: Proceedings of the Advances in Neural Information Processing Systems, 2224–2232, 2015.

    Google Scholar 

  102. [102]

    Gao, L.; Yang, J.; Qiao, Y.-L.; Lai, Y.-K.; Rosin, P. L.; Xu, W.; Xia, S. Automatic unpaired shape deformation transfer. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 237, 2018.

    Google Scholar 

  103. [103]

    Huang, S. S.; Fu, H. B.; Wei, L. Y.; Hu, S. M. Support substructures: Support-induced part-level structural representation. IEEE Transactions on Visualization and Computer Graphics Vol. 22, No. 8, 2024–2036, 2016.

    Article  Google Scholar 

  104. [104]

    Yuan, Y.-J.; Lai, Y.-K.; Yang, J.; Fu, H.; Gao, L. Mesh variational autoencoders with edge contraction pooling. arXiv preprint arXiv:1908.02507, 2019.

    Google Scholar 

  105. [105]

    Tan, Q. Y.; Pan, Z. R.; Gao, L.; Manocha, D. Realtime simulation of thin-shell deformable materials using CNN-based mesh embedding. IEEE Robotics and Automation Letters Vol. 5, No. 2, 2325–2332, 2020.

    Article  Google Scholar 

  106. [106]

    Silberman, N.; Fergus, R. Indoor scene segmentation using a structured light sensor. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 2011.

    Google Scholar 

  107. [107]

    Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In: Computer Vision - ECCV 2012. Lecture Notes in Computer Science, Vol. 7576. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 746–760, 2012.

    Google Scholar 

  108. [108]

    Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research Vol. 32, No. 11, 1231–1237, 2013.

    Article  Google Scholar 

  109. [109]

    Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Niessner, M. ScanNet: Richlyannotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5828–5839, 2017.

    Google Scholar 

  110. [110]

    Cao, Y. P.; Liu, Z. N.; Kuang, Z. F.; Kobbelt, L.; Hu, S. M. Learning to reconstruct high-quality 3D shapes with cascaded fully convolutional networks In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11213. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 626–643, 2018.

    Google Scholar 

  111. [111]

    Chang, A. X.; Funkhouser, T.; Guibas, L.; Hanrahan, P.; Huang, Q.; Li, Z.; Savarese, S.; Savva, M.; Song, S.; Su, H. et al. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012, 2015.

    Google Scholar 

  112. [112]

    Xiang, Y.; Kim, W.; Chen, W.; Ji, J. W.; Choy, C.; Su, H.; Mottaghi, R.; Guibas, L.; Savarese, S. ObjectNet3D: A large scale database for 3D object recognition. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9912. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 160–176, 2016.

    Google Scholar 

  113. [113]

    Song, S. R.; Yu, F.; Zeng, A.; Chang, A. X.; Savva, M.; Funkhouser, T. Semantic scene completion from a single depth image. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, 2017.

    Google Scholar 

  114. [114]

    Mo, K. C.; Zhu, S. L.; Chang, A. X.; Yi, L.; Tripathi, S.; Guibas, L. J.; Su, H. PartNet: A large-scale benchmark for fine-grained and hierarchical partlevel 3D object understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 909–918, 2019.

    Google Scholar 

  115. [115]

    Fu, H.; Jia, R.; Gao, L.; Gong, M.; Zhao, B.; Maybank, S.; Tao, D. 3D-FUTURE: 3D FUrniture shape with TextURE. 2020. Available at https://tianchi.aliyun.com/specials/promotion/alibaba-3d-future.

    Google Scholar 

  116. [116]

    Bronstein, A. M.; Bronstein, M. M.; Kimmel, R. Numerical Geometry of Non-Rigid Shapes. Springer Science & Business Media, 2008.

    Google Scholar 

  117. [117]

    Bogo, F.; Romero, J.; Loper, M.; Black, M. J. FAUST: Dataset and evaluation for 3D mesh registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3794–3801, 2014.

    Google Scholar 

  118. [118]

    Mahmood, N.; Ghorbani, N.; Troje, N. F.; Pons-Moll, G.; Black, M. AMASS: Archive of motion capture as surface shapes. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5442–5451, 2019.

    Google Scholar 

  119. [119]

    Liu, X. H.; Han, Z. Z.; Liu, Y.S.; Zwicker, M. Point2Sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, 8778–8785, 2019.

    Article  Google Scholar 

  120. [120]

    Gao, L.; Zhang, L. X.; Meng, H. Y.; Ren, Y. H.; Lai, Y. K.; Kobbelt, L. PRS-Net: Planar reflective symmetry detection net for 3D models. arXiv preprint arXiv:1910.06511, 2019.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61828204, 61872440), Beijing Municipal Natural Science Foundation (L182016), Youth Innovation Promotion Association CAS, CCF-Tencent Open Fund, Royal Society- Newton Advanced Fellowship (NAF\R2\192151), and the Royal Society (IES\R1\180126).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Lin Gao.

Additional information

Yun-Peng Xiao received his bachelor degree in computer science from Nankai University. He is currently a master student in the Institute of Computing Technology, the Chinese Academy of Sciences. His research interests include computer graphics and geometric processing.

Yu-Kun Lai received his bachelor and Ph.D. degrees in computer science from Tsinghua University in 2003 and 2008, respectively. He is currently a Reader in the School of Computer Science & Informatics, Cardiff University. His research interests include computer graphics, geometry processing, image processing and computer vision. He is on the editorial boards of Computer Graphics Forum and The Visual Computer.

Fang-Lue Zhang is currently a lecturer with Victoria University of Wellington, New Zealand. He received his bachelor degree from Zhejiang University, Hangzhou, in 2009, and doctoral degree from Tsinghua University, Beijing, in 2015. His research interests include image and video editing, computer vision, and computer graphics. He is a member of IEEE and ACM. He received a Victoria Early-Career Research Excellence Award in 2019.

Chunpeng Li received his Ph.D. degree in 2008 and now is an associate professor at the Institute of Computing Technology, the Chinese Academy of Sciences. His main research interests are in virtual reality, human-computer interaction, and computer graphics.

Lin Gao received his bachelor degree in mathematics from Sichuan University and Ph.D. degree in computer science from Tsinghua University. He is currently an associate professor at the Institute of Computing Technology, the Chinese Academy of Sciences. His research interests include computer graphics and geometric processing. He received a Newton Advanced Fellowship award from the Royal Society in 2019.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xiao, YP., Lai, YK., Zhang, FL. et al. A survey on deep geometry learning: From a representation perspective. Comp. Visual Media 6, 113–133 (2020). https://doi.org/10.1007/s41095-020-0174-8

Download citation

Keywords

  • 3D shape representation
  • geometry learning
  • neural networks
  • computer graphics