Three-dimensional shape space learning for visual concept construction: challenges and research progress

Tong, Xin

doi:10.1631/FITEE.2200318

Three-dimensional shape space learning for visual concept construction: challenges and research progress

面向视觉概念构建的三维形状空间学习: 挑战与研究进展

Perspective
Published: 24 September 2022

Volume 23, pages 1290–1297, (2022)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Xin Tong (童欣) ORCID: orcid.org/0000-0001-8788-2453¹

164 Accesses
2 Citations
Explore all metrics

摘要

人类可以熟练的对真实世界中物体按照形状或者功能进行分类, 并在思维中建立每类物体的视觉概念和周围真实世界的视觉知识 (Pan, 2019). Pan (2021) 指出建立这些视觉概念和视觉知识的计算表达是发展下一代人工智能的一个关键步骤. 学习同一视觉概念下所有物体的三维形状空间是实现视觉概念计算表达的一个关键步骤. 本文提出三维形状空间学习中面临的关键技术挑战, 并围绕这些技术挑战回顾了这一领域的研究进展, 最后讨论了三维形状空间学习领域的研究趋势和未来发展方向.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bai S, Bai X, Zhou ZC, et al., 2016. GIFT: a real-time and scalable 3D shape search engine. IEEE Conf on Computer Vision and Pattern Recognition, p.5023–5032. https://doi.org/10.1109/CVPR.2016.543
Cao C, Weng YL, Zhou S, et al., 2014. FaceWareHouse: a 3D facial expression database for visual computing. IEEE Trans Visual Comput Graph, 20(3):413–425. https://doi.org/10.1109/TVCG.2013.249
Article Google Scholar
Chan ER, Monteiro M, Kellnhofer P, et al., 2021. pi-GAN: periodic implicit generative adversarial networks for 3D-aware image synthesis. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5799–5809. https://doi.org/10.1109/CVPR46437.2021.00574
Chen ZQ, Zhang H, 2019. Learning implicit fields for generative shape modeling. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5932–5941. https://doi.org/10.1109/CVPR.2019.00609
Deng Y, Yang JL, Tong X, 2021. Deformed implicit field: modeling 3D shapes with learned dense correspondence. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10286–10296. https://doi.org/10.1109/CVPR46437.2021.01015
Deng Y, Yang J, Xiang J, et al., 2022. GRAM: generative radiance manifolds for 3D-aware image generation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10673–10683.
Egger B, Smith WA, Tewari A, 2020. 3D morphable face models past, present, and future. ACM Trans Graph, 39(5):157. https://doi.org/10.1145/3395208
Article Google Scholar
Gadelha M, Maji S, Wang R, 2017. 3D shape induction from 2D views of multiple objects. Int Conf on 3D Vision, p.402–411. https://doi.org/10.1109/3DV.2017.00053
Groueix T, Fisher M, Kim VG, et al., 2018. A Papier-Mache approach to learning 3D surface generation. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.216–224. https://doi.org/10.1109/CVPR.2018.00030
Hughes JF, van Dam A, McGuire M, et al., 2013. Computer Graphics: Principles and Practice (3^rd Ed.). Addison-Wesley, Upper Saddle River, USA.
Google Scholar
Jiang C, Huang J, Tagliasacchi A, et al., 2020. ShapeFlow: learnable deformation flows among 3D shapes. Advances in Neural Information Processing Systems 33, p.9745–9757.
Jin YW, Jiang DQ, Cai M, 2020. 3D reconstruction using deep learning: a survey. Commun Inform Syst, 20(4): 389–413. https://doi.org/10.4310/CIS.2020.v20.n4.a1
Article MathSciNet Google Scholar
Li X, Dong Y, Peers P, et al., 2019. Synthesizing 3D shapes from silhouette image collections using multi-projection generative adversarial networks. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5530–5539. https://doi.org/10.1109/CVPR.2019.00568
Liu F, Liu XM, 2020. Learning implicit functions for topology-varying dense 3D shape correspondence. Proc 34^th Int Conf on Neural Information Processing Systems, p.4823–4834.
Loper M, Mahmood N, Romero J, et al., 2015. SMPL: a skinned multi-person linear model. ACM Trans Graph, 34(6):248. https://doi.org/10.1145/2816795.2818013
Article Google Scholar
Lun ZL, Gadelha M, Kalogerakis E, et al., 2017. 3D shape reconstruction from sketches via multi-view convolutional networks. Proc Int Conf on 3D Vision, p.67–77. http://arxiv.org/abs/1707.06375
Masci J, Boscaini D, Bronstein MM, et al., 2015. Geodesic convolutional neural networks on Riemannian manifolds. Proc IEEE Int Conf on Computer Vision Workshop, p.832–840. https://doi.org/10.1109/ICCVW.2015.112
Měch R, Prusinkiewicz P, 1996. Visual models of plants interacting with their environment. Proc 23^rd Annual Conf on Computer Graphics and Interactive Techniques, p.397–410. https://doi.org/10.1145/237170.237279
Mescheder L, Oechsle M, Niemeyer M, et al., 2019. Occupancy networks: learning 3D reconstruction in function space. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4455–4465. https://doi.org/10.1109/CVPR.2019.00459
Mo KC, Zhu SL, Chang AX, et al., 2019. PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.909–918. https://doi.org/10.1109/CVPR.2019.00100
Müller P, Wonka P, Haegler S, et al., 2006. Procedural modeling of buildings. ACM SIGGRAPH Papers, p.614–623. https://doi.org/10.1145/1141911.1141931
Niu CJ, Li J, Xu K, 2018. Im2Struct: recovering 3D shape structure from a single RGB image. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4521–4529. https://doi.org/10.1109/CVPR.2018.00475
Pan YH, 2019. On visual knowledge. Front Inform Technol Electron Eng, 20(8):1021–1025. https://doi.org/10.1631/FITEE.1910001
Article Google Scholar
Pan YH, 2021a. Miniaturized five fundamental issues about visual knowledge. Front Inform Technol Electron Eng, 22(5):615–618. https://doi.org/10.1631/FITEE.2040000
Article Google Scholar
Pan YH, 2021b. On visual understanding. Front Inform Technol Electron Eng, early access. https://doi.org/10.1631/FITEE.2130000
Park JJ, Florence P, Straub J, et al., 2019. DeepSDF: learning continuous signed distance functions for shape representation. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.165–174. https://doi.org/10.1109/CVPR.2019.00025
Paschalidou D, Katharopoulos A, Geiger A, et al., 2021. Neural parts: learning expressive 3D shape abstractions with invertible neural networks. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3204–3215. https://doi.org/10.1109/CVPR46437.2021.00322
Qi CR, Su H, Mo KC, et al., 2017. PointNet: deep learning on point sets for 3D classification and segmentation. IEEE Conf on Computer Vision and Pattern Recognition, p.77–85. https://doi.org/10.1109/CVPR.2017.16
Riegler G, Ulusoy AO, Geiger A, 2017. OctNet: learning deep 3D representations at high resolutions. IEEE Conf on Computer Vision and Pattern Recognition, p.6620–6629. https://doi.org/10.1109/CVPR.2017.701
Sinha A, Bai J, Ramani K, 2016. Deep learning 3D shape surfaces using geometry images. Proc 14^th European Conf on Computer Vision, p.223–240. https://doi.org/10.1007/978-3-319-46466-4_14
Su H, Maji S, Kalogerakis E, et al., 2015. Multi-view convolutional neural networks for 3D shape recognition. IEEE Int Conf on Computer Vision, p.945–953. https://doi.org/10.1109/ICCV.2015.114
Sun CY, Zou QF, Tong X, et al., 2019. Learning adaptive hierarchical cuboid abstractions of 3D shape collections. ACM Trans Graph, 38(6):241. https://doi.org/10.1145/3355089.3356529
Article Google Scholar
Tulsiani S, Su H, Guibas LJ, et al., 2017. Learning shape abstractions by assembling volumetric primitives. IEEE Conf on Computer Vision and Pattern Recognition, p.1466–1474. https://doi.org/10.1109/CVPR.2017.160
Wang NY, Zhang YD, Li ZW, et al., 2018. Pixel2Mesh: generating 3D mesh models from single RGB images. Proc 15^th European Conf on Computer Vision, p.55–71. https://doi.org/10.1007/978-3-030-01252-6_4
Wang PS, Liu Y, Guo YX, et al., 2017. O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans Graph, 36(4):72. https://doi.org/10.1145/3072959.3073608
Article Google Scholar
Wang PS, Liu Y, Tong X, 2022. Dual octree graph networks for learning adaptive volumetric shape representations. ACM Trans Graph, 41(4):103. https://doi.org/10.1145/3528223.3530087
Article Google Scholar
Wen C, Zhang YD, Li ZW, et al., 2019. Pixel2Mesh++: multi-view 3D mesh generation via deformation. IEEE/CVF Int Conf on Computer Vision, p.1042–1051. https://doi.org/10.1109/ICCV.2019.00113
Wu ZR, Song SR, Khosla A, et al., 2015. 3D ShapeNets: a deep representation for volumetric shapes. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1912–1920. https://doi.org/10.1109/CVPR.2015.7298801
Xiao YP, Lai YK, Zhang FL, et al., 2020. A survey on deep geometry learning: from a representation perspective. Comput Visual Med, 6(2):113–133. https://doi.org/10.1007/s41095-020-0174-8
Article Google Scholar
Yang J, Mo KC, Lai YK, et al., 2023. DSG-Net: learning disentangled structure and geometry for 3D shape generation. ACM Trans Graph, 42(1):1. https://doi.org/10.1145/3526212
Article Google Scholar
Yang KZ, Chen XJ, 2021. Unsupervised learning for cuboid shape abstraction via joint segmentation from point clouds. ACM Trans Graph, 40(4):152. https://doi.org/10.1145/3450626.3459873
Article Google Scholar
Yu FG, Liu K, Zhang Y, et al., 2019. PartNet: a recursive part decomposition network for fine-grained and hierarchical shape segmentation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.9483–9492. https://doi.org/10.1109/CVPR.2019.00972
Yu LQ, Li XZ, Fu CW, et al., 2018. PU-Net: point cloud upsampling network. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.2790–2799. https://doi.org/10.1109/CVPR.2018.00295
Zheng XY, Liu Y, Wang PS, et al., 2022. SDF-StyleGAN: implicit SDF-based StyleGAN for 3D shape generation. https://arxiv.org/abs/2206.12055
Zheng ZR, Yu T, Dai QH, et al., 2021. Deep implicit templates for 3D shape representation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1429–1439. https://doi.org/10.1109/CVPR46437.2021.00148
Zuffi S, Kanazawa A, Jacobs DW, et al., 2017. 3D Menagerie: modeling the 3D shape and pose of animals. IEEE Conf on Computer Vision and Pattern Recognition, p.5524–5532. https://doi.org/10.1109/CVPR.2017.586

Download references

Acknowledgements

This paper is based on the author’s presentations in the first and second workshops on visual knowledge and visual intelligence. The author would like to thank all workshop attendees for the insightful discussions. The author also thanks Prof. Yunhe PAN and Dr. Heung-Yeung SHUM for their invaluable comments on the topics presented in the paper. Finally, the author thanks all collaborators for our research works presented in Li et al. (2019), Sun et al. (2019), and Deng et al. (2021, 2022).

Author information

Authors and Affiliations

Microsoft Research Asia, Beijing, 100080, China
Xin Tong (童欣)

Authors

Xin Tong (童欣)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Tong (童欣).

Ethics declarations

Xin TONG declares that he has no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tong, X. Three-dimensional shape space learning for visual concept construction: challenges and research progress. Front Inform Technol Electron Eng 23, 1290–1297 (2022). https://doi.org/10.1631/FITEE.2200318

Download citation

Received: 26 July 2022
Accepted: 26 August 2022
Published: 24 September 2022
Issue Date: September 2022
DOI: https://doi.org/10.1631/FITEE.2200318

关键词

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Three-dimensional shape space learning for visual concept construction: challenges and research progress

摘要

Access this article

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Rights and permissions

About this article

Cite this article

Share this article

关键词

Search

Navigation