Abstract
Existing unsupervised methods are often unable to capture accurate 3D shapes due to the ambiguity of shapes and albedo maps, limiting their applicability to downstream tasks. Therefore, this article proposes an unsupervised shape enhancement and decomposition machine network for 3D facial reconstruction. Specifically, we design a shape enhancement network, further combining global and local features, which can restore more complete and realistic albedo images without introducing additional supervision, so as to obtain higher-quality 3D faces. Secondly, based on the principle of decomposition machines, we design a decomposition module. By decomposing large matrices, the network learns to infer better results, while reducing the number of network parameters further improving the accuracy of our model. Extensive experiments on BFM and CelebA data demonstrate the effectiveness of our methods.
This work was supported by the Ningxia Graduate Education and Teaching Reform Research and Practice Project 2021, in part by National Natural Science Foundation of China under Grant 62062056, and in part by the Ningxia Natural Science Foundation under Grant 2022AAC03327.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Blanz, V., Vetter, T., Rockwood, A.: A morphable model for the synthesis of 3D faces. In: ACM SIGGRAPH, pp. 187–194 (2002)
Yang, M., et al.: Self-supervised High-fidelity and Re-renderable 3D Facial Reconstruction from a Single Image (2021)
Zhou, Y., et al.: Dense 3D face decoding over 2500FPS: joint texture & shape convolutional mesh decoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)
Zhu, X., et al.: Beyond 3DMM: Learning to Capture High-fidelity 3D Face Shape (2022)
Bao, L., et al.: High-Fidelity 3D Digital Human Head Creation from RGB-D Selfies. ACMPUB27, New York, NY (2022)
Jiang, D., et al.: Sphere Face Model: A 3D Morphable Model with Hypersphere Manifold Latent Space (2021)
Rahim, J.A., et al.: Deep facial non-rigid multi-view stereo. In: Conference on Computer Vision and Pattern Recognition (2020)
Yoon, J.S., et al.: Self-Supervised Adaptation of High-Fidelity Face Models for Monocular Performance Tracking (2019)
Li, T., et al.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. (TOG) (2017)
Feng, Y., et al.: Learning an animatable detailed 3D face model from in-the-wild images. ACM Trans. Graph. 40(4), 1–13 (2021)
Danecek, R., Black, M.J., Bolkart, T.: EMOCA: Emotion Driven Monocular Face Capture and Animation (2022)
Zielonka, W., Bolkart, T., Thies, J.: Towards metrical reconstruction of human faces. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13673, pp. 250–269. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19778-9_15
Wu, S., Rupprecht, C., Vedaldi, A.: Unsupervised learning of probably symmetric deformable 3D objects from images in the wild. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Shaheed, K., et al.: Finger-vein presentation attack detection using depthwise separable convolution neural network. Expert Syst. Appl. 198, 116786 (2022)
Li, G., et al.: Efficient depthwise separable convolution accelerator for classification and UAV object detection. Neurocomputing 490, 1–16 (2022)
Zhou, K., et al.: High-quality gene/disease embedding in a multi-relational heterogeneous graph after a joint matrix/tensor decomposition. J. Biomed. Inform. 126, 103973 (2022)
Huang, L., et al.: Context-aware road travel time estimation by coupled tensor decomposition based on trajectory data. Knowl.-Based Syst. 245, 108596 (2022)
Liu, Z., et al.: Deep learning face attributes in the wild. IEEE (2016)
Paysan, P., et al.: A 3D face model for pose and illumination invariant face recognition. In: 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE (2009)
Eigen, D., Puhrsch, C., Fergus, R.: Depth Map Prediction from a Single Image using a Multi-Scale Deep Network. MIT Press, Cambridge (2014)
Ho, L.N., et al.: Toward Realistic Single-View 3D Object Reconstruction With Unsupervised Learning From Multiple Images (2021)
Pan, X., et al.: Do 2D GANs Know 3D Shape? Unsupervised 3D Shape Reconstruction from 2D Image GANs (2021)
Liu, W.: Structural causal 3D reconstruction. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13661, pp. 140–159. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19769-7_9
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, L., Zhang, B., Gong, J., Wang, X., Li, X., Ma, K. (2023). Unsupervised Shape Enhancement and Factorization Machine Network for 3D Face Reconstruction. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14256. Springer, Cham. https://doi.org/10.1007/978-3-031-44213-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-44213-1_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44212-4
Online ISBN: 978-3-031-44213-1
eBook Packages: Computer ScienceComputer Science (R0)