3D Face Reconstruction and Semantic Annotation from Single Depth Image

Li, Peixin; Pei, Yuru; Guo, Yuke; Zha, Hongbin

doi:10.1007/978-3-030-63426-1_3

Peixin Li¹²,
Yuru Pei¹²,
Yuke Guo¹³ &
…
Hongbin Zha¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1300))

Included in the following conference series:

International Conference on Computer Animation and Social Agents

409 Accesses
1 Citations

Abstract

We introduce a novel data-driven approach for taking a single-view noisy depth image as input and inferring a detailed 3D face with per-pixel semantic labels. The critical point of our method is its ability to handle the depth completions with varying extent of geometric details, managing 3D expressive face estimation by exploiting low-dimensional linear subspace and dense displacement field-based non-rigid deformations. We devise a deep neural network-based coarse-to-fine 3D face reconstruction and semantic annotation framework to produce high-quality facial geometry while preserving large-scale contexts and semantics. We evaluate the semantic consistency constraint and the generative model for 3D face reconstruction and depth annotation in extensive series of experiments. The results demonstrate that the proposed approach outperforms the compared methods not only in the face reconstruction with high-quality geometric details, but also semantic annotation performances regarding segmentation and landmark location.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alexander, O., et al.: The digital emily project: achieving a photorealistic digital actor. IEEE CGA 30(4), 20–31 (2010)
Google Scholar
Amberg, B., Romdhani, S., Vetter, T.: Optimal step nonrigid ICP algorithms for surface registration. In: IEEE CVPR, pp. 1–8 (2007)
Google Scholar
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH, pp. 187–194 (1999)
Google Scholar
Borghi, G., Venturelli, M., Vezzani, R., Cucchiara, R.: Poseidon: face-from-depth for driver pose estimation. In: IEEE CVPR, pp. 5494–5503 (2017)
Google Scholar
Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. IEEE Trans. VCG 20(3), 413–425 (2014)
Google Scholar
Chu, B., Romdhani, S., Chen, L.: 3D-aided face recognition robust to expression and pose variations. In: IEEE CVPR, pp. 1899–1906 (2014)
Google Scholar
Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: IEEE CVPR, pp. 617–624 (2011)
Google Scholar
Groueix, T., Fisher, M., Kim, V.G., Russell, B.C., Aubry, M.: 3D-coded: 3D correspondences by deep deformation. In: ECCV (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE CVPR, pp. 770–778 (2016)
Google Scholar
Kazhdan, M., Bolitho, M., Hoppe, H.: Poisson surface reconstruction. In: Eurographics Symposium on Geometry processing, vol. 7 (2006)
Google Scholar
Nealen, A., Igarashi, T., Sorkine, O., Alexa, M.: Laplacian mesh optimization. In: Proceedings of the 4th International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia, pp. 381–389 (2006)
Google Scholar
Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 296–301 (2009)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. arXiv preprint arXiv:1612.00593 (2016)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Tewari, A., et al.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 hz. In: IEEE CVPR, pp. 2549–2559 (2018)
Google Scholar
Tewari, A., et al.: Mofa: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: IEEE ICCV, vol. 2, p. 5 (2017)
Google Scholar
Wang, W., Ceylan, D., Mech, R., Neumann, U.: 3dn: 3D deformation network. In: IEEE CVPR, pp. 1038–1046 (2019)
Google Scholar
Yan, S., et al.: DDRNet: depth map denoising and refinement for consumer depth cameras using cascaded cnns. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 155–171. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_10
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Machine Perception (MOE), Department of Machine Intelligence, Peking University, Beijing, China
Peixin Li, Yuru Pei & Hongbin Zha
Luoyang Institute of Science and Technology, Luoyang, China
Yuke Guo

Authors

Peixin Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuru Pei
View author publications
You can also search for this author in PubMed Google Scholar
Yuke Guo
View author publications
You can also search for this author in PubMed Google Scholar
Hongbin Zha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuru Pei .

Editor information

Editors and Affiliations

Faculty of Science and Technology, Bournemouth University, Poole, UK
Feng Tian
Faculty of Media and Communication, Bournemouth University, Poole, UK
Xiaosong Yang
École Polytechnique Fédérale de Lausa, Lausanne, Switzerland
Daniel Thalmann
Computer Science Department, Zhejiang University, Hangzhou, China
Weiwei Xu
Faculty of Media and Communication, Bournemouth University, Poole, UK
Jian Jun Zhang
MIRALab/C.U.I., University of Geneva, Geneva, Switzerland
Nadia Magnenat Thalmann
Faculty of Media and Communication, Bournemouth University, Poole, UK
Jian Chang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, P., Pei, Y., Guo, Y., Zha, H. (2020). 3D Face Reconstruction and Semantic Annotation from Single Depth Image. In: Tian, F., et al. Computer Animation and Social Agents. CASA 2020. Communications in Computer and Information Science, vol 1300. Springer, Cham. https://doi.org/10.1007/978-3-030-63426-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-63426-1_3
Published: 25 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63425-4
Online ISBN: 978-3-030-63426-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics