Towards High Fidelity Face Frontalization in the Wild

Cao, Jie; Hu, Yibo; Zhang, Hongwen; He, Ran; Sun, Zhenan

doi:10.1007/s11263-019-01229-6

Towards High Fidelity Face Frontalization in the Wild

Published: 12 October 2019

Volume 128, pages 1485–1504, (2020)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Jie Cao ORCID: orcid.org/0000-0001-6368-4495^1,2,3,4,
Yibo Hu^1,2,3,4,
Hongwen Zhang^1,2,3,4,
Ran He^1,2,3,4 &
…
Zhenan Sun^1,2,3,4

1515 Accesses
22 Citations
Explore all metrics

Abstract

Face frontalization refers to the process of synthesizing the frontal view of a face from a given profile. Due to self-occlusion and appearance distortion in the wild, it is extremely challenging to recover faithful high-resolution results meanwhile preserve texture details. This paper proposes a high fidelity pose in-variant model (HF-PIM) to produce photographic and identity-preserving results. HF-PIM frontalizes the profiles through a novel texture fusion warping procedure and leverages a dense correspondence field to bind the 2D and 3D surface spaces. We decompose the prerequisite of warping into dense correspondence field estimation and facial texture map recovering, which are both well addressed by deep networks. Different from those reconstruction methods relying on 3D data, we also propose adversarial residual dictionary learning to supervise facial texture map recovering with only monocular images. Furthermore, a multi-perception guided loss is proposed to address the practical misalignment between the ground truth frontal and profile faces, allowing HF-PIM to effectively utilize multiple images during training. Quantitative and qualitative evaluations on five controlled and uncontrolled databases show that the proposed method not only boosts the performance of pose-invariant face recognition but also improves the visual quality of high-resolution frontalization appearances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Enhancing quality of pose-varied face restoration with local weak feature sensing and GAN prior

Article 21 October 2023

Kai Hu, Yu Liu, … Bin Fu

Learning Flow-Based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision

Recognizing Profile Faces by Imagining Frontal View

Article 05 November 2019

Jian Zhao, Junliang Xing, … Jiashi Feng

Notes

https://github.com/1adrianb/face-alignment.
Visualization results produced by other methods are released by their authors. Different methods usually report visual examples of different identities. We try our best to find those identities reported by most methods.

References

AbdAlmageed, W., Wu, Y., Rawls, S., Harel, S., Hassner, T., Masi, I., Choi, J., Lekust, J., Kim, J., Natarajan, P., et al. (2016). Face recognition using deep multi-pose representations. In IEEE winter conference on applications of computer vision (WACV) (pp. 1–9).
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN. In International conference on machine learning (ICML) (pp. 214–223).
Ashraf, A. B., Lucey, S., & Chen, T. (2008). Learning patch correspondences for improved viewpoint invariant face recognition. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–8).
Blanz, V., & Vetter, T. (1999). A morphable model for the synthesis of 3D faces. In Annual conference on computer graphics and interactive techniques (SIGGRAPH) (pp. 187–194).
Booth, J., & Zafeiriou, S. (2014). Optimal UV spaces for facial morphable model construction. In IEEE international conference on image processing (ICIP) (pp. 4672–4676).
Bulat, A., & Tzimiropoulos, G. (2017). How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In IEEE international conference on computer vision (ICCV) (pp. 1021–1030).
Cao, J., Hu, Y., Yu, B., He, R., & Sun, Z. (2019). 3D aided duet GANs for multi-view face image synthesis. IEEE Transactions on Information Forensics and Security (TIFS), 14(8), 2028–2042.
Article Google Scholar
Cao, J., Hu, Y., Zhang, H., He, R., & Sun, Z. (2018). Learning a high fidelity pose invariant model for high-resolution face frontalization. In Conference on neural information processing systems (NeurIPS).
Cao, K., Rong, Y., Li, C., Tang, X., & Loy, C.C. (2018). Pose-robust face recognition via deep residual equivariant mapping. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5187–5196).
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. (2016). InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In Conference on neural information processing systems (NeurIPS) (pp. 2172–2180).
Cimpoi, M., Maji, S., & Vedaldi, A. (2015). Deep filter banks for texture recognition and segmentation. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3828–3836).
Cole, F., Belanger, D., Krishnan, D., Sarna, A., Mosseri, I., & Freeman, W.T. (2017). Synthesizing normalized faces from facial identity features. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3386–3395).
Dana, H.Z.J.X.K. (2017). Deep TEN: Texture encoding network. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2896–2905).
Deng, J., Cheng, S., Xue, N., Zhou, Y., & Zafeiriou, S. (2018). UV-GAN: Adversarial facial UV map completion for pose-invariant face recognition. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 7093–7102).
Dovgard, R., & Basri, R. (2004). Statistical symmetric shape from shading for 3D structure recovery of faces. In European conference on computer vision (ECCV) (pp. 99–113).
Ferrari, C., Lisanti, G., Berretti, S., & Del Bimbo, A. (2016). Effective 3D based frontalization for unconstrained face recognition. In International conference on pattern recognition (ICPR) (pp. 1047–1052).
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Conference on neural information processing systems (NeurIPS) (pp. 2672–2680).
Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-PIE. Image and Vision Computing (IVC), 28(5), 807–813.
Article Google Scholar
Güler, R. A., Neverova, N., & Kokkinos, I. (2018). DensePose: Dense human pose estimation in the wild. In IEEE conference on computer vision and pattern recognition (CVPR).
Güler, R. A., Trigeorgis, G., Antonakos, E., Snape, P., Zafeiriou, S., & Kokkinos, I. (2017). Densereg: Fully convolutional dense shape regression in-the-wild. In IEEE Conference on computer vision and pattern recognition (CVPR) (vol. 2, p. 5).
Han, C., Shan, S., Kan, M., Wu, S., & Chen, X. (2018). Face recognition with contrastive convolution. In European conference on computer vision (ECCV) (pp. 118–134).
Hassner, T. (2013). Viewing real-world faces in 3D. In IEEE international conference on computer vision (ICCV) (pp. 3607–3614).
Hassner, T., Harel, S., Paz, E., & Enbar, R. (2015). Effective face frontalization in unconstrained images. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 4295–4304).
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Klambauer, G., & Hochreiter, S. (2017). GANs trained by a two time-scale update rule converge to a Nash equilibrium. In Conference on neural information processing systems (NeurIPS) (pp. 6629–6640).
Hu, Y., Wu, X., Yu, B., He, R., & Sun, Z. (2018). Pose-guided photorealistic face rotation. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 8398–8406).
Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Tech. rep., University of Massachusetts, Amherst.
Huang, H., He, R., Sun, Z., & Tan, T. (2017). Wavelet-SRnet: A wavelet-based CNN for multi-scale face super resolution. In IEEE international conference on computer vision (ICCV) (pp. 1689–1697).
Huang, R., Zhang, S., Li, T., & He, R. (2017). Beyond face rotation: Global and local perception GAN for photorealistic and identity preserving frontal view synthesis. In IEEE international conference on computer vision (ICCV) (pp. 2458–2467).
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (ECCV) (pp. 694–711).
Kan, M., Shan, S., Chang, H., & Chen, X. (2014). Stacked progressive auto-encoders (SPAE) for face recognition across poses. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1883–1890).
Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). Progressive growing of GANs for improved quality, stability, and variation. In International conference on learning representations (ICLR).
Klare, B.F., Jain, A.K., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., Grother, P., Mah, A., & Burge, M. (2015). Pushing the frontiers of unconstrained face detection and recognition: IARPA janus benchmark a. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1931–1939).
Li, J., Zhao, J., Zhao, F., Liu, H., Li, J., Shen, S., Feng, J., & Sim, T. (2016). Robust face recognition with deep multi-view representation learning. In ACM international conference on multimedia (ACM-MM) (pp. 1068–1072).
Li, P., Wu, X., Hu, Y., He, R., & Sun, Z. (2019). M2FPA: A multi-yaw multi-pitch high-quality database and benchmark for facial pose analysis. IEEE international conference on computer vision (ICCV).
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In IEEE international conference on computer vision (ICCV) (pp. 3730–3738).
Lucey, S., & Chen, T. (2008). A viewpoint invariant, sparsely registered, patch based, face verifier. International Journal of Computer Vision (IJCV), 80(1), 58–71.
Article Google Scholar
Mao, X., Li, Q., Xie, H., Lau, R. Y., Wang, Z., & Smolley, S. P. (2017). Least squares generative adversarial networks. In IEEE international conference on computer vision (ICCV) (pp. 2813–2821).
Masi, I., Rawls, S., Medioni, G., & Natarajan, P. (2016). Pose-aware face recognition in the wild. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 4838–4846).
Masi, I., Trn, A. T., Hassner, T., Leksut, J. T., & Medioni, G. (2016). Do we really need to collect millions of faces for effective face recognition? In European conference on computer vision (ECCV) (pp. 579–596).
Miyato, T., & Koyama, M. (2018). cGANs with projection discriminator. In International conference on learning representations (ICLR).
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch. In Conference on neural information processing systems (NeurIPS-W).
Paysan, P., Knothe, R., Amberg, B., Romdhani, S., & Vetter, T. (2009). A 3D face model for pose and illumination invariant face recognition. In IEEE international conference on advanced video and signal based surveillance (AVSS) (pp. 296–301).
Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised representation learning with deep convolutional generative adversarial networks. In International conference on learning representations (ICLR).
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer assisted intervention (MICCAI) (pp. 234–241).
Sagonas, C., Panagakis, Y., Zafeiriou, S., & Pantic, M. (2015). Robust statistical face frontalization. In IEEE international conference on computer vision (ICCV) (pp. 3871–3879).
Sengupta, S., Chen, J. C., Castillo, C., Patel, V. M., Chellappa, R., & Jacobs, D. W. (2016). Frontal to profile face verification in the wild. In IEEE winter conference on applications of computer vision (WAC) (pp. 1–9).
Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., & Webb, R. (2017). Learning from simulated and unsupervised images through adversarial training. In IEEE conference on computer vision and pattern recognition (CVPR) (vol. 2, p. 5).
Sun, X., Nasrabadi, N. M., & Tran, T. D. (2018). Supervised deep sparse coding networks. In IEEE international conference on image processing (ICIP) (pp. 346–350).
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–9).
Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In IEEE international conference on computer vision (ICCV) (pp. 1701–1708).
Tian, Y., Peng, X., Zhao, L., Zhang, S., & Metaxas, D. N. (2018). CR-GAN: Learning complete representations for multi-view generation. In International joint conference on artificial intelligence (IJCAI).
Tran, L., & Liu, X. (2018). Nonlinear 3D face morphable model. In IEEE conference on computer vision and pattern recognition (CVPR).
Tran, L., Yin, X., & Liu, X. (2017). Disentangled representation learning GAN for pose-invariant face recognition. In IEEE conference on computer vision and pattern recognition (CVPR) (vol. 3, p. 7).
Tran, L.Q., Yin, X., & Liu, X. (2018). Representation learning by rotating your faces. In IEEE transactions on pattern analysis and machine intelligence (TPAMI).
Van Gemert, J. C., Geusebroek, J. M., Veenman, C. J., & Smeulders, A. W. (2008). Kernel codebooks for scene categorization. In European conference on computer vision (ECCV) (pp. 696–709).
Wu, X., He, R., Sun, Z., & Tan, T. (2018). A light cnn for deep face representation with noisy labels. IEEE Transactions on Information Forensics and Security, 13(11), 2884–2896.
Article Google Scholar
Yan, J., Lin, D., & Loy, C. C. (2018). Consensus-driven propagation in massive unlabeled data for face recognition. In European conference on computer vision (ECCV).
Yang, J., Reed, S.E., Yang, M.H., & Lee, H. (2015). Weakly-supervised disentangling with recurrent transformations for 3D view synthesis. In Conference on neural information processing systems (NeurIPS) (pp. 1099–1107).
Yim, J., Jung, H., Yoo, B., Choi, C., Park, D., & Kim, J. (2015). Rotating your face using multi-task deep neural network. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 676–684).
Yin, X., Yu, X., Sohn, K., Liu, X., & Chandraker, M. (2017). Towards large-pose face frontalization in the wild. In IEEE international conference on computer vision (ICCV) (pp. 1–10).
Zhao, J., Cheng, Y., Xu, Y., Xiong, L., Li, J., Zhao, F., Jayashree, K., Pranata, S., Shen, S., Xing, J., et al. (2018). Towards pose invariant face recognition in the wild. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2207–2216).
Zhao, J., Xiong, L., Cheng, Y., Cheng, Y., Li, J., Zhou, L., Xu, Y., Karlekar, J., Pranata, S., Shen, S., et al. (2018). 3D-aided deep pose-invariant face recognition. In International joint conference on artificial intelligence (IJCAI).
Zhao, J., Xiong, L., Jayashree, P. K., Li, J., Zhao, F., Wang, Z., Pranata, P. S., Shen, P. S., Yan, S., & Feng, J. (2017). Dual-agent GANs for photorealistic and identity preserving profile face synthesis. In Conference on neural information processing systems (NeurIPS) (pp. 66–76).
Zhao, J., Xiong, L., Li, J., Xing, J., Yan, S., & Feng, J. (2018). 3D-aided dual-agent gans for unconstrained face recognition. IEEE transactions on pattern analysis and machine intelligence (TPAMI).
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In IEEE international conference on computer vision (ICCV) (pp. 2242–2251).
Zhu, X., Lei, Z., Liu, X., Shi, H., & Li, S. Z. (2016). Face alignment across large poses: A 3D solution. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 146–155).
Zhu, X., Lei, Z., Yan, J., Yi, D., & Li, S. Z. (2015). High-fidelity pose and expression normalization for face recognition in the wild. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 787–796).

Download references

Acknowledgements

This work is funded by the National Key Research and Development Program of China (Grant Nos. 2016YFB1001001, 2017YFC0821602), the National Natural Science Foundation of China (Grant Nos. 61622310, 61427811, U1836217), and Beijing Natural Science Foundation (Grant No. JQ18017).

Author information

Authors and Affiliations

Center for Research on Intelligent Perception and Computing, CASIA, Beijing, China
Jie Cao, Yibo Hu, Hongwen Zhang, Ran He & Zhenan Sun
National Laboratory of Pattern Recognition, CASIA, Beijing, China
Jie Cao, Yibo Hu, Hongwen Zhang, Ran He & Zhenan Sun
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
Jie Cao, Yibo Hu, Hongwen Zhang, Ran He & Zhenan Sun
Center for Excellence in Brain Science and Intelligence Technology, CAS, Beijing, China
Jie Cao, Yibo Hu, Hongwen Zhang, Ran He & Zhenan Sun

Authors

Jie Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yibo Hu
View author publications
You can also search for this author in PubMed Google Scholar
Hongwen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ran He
View author publications
You can also search for this author in PubMed Google Scholar
Zhenan Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenan Sun.

Additional information

Communicated by Xavier Alameda-Pineda, Elisa Ricci, Albert Ali Salah, Nicu Sebe, Shuicheng Yan.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, J., Hu, Y., Zhang, H. et al. Towards High Fidelity Face Frontalization in the Wild. Int J Comput Vis 128, 1485–1504 (2020). https://doi.org/10.1007/s11263-019-01229-6

Download citation

Received: 29 September 2018
Accepted: 05 September 2019
Published: 12 October 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11263-019-01229-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Towards High Fidelity Face Frontalization in the Wild

Abstract

Access this article

Similar content being viewed by others

Enhancing quality of pose-varied face restoration with local weak feature sensing and GAN prior

Learning Flow-Based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision

Recognizing Profile Faces by Imagining Frontal View

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards High Fidelity Face Frontalization in the Wild

Abstract

Access this article

Similar content being viewed by others

Enhancing quality of pose-varied face restoration with local weak feature sensing and GAN prior

Learning Flow-Based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision

Recognizing Profile Faces by Imagining Frontal View

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation