Skip to main content
Log in

Face frontalization with deep GAN via multi-attention mechanism

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In recent years, the development of deep learning has led to some advances in face synthesis approaches, but the significant pose remains one of the factors that are difficult to overcome. Benefiting from the proposal and development of the generative adversarial network, the level of face frontalization technology has reached new heights. In this paper, we propose a deep generative adversarial network based on the multi-attention mechanism for multi-pose face frontalization. Specifically, we add a deep feature encoder based on the attention mechanism and residual blocks in the generator. Meanwhile, to carry the global and local facial information, the discriminator of our model consists of four independent discriminators. The results from quantitative and qualitative experiments on CAS-PEAL-R1 dataset show that our model proves effective. The recognition of our model exceeds or equals the highest recognition rate of other models at some angles, such as 100% at β = 0°, α = 15° and 99.78% at β = 30°, α = 0°.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

“CAS-PEAL-R1” http://www.jdl.ac.cn/peal/https://drive.google.com/drive/folders/1OLTGh15CuhyRXA0nOXVnKKjbYAg9UOLE “Casia-facev5” http://biometrics.idealtest.org/.

References

  1. Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference (2015)

  2. Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1891–1898 (2014)

  3. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015)

  4. Cao, K., Rong, Y., Li, C., Tang, X., Loy, C.C.: Pose-robust face recognition via deep residual equivariant mapping. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5187–5196 (2018)

  5. Hu, Y., Wu, X., Yu, B., He, R., Sun, Z.: Pose-guided photorealistic face rotation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8398–8406 (2018)

  6. Yim, J., Jung, H., Yoo, B.I., Choi, C., Kim, J.: Rotating your face using multi-task deep neural network. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

  7. Zhu, Z., Luo, P., Wang, X., Tang, X.: Multi-view perceptron: a deep model for learning face identity and view representations. In: Advances in Neural Information Processing Systems (2014)

  8. Cole, F., Belanger, D., Krishnan, D., Sarna, A., Mosseri, I., Freeman, W.T.: Synthesizing normalized faces from facial identity features. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3386–3395 (2017)

  9. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Neural Information Processing Systems (2014)

  10. Huang, R., Zhang, S., Li, T., He, R.: Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2458–2467 (2017)

  11. Tran, L., Yin, X., Liu, X.: Disentangled representation learning gan for pose-invariant face recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1283–1292 (2017)

  12. Li, P., Wu, X., Hu, Y., He, R., Sun, Z.: M2fpa: a multi-yaw multi-pitch high-quality dataset and benchmark for facial pose analysis. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10 042–10 050 (2019)

  13. Yin, Y, Jiang, S., Robinson, J.P., Fu, Y.: Dual-attention gan for large-pose face frontalization. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pp. 249–256 (2020)

  14. Luan, X., Geng, H., Liu, L., Li, W., Zhao, Y., Ren, M.: Geometry structure preserving based gan for multi-pose face frontalization and recognition. IEEE Access 8, 104676–104687 (2020)

    Article  Google Scholar 

  15. Zhang, Z., Liang, R., Chen, X., Xu, X., Hu, G., Zuo, W., Hancock, E.R.: Semi-supervised face frontalization in the wild. IEEE Trans. Inf. Forensics Secur. 16, 909–922 (2021)

    Article  Google Scholar 

  16. Luo, H., Cen, S., Ding, Q., Chen, X.: Frontal face reconstructionbased on detail identification, variable scale self-attention and flexible skip connection. In: Neural Computing & Applications (2022)

  17. Qian, Y., Deng, W., Hu, J.: Unsupervised face normalization with extreme pose and expression in the wild. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9843–9850 (2019)

  18. Zhao, J., Cheng, Y., Xu, Y., Xiong, L., Li, J., Zhao, F., Jayashree, K., Pranata, S., Shen, S., Xing, J., Yan, S., Feng, J.: Towards pose invariant face recognition in the wild. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2207–2216 (2018)

  19. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Computer Vision—ECCV 2018, pp. 3–19 (2018)

  20. Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D., Kim, K.I.: Unsupervised attention-guided image to image translation. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 3697–3707 (2018)

  21. Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3141–3149 (2019)

  22. Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks (2018)

  23. Denton, E., Chintala, S., Szlam, A., Fergus, R.: Deep generative image models using a laplacian pyramid of adversarial networks. In: International Conference on Neural Information Processing Systems, pp. 1486–1494 (2015)

  24. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. Computer Science (2015)

  25. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning (ICML), pp. 214–223 (2017)

  26. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of wasserstein gans, pp. 5767–5777 (2017)

  27. Berthelot, D., Schumm, T., Metz, L.: Began: boundary equilibrium generative adversarial networks. arXiv (2017)

  28. Hassner, T., Harel, S., Paz, E., Enbar, R.: Effective face frontalization in unconstrained images. In: Computer Vision Pattern Recognition, pp. 4295–4304 (2015)

  29. Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 787–796 (2015)

  30. Cen, S., Luo, H., Huang, J., Shi, W., Chen, X.: Pre-trained feature fusion and multidomain identification generative adversarial network for face frontalization. IEEE Access 10, 77872–77882 (2022)

    Article  Google Scholar 

  31. Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. Adv. Neural Inf. Process. Syst. 3, 2204–2212 (2014)

    Google Scholar 

  32. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. Computer Science (2014)

  33. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. arXiv (2017)

  34. Duan, S., Chen, Z., Wu, Q., Cai, L., Lu, D.: Multi-scale gradients self-attention residual learning for face photo-sketch transformation. IEEE Trans. Inf. Forensics Secur. 16, 1218–1230 (2021)

    Article  Google Scholar 

  35. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  36. Wu, X., He, R., Sun, Z., Tan, T.: A light cnn for deep face representation with noisy labels. IEEE Trans. Inf. Forensics Secur. 13(11), 2884–2896 (2018)

    Article  Google Scholar 

  37. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision—ECCV 2016, pp. 694–711 (2016)

  38. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)

  39. Gao, W., Cao, B., Shan, S., Chen, X., Zhou, D., Zhang, X., Zhao, D.: The cas-peal large-scale Chinese face database and baseline evaluations. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 38(1), 149–161 (2008)

    Article  Google Scholar 

  40. “Casia-facev5” http://biometrics.idealtest.org/

  41. Deng, J., Guo, J., Ververas, E., Kotsia, I., Zafeiriou, S.: RetinaFace: single-shot multi-level face localisation in the wild. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5202–5211 (2020).https://doi.org/10.1109/CVPR42600.2020.00525

  42. Liu, Y., Chen, J.: Unsupervised face frontalization using disentangled representation-learning CycleGAN. Comput. Vis. Image Underst. 222, 103526 (2022)

    Article  Google Scholar 

Download references

Funding

This work was supported in part by the National Key R&D Program of China (2019YFB1311001) in part by the National Natural Science Foundation of China (61876099) and in part by the Key R&D Project of Shandong Province (2022CXGC010503).

Author information

Authors and Affiliations

Authors

Contributions

JC, ZC, YZ and LS wrote the main manuscript text, and JC prepared Figs. 1 and 2. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zhenxue Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, J., Chen, Z., Zhang, Y. et al. Face frontalization with deep GAN via multi-attention mechanism. SIViP 17, 1965–1973 (2023). https://doi.org/10.1007/s11760-022-02409-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02409-7

Keywords

Navigation