Abstract
In recent years, the development of deep learning has led to some advances in face synthesis approaches, but the significant pose remains one of the factors that are difficult to overcome. Benefiting from the proposal and development of the generative adversarial network, the level of face frontalization technology has reached new heights. In this paper, we propose a deep generative adversarial network based on the multi-attention mechanism for multi-pose face frontalization. Specifically, we add a deep feature encoder based on the attention mechanism and residual blocks in the generator. Meanwhile, to carry the global and local facial information, the discriminator of our model consists of four independent discriminators. The results from quantitative and qualitative experiments on CAS-PEAL-R1 dataset show that our model proves effective. The recognition of our model exceeds or equals the highest recognition rate of other models at some angles, such as 100% at β = 0°, α = 15° and 99.78% at β = 30°, α = 0°.
Graphical abstract
Similar content being viewed by others
Data availability
References
Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference (2015)
Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1891–1898 (2014)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015)
Cao, K., Rong, Y., Li, C., Tang, X., Loy, C.C.: Pose-robust face recognition via deep residual equivariant mapping. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5187–5196 (2018)
Hu, Y., Wu, X., Yu, B., He, R., Sun, Z.: Pose-guided photorealistic face rotation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8398–8406 (2018)
Yim, J., Jung, H., Yoo, B.I., Choi, C., Kim, J.: Rotating your face using multi-task deep neural network. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Zhu, Z., Luo, P., Wang, X., Tang, X.: Multi-view perceptron: a deep model for learning face identity and view representations. In: Advances in Neural Information Processing Systems (2014)
Cole, F., Belanger, D., Krishnan, D., Sarna, A., Mosseri, I., Freeman, W.T.: Synthesizing normalized faces from facial identity features. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3386–3395 (2017)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., WardeFarley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Neural Information Processing Systems (2014)
Huang, R., Zhang, S., Li, T., He, R.: Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2458–2467 (2017)
Tran, L., Yin, X., Liu, X.: Disentangled representation learning gan for pose-invariant face recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1283–1292 (2017)
Li, P., Wu, X., Hu, Y., He, R., Sun, Z.: M2fpa: a multi-yaw multi-pitch high-quality dataset and benchmark for facial pose analysis. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10 042–10 050 (2019)
Yin, Y, Jiang, S., Robinson, J.P., Fu, Y.: Dual-attention gan for large-pose face frontalization. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pp. 249–256 (2020)
Luan, X., Geng, H., Liu, L., Li, W., Zhao, Y., Ren, M.: Geometry structure preserving based gan for multi-pose face frontalization and recognition. IEEE Access 8, 104676–104687 (2020)
Zhang, Z., Liang, R., Chen, X., Xu, X., Hu, G., Zuo, W., Hancock, E.R.: Semi-supervised face frontalization in the wild. IEEE Trans. Inf. Forensics Secur. 16, 909–922 (2021)
Luo, H., Cen, S., Ding, Q., Chen, X.: Frontal face reconstructionbased on detail identification, variable scale self-attention and flexible skip connection. In: Neural Computing & Applications (2022)
Qian, Y., Deng, W., Hu, J.: Unsupervised face normalization with extreme pose and expression in the wild. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9843–9850 (2019)
Zhao, J., Cheng, Y., Xu, Y., Xiong, L., Li, J., Zhao, F., Jayashree, K., Pranata, S., Shen, S., Xing, J., Yan, S., Feng, J.: Towards pose invariant face recognition in the wild. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2207–2216 (2018)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Computer Vision—ECCV 2018, pp. 3–19 (2018)
Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D., Kim, K.I.: Unsupervised attention-guided image to image translation. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 3697–3707 (2018)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3141–3149 (2019)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks (2018)
Denton, E., Chintala, S., Szlam, A., Fergus, R.: Deep generative image models using a laplacian pyramid of adversarial networks. In: International Conference on Neural Information Processing Systems, pp. 1486–1494 (2015)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. Computer Science (2015)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning (ICML), pp. 214–223 (2017)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of wasserstein gans, pp. 5767–5777 (2017)
Berthelot, D., Schumm, T., Metz, L.: Began: boundary equilibrium generative adversarial networks. arXiv (2017)
Hassner, T., Harel, S., Paz, E., Enbar, R.: Effective face frontalization in unconstrained images. In: Computer Vision Pattern Recognition, pp. 4295–4304 (2015)
Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 787–796 (2015)
Cen, S., Luo, H., Huang, J., Shi, W., Chen, X.: Pre-trained feature fusion and multidomain identification generative adversarial network for face frontalization. IEEE Access 10, 77872–77882 (2022)
Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. Adv. Neural Inf. Process. Syst. 3, 2204–2212 (2014)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. Computer Science (2014)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. arXiv (2017)
Duan, S., Chen, Z., Wu, Q., Cai, L., Lu, D.: Multi-scale gradients self-attention residual learning for face photo-sketch transformation. IEEE Trans. Inf. Forensics Secur. 16, 1218–1230 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Wu, X., He, R., Sun, Z., Tan, T.: A light cnn for deep face representation with noisy labels. IEEE Trans. Inf. Forensics Secur. 13(11), 2884–2896 (2018)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision—ECCV 2016, pp. 694–711 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)
Gao, W., Cao, B., Shan, S., Chen, X., Zhou, D., Zhang, X., Zhao, D.: The cas-peal large-scale Chinese face database and baseline evaluations. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 38(1), 149–161 (2008)
“Casia-facev5” http://biometrics.idealtest.org/
Deng, J., Guo, J., Ververas, E., Kotsia, I., Zafeiriou, S.: RetinaFace: single-shot multi-level face localisation in the wild. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5202–5211 (2020).https://doi.org/10.1109/CVPR42600.2020.00525
Liu, Y., Chen, J.: Unsupervised face frontalization using disentangled representation-learning CycleGAN. Comput. Vis. Image Underst. 222, 103526 (2022)
Funding
This work was supported in part by the National Key R&D Program of China (2019YFB1311001) in part by the National Natural Science Foundation of China (61876099) and in part by the Key R&D Project of Shandong Province (2022CXGC010503).
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cao, J., Chen, Z., Zhang, Y. et al. Face frontalization with deep GAN via multi-attention mechanism. SIViP 17, 1965–1973 (2023). https://doi.org/10.1007/s11760-022-02409-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02409-7