Skip to main content
Log in

LSRF: localized and sparse receptive fields for linear facial expression synthesis based on global face context

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Existing generative adversarial network-based methods for facial expression synthesis require larger datasets for training. Their performance tends to decrease noticeably when trained on smaller datasets. Moreover, they demand higher computational and spatial complexity at inference, making them unsuitable for resource-constrained devices. To address these limitations, this paper presents a linear formulation to learn Localized and Sparse Receptive Fields (LSRF) for facial expression synthesis considering global face context. In this approach, we extend the sparsity-inducing formulation of the Orthogonal Matching Pursuit (OMP) algorithm by incorporating a locality constraint. This constraint ensures that i) each output pixel observes a localized region and ii) neighboring output pixels attend proximate regions of the input face image. Extensive qualitative as well as quantitative experiments demonstrate that the proposed method generates realistic facial expressions and outperforms existing methods. Further, the proposed method can be trained by employing significantly smaller datasets while exhibiting good generalization capabilities for out-of-distribution images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. https://github.com/yunjey/stargan

  2. https://github.com/donydchen/ganimation_replicate

References

  1. Goodfellow, I, Pouget-Abadie, J, Mirza, M, Xu, B, Warde-Farley, D, Ozair, S, Courville, A, Bengio, Y (2014) Generative Adversarial Nets. In: Advances in neural information processing systems, pp 2672–2680

  2. Mirza, M, Osindero, S (2014) Conditional generative adversarial nets. arXiv:1411.1784

  3. Perarnau, G, van de Weijer, J, Raducanu, B, Álvarez, JM (2016) Invertible conditional GANs for image editing. arXiv:1611.06355

  4. Karras, T, Aila, T, Laine, S, Lehtinen, J (2018) Progressive growing of GANs for improved quality, stability, and variation. In: International conference on learning representations

  5. Chen, Y, Tai, Y, Liu, X, Shen, C, Yang, J (2018) FSRNet: End-to-End learning face super-resolution with facial priors. In: IEEE conference on computer vision and pattern recognition, pp 2492–2501

  6. Chen C, Gong D, Wang H, Li Z, Wong K-YK (2020) Learning spatial attention for face super-resolution. IEEE Trans Image Process 30:1219–1231

    Article  ADS  PubMed  Google Scholar 

  7. Hou H, Xu J, Hou Y, Hu X, Wei B, Shen D (2023) Semi-cycled generative adversarial networks for real-world face super-resolution. IEEE Trans Image Process 32:1184–1199

    Article  ADS  Google Scholar 

  8. He Z, Zuo W, Kan M, Shan S, Chen X (2019) AttGAN: Facial attribute editing by only changing what you want. IEEE Trans Image Process 28(11):5464–5478

    Article  ADS  MathSciNet  PubMed  Google Scholar 

  9. Liu, M, Ding, Y, Xia, M, Liu, X, Ding, E, Zuo, W, Wen, S (2019) STGAN: a unified selective transfer network for arbitrary image attribute editing. In: IEEE international conference on computer vision, pp 3673–3682

  10. Gao, Y, Wei, F, Bao, J, Gu, S, Chen, D, Wen, F, Lian, Z (2021) High-fidelity and arbitrary face editing. In: IEEE conference on computer vision and pattern recognition, pp 16115–16124

  11. Choi, Y, Choi, M, Kim, M, Ha, J-W, Kim, S, Choo, J (2018) StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: IEEE conference on computer vision and pattern recognition, pp 8789–8797

  12. Pumarola A, Agudo A, Martinez AM, Sanfeliu A, Moreno-Noguer F (2020) GANimation: one-shot anatomically consistent facial animation. Int J Comput Vis 128(3):698–713

    Article  Google Scholar 

  13. Wu, R, Zhang, G, Lu, S, Chen, T (2020) Cascade EF-GAN: progressive facial expression editing with local focuses. In: IEEE conference on computer vision and pattern recognition, pp 5021–5030

  14. Akram, A, Khan, N (2023) US-GAN: on the importance of ultimate skip connection for facial expression synthesis. Multimedia Tools and Applications

  15. Akram, A, Khan, N (2023) SARGAN: Spatial attention-based residuals for facial expression manipulation. IEEE TCSVT

  16. Khan N, Akram A, Mahmood A, Ashraf S, Murtaza K (2020) Masked Linear Regression for Learning Local Receptive Fields for Facial Expression Synthesis. International Journal of Computer Vision 128(5):1433–1454

    Article  Google Scholar 

  17. Akram, A, Khan, N (2021) Pixel-based facial expression synthesis. In: International conference on pattern recognition, pp 9733–9739. IEEE

  18. Pati, YC, Rezaiifar, R, Krishnaprasad, PS (1993) Orthogonal matching pursuit: recursive function approximat ion with applications to wavelet decomposition. In: Proceedings of 27th asilomar conference on signals, systems and computers, pp 40–44. IEEE

  19. Tropp JA, Gilbert AC (2007) Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans Inf Theory 53(12):4655–4666

    Article  MathSciNet  Google Scholar 

  20. Karras, T, Laine, S, Aila, T (2019) A style-based generator architecture for generative adversarial networks. In: IEEE conference on computer vision and pattern recognition, pp 4401–4410

  21. Isola, P, Zhu, J-Y, Zhou, T, Efros, AA (2017) Image-to-image translation with conditional adversarial networks. In: IEEE conference on computer vision and pattern recognition, pp 1125–1134

  22. Zhang, Z, Song, Y, Qi, H (2017) Age progression/regression by conditional adversarial autoencoder. In: IEEE conference on computer vision and pattern recognition, pp 5810–5818

  23. Lee H-Y, Tseng H-Y, Mao Q, Huang J-B, Lu Y-D, Singh M, Yang M-H (2020) DRIT++: Diverse image-to-image translation via disentangled representations. Int J Comput Vis 128:2402–2417

    Article  Google Scholar 

  24. Nirkin Y, Keller Y, Hassner T (2022) FSGANv2: improved subject agnostic face swapping and reenactment. IEEE Transactions on Pattern Analysis & Machine Intelligence 45(1):560–575

    Article  Google Scholar 

  25. Tang H, Sebe N (2022) Facial expression translation using landmark guided GANs. IEEE Trans Affect Comput 13(4):1986–1997

    Article  Google Scholar 

  26. Shen, W, Liu, R (2017) Learning residual images for face attribute manipulation. In: IEEE conference on computer vision and pattern recognition, pp 4030–4038

  27. Chen, Y-C, Xu, X, Jia, J (2020) Domain Adaptive Image-to-image Translation. In: IEEE conference on computer vision and pattern recognition, pp 5274–5283

  28. Arjovsky, M, Chintala, S, Bottou, L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp 214–223

  29. Gulrajani, I, Ahmed, F, Arjovsky, M, Dumoulin, V, Courville, A (2017) Improved training of wasserstein GANs. arXiv:1704.00028

  30. Zhang, H, Xu, T, Li, H, Zhang, S, Huang, X, Wang, X, Metaxas, D (2017) StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: IEEE international conference on computer vision, pp 5907–5915

  31. Patashnik, O, Wu, Z, Shechtman, E, Cohen-Or, D, Lischinski, D (2021) StyleCLIP: text-driven manipulation of stylegan imagery. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2085–2094

  32. Song, L, Lu, Z, He, R, Sun, Z, Tan, T (2018) Geometry guided adversarial facial expression synthesis. In: Proceedings of the 26th ACM international conference on multimedia, pp 627–635

  33. Qiao, F, Yao, N, Jiao, Z, Li, Z, Chen, H, Wang, H (2018) Geometry-contrastive generative adversarial network for facial expression synthesis. arXiv:1802.01822

  34. Ding, H, Sricharan, K, Chellappa, R (2018) ExprGAN: facial expression editing with controllable expression intensity. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

  35. Ling, J, Xue, H, Song, L, Yang, S, Xie, R, Gu, X (2020) Toward fine-grained facial expression manipulation. In: IEEE international conference on computer vision, pp 37–53. Springer

  36. Ronneberger, O, Fischer, P, Brox, T (2015) U-Net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241. Springer

  37. Xia Y, Zheng W, Wang Y, Yu H, Dong J, Wang F-Y (2021) Local and global perception generative adversarial network for facial expression synthesis. IEEE TCSVT 32(3):1443–1452

    Google Scholar 

  38. d’Apolito, S, Paudel, DP, Huang, Z, Romero, A, Van Gool, L (2021) GANmut: learning interpretable conditional space for gamut of emotions. In: IEEE conference on computer vision and pattern recognition, pp 568–577

  39. Fabian Benitez-Quiroz, C, Srinivasan, R, Martinez, AM (2016) EmotioNet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: IEEE conference on computer vision and pattern recognition, pp 5562–5570

  40. Megvii Inc: Face++ (2019). https://www.faceplusplus.com/

  41. He, K, Zhang, X, Ren, S, Sun, J (2016) Deep Residual Learning for Image Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  42. Du S, Tao Y, Martinez AM (2014) Compound facial expressions of emotion. Proceedings of the National Academy of Sciences 111(15):1454–1462

    Article  ADS  Google Scholar 

  43. Chen, C, Li, X, Yang, L, Lin, X, Zhang, L, Wong, K-YK (2021) Progressive semantic-aware style transformation for blind face restoration. In: IEEE conference on computer vision and pattern recognition, pp 11896–11905

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arbish Akram.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Akram, A., Khan, N. LSRF: localized and sparse receptive fields for linear facial expression synthesis based on global face context. Multimed Tools Appl 83, 31341–31360 (2024). https://doi.org/10.1007/s11042-023-16822-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16822-8

Keywords

Navigation