Skip to main content
Log in

Neural style transfer generative adversarial network (NST-GAN) for facial expression recognition

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

With the increasing number of intelligent human–computer systems, more and more research is focusing on human emotion recognition. Facial expressions are an effective modality in emotional recognition, enhancing automatic emotional analysis. Although significant studies have investigated automatic facial expression recognition in the past decades, previous works were mainly produced for controlled environments. Unlike recent pure CNN-based works, we argue that it is practical and feasible to recognize an expression from a facial image. However, the extracted features may capture more identity-related information and are not purely associated with the specific task of expression recognition. To reduce the personal influence of identity-related features by removing identity information from facial images, we propose a neural style transfer generative adversarial network (NST-GAN) in this paper. The objective is to determine the expression information from the input image by removing identity information and transferring it to a synthetic identity. We employ experimental strategies to evaluate the proposed method on three public facial expression databases (CK+, FER-2013, and JAFFE). Extensive experiments prove that our NST-GAN outperforms other methods, setting a new state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The CK+ dataset is included in this published article [18]. The FER-2013 dataset is included in [27]. The JAFFE dataset is included in [28].

References

  1. Ekman P, Friesen WV (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124. https://doi.org/10.1037/h0030377

    Article  Google Scholar 

  2. Martinez B, Valstar MF (2016) Advances, challenges, and opportunities in automatic facial expression recognition. Springer, Cham, pp 63–100. https://doi.org/10.1007/978-3-319-25958-1_4

    Book  Google Scholar 

  3. Christopher P, Martin K (2016) Facial expression recognition using convolutional neural networks: state of the art. CoRR arXiv:1612.02903

  4. Zhang X, Mahoor MH, Mavadati SM (2015) Facial expression recognition using \(l_p\)-norm MKL multiclass-SVM. Mach Vis Appl 26:467–483. https://doi.org/10.1007/s00138-015-0677-y

    Article  Google Scholar 

  5. Liu Z, Wu M, Cao W, Chen L, Xu J, Zhang R, Zhou M, Mao J (2017) A facial expression emotion recognition based human–robot interaction system. IEEE/CAA J Automatica Sinica 4(4):668–676. https://doi.org/10.1109/JAS.2017.7510622

    Article  Google Scholar 

  6. Tao L, Matuszewski BJ (2016) Is 2D unlabeled data adequate for recognizing facial expressions? IEEE Intell Syst 31(3):19–29. https://doi.org/10.1109/MIS.2016.25

    Article  Google Scholar 

  7. Baltrusaitis T, Robinson P, Morency L-P (2016) Openface: an open source facial behavior analysis toolkit. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–10. https://doi.org/10.1109/WACV.2016.7477553

  8. Luo Y, Wu C-M, Zhang Y (2013) Facial expression recognition based on fusion feature of PCA and LBP with SVM. Optik Int J Light Electron Opt 124(17):2767–2770. https://doi.org/10.1016/j.ijleo.2012.08.040

    Article  Google Scholar 

  9. Chen J, Chen Z, Chi Z, Fu H (2014) Facial expression recognition based on facial components detection and hog features

  10. Dhall A, Goecke R, Joshi J, Sikka K, Gedeon T (2014) Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: Proceedings of the 16th international conference on multimodal interaction. ICMI ’14, pp 461–466. Association for Computing Machinery, New York. https://doi.org/10.1145/2663204.2666275

  11. Hertel L, Barth E, Käster T, Martinetz T (2015) Deep convolutional neural networks as generic feature extractors. In: 2015 International joint conference on neural networks (IJCNN), pp 1–4 . https://doi.org/10.1109/IJCNN.2015.7280683

  12. Han D, Liu Q, Fan W (2018) A new image classification method using CNN transfer learning and web data augmentation. Expert Syst Appl 95:43–56. https://doi.org/10.1016/j.eswa.2017.11.028

    Article  Google Scholar 

  13. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  14. Jiuxiang G, Zhenhua W, Jason K, Lianyang M, Amir S, Bing S, Ting L, Xingxing W, Gang W, Jianfei C, Tsuhan C (2018) Recent advances in convolutional neural networks. Pattern Recogn 77:354–377. https://doi.org/10.1016/j.patcog.2017.10.013

    Article  Google Scholar 

  15. Xie W, Jia X, Shen L, Yang M (2019) Sparse deep feature learning for facial expression recognition. Pattern Recogn 96:106966. https://doi.org/10.1016/j.patcog.2019.106966

    Article  Google Scholar 

  16. Kaya M, Bilge HS (2019) Deep metric learning: a survey. Symmetry. https://doi.org/10.3390/sym11091066

    Article  Google Scholar 

  17. Jain N, Kumar S, Kumar A, Shamsolmoali P, Zareapoor M (2018) Hybrid deep neural networks for face emotion recognition. Pattern Recognit Lett 115:101–106. https://doi.org/10.1016/j.patrec.2018.04.010. (Multimodal fusion for pattern recognition)

    Article  Google Scholar 

  18. Jain DK, Pourya S, Paramjit S (2019) Extended deep neural network for facial emotion recognition. Pattern Recognit Lett 120:69–74. https://doi.org/10.1016/j.patrec.2019.01.008

    Article  Google Scholar 

  19. Yi Z, Zhang H, Tan P, Gong M (2018) DualGAN: unsupervised dual learning for image-to-image translation

  20. Souibgui MA, Kessentini Y (2022) De-gan: a conditional generative adversarial network for document enhancement. IEEE Trans Pattern Anal Mach Intell 44(3):1180–1191. https://doi.org/10.1109/tpami.2020.3022406

    Article  Google Scholar 

  21. Zhang F, Zhang T, Mao Q, Xu C (2018) Joint pose and expression modeling for facial expression recognition. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 3359–3368. https://doi.org/10.1109/CVPR.2018.00354

  22. Yang H, Ciftci U, Yin L (2018) Facial expression recognition by de-expression residue learning. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 2168–2177. https://doi.org/10.1109/CVPR.2018.00231

  23. Goodfellow I.J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks, vol 27. arXiv:1406.2661

  24. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans. In: Proceedings of the 31st international conference on neural information processing systems. NIPS’17, pp 5769–5779. Curran Associates Inc., Red Hook

  25. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. CoRR arXiv:1505.04597

  26. Khemakhem F, Ltifi H (2019) Facial expression recognition using convolution neural network enhancing with pre-processing stages. In: 2019 IEEE/ACS 16th international conference on computer systems and applications (AICCSA), pp 1–7. https://doi.org/10.1109/AICCSA47632.2019.9035249

  27. Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Bengio Y (2013) Challenges in representation learning: a report on three machine learning contests. In: ICONIP, vol 8228. https://doi.org/10.1007/978-3-642-42051-1_16

  28. Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with Gabor wavelets. In: Proceedings third IEEE international conference on automatic face and gesture recognition, pp 200–205. https://doi.org/10.1109/AFGR.1998.670949

  29. Dhall A, Goecke R, Lucey S, Gedeon T (2012) Collecting large, richly annotated facial-expression databases from movies. IEEE Multimedia 19(3):34–41. https://doi.org/10.1109/MMUL.2012.26

    Article  Google Scholar 

  30. Pantic M, Valstar M, Rademaker R, Maat L (2005) Web-based database for facial expression analysis. In: 2005 IEEE international conference on multimedia and expo, p 5. https://doi.org/10.1109/ICME.2005.1521424

  31. Zhang T, Zheng W, Cui Z, Zong Y, Li Y (2019) Spatial–temporal recurrent neural network for emotion recognition. IEEE Trans Cybern 49(3):839–847. https://doi.org/10.1109/TCYB.2017.2788081

    Article  Google Scholar 

  32. Khorrami P, Le Paine T, Brady K, Dagli C, Huang TS (2016) How deep neural networks can improve emotion recognition on video data. In: 2016 IEEE international conference on image processing (ICIP), pp 619–623. https://doi.org/10.1109/ICIP.2016.7532431

  33. André TL, Edilson D, Alberto FD, Thiago O-S (2017) Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recognit 61:610–628. https://doi.org/10.1016/j.patcog.2016.07.026

    Article  Google Scholar 

  34. Ali M, David C, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–10

  35. Laurens VDM, Geoffrey H (2008) Visualizing data using t-SNE. J Mach Learn Res 9(86):2579–2605

    MATH  Google Scholar 

Download references

Acknowledgements

Faten Khemakhem and Hela Ltifi declare that they have no affiliations with or involvement in any organization or entity with any financial interest, or non-financial interest in the subject matter or materials discussed in this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed equally to this manuscript.

Corresponding authors

Correspondence to Faten Khemakhem or Hela Ltifi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khemakhem, F., Ltifi, H. Neural style transfer generative adversarial network (NST-GAN) for facial expression recognition. Int J Multimed Info Retr 12, 26 (2023). https://doi.org/10.1007/s13735-023-00285-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13735-023-00285-6

Keywords

Navigation