Skip to main content
Log in

USTST: unsupervised self-training similarity transfer for cross-domain facial expression recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

A Correction to this article was published on 07 December 2023

This article has been updated

Abstract

Facial expression recognition (FER) is one of the popular research topics in the field of computer vision. When most of the deep learning expression recognition methods that achieve satisfactory results with a single dataset are applied to a new dataset, additional costs result from labeling the new data. FER under cross-dataset also suffers from difficulties such as data discrepancy and expression ambiguity. To address these issues, we propose an Unsupervised Self-Training Similarity Transfer (USTST) method for cross-domain FER. The Cross-Swin-Transformer (CST) module is designed to extract features and assign greater attention weight to the similar regions of the source and target domain images. The Self-Training Resampling (STR) and the Knowledge Transfer (KT) modules are then constructed to improve the confidence of the model prediction for the target domain. We also design ambiguity suppression loss and cross-domain loss to improve the ability of the model to discriminate expressions while transferring knowledge across domains. The experimental results with the RAF-DB dataset as the source domain and the CK+, JAFFE, SFEW, FER2013 and ExpW datasets as the target domains, show that our approach achieves much higher performance than the state-of-the-art cross-domain FER methods, while requiring no labels of new datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability Statements

Data openly available in a public repository. The data that support the findings of this study are openly available at: \(\bullet \) RAF-DB: http://whdeng.cn/RAF/model1.html/data-set. \(\bullet \) CK+: http://www.jeffcohn.net/Resources/.\(\bullet \) JAFFE: https://zenodo.org/record/3451524.ZGrmn-3ZByUk.\(\bullet \) SFEW: https://cs.anu.edu.au/few/emotiw2015.html.\(\bullet \) FER2013: https://www.kaggle.com/datasets/msam-bare/fer2013.\(\bullet \) ExpW: http://mmlab.ie.cuhk.edu.hk/projects/social-relation/index.html.

Change history

References

  1. Mijwil MM (2022) Has the future started the current growth of artificial intelligence, machine learning, and deep learning. Iraqi J Comput Sci Math Corpus ID: 249688145

  2. Zhuang F, Qi Z, Duan K et al (2020) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76

    Article  Google Scholar 

  3. Liang J, Hu D, Wang Y et al (2021) Source data-absent unsupervised domain adaptation through hypothesis transfer and labeling transfer. IEEE Trans Pattern Anal Mach Intell 44(11):8602–8617

    Google Scholar 

  4. Li S, Deng W (2020) A deeper look at facial expression dataset bias. IEEE Trans Affect Comput 13(2):881–893

    Article  MathSciNet  Google Scholar 

  5. Xie Y, Chen T, Pu T et al (2020) Adversarial graph representation adaptation for cross-domain facial expression recognition. Proceedings of the 28th ACM international conference on multimedia pp 1255–1264

  6. Yang F, Xie W, Zhong T, (2022) Augmented feature representation with parallel convolution for cross-domain facial expression recognition. Biometric recognition: 16th Chinese Conference, CCBR, et al (2022) Beijing, China, November 11–13, 2022, Proceedings. Springer Nature Switzerland, Cham, pp 297–306

  7. Xie Y, Gao Y, Lin J et al (2022) Learning consistent global-local representation for cross-domain facial expression recognition. 26th International conference on pattern recognition (ICPR). IEEE, pp 2489–2495

  8. Xu T, Chen W, Wang P et al (2021) Cdtrans: cross-domain transformer for unsupervised domain adaptation. arXiv preprint arXiv:2109.06165

  9. Ganin Y, Ustinova E, Ajakan H et al (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2096–2030

    MathSciNet  Google Scholar 

  10. Pan SJ, Tsang IW, Kwok JT et al (2010) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210

    Article  Google Scholar 

  11. Lucey P, Cohn JF, Kanade T et al (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. 2010 IEEE computer society conference on computer vision and pattern recognition-workshops: IEEE, pp 94-101

  12. Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with Gabor wavelets. In: Proceedings third IEEE international conference on Automatic Face and Gesture Recognition, Nara, Japan, p 200–205. https://doi.org/10.1109/AFGR.1998.670949

  13. Lyons MJ (2021) “Excavating AI” Re-excavated: debunking a fallacious account of the JAFFE dataset. arXiv:2107.13998

  14. Dhall A, Goecke R, Lucey S, Gedeon T (2011) Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, p 2106–2112. https://doi.org/10.1109/ICCVW.2011.6130508

  15. Goodfellow IJ, Erhan D, Carrier PL et al (2013) Challenges in representation learning: a report on three machine learning contests. Springer, Berlin, Heidelberg, International conference on neural information processing, pp 117–124

    Google Scholar 

  16. Zhang Z, Luo P, Loy C-C, Tang X (2015) Learning social relation traits from face images. In: 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, p 3631–3639. https://doi.org/10.1109/ICCV.2015.414

  17. Li S, Deng W, Du J (2017) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. Proc IEEE Conf Comput Vis Pattern Recognit 2017:2852–2861

    Google Scholar 

  18. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  19. Mohan K, Seal A, Krejcar O, Yazidi A (2021) Facial expression recognition using local gravitational force descriptor-based deep convolution neural networks. IEEE Trans Instrum Meas 70:1–12

    Article  Google Scholar 

  20. Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069

    Article  Google Scholar 

  21. She J, Hu Y, Shi H, Wang J, Shen Q, Mei T (2021) Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 6248–6257

  22. Ruan D, Yan Y, Lai S, Chai Z, Shen C, Wang H (2021) Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7656–7665

  23. Zhang X, Zhang F, Xu C (2022) Joint expression synthesis and representation learning for facial expression recognition. IEEE Trans Circ Syst Video Technol 32(3):1681–1695

    Article  Google Scholar 

  24. Long M, Cao Z, Wang J et al (2018) Conditional adversarial domain adaptation. Adv Neural Inf Process Syst 31

  25. Xu R, Li G, Yang J et al (2019) Larger norm more transferable: an adaptive feature norm approach for unsupervised domain adaptation. Proc IEEE/CVF Int Conf Comput Vis 2019:1426–1435

    Google Scholar 

  26. Lee C-Y, Batra T, Baig MH et al (2019) Sliced wasserstein discrepancy for unsupervised domain adaptation. Proc IEEE/CVF Conf Comput Vis Pattern Recognit 2019:10285–10295

    Google Scholar 

  27. Li S, Deng W (2018) Deep emotion transfer network for cross-database facial expression recognition. 2018 24th International conference on pattern recognition (ICPR): IEEE, pp 3092–3099

  28. Chen T, Pu T, Wu H, Xie Y, Liu L, Lin L (2021) Cross-domain facial expression recognition: a unified evaluation benchmark and adversarial graph learning. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3131222

    Article  Google Scholar 

  29. Ji Y, Hu Y, Yang Y et al (2021) Region attention enhanced unsupervised cross-domain facial emotion recognition. IEEE Trans Knowl Data Eng

  30. Li Y, Zhang Z, Chen B et al (2022) Deep margin-sensitive representation learning for cross-domain facial expression recognition. IEEE Trans Multimedia

  31. Peng X, Gu Y, Zhang P (2022) Au-guided unsupervised domain-adaptive facial expression recognition. Appl Sci 12(9):4366

    Article  Google Scholar 

  32. Xu X, Zheng W, Zong Y et al (2022) Sample self-revised network for cross-dataset facial expression recognition. International joint conference on neural networks (IJCNN). IEEE, pp 1–8

  33. Cubuk ED, Zoph B, Shlens J et al (2020) Randaugment: practical automated data augmentation with a reduced search space. Proceedings of the IEEE/CVF Conf Comput Vis Pattern Recognit Workshops 2020:702–703

    Google Scholar 

  34. Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  35. Liu Z, Lin Y, Cao Y et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. Proc IEEE/CVF Int Conf Comput Vis 2021:10012–10022

    Google Scholar 

  36. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. Proc IEEE Conf Comput Vis Pattern Recognit 2015:815–823

    Google Scholar 

  37. Kiran A, Qureshi SA, Khan A, Mahmood S, Idrees M, Saeed A, Assam M, Refaai MRA, Mohamed A (2022) Reverse image search using deep unsupervised generative learning and deep convolutional neural network. Appl Sci 12(10):4943

    Article  Google Scholar 

  38. Paszke A, Gross S, Massa F et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32

  39. Zhang K, Zhang Z, Li Z et al (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503

    Article  Google Scholar 

  40. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recognit 2016:770–778

    Google Scholar 

  41. Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11)

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62071384 and 62371399, the Key Research and Development Project of Shaanxi Province under Grant 2023-YBGY-239, and Natural Science Basic Research Plan in Shaanxi Province of China under Grant 2023-JC-YB-531.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhe Guo.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The original article contains errors in references 12–16. The original article has been corrected.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, Z., Wei, B., Liu, J. et al. USTST: unsupervised self-training similarity transfer for cross-domain facial expression recognition. Multimed Tools Appl 83, 41703–41723 (2024). https://doi.org/10.1007/s11042-023-17317-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17317-2

Keywords

Navigation