USTST: unsupervised self-training similarity transfer for cross-domain facial expression recognition

Guo, Zhe; Wei, Bingxin; Liu, Jiayi; Liu, Xuewen; Zhang, Zhibo; Wang, Yi

doi:10.1007/s11042-023-17317-2

USTST: unsupervised self-training similarity transfer for cross-domain facial expression recognition

Published: 13 October 2023

Volume 83, pages 41703–41723, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zhe Guo ORCID: orcid.org/0000-0001-8024-1434¹,
Bingxin Wei¹,
Jiayi Liu¹,
Xuewen Liu¹,
Zhibo Zhang¹ &
…
Yi Wang¹

190 Accesses
1 Citation
Explore all metrics

A Correction to this article was published on 07 December 2023

This article has been updated

Abstract

Facial expression recognition (FER) is one of the popular research topics in the field of computer vision. When most of the deep learning expression recognition methods that achieve satisfactory results with a single dataset are applied to a new dataset, additional costs result from labeling the new data. FER under cross-dataset also suffers from difficulties such as data discrepancy and expression ambiguity. To address these issues, we propose an Unsupervised Self-Training Similarity Transfer (USTST) method for cross-domain FER. The Cross-Swin-Transformer (CST) module is designed to extract features and assign greater attention weight to the similar regions of the source and target domain images. The Self-Training Resampling (STR) and the Knowledge Transfer (KT) modules are then constructed to improve the confidence of the model prediction for the target domain. We also design ambiguity suppression loss and cross-domain loss to improve the ability of the model to discriminate expressions while transferring knowledge across domains. The experimental results with the RAF-DB dataset as the source domain and the CK+, JAFFE, SFEW, FER2013 and ExpW datasets as the target domains, show that our approach achieves much higher performance than the state-of-the-art cross-domain FER methods, while requiring no labels of new datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved Cross-Dataset Facial Expression Recognition by Handling Data Imbalance and Feature Confusion

Identity-Enhanced Network for Facial Expression Recognition

Cross-Domain Facial Expression Recognition by Combining Transfer Learning and Face-Cycle Generative Adversarial Network

Article 11 March 2024

Data Availability Statements

Data openly available in a public repository. The data that support the findings of this study are openly available at: \(\bullet \) RAF-DB: http://whdeng.cn/RAF/model1.html/data-set. \(\bullet \) CK+: http://www.jeffcohn.net/Resources/.\(\bullet \) JAFFE: https://zenodo.org/record/3451524.ZGrmn-3ZByUk.\(\bullet \) SFEW: https://cs.anu.edu.au/few/emotiw2015.html.\(\bullet \) FER2013: https://www.kaggle.com/datasets/msam-bare/fer2013.\(\bullet \) ExpW: http://mmlab.ie.cuhk.edu.hk/projects/social-relation/index.html.

Change history

07 December 2023
A Correction to this paper has been published: https://doi.org/10.1007/s11042-023-17794-5

References

Mijwil MM (2022) Has the future started the current growth of artificial intelligence, machine learning, and deep learning. Iraqi J Comput Sci Math Corpus ID: 249688145
Zhuang F, Qi Z, Duan K et al (2020) A comprehensive survey on transfer learning. Proc IEEE 109(1):43–76
Article Google Scholar
Liang J, Hu D, Wang Y et al (2021) Source data-absent unsupervised domain adaptation through hypothesis transfer and labeling transfer. IEEE Trans Pattern Anal Mach Intell 44(11):8602–8617
Google Scholar
Li S, Deng W (2020) A deeper look at facial expression dataset bias. IEEE Trans Affect Comput 13(2):881–893
Article MathSciNet Google Scholar
Xie Y, Chen T, Pu T et al (2020) Adversarial graph representation adaptation for cross-domain facial expression recognition. Proceedings of the 28th ACM international conference on multimedia pp 1255–1264
Yang F, Xie W, Zhong T, (2022) Augmented feature representation with parallel convolution for cross-domain facial expression recognition. Biometric recognition: 16th Chinese Conference, CCBR, et al (2022) Beijing, China, November 11–13, 2022, Proceedings. Springer Nature Switzerland, Cham, pp 297–306
Xie Y, Gao Y, Lin J et al (2022) Learning consistent global-local representation for cross-domain facial expression recognition. 26th International conference on pattern recognition (ICPR). IEEE, pp 2489–2495
Xu T, Chen W, Wang P et al (2021) Cdtrans: cross-domain transformer for unsupervised domain adaptation. arXiv preprint arXiv:2109.06165
Ganin Y, Ustinova E, Ajakan H et al (2016) Domain-adversarial training of neural networks. J Mach Learn Res 17(1):2096–2030
MathSciNet Google Scholar
Pan SJ, Tsang IW, Kwok JT et al (2010) Domain adaptation via transfer component analysis. IEEE Trans Neural Netw 22(2):199–210
Article Google Scholar
Lucey P, Cohn JF, Kanade T et al (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. 2010 IEEE computer society conference on computer vision and pattern recognition-workshops: IEEE, pp 94-101
Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with Gabor wavelets. In: Proceedings third IEEE international conference on Automatic Face and Gesture Recognition, Nara, Japan, p 200–205. https://doi.org/10.1109/AFGR.1998.670949
Lyons MJ (2021) “Excavating AI” Re-excavated: debunking a fallacious account of the JAFFE dataset. arXiv:2107.13998
Dhall A, Goecke R, Lucey S, Gedeon T (2011) Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, p 2106–2112. https://doi.org/10.1109/ICCVW.2011.6130508
Goodfellow IJ, Erhan D, Carrier PL et al (2013) Challenges in representation learning: a report on three machine learning contests. Springer, Berlin, Heidelberg, International conference on neural information processing, pp 117–124
Google Scholar
Zhang Z, Luo P, Loy C-C, Tang X (2015) Learning social relation traits from face images. In: 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, p 3631–3639. https://doi.org/10.1109/ICCV.2015.414
Li S, Deng W, Du J (2017) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. Proc IEEE Conf Comput Vis Pattern Recognit 2017:2852–2861
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Mohan K, Seal A, Krejcar O, Yazidi A (2021) Facial expression recognition using local gravitational force descriptor-based deep convolution neural networks. IEEE Trans Instrum Meas 70:1–12
Article Google Scholar
Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069
Article Google Scholar
She J, Hu Y, Shi H, Wang J, Shen Q, Mei T (2021) Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 6248–6257
Ruan D, Yan Y, Lai S, Chai Z, Shen C, Wang H (2021) Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7656–7665
Zhang X, Zhang F, Xu C (2022) Joint expression synthesis and representation learning for facial expression recognition. IEEE Trans Circ Syst Video Technol 32(3):1681–1695
Article Google Scholar
Long M, Cao Z, Wang J et al (2018) Conditional adversarial domain adaptation. Adv Neural Inf Process Syst 31
Xu R, Li G, Yang J et al (2019) Larger norm more transferable: an adaptive feature norm approach for unsupervised domain adaptation. Proc IEEE/CVF Int Conf Comput Vis 2019:1426–1435
Google Scholar
Lee C-Y, Batra T, Baig MH et al (2019) Sliced wasserstein discrepancy for unsupervised domain adaptation. Proc IEEE/CVF Conf Comput Vis Pattern Recognit 2019:10285–10295
Google Scholar
Li S, Deng W (2018) Deep emotion transfer network for cross-database facial expression recognition. 2018 24th International conference on pattern recognition (ICPR): IEEE, pp 3092–3099
Chen T, Pu T, Wu H, Xie Y, Liu L, Lin L (2021) Cross-domain facial expression recognition: a unified evaluation benchmark and adversarial graph learning. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3131222
Article Google Scholar
Ji Y, Hu Y, Yang Y et al (2021) Region attention enhanced unsupervised cross-domain facial emotion recognition. IEEE Trans Knowl Data Eng
Li Y, Zhang Z, Chen B et al (2022) Deep margin-sensitive representation learning for cross-domain facial expression recognition. IEEE Trans Multimedia
Peng X, Gu Y, Zhang P (2022) Au-guided unsupervised domain-adaptive facial expression recognition. Appl Sci 12(9):4366
Article Google Scholar
Xu X, Zheng W, Zong Y et al (2022) Sample self-revised network for cross-dataset facial expression recognition. International joint conference on neural networks (IJCNN). IEEE, pp 1–8
Cubuk ED, Zoph B, Shlens J et al (2020) Randaugment: practical automated data augmentation with a reduced search space. Proceedings of the IEEE/CVF Conf Comput Vis Pattern Recognit Workshops 2020:702–703
Google Scholar
Dosovitskiy A, Beyer L, Kolesnikov A et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Liu Z, Lin Y, Cao Y et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. Proc IEEE/CVF Int Conf Comput Vis 2021:10012–10022
Google Scholar
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. Proc IEEE Conf Comput Vis Pattern Recognit 2015:815–823
Google Scholar
Kiran A, Qureshi SA, Khan A, Mahmood S, Idrees M, Saeed A, Assam M, Refaai MRA, Mohamed A (2022) Reverse image search using deep unsupervised generative learning and deep convolutional neural network. Appl Sci 12(10):4943
Article Google Scholar
Paszke A, Gross S, Massa F et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32
Zhang K, Zhang Z, Li Z et al (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Article Google Scholar
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vis Pattern Recognit 2016:770–778
Google Scholar
Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11)

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62071384 and 62371399, the Key Research and Development Project of Shaanxi Province under Grant 2023-YBGY-239, and Natural Science Basic Research Plan in Shaanxi Province of China under Grant 2023-JC-YB-531.

Author information

Authors and Affiliations

School of Electronics and Information, Northwestern Polytechnical University, Xi’an, 710072, China
Zhe Guo, Bingxin Wei, Jiayi Liu, Xuewen Liu, Zhibo Zhang & Yi Wang

Authors

Zhe Guo
View author publications
You can also search for this author in PubMed Google Scholar
Bingxin Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jiayi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xuewen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhibo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhe Guo.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The original article contains errors in references 12–16. The original article has been corrected.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guo, Z., Wei, B., Liu, J. et al. USTST: unsupervised self-training similarity transfer for cross-domain facial expression recognition. Multimed Tools Appl 83, 41703–41723 (2024). https://doi.org/10.1007/s11042-023-17317-2

Download citation

Received: 17 June 2023
Revised: 01 September 2023
Accepted: 27 September 2023
Published: 13 October 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11042-023-17317-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

USTST: unsupervised self-training similarity transfer for cross-domain facial expression recognition

Abstract

Access this article

Similar content being viewed by others

Improved Cross-Dataset Facial Expression Recognition by Handling Data Imbalance and Feature Confusion

Identity-Enhanced Network for Facial Expression Recognition

Cross-Domain Facial Expression Recognition by Combining Transfer Learning and Face-Cycle Generative Adversarial Network

Data Availability Statements

Change history

07 December 2023

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

USTST: unsupervised self-training similarity transfer for cross-domain facial expression recognition

Abstract

Access this article

Similar content being viewed by others

Improved Cross-Dataset Facial Expression Recognition by Handling Data Imbalance and Feature Confusion

Identity-Enhanced Network for Facial Expression Recognition

Cross-Domain Facial Expression Recognition by Combining Transfer Learning and Face-Cycle Generative Adversarial Network

Data Availability Statements

Change history

07 December 2023

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation