Abstract
Large-scale high-quality datasets are a particularly important condition for facial expression recognition(FER) in the era of deep learning, but most of the datasets used for FER are relatively small. A common method to address this problem is to use cross-datasets strategy. However, due to the different acquisition conditions and subjective labeling process, there are inevitable data inconsistencies and poor cross-dataset robustness between different FER datasets. Moreover, expression datasets collected in uncontrolled environments suffer from problems such as unclear expressions and low-quality face images, leading to low certainty in image annotation. This paper aims to improve the accuracy and generalization ability of expression recognition across datasets by optimizing the labels of fused large-scale datasets. Specifically, this paper adopts the similarity comparison of features, proposes a dataset label determination method based on distance metric learning and teacher-student model to improve the determinism of images. In addition, this paper provides an alternative scheme for fusion of datasets. The fusion of the source dataset and the target dataset provides the best trade-off between accuracy and generalization ability to achieve a better result for cross-dataset FER, address the problems of small dataset size and ignoring the performance of the source dataset in cross-dataset expression recognition. Experiments show that training on the fused large-scale datasets using the method proposed in this paper can achieve the state of the art results for cross-dataset expression recognition.
Similar content being viewed by others
References
Barsoum E, Zhang C, Ferrer CC et al (2016) Training deep networks for facial expression recognition with crowd-sourced label distribution[C]//Proceedings of the 18th ACM International Conference on Multimodal Interaction: 279–283
Bejaoui H, Ghazouani H, Barhoumi W (2019) Sparse coding-based representation of lbp difference for 3d/4d facial expression recognition[J]. Multimed Tools Appl 78(16):22773–22796
Cai J, Meng Z, Khan AS et al (2018) Probabilistic attribute tree in convolutional neural networks for facial expression recognition[J]. arXiv preprint arXiv:1812.07067
Chang WG, You T, Seo S et al (2019) Domain-specific batch normalization for unsupervised domain adaptation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition: 7354–7362
Chen T, Pu T, Wu H et al (2021) Cross-domain facial expression recognition: a unified evaluation benchmark and adversarial graph learning[J]. IEEE transactions on pattern analysis and machine intelligence
Chen WY, Liu YC, Kira Z et al (2019) A closer look at few-shot classification[J]. arXiv preprint arXiv:1904.04232
Cheng G, Yang C, Yao X et al (2018) When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs[J]. IEEE Trans Geosci Remote Sens 56(5):2811–2821
De Vazelhes W, Carey CJ, Tang Y et al (2020) metric-learn: Metric learning algorithms in python[J]. J Mach Learn Res 21(138):1–6
Dhall A, Goecke R, Lucey S et al (2011) Static facial expressions in tough conditions: Data, evaluation protocol and benchmark[C]//1st IEEE International Workshop on Benchmarking Facial Image Analysis Technologies BeFIT, ICCV2011
Douillard A,Valle E,Ollion C et al. Insights from the Future for Continual Learning[J]. arXiv preprint arXiv:2006.13748,2020
Fallahzadeh M R, Farokhi F, Harimi A et al (2021) Facial expression recognition based on image gradient and deep convolutional neural network[J]. Journal of AI and Data Mining 9(2):259–268
Farzaneh AH, Qi X (2021) Facial expression recognition in the wild via deep attentive center loss[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision: 2402–2411
Gao BB, Xing C, Xie CW et al (2017) Deep label distribution learning with label ambiguity[J]. IEEE Trans Image Process 26(6):2825–2838
Hady MFA, Schwenker F (2013) Semi-supervised learning[M]//Handbook on Neural Information Processing. Springer, Berlin, pp 215–239
Hosseini S, Shabani MA, Cho NI (2019) Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning[J]. arXiv preprint arXiv:1907.03402
Hu X, Ma F, Liu C et al (2020) Semi-supervised relation extraction via incremental meta self-training[J]. Update 9:8
Iscen A, Tolias G, Avrithis Y et al (2019) Label propagation for deep semi-supervised learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition: 5070–5079
Ji Y, Hu Y, Yang Y et al (2019) Cross-domain facial expression recognition via an intra-category common feature and inter-category distinction feature fusion network[J]. Neurocomputing 333:231–239
Lakshmi D, Ponnusamy R (2021) Facial emotion recognition using modified HOG and LBP features with deep stacked autoencoders[J]. Microprocess Microsyst 82:103834
Lee CY, Batra T, Baig MH et al (2019) Sliced wasserstein discrepancy for unsupervised domain adaptation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition: 10285–10295
Li S, Deng W (2018) Deep emotion transfer network for cross-database facial expression recognition[C]//2018 24th International Conference on Pattern Recognition (ICPR). IEEE 3092–3099
Li S, Deng W (2018) Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition[J]. IEEE Trans Image Process 28(1):356–370
Li S, Deng W (2020) A deeper look at facial expression dataset bias[J]. IEEE Trans Affect Comput
Li S, Deng W (2020) Deep facial expression recognition: a survey[J]. IEEE Trans Affect Comput
Liu D, Ouyang X, Xu S et al (2020) SAANet: Siamese action-units attention network for improving dynamic facial expression recognition[J]. Neurocomputing 413:145–157
Liu P, Wei Y, Meng Z et al (2020) Omni-supervised facial expression recognition: A simple baseline[J]. arXiv preprint arXiv:2005.08551
Liu X, Kumar BVKV, Jia P et al (2019) Hard negative generation for identity-disentangled facial expression recognition[J]. Pattern Recogn 88:1–12
Liu X, Vijaya Kumar BVK, You J et al (2017) Adaptive deep metric learning for identity-aware facial expression recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops: 20–29
Long M, Cao Z, Wang J et al (2017) Conditional adversarial domain adaptation[J]. arXiv preprint arXiv:1705.10667
Lucey P, Cohn JF, Kanade T et al (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression[C]//2010 ieee computer society conference on computer vision and pattern recognition-workshops. IEEE 94–101
Mei K, Zhu C, Zou J et al (2020) Instance adaptive self-training for unsupervised domain adaptation[J]. arXiv preprint arXiv:2008.12197
Ma H, Celik T (2019) FER‐Net: facial expression recognition using densely connected convolutional network[J]. Electron Lett 55(4):184–186
Radosavovic I, Dollár P, Girshick R et al (2018) Data distillation: Towards omni-supervised learning[C]//Proceedings of the IEEE conference on computer vision and pattern recognition: 4119–4128
Rahul M, Kohli N, Agarwal R et al (2019) Facial expression recognition using geometric features and modified hidden Markov model[J]. Int J Grid Util Comput 10(5):488–496
Rizve MN, Duarte K, Rawat YS et al (2021) In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning[J]. arXiv preprint arXiv:2101.06329
Sadeghi H, Raie AA (2019) Histogram distance metric learning for facial expression recognition[J]. J Vis Commun Image Represent 62:152–165
Shao J, Qian Y (2019) Three convolutional neural network models for facial expression recognition in the wild[J]. Neurocomputing 355:82–92
She J, Hu Y, Shi H et al (2021) Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition[J]. arXiv preprint arXiv:2104.00232
Shi S, Si H, Liu J et al (2018) Facial expression recognition based on Gabor features of salient patches and ACI-LBP[J]. J Intell Fuzzy Syst 34(4):2551–2561
Shih FY, Chuang CF, Wang PSP (2008) Performance comparisons of facial expression recognition in JAFFE database[J]. Int J Pattern Recognit Artif Intell 22(03):445–459
Suárez JL, García S, Herrera F (2021) A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis, prospects and challenges[J]. Neurocomputing 425:300–322
Sung F, Yang Y, Zhang L et al (2018) Learning to compare: Relation network for few-shot learning[C]//Proceedings of the IEEE conference on computer vision and pattern recognition: 1199–1208
Tannugi DC, Britto Jr AS, Koerich AL (2019) Memory Integrity of CNNs for Cross-Dataset Facial Expression Recognition[J]. arXiv preprint arXiv:1905.12082
Taori R, Dave A, Shankar V, et al (2020) Measuring robustness to natural distribution shifts in image classification[J]
Vu TH, Jain H, Bucher M et al (2019) Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition: 2517–2526
Wang K, Peng X, Yang J et al (2020) Region attention networks for pose and occlusion robust facial expression recognition[J]. IEEE Trans Image Process 29:4057–4069
Wang K, Peng X, Yang J et al (2020) Suppressing Uncertainties for Large-Scale Facial Expression Recognition[J]. arXiv preprint arXiv:2002.10392
Wang Y, Li Y, Song Y et al (2020) The influence of the activation function in a convolution neural network model of facial expression recognition[J]. Appl Sci 10(5):1897
Wu H, Prasad S (2017) Semi-supervised deep learning using pseudo labels for hyperspectral image classification[J]. IEEE Trans Image Process 27(3):1259–1270
Wu R, Zhang G, Lu S et al (2020) Cascade ef-gan: Progressive facial expression editing with local focuses[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition: 5021–5030
Xu R, Li G, Yang J et al (2019) Larger norm more transferable: An adaptive feature norm approach for unsupervised domain adaptation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision: 1426–1435
Yalniz IZ, Jégou H, Chen K et al (2019) Billion-scale semi-supervised learning for image classification[J]. arXiv preprint arXiv:1905.00546
Yasarla R, Perazzi F, Patel VM (2020) Deblurring face images using uncertainty guided multi-stream semantic networks[J]. IEEE Trans Image Process 29:6251–6263
Zhou L, Wang H, Lin S et al (2020) Face recognition based on local binary pattern and improved Pairwise-constrained Multiple Metric Learning[J]. Multimed Tools Appl 79(1):675–691
Acknowledgements
This work was supported by the Development Project of Ship Situational Intelligent Awareness System, China under Grant MC-201920-X01.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Meng, H., Yuan, F., Tian, Y. et al. Cross-datasets facial expression recognition via distance metric learning and teacher-student model. Multimed Tools Appl 81, 5621–5643 (2022). https://doi.org/10.1007/s11042-021-11765-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11765-4