Abstract
Dynamic facial expression recognition (DFER) has garnered significant attention due to its critical role in various applications, including human-computer interaction, emotion-aware systems, and mental health monitoring. Nevertheless, addressing the challenges of DFER in real-world scenarios remains a formidable task, primarily due to the severe class imbalance problem, leading to suboptimal model performance and poor recognition of minority class expressions. Recent studies in facial expression recognition (FER) for class imbalance predominantly focus on spatial features analysis, while the capacity to encode temporal features of spontaneous facial expressions remains limited. To tackle this issue, we introduce a novel dynamic facial expression recognition in real-world scenarios (RS-DFER) framework, which primarily comprises a spatiotemporal features combination (STC) module and a multi-classifier dynamic participation (MCDP) module. Our extensive experiments on two prevalent large-scale DFER datasets from real-world scenarios demonstrate that our proposed method outperforms existing state-of-the-art approaches, showcasing its efficacy and potential for practical applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Carreira, J., Zisserman, A.: Quo vadis, action recognition? a new model and the kinetics dataset. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308 (2017)
Corneanu, C.A., Simón, M.O., Cohn, J.F., Guerrero, S.E.: Survey on rgb, 3d, thermal, and multimodal approaches for facial expression recognition: history, trends, and affect-related applications. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1548–1568 (2016)
Darwin, C., Prodger, P.: The expression of the emotions in man and animals. Oxford University Press, USA (1998)
Dempster, A.P., et al.: Upper and lower probabilities induced by a multivalued mapping. Classic works of the Dempster-Shafer theory of belief functions 219(2), 57–72 (2008)
Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Acted facial expressions in the wild database. Australian National University, Canberra, Australia, Technical Report TR-CS-11 2, 1 (2011)
Fan, Y., Lu, X., Li, D., Liu, Y.: Video-based emotion recognition using CNN-RNN and c3d hybrid networks. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 445–450 (2016)
Guerdelli, H., Ferrari, C., Barhoumi, W., Ghazouani, H., Berretti, S.: Macro-and micro-expressions facial datasets: a survey. Sensors 22(4), 1524 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Jiang, X., et al.: Dfew: a large-scale database for recognizing dynamic facial expressions in the wild. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2881–2889 (2020)
King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)
Lee, J., Kim, S., Kim, S., Park, J., Sohn, K.: Context-aware emotion recognition networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10143–10152 (2019)
Li, B., Han, Z., Li, H., Fu, H., Zhang, C.: Trustworthy long-tailed classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6970–6979 (2022)
Liu, M., Shan, S., Wang, R., Chen, X.: Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1749–1756 (2014)
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101. IEEE (2010)
Meng, D., Peng, X., Wang, K., Qiao, Y.: Frame attention networks for facial expression recognition in videos. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 3866–3870. IEEE (2019)
Pantic, M., Valstar, M., Rademaker, R., Maat, L.: Web-based database for facial expression analysis. In: 2005 IEEE International Conference on Multimedia and Expo, pp. 5-pp. IEEE (2005)
Rössler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Niessner, M.: Faceforensics++: learning to detect manipulated facial images. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1–11. IEEE Computer Society, Los Alamitos, November 2019. https://doi.org/10.1109/ICCV.2019.00009
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Sensoy, M., Kaplan, L., Kandemir, M.: Evidential deep learning to quantify classification uncertainty. In: Advances in Neural Information Processing Systems 31 (2018)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Wang, T., Cheng, H., Chow, K.P., Nie, L.: Deep convolutional pooling transformer for deepfake detection. ACM Trans. Multimed. Comput. Commun. Appl. 19(6) (2023). https://doi.org/10.1145/3588574
Wang, T., Chow, K.P.: Noise based deepfake detection via multi-head relative-interaction. Proceedings of the AAAI Conference on Artificial Intelligence 37(12), pp. 14548–14556 (2023). https://doi.org/10.1609/aaai.v37i12.26701
Wang, T., Liu, M., Cao, W., Chow, K.P.: Deepfake noise investigation and detection. Forensic Sci. Int. Digital Investigation 42, 301395 (2022). https://doi.org/10.1016/j.fsidi.2022.301395. proceedings of the Twenty-Second Annual DFRWS USA
Xiang, L., Ding, G., Han, J.: Learning from multiple experts: Self-paced knowledge distillation for long-tailed classification. In: Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16. pp. 247–263. Springer (2020)
Yu, Y., Si, X., Hu, C., Zhang, J.: A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31(7), 1235–1270 (2019)
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928 (2016)
Zhang, Y., Wang, T., Shu, M., Wang, Y.: A robust lightweight deepfake detection network using transformers. In: PRICAI 2022: Trends in Artificial Intelligence: 19th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2022, Shanghai, China, November 10–13, 2022, Proceedings, Part I, pp. 275–288. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20862-1_20
Zhao, G., Huang, X., Taini, M., Li, S.Z., PietikäInen, M.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011)
Zhao, Z., Liu, Q.: Former-dfer: dynamic facial expression recognition transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 1553–1561 (2021)
Zhong, L., Liu, Q., Yang, P., Liu, B., Huang, J., Metaxas, D.N.: Learning active facial patches for expression analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2562–2569. IEEE (2012)
Acknowledgements
This work was supported in part by the Taishan Scholars Program: Key R &D Plan of Shandong Province (NO. 2020CXGC010111), Distinguished Taishan Scholars in Climbing Plan (NO. tspd20181211) and Young Taishan Scholars (NO. tsqn201909137).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, Z., Wang, T., Zhou, S., Shu, M. (2023). Dynamic Facial Expression Recognition in Unconstrained Real-World Scenarios Leveraging Dempster-Shafer Evidence Theory. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14255. Springer, Cham. https://doi.org/10.1007/978-3-031-44210-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-44210-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44209-4
Online ISBN: 978-3-031-44210-0
eBook Packages: Computer ScienceComputer Science (R0)