Video-based person-dependent and person-independent facial emotion recognition

Hajarolasvadi, Noushin; Bashirov, Enver; Demirel, Hasan

doi:10.1007/s11760-020-01830-0

Video-based person-dependent and person-independent facial emotion recognition

Original Paper
Published: 19 January 2021

Volume 15, pages 1049–1056, (2021)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

625 Accesses
14 Citations
1 Altmetric
Explore all metrics

Abstract

Facial emotion recognition is a challenging problem that has attracted the attention of researchers in the last decade. In this paper, we present a system for facial emotion recognition in video sequences. Then, we evaluate the system for a person-dependent and person-independent cases. Depending on the purpose of the designed system, the importance of training a personalized model versus a non-personalized one differs. In this paper, first, we compute 60 geometric features for video frames of two datasets, namely RML and SAVEE databases. In the next step, k-means clustering is applied to the geometric features to select k most discriminant frames for each video clip. Then, we employ various classifiers like linear support vector machine (SVM) and Gaussian SVM to find the best representative k. Finally, five pre-trained convolutional neural networks, namely VGG-16, VGG-19, ResNet-50, AlexNet, and GoogleNet, were used evaluating two scenarios: person-dependent and person-independent emotion recognition. Additionally, the effect of geometric features in keyframe selection for a person-dependent and person-independent scenarios is studied based on different regions of the face. Also, the extracted features by CNNs are visualized using the t-distributed stochastic neighbor embedding algorithm to study the discriminative ability in these scenarios. Experiments show that person-dependent systems result in higher accuracy and suitable to be used in personalized systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MobileEmotiFace: Efficient Facial Image Representations in Video-Based Emotion Recognition on Mobile Devices

3D-CNN for Facial Emotion Recognition in Videos

Group-Level Affect Recognition in Video Using Deviation of Frame Features

References

Hajarolasvadi, N., Demirel, H.: 3d cnn-based speech emotion recognition using k-means clustering and spectrograms. Entropy 21(5), 479 (2019)
Article Google Scholar
Guo, J., Lei, Z., Wan, J., Avots, E., Hajarolasvadi, N., Knyazev, B., Kuharenko, A., Jacques, J.C.S., Baró, X., Demirel, H., et al.: Dominant and complementary emotion recognition from still images of faces. IEEE Access 6, 26391–26403 (2018)
Article Google Scholar
Bolotnikova, A., Demirel, H., Anbarjafari, G.: Real-time ensemble based face recognition system for nao humanoids using local binary pattern. Anal. Integr. Circuits Signal Process. 92(3), 467–475 (2017)
Article Google Scholar
Zen, G., Porzi, L., Sangineto, E., Ricci, E., Sebe, N.: Learning personalized models for facial expression analysis and gesture recognition. IEEE Trans. Multimed. 18(4), 775–788 (2016)
Article Google Scholar
Fierrez-Aguilar, J., Garcia-Romero, D., Ortega-Garcia, J., Gonzalez-Rodriguez, J.: Adapted user-dependent multimodal biometric authentication exploiting general information. Pattern Recognit. Lett. 26(16), 2628–2639 (2005)
Article Google Scholar
Eskandari, M., Toygar, Ö., Demirel, H.: Feature extractor selection for face-iris multimodal recognition. Signal Image Video Process. 8(6), 1189–1198 (2014)
Article Google Scholar
Soleymani, M., Pantic, M., Pun, T.: Multimodal emotion recognition in response to videos. IEEE Trans. Affect. Comput. 3(2), 211–223 (2012)
Article Google Scholar
Noroozi, F., Marjanovic, M., Njegus, A., Escalera, S., Anbarjafari, G.: Audio-visual emotion recognition in video clips. IEEE Trans. Affect. Comput. 10, 60–75 (2017)
Article Google Scholar
Xie, Z.: Ryerson Multimedia Research Lab. University of Surrey, Guildford (2014)
Google Scholar
Jackson, P., Haq, S.: Surrey Audio-Visual Expressed Emotion (Savee) Database. University of Surrey, Guildford (2014)
Google Scholar
Baltrusaitis, T., Zadeh, A., Lim, Y.C., Morency, L-P.: Openface 2.0: facial behavior analysis toolkit. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 59–66. IEEE (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Preprint arXiv:1409.1556 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (cvpr). vol. 5, p. 6 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., et al.: Going deeper with convolutions. Preprint arXiv:1409.4842, 1409 (2014)
Zhang, S., Zhang, S., Huang, T., Gao, W., Tian, Q.: Learning affective features with a hybrid deep model for audio-visual emotion recognition. IEEE Trans. Circuits Syst. Video Technol. 28(10), 3030–3043 (2018)
Article Google Scholar
Seng, K.P., Ang, L.-M., Ooi, C.S.: A combined rule-based & machine learning audio-visual emotion recognition approach. IEEE Trans. Affect. Comput. 9(1), 3–13 (2018)
Article Google Scholar
García, H.F., Álvarez, M.A., Orozco, A.A.: Dynamic facial landmarking selection for emotion recognition using gaussian processes. J. Multimodal User Interfaces 11(4), 327–340, (2017). ISSN 1783-8738
Wang, Y., Guan, L.: Recognizing human emotional state from audiovisual signals. IEEE Trans. Multimed. 10(5), 936–946 (2008)
Article Google Scholar
Doherty, A.R., Byrne, D., Smeaton, A.F., Jones, G.J.E., Hughes, M.K.: Investigating keyframe selection methods in the novel domain of passively captured visual lifelogs. In: Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval, pp. 259–268. ACM (2008)
Guo, S.M., Pan, Y.A., Liao, Y.C., Hsu, C.Y., Tsai, J.S.H., Chang, C.I.: A key frame selection-based facial expression recognition system. In: First International Conference on Innovative Computing, Information and Control, 2006. ICICIC’06. vol. 3, pp. 341–344. IEEE (2006)
Zhang, Q., Shao-Pei, Y., Zhou, D.-S., Wei, X.-P.: An efficient method of key-frame extraction based on a cluster algorithm. J. Hum. Kinetics 39(1), 5–14 (2013)
Article Google Scholar
Haq, S., Jackson, P.J.B., Edge, J.: Speaker-dependent audio-visual emotion recognition. In: AVSP, pp. 53–58 (2009)
Barros, P., Wermter, S.: Developing crossmodal expression recognition based on a deep neural model. Adapt. Behav. 24(5), 373–396 (2016)
Article Google Scholar
Avots, E., Sapiński, T., Bachmann, M., Kamińska, D.: Audiovisual emotion recognition in wild. Mach. Vis. Appl. 1–11 (2018)

Download references

Funding

The funding was provided by BAP-C project of Eastern Mediterranean University (Grant No. BAP-C-02-18-0001).

Author information

Authors and Affiliations

Electrical and Electronic Engineering Department, Eastern Mediterranean University, via Mersin 10, Famagusta, Turkey
Noushin Hajarolasvadi & Hasan Demirel
Department of Mathematics, Eastern Mediterranean University, via Mersin 10, Famagusta, Turkey
Enver Bashirov

Authors

Noushin Hajarolasvadi
View author publications
You can also search for this author in PubMed Google Scholar
Enver Bashirov
View author publications
You can also search for this author in PubMed Google Scholar
Hasan Demirel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Noushin Hajarolasvadi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hajarolasvadi, N., Bashirov, E. & Demirel, H. Video-based person-dependent and person-independent facial emotion recognition. SIViP 15, 1049–1056 (2021). https://doi.org/10.1007/s11760-020-01830-0

Download citation

Received: 30 April 2020
Revised: 14 October 2020
Accepted: 30 November 2020
Published: 19 January 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11760-020-01830-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video-based person-dependent and person-independent facial emotion recognition

Abstract

Access this article

Similar content being viewed by others

MobileEmotiFace: Efficient Facial Image Representations in Video-Based Emotion Recognition on Mobile Devices

3D-CNN for Facial Emotion Recognition in Videos

Group-Level Affect Recognition in Video Using Deviation of Frame Features

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Video-based person-dependent and person-independent facial emotion recognition

Abstract

Access this article

Similar content being viewed by others

MobileEmotiFace: Efficient Facial Image Representations in Video-Based Emotion Recognition on Mobile Devices

3D-CNN for Facial Emotion Recognition in Videos

Group-Level Affect Recognition in Video Using Deviation of Frame Features

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation