Skip to main content

Advertisement

Log in

Facial emotion recognition of deaf and hard-of-hearing students for engagement detection using deep learning

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

Nowadays, facial expression recognition (FER) has drawn considerable attention from the research community in various application domains due to the recent advancement of deep learning. In the education field, facial expression recognition has the potential to evaluate students’ engagement in a classroom environment, especially for deaf and hard-of-hearing students. Several works have been conducted on detecting students’ engagement from facial expressions using traditional machine learning or convolutional neural network (CNN) with only a few layers. However, measuring deaf and hard-of-hearing students’ engagement is yet an unexplored area for experimental research. Therefore, we propose in this study a novel approach for detecting the engagement level (‘highly engaged’, ‘nominally engaged’, and ‘not engaged’) from the facial emotions of deaf and hard-of-hearing students using a deep CNN (DCNN) model and transfer learning (TL) technique. A pre-trained VGG-16 model is employed and fine-tuned on the Japanese female facial expression (JAFFE) dataset and the Karolinska directed emotional faces (KDEF) dataset. Then, the performance of the proposed model is compared to seven different pre-trained DCNN models (VGG-19, Inception v3, DenseNet-121, DenseNet-169, MobileNet, ResNet-50, and Xception). On the 10-fold cross-validation case, the best-achieved test accuracies with VGG-16 are 98% and 99% on JAFFE and KDEF datasets, respectively. According to the obtained results, the proposed approach outperformed other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

The JAFFE and KDEF datasets were used to support this study and there application details are in https://zenodo.org/record/3451524#.Yc1-OGjMLIU and https://www.kdef.se/download-2/register.html. The datasets are cited at relevant places within the text as references.

References

  • Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., & et al. (2016). Tensorflow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16) (pp. 265–283).

  • Aifanti, N., Papachristou, C., & Delopoulos, A. (2010). The mug facial expression database. In Proceedings of the 11th international workshop on image analysis for multimedia interactive services (WIAMIS) (pp. 1–4). Desenzano del Garda, Italy: IEEE.

  • Aslan, S., Alyuz, N., Tanriover, C., Mete, S., Okur, E., D’Mello, S., & Arslan Esme, A. (2019). Investigating the impact of a real-time, multi- modal student engagement analytics technology in authentic classrooms. In Proceedings of the 2019 conference on human factors in computing systems (chi). https://doi.org/10.1145/3290605.3300534 (pp. 1–12). Glasgow Scotland, UK: ACM.

  • Ayouni, S., Hajjej, F., Maddeh, M., & Al-Otaibi, S. (2021). A new ml-based approach to enhance student engagement in online environment. PLoS ONE, 16(11), 0258788. https://doi.org/10.1371/journal.pone.0258788.

    Article  Google Scholar 

  • Bradski, G. (2000). The opencv library. Dr. Dobb’s Journal of Software Tools.

  • Calvo, M., & Lundqvist, D. (2008). Facial expressions of emotion (KDEF): Identification under different display-duration conditions. Behavior Research Methods, 40(1), 109–115. https://doi.org/10.3758/BRM.40.1.109.

    Article  Google Scholar 

  • Chollet, F. (2015). Keras: the python deep learning library. https://keras.io. Accessed 20 March 2021.

  • Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. arXiv:1610.02357.

  • Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121–2159.

    MathSciNet  MATH  Google Scholar 

  • Ekman, P., & Friesen, W. (1971). Constants across cultures in the face and emotion. Journal of Personality and Social Psychology, 17(2). https://doi.org/10.1037/h0030377.

  • Ellaban, H., & Elsaeed, E. (2017). A real-time system for facial expression recognition using support vector machines and k-nearest neighbor classifier. International Journal of Computer Applications, 159(8), 23–29. https://doi.org/10.5120/ijca2017913009.

    Article  Google Scholar 

  • Eng, S., Ali, H., Cheah, A., & Chong, Y. (2019). Facial expression recognition in JAFFE and KDEF datasets using histogram of oriented gradients and support vector machine. IOP Conference Series: Materials Science and Engineering, 705(1), 012031. https://doi.org/10.1088/1757-899x/705/1/012031.

    Article  Google Scholar 

  • Hamester, D., Barros, P., & Wermter, S. (2015). Face expression recognition with a 2-channel convolutional neural network. In Proceedings of 2015 international joint conference on neural networks (ijcnn). https://doi.org/10.1109/IJCNN.2015.7280539(pp. 1–8). Killarney, Ireland: IEEE.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of 2016 ieee conference on computer vision and pattern recognition (cvpr). https://doi.org/10.1109/CVPR.2016.90 (pp. 770–778). Las Vegas NV, USA: IEEE.

  • Holder, R., & Tapamo, J. (2017). Improved gradient local ternary patterns for facial expression recognition. EURASIP Journal on Image and Video Processing, 2017, 42. https://doi.org/10.1186/s13640-017-0190-5.

    Article  Google Scholar 

  • Howard, A., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., & et al. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861.

  • Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. (2017). Densely connected convolutional networks. In Proceedings of 2017 IEEE conference on computer vision and pattern recognition (cvpr). https://doi.org/10.1109/CVPR.2017.243 (pp. 2261–2269). Honolulu, HI, USA: IEEE.

  • Jain, N., Kumar, S., Kumar, A., Shamsolmoali, P., & Zareapoor, M. (2018). Hybrid deep neural networks for face emotion recognition. Pattern Recognition Letters, 115, 101–106. https://doi.org/10.1016/j.patrec.2018.04.010.

    Article  Google Scholar 

  • Jin, B., Qu, Y., Zhang, L., & Gao, Z. (2020). Diagnosing parkinson disease through facial expression recognition: video analysis. Journal of Medical Internet Research, 22(7), e18697. https://doi.org/10.2196/18697.

    Article  Google Scholar 

  • Kingma, D., & Ba, J. (2014). Adam: a method for stochastic optimization. arXiv:1412.6980v9.

  • Lasri, I., Riadsolh, A., & El belkacemi, M. (2019). Facial emotion recognition of students using convolutional neural network. In Proceedings of the third international conference on intelligent computing in data sciences (icds) (pp. 1–6).

  • Lee, C., Shih, C., Lai, W., & Lin, P. (2012). An improved boosting algorithm and its application to facial emotion recognition. Journal of Ambient Intelligence and Humanized Computing, 3(1), 11–17. https://doi.org/10.1007/s12652-011-0085-8.

    Article  Google Scholar 

  • Leo, M., Carcagni, P., Mazzeo, P., Spagnolo, P., Cazzato, D., & Distante, C. (2020). Analysis of facial information for healthcare applications: a survey on computer vision-based approaches. Information, 11(3), 128. https://doi.org/10.3390/info11030128.

    Article  Google Scholar 

  • Liew, C., & Yairi, T. (2015). Facial expression recognition and analysis: A comparison study of feature descriptors. IPSJ Transactions on Computer Vision and Applications, 7, 104–120. https://doi.org/10.2197/ipsjtcva.7.104.

    Article  Google Scholar 

  • Liu, P., Han, S., Meng, Z., & Tong, Y. (2014). Facial expression recognition via a boosted deep belief network. In Proceedings of 2014 IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2014.233(pp. 1805–1812). Columbus, OH, USA: IEEE.

  • Lucey, P., Cohn, J., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The extended cohn-kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In Proceedings of 2010 IEEE computer society conference on computer vision and pattern recognition - work- shops (cvpr workshops). https://doi.org/10.1109/CVPRW.2010.5543262 (pp. 94–101). San Francisco, CA, USA: IEEE.

  • Lyons, M., Akamatsu, S., Kamachi, M., & Gyoba, J. (1998). Coding facial expressions with gabor wavelets. In Proceedings of 3rd IEEE international conference on automatic face and gesture recognition. https://doi.org/10.1109/AFGR.1998.670949(pp. 200–205). Nara, Japan: IEEE.

  • Nesterov, Y. (1983). A method of solving a convex programming problem with convergence rate o(1/k2). Soviet Mathematics Doklady, 27(2), 372–376.

    MATH  Google Scholar 

  • Qian, N. (1999). On the momentum term in gradient descent learning algorithms. Neural Networks : The Official Journal of the International Neural Network Society, 12(1), 145–151. https://doi.org/10.1016/S0893-6080(98)00116-6.

    Article  Google Scholar 

  • Robbins, H., & Monro, S. (1951). A stochastic approximation method. Annals of Mathematical Statistics, 22(3), 400–407. https://doi.org/10.1214/aoms/1177729586.

    Article  MathSciNet  MATH  Google Scholar 

  • Sari, M., Moussaoui, A., & Hadid, A. (2021). A simple yet effective convolutional neural network model to classify facial expressions. In S. Chikhi, A. Amine, A. Chaoui, D. Saidouni, & M. Kholladi (Eds.) Lecture notes in networks and systems. https://doi.org/10.1007/978-3-030-58861-8∖_14, (Vol. 156 pp. 188–202). Springer.

  • Shen, J., Yang, H., & Li, J. (2022). Assessing learning engagement based on facial expression recognition in mooc’s scenario. Multimedia Systems, 28, 469–478. https://doi.org/10.1007/s00530-021-00854-x.

    Article  Google Scholar 

  • Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.

  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., & et al. (2015). Going deeper with convolutions. In Proceedings of 2015 IEEE conference on computer vision and pattern recognition (cvpr). https://doi.org/10.1109/CVPR.2015.7298594 (pp. 1–9). Boston. MA, USA: IEEE.

  • Thomas, C., & Jayagopi, D. (2017). Predicting student engagement in classrooms using facial behavioral cues. In Proceedings of the 1st ACM sigchi international workshop on multimodal interaction for education (mie). https://doi.org/10.1145/3139513.3139514 (pp. 33–40). Glasgow Scotland, UK: ACM.

  • Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition (cvpr). https://doi.org/10.1109/CVPR.2001.990517 (pp. 511–518). Kauai, HI, USA.

  • Yin, D., Omar, S., Talip, B., Muklas, A., Norain, N., & Othman, A. (2017). Fusion of face recognition and facial expression detection for authentication: a proposed model. In Proceedings of the 11th international conference on ubiquitous information management and communication (imcom). https://doi.org/10.1145/3022227.3022247(pp. 1–8). Beppu, Japan: ACM.

  • Zeiler, D. (2012). Adadelta: an adaptive learning rate method. arXiv:1212.5701.

  • Zhao, X., Shi, X., & Zhang, S. (2015). Facial expression recognition via deep learning. IETE Technical Review, 32(5), 347–355. https://doi.org/10.1080/02564602.2015.1017542.

    Article  Google Scholar 

Download references

Acknowledgements

We wish to thank the seven deaf and hard-of-hearing students for their participation in the experiment.

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the conception and design of the study. Imane Lasri performed the material preparation, data collection, literature survey, analysis and wrote the paper. Anouar Riadsolh and Imane Lasri conducted the experiment. Mourad Elbelkacemi and Anouar Riadsolh participated in manuscript revision and all authors read and approved the final manuscript.

Corresponding author

Correspondence to Imane Lasri.

Ethics declarations

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Imane Lasri, Anouar Riadsolh and Mourad Elbelkacemi contributed equally to this work.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lasri, I., Riadsolh, A. & Elbelkacemi, M. Facial emotion recognition of deaf and hard-of-hearing students for engagement detection using deep learning. Educ Inf Technol 28, 4069–4092 (2023). https://doi.org/10.1007/s10639-022-11370-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-022-11370-4

Keywords

Navigation