Skip to main content
Log in

Recognition of student engagement in classroom from affective states

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Student engagement is positively related to comprehension in teaching–learning process. Student engagement is widely studied in online learning environments, whereas this research focuses on student engagement recognition in classroom environments using visual cues. To incorporate learning-centered affective states, we curated a dataset with six learning-centered affective states from four public datasets. A graph convolution network (GCN)-based deep learning model with attention was designed and implemented to extract more contributing features from input video for student engagement recognition. The proposed architecture was evaluated on curated as well as four public datasets. An ablation study was conducted on a curated dataset, the best performing model with minority oversampling and focal cross-entropy loss achieved 65.35% accuracy. We also estimated the student engagement in authentic classroom data, and it showed a positive correlation between students’ engagement levels and post-lesson test scores with a Pearson’s coefficient value of 0.64. The proposed method outperformed the existing state-of-the-art methods on two of the public datasets with accuracy scores of 99.20% and 56.17%, and it achieved accuracy scores of 64.92% and 56.17% on other two public datasets which are better than many baseline results on them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The datasets analyzed during the current study are available from the corresponding author of respective public dataset on reasonable request.

References

  1. Sümer, Ö, Goldberg P, D’Mello S, Gerjets P, Trautwein U, Kasneci E (2021) Multimodal engagement analysis from facial videos in the classroom. IEEE Trans Affect Comput pp. 1–1

  2. Christenson S, Reschly AL, Wylie C et al. (2012) Handbook of research on student engagement, vol. 840. Springer

  3. Fredricks JA, Blumenfeld PC, Paris AH (2004) School engagement: potential of the concept, state of the evidence. Rev Edu Res 74(1):59–109

    Article  Google Scholar 

  4. Lei H, Cui Y, Zhou W (2018) Relationships between student engagement and academic achievement: a meta-analysis. Soc Behav Personal Int J 46(3):517–528

    Article  Google Scholar 

  5. Janosz M (2012) Part iv commentary: Outcomes of engagement and engagement as an outcome: Some consensus, divergences, and unanswered questions. In: Handbook of research on student engagement, pp. 695–703, Springer

  6. Whitehill J, Serpell Z, Lin Y-C, Foster A, Movellan JR (2014) The faces of engagement: automatic recognition of student engagementfrom facial expressions. IEEE Trans Affect Comput 5(1):86–98

    Article  Google Scholar 

  7. Eisele G, Vachon H, Lafit G, Kuppens P, Houben M, Myin-Germeys I, Viechtbauer W (2022) The effects of sampling frequency and questionnaire length on perceived burden, compliance, and careless responding in experience sampling data in a student population. Assessment 29(2):136–151

    Article  Google Scholar 

  8. Van de Grift WJ, Chun S, Maulana R, Lee O, Helms-Lorenz M (2017) Measuring teaching quality and student engagement in South Korea and The Netherlands. School Effect Sch Improv 28(3):337–349

    Article  Google Scholar 

  9. D’Mello S, Picard RW, Graesser A (2007) Toward an affect-sensitive autotutor. IEEE Intell Syst 22(4):53–61

    Article  Google Scholar 

  10. Cerezo, R., Sánchez-Santillán, M., Paule-Ruiz, M. P., Nú nez, J. C.: (2016)“Students’ lms interaction patterns and their relationship with achievement: A case study in higher education.Computers & Education, 96: 42–54

  11. Bosch, N., D’mello, S. K., Ocumpaugh, J., Baker, R. S., Shute, V.: “Using video to automatically detect learner affect in computer-enabled classrooms,” ACM Transactions on Interactive Intelligent Systems (TiiS), vol. 6, no. 2, pp. 1–26, (2016)

  12. McNeal KS, Zhong M, Soltis NA, Doukopoulos L, Johnson ET, Courtney S, Alwan A, Porch M (2020) Biosensors show promise as a measure of student engagement in a large introductory biology course.CBE-Life Sciences Education 19(4): ar50

  13. Bevilacqua D, Davidesco I, Wan L, Chaloner K, Rowland J, Ding M, Poeppel D, Dikker S (2019) Brain-to-brain synchrony and learning outcomes vary by student-teacher dynamics: evidence from a real-world classroom electroencephalography study. J Cogn Neurosci 31(3):401–411

    Article  Google Scholar 

  14. Darnell DK, Krieg PA (2019) Student engagement, assessed using heart rate, shows no reset following active learning sessions in lectures. PloS one 14(12):e0225709

    Article  Google Scholar 

  15. Baker RS, Ocumpaugh J (2014) Interaction-based affect detection in educational software.The Oxford handbook of affective computing, p. 233

  16. Cocea M, Weibelzahl S (2010) Disengagement detection in online learning: validation studies and perspectives. IEEE Trans Learn Technol 4(2):114–124

    Article  Google Scholar 

  17. Aluja-Banet T, Sancho M-R, Vukic I (2019) Measuring motivation from the virtual learning environment in secondary education. J Comput Sci 36:100629

    Article  Google Scholar 

  18. Monkaresi H, Bosch N, Calvo RA, D’Mello SK (2016) Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Trans Affect Comput 8(1):15–28

    Article  Google Scholar 

  19. Fairclough SH, Venables L (2006) Prediction of subjective states from psychophysiology: a multivariate approach. Biolog Psychol 71(1):100–110

    Article  Google Scholar 

  20. Khedher AB, Jraidi I, Frasson C et al (2019) Tracking students’ mental engagement using eeg signals during an interaction with a virtual learning environment. J Intell Learn Syst Appl 11(01):1

    Google Scholar 

  21. Liao J, Liang Y, Pan J (2021) Deep facial spatiotemporal network for engagement prediction in online learning. Appl Intell 51(10):6609–6621

    Article  Google Scholar 

  22. Bhardwaj P, Gupta P, Panwar H, Siddiqui MK, Morales-Menendez R, Bhaik A (2021) Application of deep learning on student engagement in e-learning environments. Comput Electr Eng 93:107277

    Article  Google Scholar 

  23. Pabba C, Kumar P (2022) An intelligent system for monitoring students’ engagement in large classroom teaching through facial expression recognition. Expert Syst 39(1):e12839

    Article  Google Scholar 

  24. Schuller B (2015) Deep learning our everyday emotions. Adv Neural Netw Comput Theor Issues, pp. 339–346

  25. Kratzwald B, Ilić S, Kraus M, Feuerriegel S, Prendinger H (2018) Deep learning for affective computing: text-based emotion recognition in decision support. Decis Support Syst 115:24–35

    Article  Google Scholar 

  26. Zhao S, Wang S, Soleymani M, Joshi D, Ji Q (2019) Affective computing for large-scale heterogeneous multimedia data: a survey. ACM Trans Multim Comput Commun Appl (TOMM) 15(3s): 1–32

  27. Rouast PV, Adam MT, Chiong R (2019) Deep learning for human affect recognition: insights and new developments. IEEE Trans Affect Comput 12(2):524–543

    Article  Google Scholar 

  28. Chen X, Xie H, Zou D, Hwang G-J (2020) Application and theory gaps during the rise of artificial intelligence in education. Comput Edu Art Intell 1:100002

    Google Scholar 

  29. Ouyang F, Jiao P (2021) Artificial intelligence in education: the three paradigms. Comput Edu Art Intell 2:100020

    Google Scholar 

  30. Bidwell J, Fuchs H (2011) Classroom analytics: measuring student engagement with automated gaze tracking. Behav Res Meth 49:113

    Google Scholar 

  31. Raca M, Dillenbourg P (2013) System for assessing classroom attention. In:Proceedings of the Third International Conference on Learning Analytics and Knowledge, pp. 265–269

  32. Raca M (2015) Camera-based estimation of student’s attention in class. Tech. rep., EPFL

  33. Zaletelj J, Košir A (2017) Predicting students’ attention in the classroom from kinect facial and body features. EURASIP J Imag Video Process 2017(1):1–12

    Google Scholar 

  34. Zaletelj J (2015) Estimation of students’ attention in the classroom from kinect features. In:Proceedings of the 10th International Symposium on Image and Signal Processing and Analysis, pp. 220–224, IEEE

  35. Thomas C, Jayagopi DB (2017) Predicting student engagement in classrooms using facial behavioral cues. In:Proceedings of the 1st ACM SIGCHI international workshop on multimodal interaction for education, pp. 33–40

  36. Goldberg P, Sümer Ö, Stürmer K, Wagner W, Göllner R, Gerjets P, Kasneci E, Trautwein U (2021) Attentive or not? toward a machine learning approach to assessing students’ visible engagement in classroom instruction. Edu Psychol Rev 33(1):27–49

    Article  Google Scholar 

  37. Fujii K, Marian P, Clark D, Okamoto Y, Rekimoto J (2018) Sync class: visualization system for in-class student synchronization. In: Proceedings of the 9th Augmented Human International Conference

  38. Ngoc Anh B, Tung Son N, Truong Lam P, Phuong Chi L, Huu Tuan N, Cong Dat N, Huu Trung N, Umar Aftab M, Van Dinh T (2019) A computer-vision based application for student behavior monitoring in classroom. Appl Sci 9(22):4729

    Article  Google Scholar 

  39. Ahuja K, Kim D, Xhakaj F, Varga V, Xie A, Zhang S, Townsend JE, Harrison C, Ogan A, Agarwal Y (2019) Edusense: practical classroom sensing at scale. Proc ACM Interact Mob Wearab Ubiquit Technol 3(3):1–26

    Article  Google Scholar 

  40. Aslan S, Alyuz N, Tanriover C, Mete SE, Okur E, D’Mello SK, Arslan Esme A (2019) Investigating the impact of a real-time, multimodal student engagement analytics technology in authentic classrooms. In:Proceedings of the 2019 CHI conference on human factors in computing systems, pp. 1–12

  41. Stewart, A., Bosch, N., Chen, H., Donnelly, P., D’Mello, S.: “Face forward: Detecting mind wandering from video during narrative film comprehension,” in International Conference on Artificial Intelligence in Education, pp. 359–370, Springer, (2017)

  42. Stewart, A., Bosch, N., D’Mello, S. K.: “Generalizability of face-based mind wandering detection across task contexts.,” International Educational Data Mining Society, (2017)

  43. Bosch N, D’mello S. K (2019) Automatic detection of mind wandering from video in the lab and in the classroom. IEEE Transactions on Affective Computing 12(4):974–988

    Article  Google Scholar 

  44. Slavin RE (1983) When does cooperative learning increase student achievement? Psychological bulletin 94(3):429

    Article  Google Scholar 

  45. O’Donnell, A. M.: “The role of peers and group learning.,” (2006)

  46. Tölgyessy M, Dekan M, Chovanec L, Hubinskỳ P (2021) Evaluation of the azure kinect and its comparison to kinect v1 and kinect v2. Sensors 21(2):413

    Article  Google Scholar 

  47. Baltrusaitis, T., Zadeh, A., Lim, Y. C., Morency, L.-P.: “Openface 2.0: Facial behavior analysis toolkit,” in 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pp. 59–66, IEEE, (2018)

  48. Chi MT, Wylie R (2014) The icap framework: Linking cognitive engagement to active learning outcomes. Educational psychologist 49(4):219–243

    Article  Google Scholar 

  49. Lewis, D. D., Catlett, J.: “Heterogeneous uncertainty sampling for supervised learning,” in Machine learning proceedings 1994, pp. 148–156, Elsevier, (1994)

  50. Ocumpaugh, J.: “Baker rodrigo ocumpaugh monitoring protocol (bromp) 2.0 technical and training manual,” New York, NY and Manila, Philippines: Teachers College, Columbia University and Ateneo Laboratory for the Learning Sciences, vol. 60, (2015)

  51. Alyuz, N., Okur, E., Oktay, E., Genc, U., Aslan, S., Mete, S. E., Arnrich, B., Esme, A. A.: “Semi-supervised model personalization for improved detection of learner’s emotional engagement,” in Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 100–107, (2016)

  52. Okur, E., Alyuz, N., Aslan, S., Genc, U., Tanriover, C., Arslan Esme, A.: “Behavioral engagement detection of students in the wild,” in International Conference on Artificial Intelligence in Education, pp. 250–261, Springer, (2017)

  53. Smallwood J, Schooler JW (2006) The restless mind. Psychological bulletin 132(6):946

    Article  Google Scholar 

  54. D’Mello, S. K.: “What do we think about when we learn?,” in Deep comprehension, pp. 52–67, Routledge, (2018)

  55. Hutt S, Krasich K, Mills C, Bosch N, White S, Brockmole JR, D’Mello SK (2019) Automated gaze-based mind wandering detection during computerized learning in classrooms. User Modeling and User-Adapted Interaction 29(4):821–867

    Article  Google Scholar 

  56. Blanchard, N., Bixler, R., Joyce, T., D’Mello, S.: “Automated physiological-based detection of mind wandering during learning,” in International conference on intelligent tutoring systems, pp. 55–60, Springer, (2014)

  57. Ekman P (1992) An argument for basic emotions. Cognition & emotion 6(3–4):169–200

    Article  Google Scholar 

  58. Pekrun, R.: “A social-cognitive, control-value theory of achievement emotions.,” (2000)

  59. D’Mello SK, Lehman B, Person N (2010) Monitoring affect states during effortful problem solving activities. International Journal of Artificial Intelligence in Education 20(4):361–389

    Google Scholar 

  60. Sabourin JL, Lester JC (2014) Affect and engagement in game-basedlearning environments. IEEE Transactions on Affective Computing 5(1):45–56

    Article  Google Scholar 

  61. Ashwin T, Guddeti RMR (2020) Affective database for e-learning and classroom environments using indian students faces, hand gestures and body postures. Future Generation Computer Systems 108:334–348

    Article  Google Scholar 

  62. Gupta, A., D’Cunha, A., Awasthi, K., Balasubramanian, V.: “Daisee: Towards user engagement recognition in the wild,” arXiv preprint arXiv:1609.01885, (2016)

  63. Abtahi, S., Omidyeganeh, M., Shirmohammadi, S., Hariri, B.: “Yawdd: A yawning detection dataset,” in Proceedings of the 5th ACM multimedia systems conference, pp. 24–28, (2014)

  64. Zhalehpour S, Onder O, Akhtar Z, Erdem CE (2016) Baum-1: A spontaneous audio-visual face database of affective and mental states. IEEE Transactions on Affective Computing 8(3):300–313

    Article  Google Scholar 

  65. Ghoddoosian, R., Galib, M., Athitsos, V.: “A realistic dataset and baseline temporal model for early drowsiness detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0, (2019)

  66. Kipf, T. N., Welling, M.: “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, (2016)

  67. Liu, D., Zhang, H., Zhou, P.: “Video-based facial expression recognition using graph convolutional networks,” in 2020 25th International Conference on Pattern Recognition (ICPR), pp. 607–614, (2021)

  68. Liu G, Guo J (2019) Bidirectional lstm with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338

    Article  Google Scholar 

  69. Luong, M.-T., Pham, H., Manning, C. D.: “Effective approaches to attention-based neural machine translation,” arXiv preprint arXiv:1508.04025, (2015)

  70. Loshchilov, I., Hutter, F.: “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, (2017)

  71. Omidyeganeh M, Shirmohammadi S, Abtahi S, Khurshid A, Farhan M, Scharcanski J, Hariri B, Laroche D, Martel L (2016) Yawning detection using embedded smart cameras. IEEE Transactions on Instrumentation and Measurement 65(3):570–582

    Article  Google Scholar 

  72. Zhang, W., Su, J.: “Driver yawning detection based on long short term memory networks,” in 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–5, IEEE, (2017)

  73. Zhang, W., Murphey, Y. L., Wang, T., Xu, Q.: “Driver yawning detection based on deep convolutional neural learning and robust nose tracking,” in 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, IEEE, (2015)

  74. Bai, J., Yu, W., Xiao, Z., Havyarimana, V., Regan, A. C., Jiang, H., Jiao, L.: “Two-stream spatial-temporal graph convolutional networks for driver drowsiness detection,” IEEE Transactions on Cybernetics, (2021)

  75. Deng W, Wu R (2019) Real-time driver-drowsiness detection system using facial features. IEEE Access 7:118727–118738

    Article  Google Scholar 

  76. Ji Y, Wang S, Zhao Y, Wei J, Lu Y (2019) Fatigue state detection based on multi-index fusion and state recognition network. IEEE Access 7:64136–64147

    Article  Google Scholar 

  77. Ye, M., Zhang, W., Cao, P., Liu, K.: “Driver fatigue detection based on residual channel attention network and head pose estimation,” Applied Sciences, vol. 11, no. 19, (2021)

  78. Xiang W, Wu X, Li C, Zhang W, Li F (2022) Driving fatigue detection based on the combination of multi-branch 3d-cnn and attention mechanism. Applied Sciences 12(9):4689

  79. Zhang S, Zhang S, Huang T, Gao W, Tian Q (2017) Learning affective features with a hybrid deep model for audio-visual emotion recognition. IEEE Transactions on Circuits and Systems for Video Technology 28(10):3030–3043

  80. Ma Y, Hao Y, Chen M, Chen J, Lu P, Košir A (2019) Audio-visual emotion fusion (avef): A deep efficient weighted approach. Information Fusion 46:184–192

    Article  Google Scholar 

  81. Pan, B., Hirota, K., Jia, Z., Zhao, L., Jin, X., Dai, Y.: “Multimodal emotion recognition based on feature selection and extreme learning machine in video clips,” Journal of Ambient Intelligence and Humanized Computing, pp. 1–15, (2021)

  82. Mehta, N. K., Prasad, S. S., Saurav, S., Saini, R., Singh, S.: “Three-dimensional densenet self-attention neural network for automatic detection of student’s engagement,” Applied Intelligence, pp. 1–21, (2022)

  83. Yang, J., Wang, K., Peng, X., Qiao, Y.: “Deep recurrent multi-instance learning with spatio-temporal features for engagement intensity prediction,” in Proceedings of the 20th ACM international conference on multimodal interaction, pp. 594–598, (2018)

  84. Huang, T., Mei, Y., Zhang, H., Liu, S., Yang, H.: “Fine-grained engagement recognition in online learning environment,” in 2019 IEEE 9th international conference on electronics information and emergency communication (ICEIEC), pp. 338–341, IEEE, (2019)

  85. Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, pp. 2980–2988, (2017)

  86. Schroff, F., Kalenichenko, D., Philbin, J.: “Facenet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 815–823, (2015)

Download references

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: SM, KS; Methodology: SM, KS; Formal analysis and investigation: SM, KS, RM; Writing—original draft preparation: SM; Writing—review and editing: KS, RM; Supervision: KS, RM.

Corresponding author

Correspondence to Sandeep Mandia.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mandia, S., Singh, K. & Mitharwal, R. Recognition of student engagement in classroom from affective states. Int J Multimed Info Retr 12, 18 (2023). https://doi.org/10.1007/s13735-023-00284-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13735-023-00284-7

Keywords

Navigation