Abstract
User engagement is crucial for the successful completion of education and intervention programs. Automatic measurement of user engagement can provide valuable insights for instructors to personalize the delivery of the program and achieve program objectives. This paper presents a novel approach to automatically measuring users’ engagement in virtual learning programs from their videos. The proposed approach utilizes affect states, continuous values of valence and arousal, along with a new latent affective feature vector and behavioral features extracted from consecutive video frames. Deep-learning sequential models are trained and validated on the extracted features for video-based engagement measurement. Since engagement is an ordinal variable, ordinal versions of the models are also developed to address the problem of engagement level measurement as an ordinal classification problem. The proposed approach was evaluated on two publicly available video-based engagement measurement datasets, Dataset for Affective States in E-Environments (DAiSEE) and Emotion recognition in the Wild-Engagement prediction in the Wild (EmotiW-EW), containing videos of students in virtual learning programs. The experiments demonstrated a state-of-the-art engagement level classification accuracy of 67.4% on the DAiSEE dataset and a regression mean squared error of 0.0508 on the EmotiW-EW dataset. The ablation study demonstrated that incorporating affect states and ordinality of engagement significantly improved engagement measurement.
Similar content being viewed by others
Data Availability
The datasets analyzed during the current study are publicly available in the following repositories:
https://people.iith.ac.in/vineethnb/resources/daisee/index.html
References
Abedi A, Khan SS (2021) Improving state-of-the-art in detecting student engagement with resnet and tcn hybrid network. In: 2021 18th Conference on Robots and Vision (CRV). IEEE, pp. 151–157
Ai X, Sheng VS, Li C (2022) Class-attention video transformer for engagement intensity prediction. arXiv:2208.07216
Altuwairqi K, Jarraya SK, Allinjawi A, Hammami M (2021) A new emotion–based affective model to detect student’s engagement. J King Saud University-Computer Inf Sci 33(1):99–109
Aslan S, Mete SE, Okur E, Oktay E, Alyuz N, Genc UE, Stanhill D, Esme AA (2017) Human expert labeling process (help): towards a reliable higher-order user state labeling process and tool to assess student engagement. Educ Technol 53–59
Bai S, Kolter JZ, Koltun V (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv:1803.01271
Baltrusaitis T, Zadeh A, Lim YC, Morency L-P (2018) Openface 2.0: Facial behavior analysis toolkit. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, pp. 59–66
Belle A, Hargraves RH, Najarian K (2012) An automated optimal engagement and attention detection system using electrocardiogram. Comput Math Methods Med 2012
Booth BM, Ali AM, Narayanan SS, Bennett I, Farag AA (2017) Toward active and unobtrusive engagement assessment of distance learners. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, pp. 470–476
Broughton SH, Sinatra GM, Reynolds RE (2010) The nature of the refutation text effect: An investigation of attention allocation. J Educ Res 103(6):407–423
Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299
Cardoso JS, Sousa R (2011) Measuring the performance of ordinal classification. Int J Pattern Recognit Artif 25(08):1173–1195
Chen X, Niu L, Veeraraghavan A, Sabharwal A (2019) Faceengage: robust estimation of gameplay engagement from user-contributed (youtube) videos. IEEE Trans Affect Comput
Copur O, Nakip M, Scardapane S, Slowack J (2022) Engagement detection with multi-task training in e-learning environments. In: International Conference on Image Analysis and Processing. Springer, pp. 411–422
Delgado K, Origgi JM, Hasanpoor T, Yu H, Allessio D, Arroyo I, Lee W, Betke M, Woolf B, Bargal SA (2021) Student engagement dataset. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3628–3636
Dewan M, Murshed M, Lin F (2019) Engagement detection in online learning: a review. Smart Learn Environ 6(1):1–20
Dhall A, Sharma G, Goecke R, Gedeon T (2020) Emotiw 2020: Driver gaze, group emotion, student engagement and physiological signal based challenges. In: Proceedings of the 2020 International Conference on Multimodal Interaction, pp. 784–789
D’Mello S, Dieterle E, Duckworth A (2017) Advanced, analytic, automated (aaa) measurement of engagement during learning. Educ Psychol 52(2):104–123
Dobrian F, Sekar V, Awan A, Stoica I, Joseph D, Ganjam A, Zhan J, Zhang H (2011) Understanding the impact of video quality on user engagement. ACM SIGCOMM Comput Commun Rev 41(4):362–373
Doherty K, Doherty G (2018) Engagement in hci: conception, theory and measurement. ACM Comput Surv (CSUR) 51(5):1–39
Fedotov D, Perepelkina O, Kazimirova E, Konstantinova M, Minker W (2018) Multimodal approach to engagement and disengagement detection with highly imbalanced in-the-wild data. In: Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data, pp. 1–9
Geng L, Xu M, Wei Z, Zhou X (2019) Learning deep spatiotemporal feature for engagement recognition of online courses. In: 2019 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, pp. 442–447
Gupta A, D’Cunha A, Awasthi K, Balasubramanian V (2016) Daisee: Towards user engagement recognition in the wild. arXiv:1609.01885
Hu Y, Jiang Z, Zhu K (2022) An optimized cnn model for engagement recognition in an e-learning environment. Appl Sci 12(16):8007
Huang T, Mei Y, Zhang H, Liu S, Yang H (2019) Fine-grained engagement recognition in online learning environment. In: 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC). IEEE, pp. 338–341
Kaur A, Mustafa A, Mehta L, Dhall A (2018) Prediction and localization of student engagement in the wild. In: 2018 Digital Image Computing: Techniques and Applications (DICTA). IEEE, pp. 1–8
Khan SS, Abedi A, Colella T (2022) Inconsistencies in measuring student engagement in virtual learning–a critical review. arXiv:2208.04548
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661–18673
Kook L, Herzog L, Hothorn T, Dürr O, Sick B (2022) Deep and interpretable regression models for ordinal outcomes. Pattern Recognit 122:108263
Liao J, Liang Y, Pan J (2021) Deep facial spatiotemporal network for engagement prediction in online learning. Appl Intell 51(10):6609–6621
Lugaresi C, Tang J, Nash H, McClanahan C, Uboweja E, Hays M, Zhang F, Chang C-L, Yong MG, Lee J, et al. (2019) Mediapipe: A framework for building perception pipelines. arXiv:1906.08172
Ma X, Xu M, Dong Y, Sun Z (2021) Automatic student engagement in online learning environment based on neural turing machine. Int J Inf Educ Technol 11(3):107–111
Matamala-Gomez M, Maisto M, Montana JI, Mavrodiev PA, Baglio F, Rossetto F, Mantovani F, Riva G, Realdon O (2020) The role of engagement in teleneurorehabilitation: A systematic review. Front Neurol 354
Mehta NK, Prasad SS, Saurav S, Saini R, Singh S (2022) Threedimensional densenet self-attention neural network for automatic detection of student’s engagement. Appl Intell 1–21
Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31
Monkaresi H, Bosch N, Calvo RA, D’Mello SK (2016) Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Trans Affect Comput 8(1):15–28
Mukhtar K, Javed K, Arooj M, Sethi A (2020) Advantages, limitations and recommendations for online learning during covid-19 pandemic era. Pakistan J Med Sci 36(COVID19-S4):27
Niu X, Han H, Zeng J, Sun X, Shan S, Huang Y, Yang S, Chen X (2018) Automatic engagement prediction with gap feature. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp. 599–603
Nuara A, Fabbri-Destro M, Scalona E, Lenzi SE, Rizzolatti G, Avanzini P (2021) Telerehabilitation in response to constrained physical distance: an opportunity to rethink neurorehabilitative routines. J Neurol 1–12
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al. (2019) Pytorch: An imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: Machine learning in python. J Mach Learn Res 12:2825–2830
Pekrun R, Linnenbrink-Garcia L (2012) Academic emotions and student engagement. In: Handbook of Research on Student Engagement. Springer, pp. 259–282, ???
Ranti C, Jones W, Klin A, Shultz S (2020) Blink rate patterns provide a reliable measure of individual engagement with scene content. Sci Reports 10(1):1–10
Ringeval F, Sonderegger A, Sauer J, Lalanne D (2013) Introducing the recola multimodal corpus of remote collaborative and affective interactions. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG). IEEE, pp. 1–8
Rivas JJ, del Carmen Lara M, Castrejon L, Hernandez-Franco J, Orihuela-Espina F, Palafox L, Williams A, Bianchi-Berthouze N, Sucar LE (2021) Multi-label and multimodal classifier for affective states recognition in virtual rehabilitation. IEEE Trans Affect Comput 13(3):1183–1194
Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39(6):1161
Salam H, Celiktutan O, Gunes H, Chetouani M (2022) Automatic contextdriven inference of engagement in hmi: A survey. arXiv:2209.15370
Sinatra GM, Heddy BC, Lombardi D (2015) The challenges of defining and measuring student engagement in science. Taylor & Francis
Sümer Ö, Goldberg P, D’Mello S, Gerjets P, Trautwein U, Kasneci E (2021) Multimodal engagement analysis from facial videos in the classroom. IEEE Trans Affect Comput
Thomas C, Nair N, Jayagopi DB (2018) Predicting engagement intensity in the wild using temporal convolutional network. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, pp. 604–610
Toisoul A, Kossaifi J, Bulat A, Tzimiropoulos G, Pantic M (2021) Estimation of continuous valence and arousal levels from faces in naturalistic conditions. Nat Mach Intell 3(1):42–50
Venton BJ, Pompano RR (2021) Strategies for enhancing remote student engagement through active learning. Springer
Whitehill J, Serpell Z, Lin Y-C, Foster A, Movellan JR (2014) Thefaces of engagement: Automatic recognition of student engagementfrom facial expressions. IEEE Trans Affect Comput 5(1):86–98
Woolf B, Burleson W, Arroyo I, Dragon T, Cooper D, Picard R (2009) Affect-aware tutors: recognising and responding to student affect. Int J Learn Technol 4(3/4):129–164
Wu J, Yang B, Wang Y, Hattori G (2020) Advanced multi-instance learning method with multi-features engineering and conservative optimization for engagement intensity prediction. In: Proceedings of the 2020 International Conference on Multimodal Interaction, pp. 777–783
Zhang H, Xiao X, Huang T, Liu S, Xia Y, Li J (2019) An novel end-toend network for automatic student engagement recognition. In: 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC). IEEE, pp. 342–345
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Abedi, A., Khan, S.S. Affect-driven ordinal engagement measurement from video. Multimed Tools Appl 83, 24899–24918 (2024). https://doi.org/10.1007/s11042-023-16345-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16345-2