Skip to main content

Automatic Engagement Recognition for Distance Learning Systems: A Literature Study of Engagement Datasets and Methods

  • Conference paper
  • First Online:
Augmented Cognition (HCII 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12776))

Included in the following conference series:

Abstract

With the paradigm shift of learning processes from traditional classroom to distance learning systems, the recent success of artificial intelligent research and applications, including machine learning and deep neural networks, have increasingly been leveraged to learn how to include engagement state analysis into the distance learning process. Recent automatic engagement estimations employ several modalities, such as, video, audio, and biological signals or neuro-sensing information as source input to be analyzed. In this paper, we provide a literature review of engagement estimation, including dataset, algorithms as well as the discussion and evaluation. First of all, we present the engagement datasets, including publicly available and proprietary sources built for some specific concerns. We then describe the methodology of engagement measurements that are widely used in literature and state-of-the-art algorithms to automate the estimation. The advantages and limitations of the algorithms are briefly discussed and summarized in benchmark of the used modalities and datasets. Additionally, we extend this literature review to the insight for the practical use of automatic engagement estimation in a real education process that is crucial to distance learning improvement. Finally, we review the remaining challenges for robust engagement estimation as part of attempts in improving distance learning quality and performance by taking personalized engagement into account.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alarcão, S.M., Fonseca, M.J.: Emotions recognition using EEG signals: a survey. IEEE Trans. Affect. Comput. 10(3), 374–393 (2019). https://doi.org/10.1109/TAFFC.2017.2714671

    Article  Google Scholar 

  2. Alexander, K.L., Entwisle, D.R., Horsey, C.S.: From first grade forward: early foundations of high school dropout. Sociol. Educ. 70(2), 87–107 (1997). http://www.jstor.org/stable/2673158

  3. Alkabbany, I., Ali, A., Farag, A., Bennett, I., Ghanoum, M., Farag, A.: Measuring student engagement level using facial information. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 3337–3341 (2019). https://doi.org/10.1109/ICIP.2019.8803590

  4. Aung, A.M., Whitehill, J.: Harnessing label uncertainty to improve modeling: an application to student engagement recognition. In: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), pp. 166–170, May 2018. https://doi.org/10.1109/FG.2018.00033

  5. Baltrusaitis, T., Zadeh, A., Lim, Y.C., Morency, L.: Openface 2.0: facial behavior analysis toolkit. In: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), pp. 59–66 (2018). https://doi.org/10.1109/FG.2018.00019

  6. Baltrušaitis, T., Mahmoud, M., Robinson, P.: Cross-dataset learning and person-specific normalisation for automatic action unit detection. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 06, pp. 1–6 (2015). https://doi.org/10.1109/FG.2015.7284869

  7. Baltrušaitis, T., Robinson, P., Morency, L.: Openface: an open source facial behavior analysis toolkit. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10, March 2016. https://doi.org/10.1109/WACV.2016.7477553

  8. Booth, B.M., Ali, A.M., Narayanan, S.S., Bennett, I., Farag, A.A.: Toward active and unobtrusive engagement assessment of distance learners. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 470–476 (2017). https://doi.org/10.1109/ACII.2017.8273641

  9. Bosch, N.: Detecting student engagement: human versus machine. In: UMAP 2016: Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization, pp. 317–320, July 2016. https://doi.org/10.1145/2930238.2930371

  10. Bosch, N., et al.: Detecting student emotions in computer-enabled classrooms. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, pp. 4125–4129. AAAI Press (2016)

    Google Scholar 

  11. Bradski, G.: The opencv library. Dr. Dobb’s J. Softw. Tools (2000)

    Google Scholar 

  12. Bradski, G., Kaehler, A.: Learning OpenCV: Computer vision with the OpenCV library. O’Reilly (2008)

    Google Scholar 

  13. Cao, Z., Simon, T., Wei, S., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1302–1310 (2017). https://doi.org/10.1109/CVPR.2017.143

  14. Chang, C., Zhang, C., Chen, L., Liu, Y.: An ensemble model using face and body tracking for engagement detection. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, ICMI 2018, pp. 616–622. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3242969.3264986

  15. Chaouachi, M., Chalfoun, P., Jraidi, I., Frasson, C.: Affect and mental engagement: Towards adaptability for intelligent. In: Proceedings of the Twenty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS 2010), pp. 355–360, January 2010

    Google Scholar 

  16. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)

    MATH  Google Scholar 

  17. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734. Association for Computational Linguistics, Doha, October 2014. https://doi.org/10.3115/v1/D14-1179, https://www.aclweb.org/anthology/D14-1179

  18. Christenson, S.L., Reschly, A.L., Wylie, C.: Handbook of Research on Student Engagement. Springer, New York (2012). https://doi.org/10.1007/978-1-4614-2018-7

  19. Cocea, M., Weibelzahl, S.: Disengagement detection in online learning: validation studies and perspectives. IEEE Trans. Learn. Technol. 4(2), 114–124 (2011). https://doi.org/10.1109/TLT.2010.14

    Article  Google Scholar 

  20. Dewan, M.A.A., Lin, F., Wen, D., Murshed, M., Uddin, Z.: A deep learning approach to detecting engagement of online learners. In: 2018 IEEE SmartWorld, Ubiquitous Intelligence Computing, Advanced Trusted Computing, Scalable Computing Communications, Cloud Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 1895–1902, October 2018. https://doi.org/10.1109/SmartWorld.2018.00318

  21. Dewan, M.A.A., Murshed, M., Lin, F.: Engagement detection in online learning: a review. Smart Learn. Environ. 6(1) (2019). https://doi.org/10.1186/s40561-018-0080-z

  22. Dhall, A.: Emotiw 2019: automatic emotion, engagement and cohesion prediction tasks. In: 2019 International Conference on Multimodal Interaction, ICMI 2019, pp. 546–550. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3340555.3355710

  23. Dhall, A., Kaur, A., Goecke, R., Gedeon, T.: Emotiw 2018: audio-video, student engagement and group-level affect prediction. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, ICMI 2018, pp. 653–656. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3242969.3264993

  24. Ekman, P., Friesen, W.V.: Facial Action Coding System. Consulting Psychologists Press, Palo Alto (1978)

    Google Scholar 

  25. Freeman, F.G., Mikulka, P.J., Prinzel, L.J., Scerbo, M.W.: Evaluation of an adaptive automation system using three EEG indices with a visual tracking task. Biol. Psychol. 50(1), 61–76 (1999). https://doi.org/10.1016/S0301-0511(99)00002-2, http://www.sciencedirect.com/science/article/pii/S0301051199000022

  26. Grafsgaard, J.F., Wiggins, J.B., Boyer, K.E., Wiebe, E.N., Lester, J.C.: Automatically recognizing facial expression: Predicting engagement and frustration. In: Proceedings of the 6th International Conference on Educational Data Mining. Memphis, Tennessee (2013)

    Google Scholar 

  27. Gudi, A., Tasli, H.E., den Uyl, T.M., Maroulis, A.: Deep learning based facs action unit occurrence and intensity estimation. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, vol. 06, pp. 1–5, May 2015. https://doi.org/10.1109/FG.2015.7284873

  28. Gupta, A., D’Cunha, A., Awasthi, K., Balasubramanian, V.: DAiSEE: Towards user engagement recognition in the wild. arXiv preprint arXiv:1609.01885 (2018)

  29. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006). https://doi.org/10.1162/neco.2006.18.7.1527

  30. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

  31. Holmes, G., Donkin, A., Witten, I.H.: WEKA: a machine learning workbench. In: Proceedings of ANZIIS ’94 - Australian New Zealand Intelligent Information Systems Conference, pp. 357–361 (1994)

    Google Scholar 

  32. Jabid, T., Kabir, M., Chae, O.: Robust facial expression recognition based on local directional pattern. ETRI J. 32 (2010). https://doi.org/10.4218/etrij.10.1510.0132

  33. Jeni, L.A., Cohn, J.F., De La Torre, F.: Facing imbalanced data-recommendations for the use of performance metrics. In: 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, pp. 245–251 (2013). https://doi.org/10.1109/ACII.2013.47

  34. Kamath, A., Biswas, A., Balasubramanian, V.: A crowdsourced approach to student engagement recognition in e-learning environments. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9, March 2016. https://doi.org/10.1109/WACV.2016.7477618

  35. Kaur, A., Ghosh, B., Singh, N.D., Dhall, A.: Domain adaptation based topic modeling techniques for engagement estimation in the wild. In: 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019), pp. 1–6, May 2019. https://doi.org/10.1109/FG.2019.8756511

  36. Kaur, A., Mustafa, A., Mehta, L., Dhall, A.: Prediction and localization of student engagement in the wild. In: 2018 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8 (2018). https://doi.org/10.1109/DICTA.2018.8615851

  37. Kipp, M.: Spatiotemporal coding in anvil. In: Proceedings of the 6th International Conference on Language Resources and Evaluation. International Conference on Language Resources and Evaluation (LREC-2008), 6th, May 28–30, Marrakech, Morocco. ELRA (2008)

    Google Scholar 

  38. Kononenko, I.: Estimating attributes: analysis and extensions of relief. In: Bergadano, F., De Raedt, L. (eds.) Machine Learning: ECML-94, pp. 171–182. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-57868-4_57

    Chapter  Google Scholar 

  39. Lee, S.P., Perez, M.R., Worsley, M.B., Burgess, B.D.: Utilizing natural language processing (NLP) to evaluate engagement in project-based learning. In: 2018 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), pp. 1146–1149 (2018). https://doi.org/10.1109/TALE.2018.8615395

  40. Li, S., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affective Comput. 1 (2020)

    Google Scholar 

  41. Littlewort, G., et al.: The computer expression recognition toolbox (CERT). Face Gesture 2011, 298–305 (2011). https://doi.org/10.1109/FG.2011.5771414

    Article  Google Scholar 

  42. Lucey, P., Cohn, J.F., Prkachin, K.M., Solomon, P.E., Matthews, I.: Painful data: the UNBC-McMaster shoulder pain expression archive database. In: 2011 IEEE International Conference on Automatic Face Gesture Recognition (FG), pp. 57–64 (2011). https://doi.org/10.1109/FG.2011.5771462

  43. Mavadati, S.M., Mahoor, M.H., Bartlett, K., Trinh, P., Cohn, J.F.: DISFA: a spontaneous facial action intensity database. IEEE Trans. Affect. Comput. 4(2), 151–160 (2013). https://doi.org/10.1109/T-AFFC.2013.4

    Article  Google Scholar 

  44. McKeown, G., Valstar, M., Cowie, R., Pantic, M., Schroder, M.: The semaine database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans. Affect. Comput. 3(1), 5–17 (2012). https://doi.org/10.1109/T-AFFC.2011.20

    Article  Google Scholar 

  45. Mohamad Nezami, O., Dras, M., Hamey, L., Richards, D., Wan, S., Paris, C.: Automatic recognition of student engagement using deep learning and facial expression. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) Machine Learning and Knowledge Discovery in Databases, vol. 11908, pp. 273–289. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-46133-1_17

    Chapter  Google Scholar 

  46. Monkaresi, H., Bosch, N., Calvo, R.A., D’Mello, S.K.: Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Trans. Affect. Comput. 8(1), 15–28 (2017). https://doi.org/10.1109/TAFFC.2016.2515084

    Article  Google Scholar 

  47. Mostafa, E., Ali, A.A., Shalaby, A., Farag, A.: A facial features detector integrating holistic facial information and part-based model. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 93–99 (2015). https://doi.org/10.1109/CVPRW.2015.7301324

  48. Murshed, M., Dewan, M.A.A., Lin, F., Wen, D.: Engagement detection in e-learning environments using convolutional neural networks. In: 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 80–86 (2019). https://doi.org/10.1109/DASC/PiCom/CBDCom/CyberSciTech.2019.00028

  49. Nakano, Y.I., Ishii, R.: Estimating user’s engagement from eye-gaze behaviors in human-agent conversations. In: Proceedings of the 15th International Conference on Intelligent User Interfaces, IUI 2010, pp. 139–148. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1719970.1719990

  50. Nebehay, G., Pflugfelder, R.: Clustering of static-adaptive correspondences for deformable object tracking. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2784–2791 (2015). https://doi.org/10.1109/CVPR.2015.7298895

  51. Nezami, O.M., Richards, D., Hamey, L.: Semi-supervised detection of student engagement. In: PACIS 2017 Proceedings, p. 157 (2017)

    Google Scholar 

  52. Niu, X., et al.: Automatic engagement prediction with GAP feature. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, ICMI 2018, pp. 599–603. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3242969.3264982

  53. Pennebaker, J., Booth, R.J., Boyd, R.L., Francis, M.E.: Linguistic inquiry and word count. http://liwc.wpengine.com

  54. Pope, A.T., Bogart, E.H., Bartolome, D.S.: Biocybernetic system evaluates indices of operator engagement in automated task. Biol. Psychol. 40(1), 187–195 (1995). https://doi.org/10.1016/0301-0511(95)05116-3, http://www.sciencedirect.com/science/article/pii/0301051195051163, eEG in Basic and Applied Settings

  55. Psaltis, A., Apostolakis, K.C., Dimitropoulos, K., Daras, P.: Multimodal student engagement recognition in prosocial games. IEEE Trans. Games 10(3), 292–303 (2018). https://doi.org/10.1109/TCIAIG.2017.2743341

    Article  Google Scholar 

  56. Psaltis, A., et al.: Multimodal affective state recognition in serious games applications. In: 2016 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 435–439 (2016). https://doi.org/10.1109/IST.2016.7738265

  57. Ramya, R., Mala, K., Sindhuja, C.: Student engagement identification based on facial expression analysis using 3D video/image of students. TAGA J. 14, 2446–2454 (2018)

    Google Scholar 

  58. Reeve, J., Tseng, C.M.: Agency as fourth aspect of students’ engagement during learning activities. Contemp. Educ. Psychol. 36(4), 257–267 (2011)

    Article  Google Scholar 

  59. Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vision 77, 125–141 (2008)

    Article  Google Scholar 

  60. Sanghvi, J., Castellano, G., Leite, I., Pereira, A., McOwan, P.W., Paiva, A.: Automatic analysis of affective postures and body motion to detect engagement with a game companion. In: 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 305–311, March 2011. https://doi.org/10.1145/1957656.1957781

  61. Savran, A., et al.: Bosphorus database for 3D face analysis. In: Schouten, B., Juul, N.C., Drygajlo, A., Tistarelli, M. (eds.) Biometrics and Identity Management, pp. 47–56. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  62. Singh, N.D., Dhall, A.: Clustering and learning from imbalanced data. arXiv:1811.00972v2 (2018)

  63. Team, D.: About DAiSEE. https://iith.ac.in/~daisee-dataset/

  64. Thomas, C., Jayagopi, D.B.: Predicting student engagement in classrooms using facial behavioral cues. In: Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education, MIE 2017, pp. 33–40. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3139513.3139514

  65. Tofighi, G., Gu, H., Raahemifar, K.: Vision-based engagement detection in virtual reality. In: 2016 Digital Media Industry Academic Forum (DMIAF), pp. 202–206 (2016). https://doi.org/10.1109/DMIAF.2016.7574933

  66. Valstar, M.F., Jiang, B., Mehu, M., Pantic, M., Scherer, K.: The first facial expression recognition and analysis challenge. In: 2011 IEEE International Conference on Automatic Face Gesture Recognition (FG), pp. 921–926 (2011). https://doi.org/10.1109/FG.2011.5771374

  67. Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57(2), 137–154 (2004). https://doi.org/10.1023/B:VISI.0000013087.49260.fb

    Article  Google Scholar 

  68. Whitehill, J., Serpell, Z., Lin, Y., Foster, A., Movellan, J.R.: The faces of engagement: automatic recognition of student engagement from facial expressions. IEEE Trans. Affect. Comput. 5(1), 86–98 (2014). https://doi.org/10.1109/TAFFC.2014.2316163

    Article  Google Scholar 

  69. Witten, I., Frank, E.: Morgan Kaufmann/Elsevier, New York (2000)

    Google Scholar 

  70. Witten, I., Frank, E.: Morgan Kaufmann/Elsevier, New York, USA (2005)

    Google Scholar 

  71. Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. CVPR 2011, 529–534 (2011). https://doi.org/10.1109/CVPR.2011.5995566

    Article  Google Scholar 

  72. Zhang, C., Chang, C., Chen, L., Liu, Y.: Online privacy-safe engagement tracking system. In: Proceedings of the 20th ACM International Conference on Multimodal Interaction, ICMI 2018, pp. 553–554. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3242969.3266295

  73. Zhang, X., et al.: BP4D-spontaneous: a high-resolution spontaneous 3D dynamic facial expression database. Image Vis. Comput. 32(10), 692–706 (2014). https://doi.org/10.1016/j.imavis.2014.06.002, http://www.sciencedirect.com/science/article/pii/S0262885614001012, best of Automatic Face and Gesture Recognition 2013

  74. Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 915–928 (2007). https://doi.org/10.1109/TPAMI.2007.1110

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shofiyati Nur Karimah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karimah, S.N., Hasegawa, S. (2021). Automatic Engagement Recognition for Distance Learning Systems: A Literature Study of Engagement Datasets and Methods. In: Schmorrow, D.D., Fidopiastis, C.M. (eds) Augmented Cognition. HCII 2021. Lecture Notes in Computer Science(), vol 12776. Springer, Cham. https://doi.org/10.1007/978-3-030-78114-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78114-9_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78113-2

  • Online ISBN: 978-3-030-78114-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics