Abstract
The boom of big data in education has provided an unrivalled opportunity for educators to evaluate the learners’ cognitive state. However, most existing cognitive state analysis methods focus on attention, ignoring the roles of emotion in human learning. Therefore, this study proposes an emotion-sensitive learning cognitive state analysis framework, which automatically estimates the learners’ attention based on head pose and emotion based on facial expression in a non-invasive way. The proposed framework includes two modules. In the first module, a multi-task learning implementation with a cascaded convolutional neural network (CNN) is presented for face detection, landmark location, and head pose estimation simultaneously. The located landmarks are used to align the faces as the preprocessing step for the facial expression analysis. The estimated head pose and landmarks are used to recognize the visual focus of attention of the learner. In the second module, an expression intensity ranking CNN is proposed to recognize the facial expression and evaluate its intensity using ordinal information of the sequences. Then, the learners’ emotions are estimated based on the facial expression. Experimental results show that this method can estimate a learner’s attention and emotion with correctness rates of 79.5% and 88.6%, respectively. The results obtained suggest that the method has strong potential as an alternative method for analyzing emotion-sensitive learning cognitive state.
Similar content being viewed by others
References
Ashby FG, Isen AM, Turken AU (1999) A neuropsychological theory of positive affect and its influence on cognition. Psychol Rev 106(3):529–550
Asteriadis S, Tzouveli P, Karpouzis K, Kollias S (2009) Estimation of behavioral user state based on eye gaze and head pose-application in an e-learning environment. Multimed Tools Appl 41(3):469–493
Batista JC, Albiero V, Bellon ORP, Silva L (2017) Aumpnet: simultaneous action units detection and intensity estimation on multipose facial images using a single convolutional neural network. In: IEEE international conference on automatic face and gesture recognition, pp 866–871
Chen D, Hu Y, Wang L, Zomaya AY, Li X (2017) H-PARAFAC: Hierarchical parallel factor analysis of multidimensional big data. IEEE Trans Parallel Distrib Syst 28(4):1091–1104
Chen J, Luo N, Liu Y, Liu L, Zhang K, Kolodziej J (2016) A hybrid intelligence-aided approach to affect-sensitive e-learning. Computing 98(1–2):215–233
Chen S, Zhang C, Dong M, Le J, Rao M (2017) Using ranking-CNN for age estimation. In: IEEE conference on computer vision and pattern recognition, pp 742–751
Chu W, Ghahramani Z (2004) Gaussian processes for ordinal regression. J Mach Learn Res 6(3):1019–1041
Dong Y, Huang C, Liu W (2014) RankCNN: when learning to rank encounters the pseudo preference feedback. Comput Stand Interfaces 36(3):554–562
Ekman P (1978) Facial action coding system. A technique for the measurement of facial action. Consulting Psychologists Press, Palo Alto
Fan Y, Shen D, Davatzikos C (2006) Detecting cognitive states from FMRI images by machine learning and multivariate classification. In: Conference on computer vision and pattern recognition workshop, 2006. CVPRW ’06, p 89
Kaltwang S, Rudovic O, Pantic M (2012) Continuous pain intensity estimation from facial expressions, vol 7432. Springer, Berlin
Ke H, Chen D, Shah T, Liu X, Zhang X, Zhang L, Li X (2018) Cloud-aided online EEG classification system for brain healthcare: a case study of depression evaluation with a lightweight CNN. Software: Practice and Experience
Koelstra S, Pantic M (2008) Non-rigid registration using free-form deformations for recognition of facial actions and their temporal dynamics. IEEE Trans Med Imaging 18(8):712–721
Köstinger M, Wohlhart P, Roth PM, Bischof H (2012) Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: IEEE international conference on computer vision workshops, pp 2144–2151
Li S, Deng W (2018) Deep facial expression recognition: A survey. arXiv:1804.08348
Liao CT, Chuang HJ, Lai SH (2012) Learning expression kernels for facial expression intensity estimation. In: IEEE international conference on acoustics, speech and signal processing, pp 2217–2220
Littlewort G, Bartlett MS, Fasel I, Susskind J, Movellan J (2006) Dynamics of facial expression extracted automatically from video. Image Vis Comput 24(6):615–625
Liu M, Li S, Shan S, Wang R, Chen X (2014) Deeply learning deformable facial action parts model for dynamic expression analysis. Springer, Berlin
Liu Y, Chen J, Zhang M, Rao C (2018) Student engagement study based on multi-cue detection and recognition in an intelligent learning environment. Multimed Tools Appl 77:28749–28775
Liu Y, Wang L, Li W (2017) Emotion analysis based on facial expression recognition in virtual learning environment. Int J Comput Commun Eng 6:49–56
Lucey P, Cohn JF, Prkachin KM, Solomon PE (2011) Painful data: the UNBC-McMaster shoulder pain expression archive database. In: IEEE international conference on automatic face and gesture recognition, pp 57–64
Mehrabian A, Wiener M (1967) Decoding of inconsistent communications. J Pers Soc Psychol 6(1):109–114
Odobez JM, Ba S (2007) A cognitive and unsupervised map adaptation approach to the recognition of the focus of attention from head pose. In: IEEE international conference on multimedia and expo, pp 1379–1382
Ranjan R, Patel VM, Chellappa R (2016) Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41:121–135
Shen L, Wang M, Shen R (2009) Affective e-learning: using “emotional” data to improve learning in pervasive learning environment. J Educ Technol Soc 12(2):176–189
Shen WY, Lin HT (2017) Active sampling of pairs and points for large-scale linear bipartite ranking. J Mach Learn Res 29:388–403
Siau K, Sheng H, Nah FH (2006) Use of a classroom response system to enhance classroom interactivity. IEEE Trans Educ 49(3):398–403
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations
Tang Y, Chen D, Wang L, Zomaya AY, Chen J, Liu H (2018) Bayesian tensor factorization for multi-way analysis of multi-dimensional EEG. Neurocomputing 318:162–174
Tian YL, Kanade T, Cohn JF (2001) Recognizing action units for facial expression analysis. IEEE Trans Pattern Anal Mach Intell 23(2):97–115
Wei C, Keerthi SS (2005) New approaches to support vector ordinal regression. In: International conference on machine learning, pp 145–152
Yang P, Liu Q, Metaxas DN (2010) Rankboost with l1 regularization for facial expression recognition and intensity estimation. In: IEEE international conference on computer vision, pp 1018–1025
Yin L, Wei X, Sun Y, Wang J, Rosato MJ (2006) A 3D facial expression database for facial behavior research. In: International conference on automatic face and gesture recognition, pp 211–216
Yun WH, Lee D, Park C, Kim J (2015) Automatic engagement level estimation of kids in a learning environment. Int J Mach Learn Comput 5(2):148–152
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Zhao R, Gan Q, Wang S, Ji Q (2016) Facial expression intensity estimation using ordinal information. In: Computer vision and pattern recognition, pp 3466–3474
Zhao X, Liang X, Liu L, Li T, Han Y, Vasconcelos N, Yan S (2016) Peak-piloted deep network for facial expression recognition. In: European conference on computer vision, pp 425–442
Acknowledgements
This work was supported by the National Key Research and Development Program of China (Grant No. 2018YFB1004504), the National Natural Science Foundation of China (No. 61772380, No. 61273063), Foundation for Innovative Research Groups of Hubei Province (No. 2017CFA007), and Hubei Province Technological Innovation Major Project and Research Funds of CCNU from the Colleges’ Basic Research and Operation of MOE (Grant No. CCNU19Z02002).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, R., Chen, J., Han, J. et al. Towards emotion-sensitive learning cognitive state analysis of big data in education: deep learning-based facial expression analysis using ordinal information. Computing 102, 765–780 (2020). https://doi.org/10.1007/s00607-019-00722-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-019-00722-7
Keywords
- Learning cognitive state analysis
- Multi-task learning
- Head pose estimation
- Facial expression recognition
- Expression intensity evaluation