Abstract
Learning-centered emotions have a significant role in the cognitive process in learning. For this reason, it is relevant that virtual learning environments consider the cognitive and affective aspects of the student. Methods of artificial intelligence such as the recognition of facial expressions, and sentimental analysis have proven to be an excellent alternative in the automatic recognition of emotions. However, learning-centered emotions and opinion-based sentiment dataset commonly contain single modalities. At the same time, single modalities cannot effectively represent complex emotions in real life. This work presents three different fusion methods applied to three image-based and text-based dataset for learning-centered emotion recognition. Using some conventional deep learning architectures, the three new multimodal datasets showed promising results when compared with similar architectures trained in unimodal information. The improvement of one of the methods (embedding-based representation) was 4% compared to single-modality hyperparameter optimization. The main objective of this study is to benchmark the viability of semantic fusion of multimodal learning-centered emotional data from the different datasets for intelligent tutoring system applications.
Similar content being viewed by others
Data Availability
“The dataset generated during and/or analyzed during the current study are not publicly available due to the privacy of the subjects in the dataset but are available from the corresponding author on reasonable request.”
References
Bartolini M, Ciaccia P (2008) Scenique: a multimodal image retrieval interface. In: Proceedings of the workshop on advanced visual interfaces AVI, pp 476–477, New York, New York, USA. ACM Press
Chatterjee A, Gupta U, Chinnakotla MK, Srikanth R, Galley M, Agrawal P (2019) Understanding emotions in text using deep learning and big data. Comput Hum Behav 93:309–317
Chen F, Luo Z (2019) Sentiment analysis using deep robust complementary fusion of multi-features and multi-modalities. CoRR, arXiv:1904.08138
Crangle CE, Wang R, Guimaraes MP, Nguyen MU, Nguyen DT, Suppes P (2019) Machine learning for the recognition of emotion in the speech of couples in psychotherapy using the stanford suppes brain lab psychotherapy dataset. CoRR, arXiv:1901.04110
D’Mello S, Graesser A (2012) Dynamics of affective states during complex learning. Learn Instr 22(2):145–157
Eitel A, Springenberg JT, Spinello L, Riedmiller M, Burgard W (2015) Multimodal deep learning for robust RGB-D object recognition. In: IEEE international conference on intelligent robots and systems, volume 2015, pp 681–687
Ekman P (1992) An argument for basic emotions. Cogn Emot 6(3–4):169–200
González-Hernández F, Zatarain-Cabada R, Barrón-Estrada ML, Rodríguez-Rangel H (2018) Recognition of learning-centered emotions using a convolutional neural network. J Intell Fuzzy Syst 34(5):3325–3336
Hassan MM, Alam MGR, Uddin MZ, Huda S, Almogren A, Fortino G (2019) Human emotion recognition using deep belief network architecture. Inf Fusion 51:10–18
Hu A, Flaxman S (2018) Multimodal sentiment analysis to explore the structure of emotions. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 350–358
Kanjo E, Younis EM, Ang CS (2019) Deep learning analysis of mobile physiological, environmental and location sensor data for emotion detection. Info Fusion 49:46–56
Lahat D, Adalı T, Jutten C (2015) Multimodal data fusion: an overview of methods, challenges and prospects. Proc IEEE 103(9):1449–1477
Mollahosseini A, Hasani B, Mahoor MH (2019) AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31
Oramas S, Barbieri F, Nieto O, Serra X (2018) Multimodal deep learning for music genre classification. Trans Int Soc Music Inf Retr 1(1):4–21
Oramas Bustillos R, Zatarain Cabada R, Barrón Estrada ML, Hernández Pérez Y (2019) Opinion mining and emotion recognition in an intelligent learning environment. Comput Appl Eng Educ 27(1):90–101
Radu V, Tong C, Bhattacharya S, Lane ND, Mascolo C, Marina MK, Kawsar F (2018) Multimodal deep learning for activity and context recognition. In: Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies 1(4):1–27
Ranganathan H, Chakraborty S, Panchanathan S (2016) Multimodal emotion recognition using deep learning architectures. In: 2016 IEEE winter conference on applications of computer vision, WACV 2016, pp 1–9. IEEE
Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39(6):1161–1178
Tyng CM, Amin HU, Saad MN, Malik AS (2017) The influences of emotion on learning and memory
Wang K, Peng X, Yang J, Lu S, Qiao Y (2020) Suppressing uncertainties for large-scale facial expression recognition. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Yang L, Ban X, Mukeshimana M, Chen Z (2019) Multimodal emotion recognition using the symmetric S-ELM-LUPI paradigm. Symmetry 11(4):487
Yuhas BP, Goldstein MH, Sejnowski TJ (1989) Integration of acoustic and visual speech signals using neural networks. IEEE Commun Mag 27(11):65–71
Zatarain Cabada R, Rodriguez Rangel H, Barron Estrada ML, Cardenas Lopez, HM (2019) Hyperparameter optimization in CNN for learning-centered emotion recognition for intelligent tutoring systems. Soft Comput
Funding
The work described in this paper was fully supported by a scholarship from CONACYT (Consejo Nacional de Ciencia y Tecnologia) in México
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cárdenas-López, H.M., Zatarain-Cabada, R., Barrón-Estrada, M.L. et al. Semantic fusion of facial expressions and textual opinions from different datasets for learning-centered emotion recognition. Soft Comput 27, 17357–17367 (2023). https://doi.org/10.1007/s00500-023-08076-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-08076-1