Skip to main content

Multimodal Attention CNN for Human Emotion Recognition

  • Conference paper
  • First Online:
Cryptology and Network Security with Machine Learning (ICCNSML 2023)

Abstract

The human face is the mirror of the mind. The face generally tells all that is going on in one’s heart and mind. Just by looking at the faces of our known ones, we may easily guess their mood. But many times, when we meet some unfamiliar person, it’s hard to get his or her mood just by looking at their faces. This is just because the person may have a certain facial structure that makes them by default look angry, happy, or sad. So, we need to spend some time with that person to analyse other parameters before concluding their state of mood. The current work proposed a novel approach that integrates facial images with electroencephalography (EEG) signals for facial expression recognition tasks. When attention-based deep CNN analyses the facial traits of the subject, a parallel Long Short-Term Memory (LSTM) network analyses the EEG signals. A late fusion network combines the features extracted from both networks, and finally, a classification network tells about what is the current mood of the subject. Combining multiple modalities for emotion recognition has shown promising results when compared with other state-of-the-art models. There are multiple real-life applications of emotion recognition models, such as Advertisement Industry, Human–Robot Interaction, Automatic Depression Detection, Mood Audio/Video Players, etc.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang S, Qu J, Zhang Y, Zhang Y (2023) Multimodal emotion recognition from EEG signals and facial expressions. IEEE Access 11:33061–33068. https://doi.org/10.1109/ACCESS.2023.3263670

    Article  Google Scholar 

  2. Yang Y, Gao Q, Song Y, Song X, Mao Z, Liu J (2022) Investigating of deaf emotion cognition pattern by EEG and facial expression combination. IEEE J Biomed Health Inform 26(2):589–599. https://doi.org/10.1109/JBHI.2021.3092412

    Article  Google Scholar 

  3. Li D et al (2023) Emotion recognition of subjects with hearing impairment based on fusion of facial expression and EEG topographic map. IEEE Trans Neural Syst Rehabil Eng 31:437–445. https://doi.org/10.1109/TNSRE.2022.3225948

    Article  Google Scholar 

  4. Yang J, Qian T, Zhang F, Khan SU (2021) Real-time facial expression recognition based on edge computing. IEEE Access 9:76178–76190. https://doi.org/10.1109/ACCESS.2021.3082641

    Article  Google Scholar 

  5. Tang Y, Zhang X, Hu X, Wang S, Wang H (2021) Facial expression recognition using frequency neural network. IEEE Trans Image Process 30:444–457. https://doi.org/10.1109/TIP.2020.3037467

    Article  Google Scholar 

  6. Wadhawan R, Gandhi T (2022) Landmark-aware and part-based ensemble transfer learning network for static facial expression recognition from images. In: IEEE transactions on artificial intelligence, pp 1–1. https://doi.org/10.1109/tai.2022.3172272

  7. Lee J, Kim S, Kim S, Sohn K (2020) Multi-modal recurrent attention networks for facial expression recognition. IEEE Trans Image Process 29:6977–6991. https://doi.org/10.1109/TIP.2020.2996086

    Article  Google Scholar 

  8. Zheng W, Zong Y, Zhou X, Xin M (2018) Cross-domain color facial expression recognition using transductive transfer subspace learning. IEEE Trans Affect Comput 9(1):21–37. https://doi.org/10.1109/TAFFC.2016.2563432

    Article  Google Scholar 

  9. Islam R (2019) Parts of the brain & function. Anatomy Info. https://anatomyinfo.com/parts-of-the-brain/

  10. Jatupaiboon N, Pan-Ngum S, Israsena P (2013) Real-time EEG-based happiness detection system. Sci World J 2013. https://doi.org/10.1155/2013/618649

  11. Ekman P, Friesen WV (1978) Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto

    Google Scholar 

  12. Tiwary G, Chauhan S, Goyal KK (2022) Video based deep CNN model for depression detection. Int J Recent Innov Trends Comput Commun 10(10):59–64. https://doi.org/10.17762/ijritcc.v10i10.5735

    Article  Google Scholar 

  13. Karnati M, Seal A, Bhattacharjee D, Yazidi A, Krejcar O (2023) Understanding deep learning techniques for recognition of human emotions using facial expressions: a comprehensive survey. IEEE Trans Instrum Meas 72. https://doi.org/10.1109/TIM.2023.3243661

  14. Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069. https://doi.org/10.1109/TIP.2019.2956143

    Article  Google Scholar 

  15. Farzaneh AH, Qi X (2021) Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision 2021, pp 2402–2411

    Google Scholar 

  16. Li S, Deng W (2022) Deep facial expression recognition: a survey. IEEE Trans Affect Comput 13(3):1195–1215. https://doi.org/10.1109/TAFFC.2020.2981446

    Article  Google Scholar 

  17. Zhang F, Zhang T, Mao Q, Xu C (2020) Geometry guided pose-invariant facial expression recognition. IEEE Trans Image Process 29:4445–4460. https://doi.org/10.1109/TIP.2020.2972114

    Article  Google Scholar 

  18. Zhang F, Zhang T, Mao Q, Xu C (2020) A unified deep model for joint facial expression recognition, face synthesis, and face alignment. IEEE Trans Image Process 29:6574–6589. https://doi.org/10.1109/TIP.2020.2991549

    Article  Google Scholar 

  19. Zhang F, Xu M, Xu C (2022) Weakly-supervised facial expression recognition in the wild with noisy data. IEEE Trans Multimedia 24:1800–1814. https://doi.org/10.1109/TMM.2021.3072786

    Article  Google Scholar 

  20. Xia Y, Yu H, Wang X, Jian M, Wang FY (2022) Relation-aware facial expression recognition. IEEE Trans Cogn Dev Syst 14(3):1143–1154. https://doi.org/10.1109/TCDS.2021.3100131

    Article  Google Scholar 

  21. Lu Y, Wang S, Zhao W, Zhao Y (2019) WGAN-based robust occluded facial expression recognition. IEEE Access 7:93594–93610. https://doi.org/10.1109/ACCESS.2019.2928125

    Article  Google Scholar 

  22. Chen J, Guo C, Xu R, Zhang K, Yang Z, Liu H (2022) Toward children’s empathy ability analysis: joint facial expression recognition and intensity estimation using label distribution learning. IEEE Trans Industr Inform 18(1):16–25. https://doi.org/10.1109/TII.2021.3075989

    Article  Google Scholar 

  23. Ali G et al (2020) Artificial neural network based ensemble approach for multicultural facial expressions analysis. IEEE Access 8:134950–134963. https://doi.org/10.1109/ACCESS.2020.3009908

    Article  Google Scholar 

  24. Cha HS, Choi SJ, Im CH (2020) Real-time recognition of facial expressions using facial electromyograms recorded around the eyes for social virtual reality applications. IEEE Access 8:62065–62075. https://doi.org/10.1109/ACCESS.2020.2983608

    Article  Google Scholar 

  25. Dalvi C, Rathod M, Patil S, Gite S, Kotecha K (2021) A survey of AI-based facial emotion recognition: features, ML DL techniques, age-wise datasets and future directions. IEEE Access 9:165806–165840. https://doi.org/10.1109/ACCESS.2021.3131733

    Article  Google Scholar 

  26. Deng J, Pang G, Zhang Z, Pang Z, Yang H, Yang G (2019) CGAN based facial expression recognition for human-robot interaction. IEEE Access 7:9848–9859. https://doi.org/10.1109/ACCESS.2019.2891668

    Article  Google Scholar 

  27. García M, Ramírez S (2020) Deep neural network architecture: application for facial expression recognition; deep neural network architecture: application for facial expression recognition

    Google Scholar 

  28. He Y, Chen S (2020) Person-independent facial expression recognition based on improved local binary pattern and higher-order singular value decomposition. IEEE Access 8:190184–190193. https://doi.org/10.1109/ACCESS.2020.3032406

    Article  Google Scholar 

  29. Huang W, Zhang S, Zhang P, Zha Y, Fang Y, Zhang Y (2022) Identity-aware facial expression recognition via deep metric learning based on synthesized images. IEEE Trans Multimedia 24:3327–3339. https://doi.org/10.1109/TMM.2021.3096068

    Article  Google Scholar 

  30. Jiang P, Liu G, Wang Q, Wu J (2020) Accurate and reliable facial expression recognition using advanced softmax loss with fixed weights. IEEE Signal Process Lett 27:725–729. https://doi.org/10.1109/LSP.2020.2989670

    Article  Google Scholar 

  31. Jiang P, Wan B, Wang Q, Wu J (2020) Fast and efficient facial expression recognition using a gabor convolutional network. IEEE Signal Process Lett 27:1954–1958. https://doi.org/10.1109/LSP.2020.3031504

    Article  Google Scholar 

  32. Karnati M, Seal A, Yazidi A, Krejcar O (2022) FLEPNet: feature level ensemble parallel network for facial expression recognition. IEEE Trans Affect Comput 13(4):2058–2070. https://doi.org/10.1109/TAFFC.2022.3208309

    Article  Google Scholar 

  33. Khan S, Chen L, Yan H (2020) Co-clustering to reveal salient facial features for expression recognition. IEEE Trans Affect Comput 11(2):348–360. https://doi.org/10.1109/TAFFC.2017.2780838

    Article  Google Scholar 

  34. Kim DH, Baddar WJ, Jang J, Ro YM (2019) Multi-objective based spatio-temporal feature representation learning robust to expression intensity variations for facial expression recognition. IEEE Trans Affect Comput 10(2):223–236. https://doi.org/10.1109/TAFFC.2017.2695999

    Article  Google Scholar 

  35. Kim JH, Kim BG, Roy PP, Jeong DM (2019) Efficient facial expression recognition algorithm based on hierarchical deep neural network structure. IEEE Access 7:41273–41285. https://doi.org/10.1109/ACCESS.2019.2907327

    Article  Google Scholar 

  36. Kulkarni K et al (2021) Automatic recognition of facial displays of unfelt emotions. IEEE Trans Affect Comput 12(2):377–390. https://doi.org/10.1109/TAFFC.2018.2874996

    Article  Google Scholar 

  37. Kuruvayil S, Palaniswamy S (2022) Emotion recognition from facial images with simultaneous occlusion, pose and illumination variations using meta-learning. J King Saud Univ Comput Inf Sci 34(9):7271–7282. https://doi.org/10.1016/j.jksuci.2021.06.012

    Article  Google Scholar 

  38. Li B, Lima D (2021) Facial expression recognition via ResNet-50. Int J Cogn Comput Eng 2:57–64. https://doi.org/10.1016/j.ijcce.2021.02.002

    Article  Google Scholar 

  39. Li H, Wang N, Ding X, Yang X, Gao X (2021) Adaptively learning facial expression representation via C-F labels and distillation. IEEE Trans Image Process 30:2016–2028. https://doi.org/10.1109/TIP.2021.3049955

    Article  Google Scholar 

  40. Li H, Wang N, Yang X, Gao X (2022) CRS-CONT: a well-trained general encoder for facial expression analysis. IEEE Trans Image Process 31:4637–4650. https://doi.org/10.1109/TIP.2022.3186536

    Article  Google Scholar 

  41. Li M, Xu H, Huang X, Song Z, Liu X, Li X (2021) Facial expression recognition with identity and emotion joint learning. IEEE Trans Affect Comput 12(2):544–550. https://doi.org/10.1109/TAFFC.2018.2880201

    Article  Google Scholar 

  42. Li S, Deng W (2022) A deeper look at facial expression dataset bias. IEEE Trans Affect Comput 13(2):881–893. https://doi.org/10.1109/TAFFC.2020.2973158

    Article  Google Scholar 

  43. Li Y, Huang X, Zhao G (2021) Joint local and global information learning with single apex frame detection for micro-expression recognition. IEEE Trans Image Process 30:249–263. https://doi.org/10.1109/TIP.2020.3035042

    Article  Google Scholar 

  44. Li Y, Zeng J, Shan S, Chen X (2019) Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans Image Process 28(5):2439–2450. https://doi.org/10.1109/TIP.2018.2886767

    Article  MathSciNet  Google Scholar 

  45. Meng Z, Han S, Liu P, Tong Y (2019) Improving speech related facial action unit recognition by audiovisual information fusion. IEEE Trans Cybern 49(9):3293–3306. https://doi.org/10.1109/TCYB.2018.2840090

    Article  Google Scholar 

  46. Ni R, Yang B, Zhou X, Cangelosi A, Liu X (2022) Facial expression recognition through cross-modality attention fusion. IEEE Trans Cogn Dev Syst. https://doi.org/10.1109/TCDS.2022.3150019

    Article  Google Scholar 

  47. Poux D, Allaert B, Ihaddadene N, Bilasco IM, Djeraba C, Bennamoun M (2022) Dynamic facial expression recognition under partial occlusion with optical flow reconstruction. IEEE Trans Image Process 31:446–457. https://doi.org/10.1109/TIP.2021.3129120

    Article  Google Scholar 

  48. Qi C et al (2018) Facial expressions recognition based on cognition and mapped binary patterns. IEEE Access 6:18795–18803. https://doi.org/10.1109/ACCESS.2018.2816044

    Article  Google Scholar 

  49. Qu X et al (2022) Attend to where and when: cascaded attention network for facial expression recognition. IEEE Trans Emerg Top Comput Intell 6(3):580–592. https://doi.org/10.1109/TETCI.2021.3070713

    Article  Google Scholar 

  50. Sepas-Moghaddam A, Etemad A, Pereira F, Correia PL (2021) CapsField: light field-based face and expression recognition in the wild using capsule routing. IEEE Trans Image Process 30:2627–2642. https://doi.org/10.1109/TIP.2021.3054476

    Article  Google Scholar 

  51. Wang K, Peng X, Yang J, Lu S, Qiao Y, Suppressing uncertainties for large-scale facial expression recognition (Online). Available: https://github.com/kaiwang960112/Self-Cure-Network

  52. Wang Y, Li Y, Song Y, Rong X (2020) The influence of the activation function in a convolution neural network model of facial expression recognition. Appl Sci (Switzerland) 10(5). https://doi.org/10.3390/app10051897

  53. Xia Y, Zheng W, Wang Y, Yu H, Dong J, Wang FY (2022) Local and global perception generative adversarial network for facial expression synthesis. IEEE Trans Circuits Syst Video Technol 32(3):1443–1452. https://doi.org/10.1109/TCSVT.2021.3074032

    Article  Google Scholar 

  54. Xie S, Hu H, Chen Y (2021) Facial expression recognition with two-branch disentangled generative adversarial network. IEEE Trans Circ Syst Video Technol 31(6):2359–2371. https://doi.org/10.1109/TCSVT.2020.3024201

    Article  Google Scholar 

  55. Yan Y, Huang Y, Chen S, Shen C, Wang H (2020) Joint deep learning of facial expression synthesis and recognition. IEEE Trans Multimedia 22(11):2792–2807. https://doi.org/10.1109/TMM.2019.2962317

    Article  Google Scholar 

  56. Yang B, Cao J, Ni R, Zhang Y (2017) Facial expression recognition using weighted mixture deep neural network based on double-channel facial images. IEEE Access 6:4630–4640. https://doi.org/10.1109/ACCESS.2017.2784096

    Article  Google Scholar 

  57. Zhang H, Su W, Yu J, Wang Z (2021) Identity-expression dual branch network for facial expression recognition. IEEE Trans Cogn Dev Syst 13(4):898–911. https://doi.org/10.1109/TCDS.2020.3034807

    Article  Google Scholar 

  58. Zhang X, Zhang F, Xu C (2022) Joint expression synthesis and representation learning for facial expression recognition. IEEE Trans Circuits Syst Video Technol 32(3):1681–1695. https://doi.org/10.1109/TCSVT.2021.3056098

    Article  Google Scholar 

  59. Zheng K, Yang D, Liu J, Cui J (2020) Recognition of teachers’ facial expression intensity based on convolutional neural network and attention mechanism. IEEE Access. https://doi.org/10.1109/ACCESS.2020.3046225

    Article  Google Scholar 

  60. Zhang Y, Hossain MZ, Rahman S (2021) DeepVANet: a deep end-to-end network for multi-modal emotion recognition, pp 227–237. https://doi.org/10.1007/978-3-030-85613-7_16

  61. Kossaifi J, Tzimiropoulos G, Todorovic S, Pantic M (2017) AFEW-VA database for valence and arousal estimation in-the-wild. Image Vis Comput 65:23–36. https://doi.org/10.1016/j.imavis.2017.02.001

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gyanendra Tiwary .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tiwary, G., Chauhan, S., Goyal, K.K. (2024). Multimodal Attention CNN for Human Emotion Recognition. In: Chaturvedi, A., Hasan, S.U., Roy, B.K., Tsaban, B. (eds) Cryptology and Network Security with Machine Learning. ICCNSML 2023. Lecture Notes in Networks and Systems, vol 918. Springer, Singapore. https://doi.org/10.1007/978-981-97-0641-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-0641-9_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-0640-2

  • Online ISBN: 978-981-97-0641-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics