Skip to main content
Log in

Multi-facial patches aggregation network for facial expression recognition and facial regions contributions to emotion display

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, an approach for Facial Expressions Recognition (FER) based on a multi-facial patches (MFP) aggregation network is proposed. Deep features are learned from facial patches using convolutional neural sub-networks and aggregated within one architecture for expression classification. Besides, a framework based on two data augmentation techniques is proposed to expand FER labels training datasets. Consequently, the proposed shallow convolutional neural networks (CNN) based approach does not need large datasets for training. The proposed framework is evaluated on three FER datasets. Results show that the proposed approach achieves state-of-art FER deep learning approaches performance when the model is trained and tested on images from the same dataset. Moreover, the proposed data augmentation techniques improve the expression recognition rate, and thus can be a solution for training deep learning FER models using small datasets. The accuracy degrades significantly when testing for dataset bias. A fine-tuning can overcome the problem of transition from laboratory-controlled conditions to in-the-wild conditions. Finally, the emotional face is mapped using the MFP-CNN and the contribution of the different facial areas in displaying emotion as well as their importance in the recognition of each facial expression are studied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bartlett MS, Littlewort G, Fasel I, Movellan JR (2003) Real time face detection and facial expression recognition: Development and applications to human computer interaction. In: Conference on computer vision and pattern recognition workshop, vol. 5, pp 53–53

  2. Cohen I, Sebe N, Gozman FG, Cirelo MC, Huang TS (2003) Learning bayesian network classifiers for facial expression recognition both labeled and unlabeled data. In: Proceedings IEEE computer society conference on computer vision and pattern recognition, vol. 1, pp I–I

  3. Cohn J, Zlochower A (1995) A computerized analysis of facial expression: Feasibility of automated discrimination. Amer Psychol Soc

  4. Dahmane M, Meunier J (2011) Emotion recognition using dynamic grid-based hog features. In: Face and gesture, pp 884–888

  5. Dhall A, Asthana A, Goecke R, Gedeon T (2011) Emotion recognition using phog and lpq features. In: Face and gesture, pp 878–883

  6. Dhall A, Goecke R, Gedeon T (2015) Automatic group happiness intensity analysis. IEEE Trans Affect Comput 6(1):13–26

    Article  Google Scholar 

  7. Dhall A, Goecke R, Ghosh S, Joshi J, Hoey J, Gedeon T (2017) From individual to group-level emotion recognition: Emotiw 5.0. In: Proceedings of the 19th ACM international conference on multimodal interaction, pp 524–528

  8. Dhall A, Goecke R, Lucey S, Gedeon T (2011) Acted facial expressions in the wild database. Australian National University, Canberra, Australia Technical Report TR-CS-11 2:1

    Google Scholar 

  9. Dhall A, Goecke R, Lucey S, Gedeon T (2011) Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark. In: IEEE International conference on computer vision workshops, pp 2106–2112

  10. Dhall A, Goecke R, Lucey S, Gedeon T (2012) Collecting large, richly annotated facial-expression databases from movies. IEEE Multimed 19 (1):34–41

    Article  Google Scholar 

  11. Dhall A, Ramana Murthy OV, Goecke R, Joshi J, Gedeon T (2015) Video and image based emotion recognition challenges in the wild: Emotiw 2015. In: Proceedings of the 2015 ACM on international conference on multimodal interaction, pp 423–426

  12. Du L, Wu Y, Hu H, Wang W (2020) Self-adaptive weighted synthesised local directional pattern integrating with sparse autoencoder for expression recognition based on improved multiple kernel learning strategy. IET Computer Vision

  13. Ekman P, Friesen WV, Hager J (2002) Facial action coding system. A Human Face

  14. Ekman P, Friesen WV, O’sullivan M, Chan A, Diacoyanni-Tarlatzis I, Heider K, Scherer K (1987) Universals and cultural differences in the judgments of facial expressions of emotion. J Personal Soc Psychol 53(4):712

    Article  Google Scholar 

  15. El Kaliouby R, Robinson P (2005) Real-time inference of complex mental states from facial expressions and head gestures. In: Real-time vision for human-computer interaction, pp 181–200. Springer

  16. Gauthier J (2014) Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231n: Convolutional Neural Networks for Visual Recognition Winter semester 2014(5):2

    Google Scholar 

  17. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  18. Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Cukierski W, Tang Y, Thaler D, Lee D-H, Zhou Y, Ramaiah C, Feng F, Li R, Wang X, Athanasakis D, Shawe-Taylor J, Milakov M, Park J, Ionescu R, Popescu M, Grozea C, Bergstra J, Xie J, Romaszko L, Xu B, Chuang Z, Bengio Y (2013) Challenges in representation learning: A report on three machine learning contests. In: International conference on neural information processing, pp 117–124. Springer, arXiv:1307.0414

  19. Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vision Comput 25(5):807–813

    Article  Google Scholar 

  20. Hernández B., Olague G, Hammoud R, Trujillo L, Romero E (2007) Visual learning of texture descriptors for facial expression recognition in thermal imagery. Comput Vis Image Underst 106(2-3):258–269

    Article  Google Scholar 

  21. Ioannou SV, Raouzaiou AT, Tzouvaras VA, Mailis TP, Karpouzis KC, Kollias SD (2005) Emotion recognition through facial expression analysis based on a neurofuzzy network. Neural Netw 18(4):423–435

    Article  Google Scholar 

  22. Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134

  23. Jack RE, Garrod OG, Yu H, Caldara R, Schyns PG (2012) Facial expressions of emotion are not culturally universal. In: Proceedings of the National Academy of Sciences, pp 7241–7244

  24. Jian M, Cui C, Nie X, Zhang H, Nie L, Yin Y (2019) Multi-view face hallucination using svd and a mapping model. Inf Sci 488:181–189

    Article  Google Scholar 

  25. Jian M, Lam KM (2014) Face-image retrieval based on singular values and potential-field representation. Signal Process 100:9–15

    Article  Google Scholar 

  26. Jian M, Lam KM (2015) Simultaneous hallucination and recognition of low-resolution faces based on singular value decomposition. IEEE Trans Circ Syst Video Technol 25(11):1761–1772

    Article  Google Scholar 

  27. Jian M, Lam KM, Dong J (2013) A novel face-hallucination scheme based on singular value decomposition. Pattern Recogn 46(11):3091–3102

    Article  Google Scholar 

  28. Jian M, Lam KM, Dong J (2014) Facial-feature detection and localization based on a hierarchical scheme. Inf Sci 262:1–14

    Article  Google Scholar 

  29. Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2983–2991

  30. Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1867–1874

  31. Khorrami P, Paine T, Huang T (2015) Do deep neural networks learn facial action units when doing expression recognition?. In: Proceedings of the IEEE international conference on computer vision workshops, pp 19–27

  32. Kim Y, Lee H, Provost EM (2013) Deep learning for robust feature generation in audiovisual emotion recognition. In: IEEE International conference on acoustics, speech and signal processing, pp 3687–3691

  33. Kim Y, Yoo B, Kwak Y, Choi C, Kim J (2017) Deep generative-contrastive networks for facial expression recognition. arXiv:1703.07140

  34. Kuo CM, Lai SH, Sarkis M (2018) A compact deep learning model for robust facial expression recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 2121–2129

  35. Li S, Deng W (2018) Deep facial expression recognition: A survey. arXiv:1804.08348

  36. Liu M, Li S, Shan S, Chen X (2013) Au-aware deep networks for facial expression recognition. In: 2013 10th IEEE international conference and workshops on automatic face and gesture recognition (FG), pp 1–6. IEEE

  37. Liu T, Guo W, Sun Z, Lian Y, Liu S, Wu K (2020) Facial expression recognition based on regularized semi-supervised deep learning. In: Advances in intelligent information hiding and multimedia signal processing, pp 323–331. Springer

  38. Lu X, Lin Z, Shen X, Mech R, Wang JZ (2015) Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: Proceedings of the IEEE international conference on computer vision, pp 990–998

  39. Luc P, Couprie C, Chintala S, Verbeek J (2016) Semantic segmentation using adversarial networks. arXiv:1611.08408

  40. Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: IEEE Computer society conference on computer vision and pattern recognition-workshops, pp 94–101. IEEE

  41. Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with gabor wavelets. In: Proceedings third IEEE international conference on automatic face and gesture recognition, pp 200–205

  42. Marsh AA, Elfenbein HA, Ambady N (2003) Nonverbal accents cultural differences in facial expressions of emotion. Psychol Sci 14(4):373–376

    Article  Google Scholar 

  43. Michel P, El Kaliouby R (2003) Real time facial expression recognition in video using support vector machines. In: Proceedings of the 5th international conference on Multimodal interfaces, pp 258–264

  44. Minaee S, Abdolrashidi A (2019) Deep-emotion: Facial expression recognition using attentional convolutional network. arXiv:1902.01019

  45. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784

  46. Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE Winter conference on applications of computer vision (WACV), pp 1–10. IEEE

  47. Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: IEEE Winter conference on applications of computer vision (WACV), pp 1–10

  48. Ng HW, Nguyen VD, Vonikakis V, Winkler S (2015) Deep learning for emotion recognition on small datasets using transfer learning. In: Proceedings of the 2015 ACM on international conference on multimodal interaction, pp 443–449

  49. Nicolle J, Bailly K, Chetouani M (2015) Facial action unit intensity prediction via hard multi-task metric learning for kernel regression. In: 11th IEEE international conference and workshops on automatic face and gesture recognition (FG), vol. 6, pp 1–6

  50. Nusseck M, Cunningham DW, Wallraven C, Bülthoff H. H. (2008) The contribution of different facial regions to the recognition of conversational expressions. J Vis 8:1–23

    Article  Google Scholar 

  51. Othmani A, Taleb AR, Abdelkawy H, Hadid A (2020) Age estimation from faces using deep learning: A comparative analysis. Computer Vision and Image Understanding p 102961

  52. Pantic M, Valstar M, Rademaker R, Maat L (2005) Web-based database for facial expression analysis. In: IEEE International conference on multimedia and expo, pp 5. IEEE

  53. Poria S, Chaturvedi I, Cambria E, Hussain A (2016) Convolutional mkl based multimodal emotion recognition and sentiment analysis. In: IEEE 16th international conference on data mining (ICDM), pp 439–448

  54. Rejaibi E, Komaty A, Meriaudeau F, Agrebi S, Othmani A (2019) Mfcc-based recurrent neural network for automatic clinical depression recognition and assessment from speech. arXiv:1909.07208

  55. Salam H, Seguier R (2018) A survey on face modeling: Building a bridge between face analysis and synthesis. Vis Comput 34(2):289–319

    Article  Google Scholar 

  56. Senechal T, Rapp V, Salam H, Seguier R, Bailly K, Prevost L (2012) Facial action recognition combining heterogeneous features via multikernel learning. IEEE Transactions on Systems, Man, and Cybernetics Part B (Cybernetics) 42(4):993–1005

    Article  Google Scholar 

  57. Simcock G, McLoughlin LT, De Regt T, Broadhouse KM, Beaudequin D, Lagopoulos J, Hermens DF (2020) Associations between facial emotion recognition and mental health in early adolescence. Int J Environ Res Public Health 17(1):330

    Article  Google Scholar 

  58. Sun N, Li Q, Huan R, Liu J, Han G (2019) Deep spatial-temporal feature fusion for facial expression recognition in static images. Pattern Recogn Lett 119:49–61

    Article  Google Scholar 

  59. Tong Y, Liao W, Ji Q (2007) Facial action unit recognition by exploiting their dynamic and semantic relationships. IEEE Trans Pattern Anal Machine Intell 29(10):1683–1699

    Article  Google Scholar 

  60. Tonguċ G, Ozkara BO (2020) Automatic recognition of student emotions from facial expressions during a lecture. Comput Educ p 103797

  61. Valstar MF, Jiang B, Mehu M, Pantic M, Scherer K (2011) The first facial expression recognition and analysis challenge. In: Face and gesture, pp 921–926

  62. Wang W, Fu Y, Sun Q, Chen T, Cao C, Zheng Z, Xu G, Qiu H, Jiang YG, Xue X (2020) Learning to augment expressions for few-shot fine-grained facial expression recognition. arXiv:2001.06144

  63. Wegrzyn M, Vogt M, Kireclioglu B, Schneider J, Kissler J (2017) Mapping the emotional face. how individual face parts contribute to successful emotion recognition. PloS One 12:e0177,239

    Article  Google Scholar 

  64. Whitehill J, Littlewort G, Fasel I, Bartlett M, Movellan J (2009) Toward practical smile detection. IEEE Trans Pattern Anal Machine Intell 2106–2111

  65. Yan H, Ang MH, Poo AN (2012) Adaptive discriminative metric learning for facial expression recognition. IET biometrics 1(3):160–167

    Article  Google Scholar 

  66. Yang H, Zhang Z, Yin L (2018) Identity-adaptive facial expression recognition through expression regeneration using conditional generative adversarial networks. In: 13Th IEEE international conference on automatic face & gesture recognition (FG 2018), pp 294–301

  67. Yu Z, Liu Q, Liu G (2018) Deeper cascaded peak-piloted network for weak expression recognition. Vis Comput 34(12):1691–1699

    Article  Google Scholar 

  68. Yu Z, Zhang C (2015) Image based static facial expression recognition with multiple deep network learning. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp 435–442

  69. Zhang H, Huang B, Tian G (2020) Facial expression recognition based on deep convolution long short-term memory networks of double-channel weighted mixture. Pattern Recogn Lett 131:128–134

    Article  Google Scholar 

  70. Zhang J, Xiao N (2020) Capsule network-based facial expression recognition method for a humanoid robot. In: Recent trends in intelligent computing, communication and devices, pp 113–121. Springer

  71. Zhang K, Huang Y, Du Y, Wang L (2017) Facial expression recognition based on deep evolutional spatial-temporal networks. IEEE Trans Image Process 26(9):4193–4203

    Article  MathSciNet  Google Scholar 

  72. Zhao G, Huang X, Taini M, Li SZ, PietikäInen M (2011) Facial expression recognition from near-infrared videos. Image Vis Comput 29 (9):607–619

    Article  Google Scholar 

  73. Zhao X, Liang X, Liu L, Li T, Han Y, Vasconcelos N, Yan S (2016) Peak-piloted deep network for facial expression recognition. In: European conference on computer vision, pp 425–442

  74. Zhong L, Liu Q, Yang P, Liu B, Huang J, Metaxas DN (2012) Learning active facial patches for expression analysis. In: IEEE Conference on computer vision and pattern recognition, pp 2562–2569

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alice Othmani.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hazourli, A.R., Djeghri, A., Salam, H. et al. Multi-facial patches aggregation network for facial expression recognition and facial regions contributions to emotion display. Multimed Tools Appl 80, 13639–13662 (2021). https://doi.org/10.1007/s11042-020-10332-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10332-7

Keywords

Navigation