Skip to main content
Log in

Deep learning enhanced attributes conditional random forest for robust facial expression recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Automated Facial Expression Recognition (FER) has remained challenging because of the high inter-subject (e.g. the variations of age, gender and ethnic backgrounds) and intra-subject variations (e.g. the variations of low image resolution, occlusion and illumination). To reduce the variations of age, gender and ethnic backgrounds, we have introduced a conditional random forest architecture. Moreover, a deep multi-instance learning model has been proposed for reducing the variations of low image resolution, occlusion and illumination. Unlike most existing models are trained with facial expression labels only, other attributes related to facial expressions such as age and gender are also considered in our proposed model. A large number of experiments were conducted on the public CK+, ExpW, RAF-DB and AffectNet datasets, and the recognition rates reached 99% and 69.72% on the normalized CK+ face database and the challenging natural scene database respectively. The experimental results shows that our proposed method outperforms the state-of-the-art methods and it is robust to occlusion, noise and resolution variation in the wild.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Arvind P, Rajkumar S et al (2018) Local diagonal extrema number pattern: A new feature descriptor for face recognition. Futur Gener Comput Syst Fgcs 81:297–306

    Article  Google Scholar 

  2. Bargal SA, Barsoum E, Ferrer CC, Zhang C (2016) Emotion recognition in the wild from videos using images, In: ACM International Conference on Multimodal Interaction, pp. 433–436

  3. Benitez-Garcia G, Nakamura T, Kaneko M (2018) Multicultural facial expression recognition based on differences of Western-Caucasian and east-Asian facial expressions of emotions. Ice Trans Inf Syst 101(5):1317–1324

    Article  Google Scholar 

  4. Bulo SR, Kontschieder P (2014) Neural decision forests for semantic image labeling, In: CVPR, pp. 81–88

  5. Chen C, Hensel L B, Duan Y, et al (2019) Equipping social robots with culturally-sensitive facial expressions of emotion using data-driven methods, In: IEEE International Conference on Automatic Face & Gesture Recognition, pp. 1–8

  6. Criminisi A, Shotton J, Konukoglu E (2011) Decision forests for classification, regression, density estimation, manifold learning and semi-supervised learning, Microsoft Research Cambridge. Tech Rep MSRTR-2011-114 5(6):12

    MATH  Google Scholar 

  7. Dantone M, Gall J, Fanelli G, Van Gool L (2012) Real-time facial feature detection using conditional regression forests, In: CVPR, pp. 2578–2585

  8. Dapogny A, Bailly K, Dubuisson S (2015) Pairwise conditional random forests for facial expression recognition, In: ICCV, pp. 3783–3791

  9. Dapogny A, Bailly K, Dubuisson S (2019) Dynamic pose-robust facial expression recognition by multi-view pairwise conditional random forests. IEEE Trans Affect Comput 10(2):167–181. https://doi.org/10.1109/TAFFC.2017.2708106

    Article  Google Scholar 

  10. Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: A deep convolutional activation feature for generic visual recognition. In ICML'14: Proceedings of the 31st International Conference on International Conference on Machine Learning 32:647–655

  11. Ekman P (1994) Strong evidence for universals in facial expressions: a reply to Russell’s mistaken critique. Psychol Bull 115(2):268–287

    Article  Google Scholar 

  12. Ekman P, Friesen W (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124–129

    Article  Google Scholar 

  13. Fanelli G, Yao A, Noel P-L, Gall J, Van Gool L (2010) Hough forest-based facial expression recognition from video sequences, In: ECCV, pp. 195–206

  14. Fang B, Zhang Q, Wang H, Yuan X (2018) Personality driven task allocation for emotional robot team. Int J Mach Learn Cyber 9:1955–1962. https://doi.org/10.1007/s13042-017-0679-3

    Article  Google Scholar 

  15. Girshick R (2015) Fast R-CNN. IEEE International Conference on Computer Vision (ICCV) pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169

  16. Hassaballah M, Awad AI (2020) Deep learning in computer vision: principles and applications. CRC Press, Taylor & Francis Group, pp. 33-59

  17. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

  18. Jack RE, Garrod OG, Yu H, Caldara R, Schyns PG (2012) Facial expressions of emotion are not culturally universal. Proc Natl Acad Sci 109(19):7241–7244. https://doi.org/10.1073/pnas.1200155109

    Article  Google Scholar 

  19. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia, ACM, pp. 675–678

  20. Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. IEEE International Conference on Computer Vision (ICCV), pp. 2983–2991. https://doi.org/10.1109/ICCV.2015.341

  21. Kim B-K, Lee H, Roh J, Lee S-Y (2015) Hierarchical committee of deep cnns with exponentially-weighted decision fusion for static facial expression recognition. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (ICMI '15). Association for Computing Machinery, pp. 427–434. https://doi.org/10.1145/2818346.2830590

  22. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference onNeural Information Processing Systems - Volume 1 (NIPS'12). Curran Associates Inc., Red Hook, pp. 1097–1105

  23. Levi G, Hassncer T (2015) Age and gender classification using convolutional neural networks, In: CVPR, pp.34–42

  24. Li S, Deng W (2019) Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28(1):356–370. https://doi.org/10.1109/TIP.2018.2868382

    Article  MathSciNet  MATH  Google Scholar 

  25. Li S, Deng W (n.d.) Deep facial expression recognition: a survey. IEEE Trans Affect Comput. https://doi.org/10.1109/TAFFC.2020.2981446

  26. Liu M, Li S, Shan S, Chen X (2013) Au-aware deep networks for facial expression recognition. 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp.1–6. https://doi.org/10.1109/FG.2013.6553734

  27. Liu Y, Yuan X, Gong X, Xie Z, Fang F, Luo Z (2018) Conditional convolution neural network enhanced random forest for facial expression recognition. Pattern Recogn 84:251–261

    Article  Google Scholar 

  28. Lopes AT, de Aguiar E, De Souza AF, Oliveira-Santos T (2017) Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recogn 61:610–628. https://doi.org/10.1016/j.patcog.2016.07.026

    Article  Google Scholar 

  29. Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I(2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression, In: CVPR, pp. 94–101

  30. Luo L, Hu X, Hu S, Zhang W, Zhang H (2018) A discriminative face geometric feature-based face recognition. Arab J Sci Eng 43(12):7679–7693

    Article  Google Scholar 

  31. Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks, in: Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on, IEEE, pp. 1–10

  32. Mollahosseini A, Hasani B, Mahoor MH (2019) Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31

    Article  Google Scholar 

  33. Moore S, Bowden R (2011) Local binary patterns for multi-view facial expression recognition. Comput Vis Image Underst 115(4):541–558. https://doi.org/10.1016/j.cviu.2010.12.001

    Article  Google Scholar 

  34. Nguyen HV, Ho HT, Patel VM, Chellappa R (2015) Dash-n: joint hierarchical domain adaptation and feature learning. IEEE Trans Image Process 24(12):5479–5491. https://doi.org/10.1109/TIP.2015.2479405

    Article  MathSciNet  MATH  Google Scholar 

  35. Rudovic O, Patras I, Pantic M (2010) Coupled Gaussian process regression for pose-invariant facial expression recognition, In: ECCV, pp.350–363

  36. Sun M, Kohli P, Shotton J (2012) Conditional regression forests for human pose estimation. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3394–3401. https://doi.org/10.1109/CVPR.2012.6248079

  37. Szegedy C, Liu W, Jia Y, et al. (2015) Going Deeper with Convolutions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594

  38. Valstar MF, Mehu M, Jiang B, Pantic M, Scherer K (2012) Meta-analysis of the first facial expression recognition challenge, IEEE transactions on systems, man, and cybernetics. Part B (Cybernetics) 42(4):966–979

    Article  Google Scholar 

  39. Xu M, Cheng W, Zhao Q, Ma L, Xu F (2015) Facial expression recognition based on transfer learning from deep convolutional networks. 11th International Conference onNatural Computation (ICNC), pp. 702–708. https://doi.org/10.1109/ICNC.2015.7378076

  40. Yang M, Wang X, Zeng G, Shen L (2017) Joint and collaborative representation with local adaptive convolution feature for face recognition with single sample per person. Pattern Recogn 66(C):117–128

    Article  Google Scholar 

  41. Yuan X, Abouelenien M, Elhoseny M (2018) Quantum Computing: An Environment for Intelligent Large Scale Real Application, Springer International Publishing, Ch. 18 A Boosting-Based Decision Fusion Method for Learning from Large, Imbalanced Face Data Set, pp. 433–448

  42. Yuan X, Xie L, Abouelenien M (2018) A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recogn 77:160–172. https://doi.org/10.1016/j.patcog.2017.12.017

    Article  Google Scholar 

  43. Zhang K, Gao C, Guo L, Sun M, Yuan X, Han TX, Zhao Z, Li B (2017) Age group and gender estimation in the wild with deep RoR architecture. IEEE Access 5:22492–22503. https://doi.org/10.1109/ACCESS.2017.2761849

    Article  Google Scholar 

  44. Zhang Z, Luo P, Loy CC, Tang X (2018) From facial expression recognition to interpersonal relation prediction. Int J Comput Vis 126(5):550–569

    Article  MathSciNet  Google Scholar 

  45. Zhang Z, Luo P, Loy CC, et al. (2015) Learning Social Relation Traits from Face Images, In: ICCV, IEEE, pp.3631–3639

  46. Zhang X, Mahoor MH, Mavadati SM (2015) Facial expression recognition using lp-norm MKL multiclass-SVM. Mach Vis Appl 26(4):467–483

    Article  Google Scholar 

  47. Zhang S, Wen L, Shi H, Lei Z, Lyu S, Li SZ (2019) Single-Shot Scale-Aware Network for Real-Time Face Detection. Int J Comput Vis 127:537–599. https://doi.org/10.1007/s11263-019-01159-3

    Article  Google Scholar 

  48. Zhang T, Zheng W, Cui Z, Zong Y, Yan J, Yan K (2016) A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans Multimed 18(12):2528–2536. https://doi.org/10.1109/TMM.2016.2598092

    Article  Google Scholar 

  49. Zheng W (2014) Multi-view facial expression recognition based on group sparse reduced-rank regression. IEEE Trans Affect Comput 5(1):71–85

    Article  Google Scholar 

Download references

Acknowledgments

We want to thank the helpful comments and suggestions from the Yicun Ouyang and Bin Xu. This work is supported partially by the Xianning Natural Science Foundation (No. 2019kj130) and the Cultivation Fund of Hubei University of Science and Technology (No. 2020- 22GP03).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haibin Liao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liao, H., Wang, D., Fan, P. et al. Deep learning enhanced attributes conditional random forest for robust facial expression recognition. Multimed Tools Appl 80, 28627–28645 (2021). https://doi.org/10.1007/s11042-021-10951-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10951-8

Keywords

Navigation