Abstract
Automated Facial Expression Recognition (FER) has remained challenging because of the high inter-subject (e.g. the variations of age, gender and ethnic backgrounds) and intra-subject variations (e.g. the variations of low image resolution, occlusion and illumination). To reduce the variations of age, gender and ethnic backgrounds, we have introduced a conditional random forest architecture. Moreover, a deep multi-instance learning model has been proposed for reducing the variations of low image resolution, occlusion and illumination. Unlike most existing models are trained with facial expression labels only, other attributes related to facial expressions such as age and gender are also considered in our proposed model. A large number of experiments were conducted on the public CK+, ExpW, RAF-DB and AffectNet datasets, and the recognition rates reached 99% and 69.72% on the normalized CK+ face database and the challenging natural scene database respectively. The experimental results shows that our proposed method outperforms the state-of-the-art methods and it is robust to occlusion, noise and resolution variation in the wild.
Similar content being viewed by others
References
Arvind P, Rajkumar S et al (2018) Local diagonal extrema number pattern: A new feature descriptor for face recognition. Futur Gener Comput Syst Fgcs 81:297–306
Bargal SA, Barsoum E, Ferrer CC, Zhang C (2016) Emotion recognition in the wild from videos using images, In: ACM International Conference on Multimodal Interaction, pp. 433–436
Benitez-Garcia G, Nakamura T, Kaneko M (2018) Multicultural facial expression recognition based on differences of Western-Caucasian and east-Asian facial expressions of emotions. Ice Trans Inf Syst 101(5):1317–1324
Bulo SR, Kontschieder P (2014) Neural decision forests for semantic image labeling, In: CVPR, pp. 81–88
Chen C, Hensel L B, Duan Y, et al (2019) Equipping social robots with culturally-sensitive facial expressions of emotion using data-driven methods, In: IEEE International Conference on Automatic Face & Gesture Recognition, pp. 1–8
Criminisi A, Shotton J, Konukoglu E (2011) Decision forests for classification, regression, density estimation, manifold learning and semi-supervised learning, Microsoft Research Cambridge. Tech Rep MSRTR-2011-114 5(6):12
Dantone M, Gall J, Fanelli G, Van Gool L (2012) Real-time facial feature detection using conditional regression forests, In: CVPR, pp. 2578–2585
Dapogny A, Bailly K, Dubuisson S (2015) Pairwise conditional random forests for facial expression recognition, In: ICCV, pp. 3783–3791
Dapogny A, Bailly K, Dubuisson S (2019) Dynamic pose-robust facial expression recognition by multi-view pairwise conditional random forests. IEEE Trans Affect Comput 10(2):167–181. https://doi.org/10.1109/TAFFC.2017.2708106
Donahue J, Jia Y, Vinyals O, Hoffman J, Zhang N, Tzeng E, Darrell T (2014) Decaf: A deep convolutional activation feature for generic visual recognition. In ICML'14: Proceedings of the 31st International Conference on International Conference on Machine Learning 32:647–655
Ekman P (1994) Strong evidence for universals in facial expressions: a reply to Russell’s mistaken critique. Psychol Bull 115(2):268–287
Ekman P, Friesen W (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124–129
Fanelli G, Yao A, Noel P-L, Gall J, Van Gool L (2010) Hough forest-based facial expression recognition from video sequences, In: ECCV, pp. 195–206
Fang B, Zhang Q, Wang H, Yuan X (2018) Personality driven task allocation for emotional robot team. Int J Mach Learn Cyber 9:1955–1962. https://doi.org/10.1007/s13042-017-0679-3
Girshick R (2015) Fast R-CNN. IEEE International Conference on Computer Vision (ICCV) pp. 1440–1448. https://doi.org/10.1109/ICCV.2015.169
Hassaballah M, Awad AI (2020) Deep learning in computer vision: principles and applications. CRC Press, Taylor & Francis Group, pp. 33-59
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
Jack RE, Garrod OG, Yu H, Caldara R, Schyns PG (2012) Facial expressions of emotion are not culturally universal. Proc Natl Acad Sci 109(19):7241–7244. https://doi.org/10.1073/pnas.1200155109
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia, ACM, pp. 675–678
Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. IEEE International Conference on Computer Vision (ICCV), pp. 2983–2991. https://doi.org/10.1109/ICCV.2015.341
Kim B-K, Lee H, Roh J, Lee S-Y (2015) Hierarchical committee of deep cnns with exponentially-weighted decision fusion for static facial expression recognition. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (ICMI '15). Association for Computing Machinery, pp. 427–434. https://doi.org/10.1145/2818346.2830590
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference onNeural Information Processing Systems - Volume 1 (NIPS'12). Curran Associates Inc., Red Hook, pp. 1097–1105
Levi G, Hassncer T (2015) Age and gender classification using convolutional neural networks, In: CVPR, pp.34–42
Li S, Deng W (2019) Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28(1):356–370. https://doi.org/10.1109/TIP.2018.2868382
Li S, Deng W (n.d.) Deep facial expression recognition: a survey. IEEE Trans Affect Comput. https://doi.org/10.1109/TAFFC.2020.2981446
Liu M, Li S, Shan S, Chen X (2013) Au-aware deep networks for facial expression recognition. 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp.1–6. https://doi.org/10.1109/FG.2013.6553734
Liu Y, Yuan X, Gong X, Xie Z, Fang F, Luo Z (2018) Conditional convolution neural network enhanced random forest for facial expression recognition. Pattern Recogn 84:251–261
Lopes AT, de Aguiar E, De Souza AF, Oliveira-Santos T (2017) Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recogn 61:610–628. https://doi.org/10.1016/j.patcog.2016.07.026
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I(2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression, In: CVPR, pp. 94–101
Luo L, Hu X, Hu S, Zhang W, Zhang H (2018) A discriminative face geometric feature-based face recognition. Arab J Sci Eng 43(12):7679–7693
Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks, in: Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on, IEEE, pp. 1–10
Mollahosseini A, Hasani B, Mahoor MH (2019) Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31
Moore S, Bowden R (2011) Local binary patterns for multi-view facial expression recognition. Comput Vis Image Underst 115(4):541–558. https://doi.org/10.1016/j.cviu.2010.12.001
Nguyen HV, Ho HT, Patel VM, Chellappa R (2015) Dash-n: joint hierarchical domain adaptation and feature learning. IEEE Trans Image Process 24(12):5479–5491. https://doi.org/10.1109/TIP.2015.2479405
Rudovic O, Patras I, Pantic M (2010) Coupled Gaussian process regression for pose-invariant facial expression recognition, In: ECCV, pp.350–363
Sun M, Kohli P, Shotton J (2012) Conditional regression forests for human pose estimation. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3394–3401. https://doi.org/10.1109/CVPR.2012.6248079
Szegedy C, Liu W, Jia Y, et al. (2015) Going Deeper with Convolutions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594
Valstar MF, Mehu M, Jiang B, Pantic M, Scherer K (2012) Meta-analysis of the first facial expression recognition challenge, IEEE transactions on systems, man, and cybernetics. Part B (Cybernetics) 42(4):966–979
Xu M, Cheng W, Zhao Q, Ma L, Xu F (2015) Facial expression recognition based on transfer learning from deep convolutional networks. 11th International Conference onNatural Computation (ICNC), pp. 702–708. https://doi.org/10.1109/ICNC.2015.7378076
Yang M, Wang X, Zeng G, Shen L (2017) Joint and collaborative representation with local adaptive convolution feature for face recognition with single sample per person. Pattern Recogn 66(C):117–128
Yuan X, Abouelenien M, Elhoseny M (2018) Quantum Computing: An Environment for Intelligent Large Scale Real Application, Springer International Publishing, Ch. 18 A Boosting-Based Decision Fusion Method for Learning from Large, Imbalanced Face Data Set, pp. 433–448
Yuan X, Xie L, Abouelenien M (2018) A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recogn 77:160–172. https://doi.org/10.1016/j.patcog.2017.12.017
Zhang K, Gao C, Guo L, Sun M, Yuan X, Han TX, Zhao Z, Li B (2017) Age group and gender estimation in the wild with deep RoR architecture. IEEE Access 5:22492–22503. https://doi.org/10.1109/ACCESS.2017.2761849
Zhang Z, Luo P, Loy CC, Tang X (2018) From facial expression recognition to interpersonal relation prediction. Int J Comput Vis 126(5):550–569
Zhang Z, Luo P, Loy CC, et al. (2015) Learning Social Relation Traits from Face Images, In: ICCV, IEEE, pp.3631–3639
Zhang X, Mahoor MH, Mavadati SM (2015) Facial expression recognition using lp-norm MKL multiclass-SVM. Mach Vis Appl 26(4):467–483
Zhang S, Wen L, Shi H, Lei Z, Lyu S, Li SZ (2019) Single-Shot Scale-Aware Network for Real-Time Face Detection. Int J Comput Vis 127:537–599. https://doi.org/10.1007/s11263-019-01159-3
Zhang T, Zheng W, Cui Z, Zong Y, Yan J, Yan K (2016) A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans Multimed 18(12):2528–2536. https://doi.org/10.1109/TMM.2016.2598092
Zheng W (2014) Multi-view facial expression recognition based on group sparse reduced-rank regression. IEEE Trans Affect Comput 5(1):71–85
Acknowledgments
We want to thank the helpful comments and suggestions from the Yicun Ouyang and Bin Xu. This work is supported partially by the Xianning Natural Science Foundation (No. 2019kj130) and the Cultivation Fund of Hubei University of Science and Technology (No. 2020- 22GP03).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liao, H., Wang, D., Fan, P. et al. Deep learning enhanced attributes conditional random forest for robust facial expression recognition. Multimed Tools Appl 80, 28627–28645 (2021). https://doi.org/10.1007/s11042-021-10951-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10951-8