Abstract
Facial expression recognition plays an important role in the field involving human-computer interactions. Given the wide use of convolutional neural networks or other neural network models in automatic image classification systems, high-level features can be automatically learned by hierarchical neural networks. However, the training of CNNs requires large amounts of training data to permit adequate generalization. The traditional scale-invariant feature transform (SIFT) does not need large learning samples to obtain features. In this paper, we proposed a feature extraction method for use in the facial expressions recognition from a single image frame. The hybrid features use a combination of SIFT and deep learning features of different levels extracted from a CNN model. The combined features are adopted to classify expressions using support vector machines. The performance of proposed method is tested using the publicly available extended Cohn-Kanade (CK+) database. To evaluate the generalization ability of our method, several experiments are designed and carried out in a cross-database environment. Compared with the 76.57% accuracy obtained using SIFT-bag of features (BoF) features and the 92.87% accuracy obtained using CNN features, we achieve a FER accuracy of 94.82% using the proposed hybrid SIFT-CNN features. The results of additional cross-database experiments also demonstrate the considerable potential of combining shallow features with deep learning features, and these results are more promising than state-of-the-art models. Combining shallow and deep learning features is effective when the training data are not sufficient to obtain a deep model with considerable generalization ability.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ekman P, Friesen WV. Facial action coding system (FACS): a technique for the measurement of facial actions[J]. Rivista Di Psichiatria 1978;47(2):126–38.
Liu P, Han S, Meng Z, et al. Facial expression recognition via a boosted deep belief network[C]. In: IEEE Conference on computer vision and pattern recognition. IEEE Comput Soc; 2014. p. 1805–12.
Liu Z, Wang H, Yan Y, et al. Effective facial expression recognition via the boosted convolutional neural network[C]. CCF Chinese conference on computer vision. Berlin: Springer; 2015. p. 179–88.
Bosse T, Duell R, Memon ZA, et al. Agent-based modeling of emotion contagion in groups[J]. Cogn Comput Springer 2015;7:111.
Chen Y-w, Zhou Q, Luo W, et al. Classification of Chinese texts based on recognition of semantic topics[J]. Cogn Comput Springer 2016;8:114.
Xu R, Chen T, Xia Y, Lu Q, et al. Word embedding composition for data imbalances in sentiment and emotion classification[J]. Cogni Comput Springer 2015;7:226.
Fan H, Cao Z, Jiang Y, et al. 2014. Learning deep face representation[J]. Eprint Arxiv.
Zhang Z, Lyons M, Schuster M, et al. Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron[C]. In International conference on face & gesture recognition. IEEE Computer Society; 1998. p. 454.
Shan C, Gong S, Mcowan PW. Facial expression recognition based on local binary patterns: a comprehensive study[J]. Image Vis Comput 2009;27(6):803–16.
Dahmane M, Meunier J. Emotion recognition using dynamic grid-based HoG features[C]. In IEEE International conference on automatic face & gesture recognition and workshops. IEEE; 2011. p. 884–88.
Lowe DG. Distinctive image features from scale-invariant key-points. Int J Comput Vis 2004;60(2):91–110.
Luo Y, Wu CM, Zhang Y. Facial expression recognition based on fusion feature of PCA and LBP with SVM[J]. Optik - Int J Light Electron Opt 2013;124(17):2767–70.
Lopes AT, Aguiar ED, Oliveira-Santos T. Facial expression recognition system using convolutional networks[C]. Graphics, patterns and images. IEEE; 2015. p. 273–80.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks[C]. In International conference on neural information processing systems. Curran Associates Inc. 2012; p. 1097–5.
Mollahosseini A, Chan D, Mahoor MH. Going deeper in facial expression recognition using deep neural networks[J]. Comput Sci. 2015; 1–0.
Lv L, Zhao D, Deng Q. A semi-supervised predictive sparse decomposition based on task-driven dictionary learning[J]. Cogn Comput 2017;9(1):1–0.
Liu P, Li H. Interval-valued intuitionistic fuzzy power Bonferroni aggregation operators and their application to group decision making[J]. Cogn Comput 2017;9(1):1–9.
Lucey P, Cohn JF, Kanade T, et al. The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression[C]. In Computer vision and pattern recognition workshops. IEEE; 2010. p. 94–101.
Kamachi M, Lyons M, Gyoba J. The Japanese female facial expression (JAFFE) database[J].
Pantic M, Valstar M, Rademaker R, et al. Web-based database for facial expression analysis[C]. In IEEE international conference on multimedia and expo. IEEE; 2005. p. 5.
Donato G, Bartlett MS, Hager JC, et al. Classifying facial actions[j]. IEEE Trans Pattern Anal Mach Intell 1999;21(10):974.
Filliat D. A visual bag of words method for interactive qualitative localization and mapping[C]. In IEEE International conference on robotics and automation. IEEE; 2007. p. 3921–26.
Jorda M, Miolane N. Emotion classification on face images, Stanford University, CS229: Machine Learning Techniques project report.
Chen M, Zhang L, Allebach JP. Learning deep features for image emotion classification[C]. In IEEE International conference on image processing. IEEE; 2015. p. 4491–95.
Zhang SX. CNN deep learning model for facial expression feature extraction. Modern Comput: Professional Edition 2016;2:41–4.
Burges CJC. A tutorial on support vector machine for pattern recognition. JData Mining Knowl Discov 1998; 2(2):121–67.
Yousefi S, Kehtarnavaz N, Cao YCY. . Facial expression recognition based on diffeomorphic matching[J] 2010;119(5):4549–52.
Tian Y, Kanade T, Cohn JF. Recognizing action units for facial expression analysis[J]. IEEE Trans Pattern Anal Mach Intell 2001;23(2):97.
Asian O, Yildiz OT, Alpaydin E. Calculating the VC-dimension of decision trees[C]. In International symposium on computer and information sciences. IEEE. 2009; p. 193–8.
Azhar R, Tuwohingide D, Kamudi D, et al. Batik image classification using SIFT feature extraction, bag of features and support vector machine[C]. In: Information systems international conference; 2015. p. 24–30.
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE 1998;86:2278–2324.
An DC, Meier U, Masci J, et al. Flexible, high performance convolutional neural networks for image classification[C]. In: IJCAI 2011, proceedings of the, international joint conference on artificial intelligence, Barcelona, Catalonia, Spain, July. DBLP; 2011. p. 1237–42.
Ouellet S. 2014. Real-time emotion recognition for gaming using deep convolutional network features[J]. Eprint Arxiv.
Chang CC, Lin CJ. 2001. LIBSVM: a library for support vector machines software available at http://www.csie.ntu.edu.tw/cjlin/libsvmS.
Lopes AT, Aguiar ED, Souza AFD, et al. Facial expression recognition with convolutional neural networks: coping with few data and the training sample order[J]. Pattern Recogn 2016;61:610–28.
Wandell BA. Foundations of vision, 1st ed. Sunderland: Sinauer Associates Inc; 1995.
Bradski G, Kaehler A. Learning OpenCV: computer vision with the OpenCV library. Cambridge: O’Reilly; 2008.
Zhu R, Zhang T, Zhao Q, et al. A transfer learning approach to cross-database facial expression recognition[C]. In: International conference on biometrics. IEEE; 2015. p. 293–8.
Hasani B, Mahoor MH. 2017. Spatio-temporal facial expression recognition using convolutional neural networks and conditional random fields[J].
Liu M, Li S, Shan S, et al. AU-inspired deep networks for facial expression feature learning[J]. Neurocomputing 2015;159(C):126–6.
Funding
The authors would like to thank Zhou for the use of the NLPCC 2017 Shared Task Sample Data: Emotional Conversation Generation as training dataset. The work was supported by the State Key Program of the National Natural Science of China (61432004, 71571058, 61461045). This work was partially supported by the China Postdoctoral Science Foundation funded project (2017T100447). This research has been partially supported by the National Natural Science Foundation of China under Grant No. 61472117. This work was also supported by the foundational application research of Qinghai Province Science and Technology Fund (No. 2016-ZJ-743). This work was also supported by the Open Project Program of the National Laboratory of Pattern Recognition (NLPR).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflicts of interest.
Additional information
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sun, X., Lv, M. Facial Expression Recognition Based on a Hybrid Model Combining Deep and Shallow Features. Cogn Comput 11, 587–597 (2019). https://doi.org/10.1007/s12559-019-09654-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-019-09654-y