Abstract
Most facial emotion recognition algorithms assume that the face is near frontal and the face pose fixed during the recognition process. However, such constrain limits the adoption for real-world applications. To solve this, pose-invariant descriptor for emotion recognition is required. This work proposes a novel pose-invariant dynamic descriptor that encodes the relative movement information of facial landmarks. The proposed feature set is able to handle speed variations and continuous head pose variations, while the subject is expressing an emotion. In addition, the proposed method is fast and thus can be realize real-time implementation for real-world application. Performance evaluation done using three publicly available databases; Cohn-Kanade \((\hbox {CK}^{+})\), Amsterdam Dynamic Facial Expression Set (ADFES), and Audio Visual Emotion Challenge (AVEC 2011) showed that our proposed method outperforms the state-of-the-art methods.
Similar content being viewed by others
Notes
The source code for nose point detection is available at: http://humansensing.cs.cmu.edu/intraface/download.html.
References
Wehrle, T., Kaiser, S., Schmidt, S., Scherer, K.R.: Studying the dynamics of emotional expression using synthesized facial muscle movements. J. Personal. Soc. Psychol. 78, 105–119 (2000)
Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29, 915–928 (2007)
Bihan, J., Valstar, M. F., Pantic, M.: Action unit detection using sparse appearance descriptors in space-time video volumes. In: IEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011)
Rudovic, O., Pantic, M., Patras, I.: Coupled Gaussian processes for pose-invariant facial expression recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1357–1369 (2013)
Jeni, L.A., et al.: 3D shape estimation in video sequences provides high precision evaluation of facial expressions. Image Vis. Comput. 30, 785–795 (2012)
Songfan, Y., Bhanu, B.: Understanding discrete facial expressions in video using an emotion avatar image. IEEE Trans. Syst. Man Cybern. Part B Cybern. 42, 980–992 (2012)
Zheng, W., Tang, H., Lin, Z., Huang, T.: Emotion recognition from arbitrary view facial images. Comput. Vis. ECCV 2010 6316, 490–503 (2010)
Kumano, S., Otsuka, K., Yamato, J., Maeda, E., Sato, Y.: Pose-invariant facial expression recognition using variable-intensity templates. Int. J. Comput. Vis. 83, 178–194 (2009)
Xiong, X., De La Torre, F.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 532–539 (2013)
Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27, 803–816 (2009)
Lucey, P., et al.: The extended cohn-kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–101 (2010)
Kanade, T., Cohn, J. F., Tian, Y.: Comprehensive database for facial expression analysis. In: Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition (FG00), Grenoble, France, pp. 46–53
Van der Schalk, J., Hawk, S.T., Fischer, A.H., Doosje, B.J.: Moving faces, looking places: the Amsterdam dynamic facial expressions set (ADFES). Emotion 11, 907–920 (2011)
Schuller, B., et al.: AVEC 2011—the first international audio/visual emotion challenge. Affect. Comput. Intell. Interact. 6975, 415–424 (2011)
Shojaeilangari, S., Yau, W.Y., Teoh, E.K.: Dynamic facial expression analysis based on histogram of local phase and local orientation. In: International Conference on Multimedia and Human-Computer Interaction (MHCI), Canada (2013)
Shojaeilangari, S., Yau, W.Y., Li, J., Teoh, E.K.: Multi-scale analysis of local phase and local orientation for dynamic facial expression recognition. J. Multimed. Theory Appl. (JMTA) 2, 1–10 (2014)
Shojaeilangari, S., Yau, W.Y., Teoh, E.K.: A novel phase congruency based descriptor for dynamic facial expression analysis. Pattern Recognit. Lett. 49, 55–61 (2014)
Shojaeilangari, S., Yau, W.Y., Nandakumar, K., Li, J., Teoh, E.K.: Robust representation and recognition of facial emotions using extreme sparse learning. IEEE Trans. Image Process. 24, 2140–2152 (2015)
Meng, H., Bianchi-Berthouze, N.: Affective state level recognition in naturalistic facial and vocal expressions. IEEE Trans. Cybern. 44, 315–328 (2014)
Ramirez, G., Baltrušaitis, T., Morency, L.-P.: Modeling latent discriminative dynamic of multi-dimensional affective signals. Affect. Comput. Intell. Interact. 6975, 396–406 (2011)
Cruz, A., Bhanu, B., Yang, S.: A psychologically-inspired match-score fusion model for video-based facial expression recognition. Affect. Comput. Intell. Interact. 6975, 341–350 (2011)
Glodek, M., et al.: Multiple classifier systems for the classification of audio-visual emotional states. Affect. Comput. Intell. Interact. 6975, 359–368 (2011)
Acknowledgments
This research is supported by the Agency for Science, Technology and Research (A*STAR), Singapore.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shojaeilangari, S., Yau, WY. & Teoh, EK. Pose-invariant descriptor for facial emotion recognition. Machine Vision and Applications 27, 1063–1070 (2016). https://doi.org/10.1007/s00138-016-0794-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-016-0794-2