The Role of Coherence in Facial Expression Recognition

  • Lisa GrazianiEmail author
  • Stefano Melacci
  • Marco Gori
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11298)


Recognizing facial expressions from static images or video sequences is a widely studied but still challenging problem. The recent progresses obtained by deep neural architectures, or by ensembles of heterogeneous models, have shown that integrating multiple input representations leads to state-of-the-art results. In particular, the appearance and the shape of the input face, or the representations of some face parts, are commonly used to boost the quality of the recognizer. This paper investigates the application of Convolutional Neural Networks (CNNs) with the aim of building a versatile recognizer of expressions in static images that can be further applied to video sequences. We first study the importance of different face parts in the recognition task, focussing on appearance and shape-related features. Then we cast the learning problem in the Semi-Supervised setting, exploiting video data, where only a few frames are supervised. The unsupervised portion of the training data is used to enforce two types of coherence, namely temporal coherence and coherence among the predictions on the face parts. Our experimental analysis shows that coherence constraints can improve the quality of the expression recognizer, thus offering a suitable basis to profitably exploit unsupervised video sequences.


Facial expression recognition Convolutional Neural Networks Learning from constraints Coherence constraints 


  1. 1.
    Duchenne, G.B., de Boulogne, G.B.D.: The Mechanism of Human Facial Expression. Cambridge University Press, Cambridge (1990)CrossRefGoogle Scholar
  2. 2.
    Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17(2), 124 (1971)CrossRefGoogle Scholar
  3. 3.
    Fan, X., Tjahjadi, T.: A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences. Pattern Recogn. 48(11), 3407–3416 (2015)CrossRefGoogle Scholar
  4. 4.
    Gnecco, G., Gori, M., Melacci, S., Sanguineti, M.: Foundations of support constraint machines. Neural Comput. 27(2), 388–480 (2015)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Happy, S., Routray, A.: Automatic facial expression recognition using features of salient facial patches. IEEE Trans. Affect. Comput. 6(1), 1–12 (2015)CrossRefGoogle Scholar
  6. 6.
    Jain, S., Hu, C., Aggarwal, J.K.: Facial expression recognition with temporal modeling of shapes. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1642–1649. IEEE (2011)Google Scholar
  7. 7.
    Jung, H., Lee, S., Yim, J., Park, S., Kim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2983–2991. IEEE (2015)Google Scholar
  8. 8.
    Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)Google Scholar
  9. 9.
    Long, F., Bartlett, M.S.: Video-based facial expression recognition using learned spatiotemporal pyramid sparse coding features. Neurocomputing 173, 2049–2054 (2016)CrossRefGoogle Scholar
  10. 10.
    Lopes, A.T., de Aguiar, E., De Souza, A.F., Oliveira-Santos, T.: Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recogn. 61, 610–628 (2017)CrossRefGoogle Scholar
  11. 11.
    Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The Extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–101. IEEE (2010)Google Scholar
  12. 12.
    Melacci, S., Maggini, M., Gori, M.: Semi–supervised learning with constraints for multi–view object recognition. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009. LNCS, vol. 5769, pp. 653–662. Springer, Heidelberg (2009). Scholar
  13. 13.
    Mollahosseini, A., Chan, D., Mahoor, M.H.: Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10. IEEE (2016)Google Scholar
  14. 14.
    Plutchik, R.: The nature of emotions: human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. Am. Sci. 89(4), 344–350 (2001)CrossRefGoogle Scholar
  15. 15.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, p. I. IEEE (2001)Google Scholar
  16. 16.
    Zhang, K., Huang, Y., Du, Y., Wang, L.: Facial expression recognition based on deep evolutional spatial-temporal networks. IEEE Trans. Image Process. 26(9), 4193–4203 (2017)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.DINFOUniversity of FlorenceFlorenceItaly
  2. 2.DIISMUniversity of SienaSienaItaly

Personalised recommendations