Skip to main content

Learning Features Robust to Image Variations with Siamese Networks for Facial Expression Recognition

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10132))

Included in the following conference series:


This paper proposes a computationally efficient method for learning features robust to image variations for facial expression recognition (FER). The proposed method minimizes the feature difference between an image under a variable image variation and a corresponding target image with the best image conditions for FER (i.e. frontal face image with uniform illumination). This is achieved by regulating the objective function during the learning process where a Siamese network is employed. At the test stage, the learned network parameters are transferred to a convolutional neural network (CNN) with which the features robust to image variations can be obtained. Experiments have been conducted on the Multi-PIE dataset to evaluate the proposed method under a large number of variations including pose and illumination. The results show that the proposed method improves the FER performance under different variations without requiring extra computational complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions


  1. Cohn, J.F., Ekman, P.: Measuring facial action. In: The New Handbook of Methods in Nonverbal Behavior Research, pp. 9–64 (2005)

    Google Scholar 

  2. Tian, Y.-L., Kanade, T., Cohn, J.F.: Facial expression analysis. In: Tian, Y.-L., Kanade, T., Cohn, J.F. (eds.) Handbook of Face Recognition, pp. 247–275. Springer, New York (2005)

    Chapter  Google Scholar 

  3. Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: Audio, visual, and spontaneous expressions. Pattern Anal. Mach. Intell. 31, 39–58 (2009)

    Article  Google Scholar 

  4. Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17, 124 (1971)

    Article  Google Scholar 

  5. Valstar, M.F., Patras, I., Pantic, M.: Facial action unit detection using probabilistic actively learned support vector machines on tracked facial point data. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005)-Workshops, pp. 76–76. IEEE (2005)

    Google Scholar 

  6. Ramirez Rivera, A., Rojas Castillo, J., Chae, O.: Local directional number pattern for face analysis: Face and expression recognition. IEEE Trans. Image Process. 22, 1740–1752 (2013)

    Article  MathSciNet  Google Scholar 

  7. Lee, S.H., Baddar, W.J., Ro, Y.M.: Collaborative expression representation using peak expression and intra class variation face images for practical subject-independent emotion recognition in videos. Pattern Recogn. 54, 52–67 (2016)

    Article  Google Scholar 

  8. Jiang, B., Valstar, M., Martinez, B., Pantic, M.: A dynamic appearance descriptor approach to facial actions temporal modeling. IEEE Trans. Cybern. 44, 161–174 (2014)

    Article  Google Scholar 

  9. Mäenpää, T.: The local binary pattern approach to texture analysis: extensions and applications. Oulun yliopisto (2003)

    Google Scholar 

  10. Zhang, L., Tjondronegoro, D., Chandran, V.: Random Gabor based templates for facial expression recognition in images with facial occlusion. Neurocomputing 145, 451–464 (2014)

    Article  Google Scholar 

  11. Liu, M., Li, S., Shan, S., Wang, R., Chen, X.: Deeply learning deformable facial action parts model for dynamic expression analysis. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9006, pp. 143–157. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16817-3_10

    Google Scholar 

  12. Liu, M., Li, S., Shan, S., Chen, X.: AU-aware deep networks for facial expression recognition. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–6. IEEE (2013)

    Google Scholar 

  13. Liu, M., Li, S., Shan, S., Chen, X.: AU-inspired deep networks for facial expression feature learning. Neurocomputing 159, 126–136 (2015)

    Article  Google Scholar 

  14. Mollahosseini, A., Chan, D., Mahoor, M.H.: going deeper in facial expression recognition using deep neural networks (2015). arXiv preprint arXiv:1511.04110

  15. Jung, H., Lee, S., Yim, J., Park, S., Kim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2983–2991 (2015)

    Google Scholar 

  16. Khorrami, P., Paine, T., Huang, T.: Do deep neural networks learn facial action units when doing expression recognition? In: Proceedings of IEEE International Conference on Computer Vision Workshops, pp. 19–27 (2015)

    Google Scholar 

  17. Gupta, O., Raviv, D., Raskar, R.: Deep video gesture recognition using illumination invariants (2016). arXiv preprint arXiv:1603.06531

  18. Kim, B.-K., Roh, J., Dong, S.-Y., Lee, S.-Y.: Hierarchical committee of deep convolutional neural networks for robust facial expression recognition. J. Multimodal User Interfaces 10, 1–17 (2016)

    Article  Google Scholar 

  19. Ba, J., Caruana, R.: Do deep nets really need to be deep? In: Advances in Neural Information Processing Systems, pp. 2654–2662 (2014)

    Google Scholar 

  20. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions (2014). arXiv preprint arXiv:1409.4842

  21. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). arXiv preprint arXiv:1512.03385

  22. Gross, R., Matthews, I., Cohn, J., Kanade, T., Baker, S.: Multi-pie. Image Vis. Comput. 28, 807–813 (2010)

    Article  Google Scholar 

  23. Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Proceedings of 30th International Conference on Machine Learning (ICML-13), pp. 1058–1066. (2013)

    Google Scholar 

  24. Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Incremental face alignment in the wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1859–1866. IEEE (2014)

    Google Scholar 

  25. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of ACM International Conference on Multimedia, pp. 675–678. ACM (2014)

    Google Scholar 

  26. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of 27th International Conference on Machine Learning (ICML-10), pp. 807–814. (2010)

    Google Scholar 

  27. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 85 (2008)

    MATH  Google Scholar 

Download references


This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 2015R1A2A2A01005724).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Wissam J. Baddar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Baddar, W.J., Kim, D.H., Ro, Y.M. (2017). Learning Features Robust to Image Variations with Siamese Networks for Facial Expression Recognition. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51810-7

  • Online ISBN: 978-3-319-51811-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics