Skip to main content
Log in

ARM: a lightweight module to amend facial expression representation

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

The classification objects of facial expression recognition (FER) are visual representations based on human faces, so there are certain generic parts between facial expression features that can be exploited to develop a concise feature extraction pattern. In addition, this paper is the first to investigate the flawed nature of padding in the convolutional layer. We find that the accumulation of padding in convolutional networks has an erosive effect on features, resulting in a partial loss of recognition performance. We block the detrimental impacts of padding to some extent at the extraction stage of the representation by secondary processing of the high-dimensional features at the network’s back end, despite the fact that no solution for total padding replacement has been established. Our proposed module named Amend Representation Module (ARM) (1) deconstructs the expression features into face and expression components, making feature extraction easier and feature representation more accurate, (2) weakens the weights of degraded feature pixels and obtains more discriminative representations from eroded features. Experiments on public benchmarks prove that our ARM based on ResNet-18 boosts the performance of FER remarkably.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

The three datasets used in the paper are all freely available, and the use request is authorised for non-profit purposes.

References

  1. Acharya, D., Huang, Z., Pani Paudel, D., Van Gool, L.: Covariance pooling for facial expression recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 367–374, (2018)

  2. Cai, J., Meng, Z., Khan, A. S., Li, Z., O’Reilly, J., Tong, Y.: Island loss for learning discriminative features in facial expression recognition. In 2018 13th IEEE International conference on automatic face & gesture recognition (FG 2018), 302–309. IEEE, (2018)

  3. Carrier, P. L., Courville, A., Goodfellow, I. J., Mirza, M., Bengio, Y.: FER-2013 face database. Universit de Montreal, (2013)

  4. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), 1, 886–893. IEEE, (2005)

  5. Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark. In 2011 IEEE International conference on computer vision workshops (ICCV Workshops), pages 2106–2112. IEEE, (2011)

  6. Ekman, P., Friesen, Wallace V.: Constants across cultures in the face and emotion. J. Personal. Soc. Psychol. 17(2), 124 (1971)

    Article  Google Scholar 

  7. Fabian, Benitez-Quiroz, C., Srinivasan, R., Martinez, A. M.: Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, 5562–5570, (2016)

  8. Farzaneh, A. H., Qi, X.: Facial expression recognition in the wild via deep attentive center loss. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2402–2411, (2021)

  9. Georgescu, M.-I., Ionescu, R.T., Popescu, M.: Local learning with deep and handcrafted features for facial expression recognition. IEEE Access 7, 64827–64836 (2019)

    Article  Google Scholar 

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition , 770–778 (2016)

  11. Jie, H., Shen, L., Sun, G.: Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7132–7141 (2018)

  12. Jung, H., Lee, S., Y., Junho, P., Sunjeong, K.,: Joint fine-tuning in deep neural networks for facial expression recognition. In Proceedings of the IEEE international conference on computer vision, 2983–2991 (2015)

  13. Kervadec, C., Vielzeuf, V., Pateux, S., Lechervy, A., Jurie, F.: Cake: Compact and accurate k-dimensional representation of emotion. arXiv preprintarXiv:1807.11215, (2018)

  14. Krizhevsky, A., Sutskever, I., Hinton, Geoffrey E.: Imagenet classification with deep convolutional neural networks. Adv Neur Info. Process. Sys. 25, 1097–1105 (2012)

    Google Scholar 

  15. LeCun, Y., Bottou, Léon., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceed IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  16. Li, S., Deng, W., JunPing, D.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition , 2852–2861 (2017)

  17. Meng, Z., Liu, P., Cai, J., Han, S., Tong, Y.: Identity-aware convolutional neural network for facial expression recognition. In 2017 12th IEEE International conference on automatic face & gesture recognition (FG 2017), 558–565. IEEE, (2017)

  18. Mollahosseini, A., Hasani, B., Mahoor, M.H.: Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1), 18–31 (2017)

    Article  Google Scholar 

  19. Ng, P.C., Henikoff, S.: Sift: predicting amino acid changes that affect protein function. Nucl Acid Res 31(13), 3812–3814 (2003)

    Article  Google Scholar 

  20. Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6), 803–816 (2009)

    Article  Google Scholar 

  21. She, J., Yibo, H., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 6248–6257 (2021)

  22. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1874–1883, (2016)

  23. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprintarXiv:1409.1556, (2014)

  24. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition , 1–9 (2015)

  25. Vo, T.H., Lee, G.S., Yang, H.J., Kim, S.H.: Pyramid with super resolution for in-the-wild facial expression recognition. IEEE Access 8, 131988–132001 (2020)

    Article  Google Scholar 

  26. Wang, K., Peng, X., Yang, J., Shijian, L., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition , 6897–6906 (2020)

  27. Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29, 4057–4069 (2020)

    Article  MATH  Google Scholar 

  28. Yu, Z., Zhang, C.: Image based static facial expression recognition with multiple deep network learning. In Proceedings of the 2015 ACM on international conference on multimodal interaction, 435–442, (2015)

  29. Zeng, J., Shan, S., Chen, X.: Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European conference on computer vision (ECCV) , 222–237 (2018)

  30. Zhi, R., Flierl, M., Ruan, Q., Kleijn, W.B.: Graph-preserving sparse nonnegative matrix factorization with application to facial expression recognition. IEEE Trans Syst, Man, Cybern, Part B (Cybernetics) 41(1), 38–52 (2010)

    Google Scholar 

Download references

Funding

This work is supported by Natural Science Foundation of Nanjing University of Posts and Telecommunications under No. NY221077, and National Natural Science Foundation of China under No. 52170001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Songhao Zhu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, J., Zhu, S., Wang, D. et al. ARM: a lightweight module to amend facial expression representation. SIViP 17, 1315–1323 (2023). https://doi.org/10.1007/s11760-022-02339-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02339-4

Keywords

Navigation