Skip to main content

STA-GCN: Spatio-Temporal AU Graph Convolution Network for Facial Micro-expression Recognition

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13019))

Included in the following conference series:

Abstract

Facial micro-expression (FME) is a fast and subtle facial muscle movement that typically reflects person’s real mental state. It is a huge challenge in the FME recognition task due to the low intensity and short duration. FME can be decomposed into a combination of facial muscle action units (AU), and analyzing the correlation between AUs is a solution for FME recognition. In this paper, we propose a framework called spatio-temporal AU graph convolutional network (STA-GCN) for FME recognition. Firstly, pre-divided AU-related regions are input into the 3D CNN, and inter-frame relations are encoded by inserting a Non-Local module for focusing on apex information. Moreover, to obtain the inter-AU dependencies, we construct separate graphs of their spatial relationships and activation probabilities. The relationship feature we obtain from the graph convolution network (GCN) are used to activate on the full-face features. Our proposed algorithm achieves state-of-the-art accuracy of 76.08% accuracy and F1-score of 70.96% on the CASME II dataset, which outperformance all baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahonen, T., Hadid, A., Pietikäinen, M.: Face recognition with local binary patterns. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3021, pp. 469–481. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24670-1_36

    Chapter  Google Scholar 

  2. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)

    Article  Google Scholar 

  3. Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. arXiv preprint arXiv:1606.09375 (2016)

  4. Ekman, R.: What the Face Reveals: Basic and Applied Studies of Spontaneous Expression using the Facial Action Coding System (FACS). Oxford University Press, USA (1997)

    Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  6. Khor, H.Q., See, J., Phan, R.C.W., Lin, W.: Enriched long-term recurrent convolutional network for facial micro-expression recognition. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 667–674. IEEE (2018)

    Google Scholar 

  7. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  8. Liu, Y.J., Zhang, J.K., Yan, W.J., Wang, S.J., Zhao, G., Fu, X.: A main directional mean optical flow feature for spontaneous micro-expression recognition. IEEE Trans. Affect. Comput. 7(4), 299–310 (2015)

    Article  Google Scholar 

  9. Liu, Z., Dong, J., Zhang, C., Wang, L., Dang, J.: Relation modeling with graph convolutional networks for facial action unit detection. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 489–501. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_40

    Chapter  Google Scholar 

  10. Ma, C., Chen, L., Yong, J.: Au R-CNN: encoding expert prior knowledge into R-CNN for action unit detection. Neurocomputing 355, 35–47 (2019)

    Article  Google Scholar 

  11. Pfister, T., Li, X., Zhao, G., Pietikäinen, M.: Recognising spontaneous facial micro-expressions. In: 2011 International Conference on Computer Vision, pp. 1449–1456. IEEE (2011)

    Google Scholar 

  12. Platt, J.: Sequential minimal optimization: a fast algorithm for training support vector machines (1998)

    Google Scholar 

  13. Reddy, S.P.T., Karri, S.T., Dubey, S.R., Mukherjee, S.: Spontaneous facial micro-expression recognition using 3D spatiotemporal convolutional neural networks. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2019)

    Google Scholar 

  14. Shi, X., Yang, C., Xia, X., Chai, X.: Deep cross-species feature learning for animal face recognition via residual interspecies equivariant network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12372, pp. 667–682. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58583-9_40

    Chapter  Google Scholar 

  15. Verma, M., Vipparthi, S.K., Singh, G., Murala, S.: Learnet: dynamic imaging network for micro expression recognition. IEEE Trans. Image Process. 29, 1618–1627 (2019)

    Article  MathSciNet  Google Scholar 

  16. Wang, S.J., et al.: Micro-expression recognition with small sample size by transferring long-term convolutional neural network. Neurocomputing 312, 251–262 (2018)

    Article  Google Scholar 

  17. Wang, S.J., et al.: Micro-expression recognition using color spaces. IEEE Trans. Image Process. 24(12), 6034–6047 (2015)

    Article  MathSciNet  Google Scholar 

  18. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)

    Google Scholar 

  19. Shi, X., Yang, C., Xia, X., Chai, X.: Deep cross-species feature learning for animal face recognition via residual interspecies equivariant network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12372, pp. 667–682. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58583-9_40

    Chapter  Google Scholar 

  20. Wang, Y., See, J., Phan, R.C.W., Oh, Y.H.: Lbp with six intersection points: reducing redundant information in lbp-top for micro-expression recognition. In: Asian Conference on Computer Vision, pp. 525–537. Springer (2014)

    Google Scholar 

  21. Xia, Z., Hong, X., Gao, X., Feng, X., Zhao, G.: Spatiotemporal recurrent convolutional networks for recognizing spontaneous micro-expressions. IEEE Trans. Multimed. 22(3), 626–640 (2019)

    Article  Google Scholar 

  22. Xu, M., Zhao, C., Rojas, D.S., Thabet, A., Ghanem, B.: G-tad: Sub-graph localization for temporal action detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10156–10165 (2020)

    Google Scholar 

  23. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  24. Yan, W.J., et al.: Casme ii: an improved spontaneous micro-expression database and the baseline evaluation. PLoS ONE 9(1), e86041 (2014)

    Google Scholar 

  25. Ying, R., He, R., Chen, K., Eksombatchai, P., Hamilton, W.L., Leskovec, J.: Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 974–983 (2018)

    Google Scholar 

  26. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  27. Zhang, Y., et al.: Joint representation and estimator learning for facial action unit intensity estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3457–3466 (2019)

    Google Scholar 

  28. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: European Conference on Computer Vision, pp. 94–108. Springer (2014)

    Google Scholar 

  29. Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 915–928 (2007)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. U20B2062), the fellowship of China Postdoctoral Science Foundation (No. 2021M690354), the Beijing Municipal Science & Technology Project (No. Z191100007419001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huimin Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, X., Ma, H., Wang, R. (2021). STA-GCN: Spatio-Temporal AU Graph Convolution Network for Facial Micro-expression Recognition. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13019. Springer, Cham. https://doi.org/10.1007/978-3-030-88004-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88004-0_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88003-3

  • Online ISBN: 978-3-030-88004-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics