Skip to main content

Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13679))

Included in the following conference series:

Abstract

Most existing compound facial expression recognition (FER) methods rely on large-scale labeled compound expression data for training. However, collecting such data is labor-intensive and time-consuming. In this paper, we address the compound FER task in the cross-domain few-shot learning (FSL) setting, which requires only a few samples of compound expressions in the target domain. Specifically, we propose a novel cascaded decomposition network (CDNet), which cascades several learn-to-decompose modules with shared parameters based on a sequential decomposition mechanism, to obtain a transferable feature space. To alleviate the overfitting problem caused by limited base classes in our task, a partial regularization strategy is designed to effectively exploit the best of both episodic training and batch training. By training across similar tasks on multiple basic expression datasets, CDNet learns the ability of learn-to-decompose that can be easily adapted to identify unseen compound expressions. Extensive experiments on both in-the-lab and in-the-wild compound expression datasets demonstrate the superiority of our proposed CDNet against several state-of-the-art FSL methods. Code is available at: https://github.com/zouxinyi0625/CDNet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Afrasiyabi, A., Lalonde, J.-F., Gagné, C.: Associative alignment for few-shot image classification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 18–35. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_2

    Chapter  Google Scholar 

  2. Chen, M., et al.: Diversity transfer network for few-shot learning. In: AAAI Conference on Artificial Intelligence, vol. 34, pp. 10559–10566 (2020)

    Google Scholar 

  3. Chen, W.Y., Liu, Y.C., Kira, Z., Wang, Y.C., Huang, J.B.: A closer look at few-shot classification. In: International Conference on Learning Representations (2019)

    Google Scholar 

  4. Chen, Y., Liu, Z., Xu, H., Darrell, T., Wang, X.: Meta-Baseline: exploring simple meta-learning for few-shot learning. In: IEEE/CVF International Conference on Computer Vision, pp. 9062–9071 (2021)

    Google Scholar 

  5. Ciubotaru, A.N., Devos, A., Bozorgtabar, B., Thiran, J.P., Gabrani, M.: Revisiting few-shot learning for facial expression recognition. arXiv preprint arXiv:1912.02751 (2019)

  6. Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: IEEE International Conference on Computer Vision Workshops, pp. 2106–2112 (2011)

    Google Scholar 

  7. Du, S., Tao, Y., Martinez, A.M.: Compound facial expressions of emotion. Proc. Natl. Acad. Sci. 111(15), E1454–E1462 (2014)

    Article  Google Scholar 

  8. Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17(2), 124–129 (1971)

    Article  Google Scholar 

  9. Fabian Benitez-Quiroz, C., Srinivasan, R., Martinez, A.M.: EmotioNet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5562–5570 (2016)

    Google Scholar 

  10. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135 (2017)

    Google Scholar 

  11. Garcia, V., Bruna, J.: Few-shot learning with graph neural networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  12. Gidaris, S., Komodakis, N.: Dynamic few-shot visual learning without forgetting. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4367–4375 (2018)

    Google Scholar 

  13. Guo, J., et al.: Multi-modality network with visual and geometrical information for micro emotion recognition. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 814–819 (2017)

    Google Scholar 

  14. Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6

    Chapter  Google Scholar 

  15. Lee, Y., Choi, S.: Gradient-based meta-learning with learned layerwise metric and subspace. In: International Conference on Machine Learning, pp. 2927–2936 (2018)

    Google Scholar 

  16. Li, S., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affect. Comput. (2020)

    Google Scholar 

  17. Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2852–2861 (2017)

    Google Scholar 

  18. Li, Y., Lu, Y., Li, J., Lu, G.: Separate loss for basic and compound facial expression recognition in the wild. In: Asian Conference on Machine Learning, pp. 897–911 (2019)

    Google Scholar 

  19. Liu, C., et al.: Learning a few-shot embedding model with contrastive learning. In: AAAI Conference on Artificial Intelligence, vol. 35, pp. 8635–8643 (2021)

    Google Scholar 

  20. Lu, J., Gong, P., Ye, J., Zhang, C.: Learning from very few samples: a survey. arXiv preprint arXiv:2009.02653 (2020)

  21. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 94–101 (2010)

    Google Scholar 

  22. Pantic, M., Valstar, M., Rademaker, R., Maat, L.: Web-based database for facial expression analysis. In: IEEE International Conference on Multimedia and Expo, pp. 317–321 (2005)

    Google Scholar 

  23. Phoo, C.P., Hariharan, B.: Self-training for few-shot transfer across extreme task differences. In: International Conference on Learning Representations (2021)

    Google Scholar 

  24. Ruan, D., Yan, Y., Chen, S., Xue, J.H., Wang, H.: Deep disturbance-disentangled learning for facial expression recognition. In: ACM International Conference on Multimedia, pp. 2833–2841 (2020)

    Google Scholar 

  25. Ruan, D., Yan, Y., Lai, S., Chai, Z., Shen, C., Wang, H.: Feature decomposition and reconstruction learning for effective facial expression recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7660–7669 (2021)

    Google Scholar 

  26. Simon, C., Koniusz, P., Nock, R., Harandi, M.: Adaptive subspaces for few-shot learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4136–4145 (2020)

    Google Scholar 

  27. Slimani, K., Lekdioui, K., Messoussi, R., Touahni, R.: Compound facial expression recognition based on highway CNN. In: New Challenges in Data Sciences: Acts of the Second Conference of the Moroccan Classification Society, pp. 1–7 (2019)

    Google Scholar 

  28. Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, pp. 4077–4087 (2017)

    Google Scholar 

  29. Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)

    Google Scholar 

  30. Tian, Y., Wang, Y., Krishnan, D., Tenenbaum, J.B., Isola, P.: Rethinking few-shot image classification: a good embedding is all you need? In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 266–282. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_16

    Chapter  Google Scholar 

  31. Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, pp. 3630–3638 (2016)

    Google Scholar 

  32. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 dataset (2011)

    Google Scholar 

  33. Wang, C., Wang, S., Liang, G.: Identity- and pose-robust facial expression recognition through adversarial feature learning. In: ACM International Conference on Multimedia, pp. 238–246 (2019)

    Google Scholar 

  34. Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)

    Google Scholar 

  35. Yang, S., Liu, L., Xu, M.: Free lunch for few-shot learning: distribution calibration. In: International Conference on Learning Representations (2021)

    Google Scholar 

  36. Zeng, J., Shan, S., Chen, X.: Facial expression recognition with inconsistently annotated datasets. In: European Conference on Computer Vision, pp. 222–237 (2018)

    Google Scholar 

  37. Zhang, Z., Yi, M., Xu, J., Zhang, R., Shen, J.: Two-stage recognition and beyond for compound facial emotion recognition. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 900–904 (2020)

    Google Scholar 

  38. Zhao, F., Zhao, J., Yan, S., Feng, J.: Dynamic conditional networks for few-shot learning. In: European Conference on Computer Vision, pp. 19–35 (2018)

    Google Scholar 

  39. Zhao, G., Huang, X., Taini, M., Li, S.Z., PietikäInen, M.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011)

    Article  Google Scholar 

  40. Zhou, Z., Qiu, X., Xie, J., Wu, J., Zhang, C.: Binocular mutual learning for improving few-shot classification. In: IEEE/CVF International Conference on Computer Vision, pp. 8402–8411 (2021)

    Google Scholar 

  41. Zhu, Q., Mao, Q., Jia, H., Noi, O.E.N., Tu, J.: Convolutional relation network for facial expression recognition in the wild with few-shot learning. Expert Syst. Appl. 189, 116046 (2022)

    Google Scholar 

  42. Zou, X., Yan, Y., Xue, J.H., Chen, S., Wang, H.: When facial expression recognition meets few-shot learning: a joint and alternate learning framework. In: AAAI Conference on Artificial Intelligence (2022)

    Google Scholar 

Download references

Acknowledgement

This work was partly supported by the National Natural Science Foundation of China under Grants 62071404, U21A20514, and 61872307, by the Open Research Projects of Zhejiang Lab under Grant 2021KG0AB02, by the Natural Science Foundation of Fujian Province under Grant 2020J01001, and by the Youth Innovation Foundation of Xiamen City under Grant 3502Z20206046.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yan Yan .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2876 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zou, X., Yan, Y., Xue, JH., Chen, S., Wang, H. (2022). Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13679. Springer, Cham. https://doi.org/10.1007/978-3-031-19800-7_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19800-7_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19799-4

  • Online ISBN: 978-3-031-19800-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics