Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation

Zhang, Min; Huang, Siteng; Li, Wenbin; Wang, Donglin

doi:10.1007/978-3-031-20044-1_26

Min Zhang^12,13,15,
Siteng Huang^13,15,
Wenbin Li¹⁴ &
…
Donglin Wang^13,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13680))

Included in the following conference series:

European Conference on Computer Vision

2202 Accesses
8 Citations

Abstract

In this paper, we mainly focus on the problem of how to learn additional feature representations for few-shot image classification through pretext tasks (e.g., rotation or color permutation and so on). This additional knowledge generated by pretext tasks can further improve the performance of few-shot learning (FSL) as it differs from human-annotated supervision (i.e., class labels of FSL tasks). To solve this problem, we present a plug-in Hierarchical Tree Structure-aware (HTS) method, which not only learns the relationship of FSL and pretext tasks, but more importantly, can adaptively select and aggregate feature representations generated by pretext tasks to maximize the performance of FSL tasks. A hierarchical tree constructing component and a gated selection aggregating component is introduced to construct the tree structure and find richer transferable knowledge that can rapidly adapt to novel classes with a few labeled images. Extensive experiments show that our HTS can significantly enhance multiple few-shot methods to achieve new state-of-the-art performance on four benchmark datasets. The code is available at: https://github.com/remiMZ/HTS-ECCV22.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Note that our method mainly focuses on how to adaptively learn the knowledge of pretext tasks and improve the performance of few-shot image classification.
2.
During training, for a 5-way 1-/5-shot setting, one episode time is 0.45/0.54 s (0.39/0.50 s for baseline) with 75 query images over 500 randomly sampled episodes.

References

An, Y., Xue, H., Zhao, X., Zhang, L.: Conditional self-supervised learning for few-shot classification. In: International Joint Conference on Artificial Intelligence, IJCAI (2021)
Google Scholar
Bateni, P., Goyal, R., Masrani, V., Wood, F., Sigal, L.: Improved few-shot visual classification. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 14481–14490 (2020)
Google Scholar
Bertinetto, L., Henriques, J.F., Torr, P.H.S., Vedaldi, A.: Meta-learning with differentiable closed-form solvers. In: International Conference on Learning Representations, ICLR (2019)
Google Scholar
Chen, W., Liu, Y., Kira, Z., Wang, Y.F., Huang, J.: A closer look at few-shot classification. In: International Conference on Learning Representations, ICLR (2019)
Google Scholar
Chen, Z., Ge, J., Zhan, H., Huang, S., Wang, D.: Pareto self-supervised training for few-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2021)
Google Scholar
Cubuk, E.D., Zoph, B., Mané, D., Vasudevan, V., Le, Q.V.: AutoAugment: learning augmentation strategies from data. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2019)
Google Scholar
Cui, W., Guo, Y.: Parameterless transductive feature re-representation for few-shot learning. In: International Conference on Machine Learning, ICML (2021)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2009)
Google Scholar
Feng, Z., Xu, C., Tao, D.: Self-supervised representation learning by rotation feature decoupling. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2019)
Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, ICML (2017)
Google Scholar
Gidaris, S., Bursuc, A., Komodakis, N., Pérez, P., Cord, M.: Boosting few-shot visual learning with self-supervision. In: International Conference on Computer Vision, ICCV (2019)
Google Scholar
Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. In: International Conference on Learning Representations, ICLR (2018)
Google Scholar
Hou, R., Chang, H., Ma, B., Shan, S., Chen, X.: Cross attention network for few-shot classification. In: Advances in Neural Information Processing Systems, NeurIPS (2019)
Google Scholar
Kang, D., Kwon, H., Min, J., Cho, M.: Relational embedding for few-shot classification. In: International Conference on Computer Vision, ICCV (2021)
Google Scholar
Laurens, V.D.M., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Lee, H., Hwang, S.J., Shin, J.: Self-supervised label augmentation via input transformations. In: International Conference on Machine Learning, ICML (2020)
Google Scholar
Li, F., Fergus, R., Perona, P.: A bayesian approach to unsupervised one-shot learning of object categories. In: International Conference on Computer Vision, ICCV (2003)
Google Scholar
Li, H., Eigen, D., Dodge, S., Zeiler, M., Wang, X.: Finding task-relevant features for few-shot learning by category traversal. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2019)
Google Scholar
Li, W., Wang, L., Xu, J., Huo, J., Gao, Y., Luo, J.: Revisiting local descriptor based image-to-class measure for few-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2019)
Google Scholar
Li, W., Xu, J., Huo, J., Wang, L., Gao, Y., Luo, J.: Distribution consistency based covariance metric networks for few-shot learning. In: Association for the Advancement of Artificial Intelligence, AAAI (2019)
Google Scholar
Li, Z., Zhou, F., Chen, F., Li, H.: Meta-SGD: learning to learn quickly for few shot learning. CoRR abs/1707.09835 (2017)
Google Scholar
Liu, L., Hamilton, W.L., Long, G., Jiang, J., Larochelle, H.: A universal representation transformer layer for few-shot image classification. In: International Conference on Learning Representations, ICLR. OpenReview.net (2021)
Google Scholar
Liu, L., Zhou, T., Long, G., Jiang, J., Zhang, C.: Learning to propagate for graph meta-learning. In: Advances in Neural Information Processing Systems, NeurIPS, pp. 1037–1048 (2019)
Google Scholar
Liu, S., Davison, A.J., Johns, E.: Self-supervised generalisation with meta auxiliary learning. In: Advances in Neural Information Processing Systems, NeurIPS (2020)
Google Scholar
Misra, I., van der Maaten, L.: Self-supervised learning of pretext-invariant representations. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2020)
Google Scholar
Ni, R., Goldblum, M., Sharaf, A., Kong, K., Goldstein, T.: Data augmentation for meta-learning. In: International Conference on Machine Learning, ICML (2021)
Google Scholar
Qiao, S., Liu, C., Shen, W., Yuille, A.L.: Few-shot image recognition by predicting parameters from activations. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2018)
Google Scholar
Ravichandran, A., Bhotika, R., Soatto, S.: Few-shot learning with embedded class models and shot-free meta training. In: International Conference on Computer Vision, ICCV (2019)
Google Scholar
Ren, M., et al.: Meta-learning for semi-supervised few-shot classification. In: International Conference on Learning Representations, ICLR (2018)
Google Scholar
Requeima, J., Gordon, J., Bronskill, J., Nowozin, S., Turner, R.E.: Fast and flexible multi-task classification using conditional neural adaptive processes. In: Advances in Neural Information Processing Systems, NeurIPS, pp. 7957–7968 (2019)
Google Scholar
Rusu, A.A., et al.: Meta-learning with latent embedding optimization. In: International Conference on Learning Representations, ICLR (2019)
Google Scholar
Satorras, V.G., Estrach, J.B.: Few-shot learning with graph neural networks. In: International Conference on Learning Representations, ICLR (2018)
Google Scholar
Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, NeurIPS (2017)
Google Scholar
Su, J.-C., Maji, S., Hariharan, B.: When does self-supervision improve few-shot learning? In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12352, pp. 645–666. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58571-6_38
Chapter Google Scholar
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2018)
Google Scholar
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Association for Computational Linguistics, ACL (2015)
Google Scholar
Tian, Y., Wang, Y., Krishnan, D., Tenenbaum, J.B., Isola, P.: Rethinking few-shot image classification: a good embedding is all you need? In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 266–282. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_16
Chapter Google Scholar
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, NeurIPS (2016)
Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. California Institute of Technology (CNS-TR-2011-001) (2011)
Google Scholar
Yao, H., Wei, Y., Huang, J., Li, Z.: Hierarchically structured meta-learning. In: International Conference on Machine Learning, ICML. Proceedings of Machine Learning Research, vol. 97, pp. 7045–7054. PMLR (2019)
Google Scholar
Ye, H., Hu, H., Zhan, D., Sha, F.: Few-shot learning via embedding adaptation with set-to-set functions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2020)
Google Scholar
Zhang, C., Cai, Y., Lin, G., Shen, C.: DeepEMD: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 12200–12210 (2020)
Google Scholar
Zhang, M., Zhang, J., Lu, Z., Xiang, T., Ding, M., Huang, S.: IEPT: instance-level and episode-level pretext tasks for few-shot learning. In: International Conference on Learning Representations, ICLR (2020)
Google Scholar
Zhang, M., Huang, S., Wang, D.: Domain generalized few-shot image classification via meta regularization network. In: ICASSP, pp. 3748–3752 (2022)
Google Scholar
Zhang, M., Wang, D., Gai, S.: Knowledge distillation for model-agnostic meta-learning. In: European Conference on Artificial Intelligence, ECAI (2020)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Science and Technology Innovation 2030 - Major Project (Grant No. 2022ZD0208800), and NSFC General Program (Grant No. 62176215). We thank Dr. Zhitao Wang for helpful feedback and discussions.

Author information

Authors and Affiliations

Zhejiang University, Hangzhou, China
Min Zhang
Westlake University, Hangzhou, China
Min Zhang, Siteng Huang & Donglin Wang
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Wenbin Li
Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, China
Min Zhang, Siteng Huang & Donglin Wang

Authors

Min Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Siteng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Li
View author publications
You can also search for this author in PubMed Google Scholar
Donglin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Donglin Wang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 634 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, M., Huang, S., Li, W., Wang, D. (2022). Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13680. Springer, Cham. https://doi.org/10.1007/978-3-031-20044-1_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-20044-1_26
Published: 20 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20043-4
Online ISBN: 978-3-031-20044-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation