Skip to main content
Log in

Global and local representation collaborative learning for few-shot learning

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

A Correction to this article was published on 14 March 2023

This article has been updated

Abstract

The objective of few-shot learning (FSL) is to learn a model that can quickly adapt to novel classes with only few examples. Recent works have shown that a powerful representation with a base learner trained in supervised and self-supervised manners has significant advantages over the existing sophisticated FSL algorithms. In this paper, we build on this insight and propose a new framework called global and local representation collaborative learning (GLCL), which combines the complementary advantages of global equivariance and local aggregation. Global equivariance learns the internal structure of data to improve class discrimination, and the local aggregation retains important semantic information to enrich feature representations. In addition, we design a cross-view contrastive learning to promote the consistent learning and implicit exploration of useful knowledge from one another. A simultaneous optimization of these contrasting objectives allows the model to encode informative features while maintaining strong generalization capabilities for new tasks. We demonstrate consistent and substantial performance gains for FSL classification tasks on multiple datasets. Our code is available at https://github.com/zjgans/GLCL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Change history

References

  • Afrasiyabi, A., Lalonde, J. F., & Gagne, C. (2021). Mixture-based feature space learning for few-shot image classification. In Proceedings of the IEEE international conference on computer vision (pp. 9041–9051).

  • Allen, K., Shelhamer, E., Shin, H., & Tenenbaum, J. (2019). Infinite mixture prototypes for few-shot learning. In International conference on machine learning (pp. 232–241).

  • Baik, S., Choi, J., Kim, H., Cho, D., Min, J., & Lee, K. M. (2021). Meta-learning with task-adaptive loss function for few-shot learning. In Proceedings of the IEEE international conference on computer vision (pp. 9465–9474).

  • Bertinetto, L., Henriques, J. F., Torr, P. H., & Vedaldi, A. (2018). Meta-learning with differentiable closed-form solvers. arXiv preprint arXiv:1805.08136.

  • Cao, K., Brbic, M., & Leskovec, J. (2020). Concept learners for few-shot learning. arXiv preprint arXiv:2007.07375.

  • Carreira, J., & Zisserman, A. (2017). Quo vadis, action recognition? A new model and the kinetics dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6299–6308).

  • Chen, L. C., Zhu, Y., P apandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In European conference on computer vision (pp. 801–818).

  • Chen, P., Liu, S., & Jia, J. (2021a). Jigsaw clustering for unsupervised visual representation learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 11526–11535).

  • Chen, W. Y., Liu, Y. C., Kira, Z., Wang, Y. C. F., & Huang, J. B. (2019). A closer look at few-shot classification. arXiv preprint arXiv:1904.04232.

  • Chen, X., & He, K. (2021). Exploring simple siamese representation learning. In Proceedings of the IEEE international conference on computer vision (pp. 15750–15758).

  • Chen, Y., Liu, Z., Xu, H., Darrell, T., & Wang, X. (2021b). Meta-baseline: Exploring simple meta-learning for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9062–9071).

  • Dhillon, G. S., Chaudhari, P., Ravichandran, A., & Soatto, S. (2019). A baseline for few-shot image classification. arXiv preprint arXiv:1909.02729.

  • Doersch, C., Gupta, A., & Efros, A. A. (2015). Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1422–1430).

  • Finn, C., Abbeel, P., & Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning (pp. 1126–1135).

  • Furlanello, T., Lipton, Z., Tschannen, M., Itti, L., & Anandkumar, A. (2018). Born again neural networks. In International conference on machine learning (pp. 1607–1616).

  • Ge, Y., Zhu, F., Chen, D., & Zhao, R. (2020). Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. In Advances in neural information processing systems (pp. 11309–11321).

  • Gidaris, S., Bursuc, A., Komodakis, N., Pérez, P., & Cord, M. (2018). Dynamic few-shot visual learning without forgetting. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4367–4375).

  • Gidaris, S., Bursuc, A., Komodakis, N., Pérez, P., & Cord, M. (2019). Boosting few-shot visual learning with self-supervision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8059–8068).

  • Gidaris, S., Singh, P., & Komodakis, N. (2018). Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728.

  • Grill, J. B., Strub, F., Altché, F., Tallec, C., Richemond, P. H., & Buchatskaya, E. (2020). Bootstrap your own latent: A new approach to self-supervised learning. arXiv preprint arXiv:2006.07733.

  • Guo, Y., & Cheung, N. M. (2020). Attentive weights generation for few shot learning via information maximization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 13499–13508).

  • Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1735–1742).

  • Hao, F., He, F., Cheng, J., & Tao, D. (2021). Global-local interplay in semantic alignment for few-shot learning. IEEE Transactions on Circuits and Systems for Video Technology. https://doi.org/10.1109/TCSVT.2021.3132912

    Article  Google Scholar 

  • He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9729–97381).

  • He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2961–2969).

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE international conference on computer vision (pp. 770–778).

  • Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.

  • Jamal, M. A., & Qi, G. J. (2019). Task agnostic meta-learning for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 11719–11727).

  • Jiang, W., Huang, K., Geng, J., & Deng, X. (2020). Multi-scale metric learning for few-shot learning. IEEE Transactions on Circuits and Systems for Video Technology, 31(3), 1091–1102.

    Article  Google Scholar 

  • Kang, D., Kwon, H., Min, J., & Cho, M. (2021). Relational embedding for few-shot classification. In Proceedings of the IEEE international conference on computer vision (pp. 8822–8833).

  • Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., & Krishnan, D. (2020). Supervised contrastive learning. In Advances in neural information processing systems (pp. 18661–18673).

  • Kim, J., Kim, H., & Kim, G. (2020). Model-agnostic boundary-adversarial sampling for test-time generalization in few-shot learning. In European conference on computer vision (pp. 599–617). Springer.

  • Koch, G., Zemel, R., & Salakhutdinov, R. (2015). Siamese neural networks for one-shot image recognition. In ICML deep learning workshop: Lille.

    Google Scholar 

  • Larsson, G., Maire, M., & Shakhnarovich, G. (2016). Learning representations for automatic colorization. In European conference on computer vision proceedings (pp. 577–593). Springer.

  • Le, C. P., Dong, J., Soltani, M., & Tarokh, V. (2021). Task affinity with maximum bipartite matching in few-shot learning. arXiv preprint arXiv:2110.02399

  • Lee, K., Maji, S., Ravichandran, A., & Soatto, S. (2019). Meta-learning with differentiable convex optimization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10657–10665).

  • Li, W., Wang, L., Xu, J., Huo, J., Gao, Y., & Luo, J. (2019). Revisiting local descriptor based image-to-class measure for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7260–7268).

  • Liu, Y., Zhang, W., Xiang, C., Zheng, T., Cai, D., & He, X. (2022). Learning to affiliate: Mutual centralized learning for few-shot classification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 14411–14420).

  • Lu, Y., Wen, L., Liu, J., Liu, Y., & Tian, X. (2022). Self-supervision can be a good few-shot learner. arXiv preprint arXiv:2207.09176.

  • Ma, R., Fang, P., Drummond, T., & Harandi, M. (2022). Adaptive poincaré point to set distance for few-shot classification. In Proceedings of the AAAI conference on artificial intelligence (pp. 1926–1934).

  • Mishra, N., Rohaninejad, M., Chen, X., & Abbeel, P. (2017). A simple neural attentive meta-learner. arXiv preprint arXiv:1707.03141.

  • Munkhdalai, T., Yuan, X., Mehri, S., & Trischler, A. (2018). Rapid adaptation with conditionally shifted neurons. In International conference on machine learning (pp. 3664–3673).

  • Noroozi, M., & Favaro, P. (2016). Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision proceedings (pp. 69–84). Springer.

  • Oreshkin, B., Rodríguez López, P., & Lacoste, A. (2018). Tadam: Task dependent adaptive metric for improved few-shot learning. In Advances in Neural Information Processing Systems, 31.

  • Park, W., Kim, D., Lu, Y., & Cho, M. (2019). Relational knowledge distillation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3967–3976).

  • Qiao, L., Shi, Y., Li, J., Wang, Y., Huang, T., & Tian, Y. (2019). Transductive episodic-wise adaptive metric for few-shot learning. In Proceedings of the IEEE international conference on computer vision (pp. 3603–3612).

  • Raghu, A., Raghu, M., Bengio, S., & Vinyals, O. (2019). Rapid learning or feature reuse? Towards understanding the effectiveness of maml. arXiv preprint arXiv:1909.09157.

  • Rajasegaran, J., Khan, S., Hayat, M., Khan, F. S., & Shah, M. (2020) Self-supervised knowledge distillation for few-shot learning. arXiv preprint arXiv:2006.09785.

  • Ravi, S., & Larochelle, H. (2016). Optimization as a model for few-shot learning. In International conference on learning representations.

  • Ravichandran, A., Bhotika, R., & Soatto, S. (2019). Few-shot learning with embedded class models and shot-free meta training. In Proceedings of the IEEE international conference on computer vision (pp.331–339).

  • Ren, M., Triantafillou, E., Ravi, S., Snell, J., Swersky, K., Tenenbaum, J. B., & Zemel, R. S. (2018). Meta-learning for semi-supervised few-shot classification. arXiv preprint arXiv:1803.00676.

  • Rizve, M. N., Khan, S., Khan, F. S., & Shah, M. (2021). Exploring complementary strengths of invariant and equivariant representations for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10836–10846).

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., & Ma, S. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.

    Article  Google Scholar 

  • Simon, C., Koniusz, P., & Harandi, M. (2020a). Adaptive subspaces for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4136–4145).

  • Simon, C., Koniusz, P., & Harandi, M. (2021). On learning the geodesic path for incremental learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1591–1600).

  • Simon, C., Koniusz, P., Nock, R., & Harandi, M. (2020b). On modulating the gradient for meta-learning. In European conference on computer vision (pp. 556–572). Springer.

  • Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical networks for few-shot learning. In Advances in neural information processing systems, 30.

  • Su, J. C., Maji, S., & Hariharan, B. (2020). When does self-supervision improve few-shot learning?. In International conference on computer vision (pp. 645–666). Springer.

  • Sun, Q., Liu, Y., Chua, T. S., & Schiele, B. (2019). Meta-transfer learning for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 403–412).

  • Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P. H., & Hospedales, T. M. (2018). Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1199–1208).

  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818–2826).

  • taRusu, A. A., Rao, D., Sygnowski, J., Vinyals, O., Pascanu, R., Osindero, S., & Hadsell, R. (2018). Meta-learning with latent embedding optimization. arXiv preprint arXiv:1807.05960.

  • Tian, Y., Wang, Y., Krishnan, D., Tenenbaum, J. B., & Isola, P. (2020). Rethinking few-shot image classification: a good embedding is all you need? In European conference on computer vision (pp. 266–282). Springer.

  • Vinyals, O., Blundell, C., Lillicrap, T., & Wierstra, D. (2016). Matching networks for one shot learning. In Advances in neural information processing systems (pp. 3630–3638).

  • Wang, H., Wang, Y., Sun, R., & Li, B. (2022). Global convergence of maml and theory-inspired neural architecture search for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9797–9808).

  • Wertheimer, D., Tang, L., & Hariharan, B. (2021). Few-shot classification with feature map reconstruction networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8012–8021).

  • Wu, Z., Xiong, Y., Yu, S. X., & Lin, D. (2018). Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3733–3742).

  • Xie, E., Ding, J., Wang, W., Zhan, X., Xu, H., Sun, P., & Luo, P. (2021). Detco: Unsupervised contrastive learning for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8392–8401).

  • Xu, C., Fu, Y., Liu, C., Wang, C., Li, J., Huang, F., & Xue, X. (2021). Learning dynamic alignment via meta-filter for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5182–5191).

  • Ye, H. J., Ming, L., Zhan, D. C., & Chao, W. L. (2022). Few-shot learning with a strong teacher. IEEE Transactions on Pattern Analysis and Machine Intelligence.

  • Zhang, C., Cai, Y., Lin, G., & Shen, C. (2020). Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 12203–12213).

  • Zhang, R., Isola, P., & Efros, A. A. (2016). Colorful image colorization. In European conference on computer vision proceedings (pp. 649–666). Springer.

  • Zhang, T., Qiu, C., Ke, W., Süsstrunk, S., & Salzmann, M. (2022). Leverage your local and global representations: A new self-supervised learning strategy. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 16580–16589).

  • Zhou, Z., Qiu, X., Xie, J., Wu, J., & Zhang, C. (2021). Binocular mutual learning for improving few-shot classification. In Proceedings of the IEEE international conference on computer vision (pp. 8402–8411).

Download references

Acknowledgements

This study was supported by the National Natural Science Foundation of China, No. U1811461; the State Key Development Program of China, No. 2018YFC0116904.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qingling Cai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: In Table 2, the values in column 6 have been corrected.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, J., Cai, Q. Global and local representation collaborative learning for few-shot learning. J Intell Manuf 35, 647–664 (2024). https://doi.org/10.1007/s10845-022-02066-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-022-02066-0

Keywords

Navigation