DisRot: boosting the generalization capability of few-shot learning via knowledge distillation and self-supervised learning

Ma, Chenyu; Jia, Jinfang; Huang, Jianqiang; Wu, Li; Wang, Xiaoying

doi:10.1007/s00138-024-01529-z

DisRot: boosting the generalization capability of few-shot learning via knowledge distillation and self-supervised learning

Original Paper
Published: 09 April 2024

Volume 35, article number 51, (2024)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Chenyu Ma¹,
Jinfang Jia ORCID: orcid.org/0000-0002-0139-5126^1,2,
Jianqiang Huang¹,
Li Wu¹ &
…
Xiaoying Wang¹

94 Accesses
Explore all metrics

Abstract

Few-shot learning (FSL) aims to adapt quickly to new categories with limited samples. Despite significant progress in utilizing meta-learning for solving FSL tasks, challenges such as overfitting and poor generalization still exist. Building upon the demonstrated significance of powerful feature representation, this work proposes disRot, a novel two-strategy training mechanism, which combines knowledge distillation and rotation prediction task for the pre-training phase of transfer learning. Knowledge distillation enables shallow networks to learn relational knowledge contained in deep networks, while the self-supervised rotation prediction task provides class-irrelevant and transferable knowledge for the supervised task. Simultaneous optimization for these two tasks allows the model learn generalizable and transferable feature embedding. Extensive experiments on the miniImageNet and FC100 datasets demonstrate that disRot can effectively improve the generalization ability of the model and is comparable to the leading FSL methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge Distillation: A Survey

Article 22 March 2021

Learning with Noisy Correspondence

Article 13 April 2024

Learning to Prompt for Vision-Language Models

Article 31 July 2022

Notes

https://github.com/pytorch/vision/tree/main/torchvision/models.

References

Hospedales, T., Antoniou, A., Micaelli, P., Storkey, A.: Meta-learning in neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(9), 5149–5169 (2021)
Google Scholar
Singh, R., Bharti, V., Purohit, V., Kumar, A., Singh, A.K., Singh, S.K.: MetaMed: few-shot medical image classification using gradient-based meta-learning. Pattern Recognit. 120, 108111 (2021)
Article Google Scholar
Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M.W., Pfau, D., Schaul, T., Shillingford, B., De Freitas, N.: Learning to learn by gradient descent by gradient descent. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135 (2017)
Antoniou, A., Edwards, H., Storkey, A.: How to train your MAML. arXiv preprint arXiv:1810.09502 (2018)
Rusu, A.A., Rao, D., Sygnowski, J., Vinyals, O., Pascanu, R., Osindero, S., Hadsell, R.: Meta-learning with latent embedding optimization. arXiv preprint arXiv:1807.05960 (2018)
Koch, G., Zemel, R., Salakhutdinov, R., et al.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 (2015)
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)
Chen, W.-Y., Liu, Y.-C., Kira, Z., Wang, Y.-C.F., Huang, J.-B.: A closer look at few-shot classification. arXiv preprint arXiv:1904.04232 (2019)
Tian, Y., Wang, Y., Krishnan, D., Tenenbaum, J.B., Isola, P.: Rethinking few-shot image classification: a good embedding is all you need? In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, Proceedings, Part XIV 16, pp. 266–282 (2020)
Dhillon, G.S., Chaudhari, P., Ravichandran, A., Soatto, S.: A baseline for few-shot image classification. arXiv preprint arXiv:1909.02729 (2019)
Raghu, A., Raghu, M., Bengio, S., Vinyals, O.: Rapid learning or feature reuse? Towards understanding the effectiveness of MAML. arXiv preprint arXiv:1909.09157 (2019)
Chen, Y., Liu, Z., Xu, H., Darrell, T., Wang, X.: Meta-baseline: exploring simple meta-learning for few-shot learning. In: International Conference on Computer Vision (2021)
Su, J.-C., Maji, S., Hariharan, B.: When does self-supervision improve few-shot learning? In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, Proceedings, Part VII 16, pp. 645–666 (2020)
Guo, Y., Codella, N.C., Karlinsky, L., Codella, J.V., Smith, J.R., Saenko, K., Rosing, T., Feris, R.: A broader study of cross-domain few-shot learning. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, Proceedings, Part XXVII 16, pp. 124–141 (2020)
Goyal, S., Kumar, A., Garg, S., Kolter, Z., Raghunathan, A.: Finetune like you pretrain: improved finetuning of zero-shot vision models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19338–19347 (2023)
Bendou, Y., Hu, Y., Lafargue, R., Lioi, G., Pasdeloup, B., Pateux, S., Gripon, V.: Easy—ensemble augmented-shot-y-shaped learning: state-of-the-art few-shot classification with simple components. J. Imaging 8(7), 179 (2022)
Article Google Scholar
Carlucci, F.M., D’Innocente, A., Bucci, S., Caputo, B., Tommasi, T.: Domain generalization by solving jigsaw puzzles. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2229–2238 (2019)
Goyal, P., Mahajan, D., Gupta, A., Misra, I.: Scaling and benchmarking self-supervised visual representation learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6391–6400 (2019)
Kolesnikov, A., Zhai, X., Beyer, L.: Revisiting self-supervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1920–1929 (2019)
Kang, D., Koniusz, P., Cho, M., Murray, N.: Distilling self-supervised vision transformers for weakly-supervised few-shot classification & segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19627–19638 (2023)
Lim, J.Y., Lim, K.M., Ooi, S.Y., Lee, C.P.: Efficient-prototypicalNet with self knowledge distillation for few-shot learning. Neurocomputing 459, 327–337 (2021)
Article Google Scholar
Liu, S., Wang, Y.: Few-shot learning with online self-distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1067–1070 (2021)
Zhao, Q., Liu, B., Lyu, S., Chen, H.: A self-distillation embedded supervised affinity attention model for few-shot segmentation. IEEE Trans. Cognit. Dev. Syst. 16, 177 (2023)
Article Google Scholar
Dumoulin, V., Houlsby, N., Evci, U., Zhai, X., Goroshin, R., Gelly, S., Larochelle, H.: Comparing transfer and meta learning approaches on a unified few-shot classification benchmark. arXiv preprint arXiv:2104.02638 (2021)
Elaraby, N., Barakat, S., Rezk, A.: A conditional GAN-based approach for enhancing transfer learning performance in few-shot HCR tasks. Sci. Rep. 12(1), 16271 (2022)
Article Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4133–4141 (2017)
Stanton, S., Izmailov, P., Kirichenko, P., Alemi, A.A., Wilson, A.G.: Does knowledge distillation really work? Adv. Neural Inf. Process. Syst. 34, 6906–6919 (2021)
Google Scholar
Zheng, Z., Peng, X.: Self-guidance: improve deep neural network generalization via knowledge distillation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp.3203–3212 (2022)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vis. 129, 1789–1819 (2021)
Article Google Scholar
Li, Y., Gong, Y., Zhang, Z.: Few-shot object detection based on self-knowledge distillation. IEEE Intell. Syst. (2022). https://doi.org/10.1109/MIS.2022.3205686
Article Google Scholar
Shen, C., Wang, X., Yin, Y., Song, J., Luo, S., Song, M.: Progressive network grafting for few-shot knowledge distillation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2541–2549 (2021)
Liu, B., Rao, Y., Lu, J., Zhou, J., Hsieh, C.-J.: Metadistiller: network self-boosting via meta-learned top-down distillation. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, 2020, Proceedings, Part XIV 16, pp. 694–709. Springer (2020)
Zhao, B., Cui, Q., Song, R., Qiu, Y., Liang, J.: Decoupled knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11953–11962 (2022)
Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018)
Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015)
Caron, M., Bojanowski, P., Joulin, A., Douze, M.: Deep clustering for unsupervised learning of visual features. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 132–149 (2018)
Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, Proceedings, Part IV 14, pp. 577–593 (2016)
Mazumder, P., Singh, P., Namboodiri, V.P.: Few-shot image classification with composite rotation based self-supervised auxiliary task. Neurocomputing 489, 179–195 (2022)
Article Google Scholar
Yu, C.-N., Xie, Y.: A study on representation transfer for few-shot learning. arXiv preprint arXiv:2209.02073 (2022)
Ji, H., Yang, H., Gao, Z., Li, C., Wan, Y., Cui, J.: Few-shot scene classification using auxiliary objectives and transductive inference. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2022)
Article Google Scholar
Sorscher, B., Ganguli, S., Sompolinsky, H.: Neural representational geometry underlies few-shot concept learning. Proc. Natl. Acad. Sci. 119(43), 2200800119 (2022)
Article MathSciNet Google Scholar
Oreshkin, B., Rodríguez López, P., Lacoste, A.: TADAM: task dependent adaptive metric for improved few-shot learning. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Przewiezlikowski, M., Przybysz,P., Tabor, J., Zieba, M., Spurek, P.: HyperMAML: few-shot adaptation of deep models with hypernetworks. arXiv preprint arXiv:2205.157456(7) (2022)
Qiao, L., Shi, Y., Li, J., Wang, Y., Huang, T., Tian, Y.: Transductive episodic-wise adaptive metric for few-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3603–3612 (2019)
Jamal, M.A., Qi, G.-J.: Task agnostic meta-learning for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11719–11727 (2019)
Snell, J., Zemel,R.: Bayesian few-shot classification with one-vs-each pólya-gamma augmented gaussian processes. arXiv preprint arXiv:2007.10417 (2020)
Oh, J., Yoo, H., Kim, C., Yun, S.-Y.: Boil: towards representation change for few-shot learning. arXiv preprint arXiv:2008.08882 (2020)
Patacchiola, M., Turner, J., Crowley, E.J., O’Boyle, M., Storkey, A.J.: Bayesian meta-learning for the few-shot setting via deep kernels. Adv. Neural Inf. Process. Syst. 33, 16108–16118 (2020)
Google Scholar
Fan, C., Ram, P., Liu, S.: Sign-MAML: efficient model-agnostic meta-learning by signSGD. arXiv preprint arXiv:2109.07497 (2021)
Sendera, M., Przewieźlikowski, M., Miksa, J., Rajski, M., Karanowski, K., Zieba, M., Tabor, J., Spurek, P.: The general framework for few-shot learning by kernel hypernetworks. Mach. Vis. Appl. 34(4), 53 (2023)
Article Google Scholar
Chen, H., Li, H., Li, Y., Chen, C.: Sparse spatial transformers for few-shot learning. Sci. China Inf. Sci. 66(11), 210102 (2023)
Article Google Scholar
Huisman, M., Moerland, T.M., Plaat, A., Rijn, J.N.: Are LSTMS good few-shot learners? Mach. Learn. 112(11), 4635–4662 (2023)
Article MathSciNet Google Scholar
Mishra, N., Rohaninejad, M., Chen, X., Abbeel, P.: A simple neural attentive meta-learner. arXiv preprint arXiv:1707.03141 (2017)
Sun, Q., Liu, Y., Chua, T.-S., Schiele, B.: Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 403–412 (2019)
Xue, W., Wang, W.: One-shot image classification by learning to restore prototypes. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 6558–6565 (2020)
Chen, Z., Ge, J., Zhan, H., Huang, S., Wang, D.: Pareto self-supervised training for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13663–13672 (2021)
Afrasiyabi, A., Lalonde, J.-F., Gagné, C.: Mixture-based feature space learning for few-shot image classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9041–9051 (2021)
Shen, Z., Liu, Z., Qin, J., Savvides, M., Cheng, K.-T.: Partial is better than all: revisiting fine-tuning strategy for few-shot learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 9594–9602 (2021)
Lu, Y., Wen, L., Liu, J., Liu, Y., Tian, X.: Self-supervision can be a good few-shot learner. In: European Conference on Computer Vision, pp. 740–758. Springer (2022)
Lazarou, M., Stathaki, T., Avrithis, Y.: Tensor feature hallucination for few-shot learning. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3500–3510 (2022)
Liu, Y., Zhang, W., Xiang, C., Zheng, T., Cai, D., He, X.: Learning to affiliate: mutual centralized learning for few-shot classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14411–14420 (2022)
Dai, L., Feng, L., Shang, X., Su, H.: Cross modal adaptive few-shot learning based on task dependence. Chin. J. Electron. 32(1), 85–96 (2023)
Article Google Scholar
Wang, X., Wang, X., Jiang, B., Luo, B.: Few-shot learning meets transformer: unified query-support transformers for few-shot classification. IEEE Trans. Circuits Syst. Video Technol. 33, 7789 (2023)
Article Google Scholar
Lu, J., Jin, S., Liang, J., Zhang, C.: Robust few-shot learning for user-provided data. IEEE Trans. Neural Netw. Learn. Syst. 32(4), 1433–1447 (2020)
Article Google Scholar
Fan, C., Huang, J.: Federated few-shot learning with adversarial learning. In: 2021 19th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp. 1–8. IEEE (2021)
Nikpour, B., Armanfard, N.: Explainable attention for few-shot learning and beyond. arXiv preprint arXiv:2310.07800 (2023)
Lee, K., Maji, S., Ravichandran, A., Soatto, S.: Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10657–10665 (2019)
Kim, J., Kim, H., Kim, G.: Model-agnostic boundary-adversarial sampling for test-time generalization in few-shot learning. In: Computer Vision—ECCV 2020: 16th European Conference, Glasgow, Proceedings, Part I 16, pp. 599–617 (2020)
Lee, H., Hwang, S.J., Shin, J.: Self-supervised label augmentation via input transformations. In: International Conference on Machine Learning, pp. 5714–5724 (2020)
Liu, C., Fu, Y., Xu, C., Yang, S., Li, J., Wang, C., Zhang, L.: Learning a few-shot embedding model with contrastive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 8635–8643 (2021)
Ye, H.-J., Han, L., Zhan, D.-C.: Revisiting unsupervised meta-learning via the characteristics of few-shot tasks. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3721–3737 (2022)
Google Scholar
Chen, K., Lee, C.-G.: Unsupervised few-shot learning via deep Laplacian eigenmaps. arXiv preprint arXiv:2210.03595 (2022)
Zhang, R., Yang, Y., Li, Y., Wang, J., Li, H., Miao, Z.: Multi-task few-shot learning with composed data augmentation for image classification. IET Comput. Vis. 17(2), 211–221 (2023)

Download references

Acknowledgements

This paper is partially supported by the Young and Middle-aged Science and Technology Talents Promotion Project of Qinghai Province (2022), Science and Technology Project of Qinghai Province (No. 2023-QY-208), National Natural Science Foundation of China (Nos. 62062059, 62162053, 42265010).

Author information

Authors and Affiliations

Department of Computer Technology and Applications, Qinghai University, Xining, 810016, China
Chenyu Ma, Jinfang Jia, Jianqiang Huang, Li Wu & Xiaoying Wang
Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
Jinfang Jia

Authors

Chenyu Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jinfang Jia
View author publications
You can also search for this author in PubMed Google Scholar
Jianqiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Li Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoying Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinfang Jia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, C., Jia, J., Huang, J. et al. DisRot: boosting the generalization capability of few-shot learning via knowledge distillation and self-supervised learning. Machine Vision and Applications 35, 51 (2024). https://doi.org/10.1007/s00138-024-01529-z

Download citation

Received: 12 April 2023
Revised: 28 February 2024
Accepted: 10 March 2024
Published: 09 April 2024
DOI: https://doi.org/10.1007/s00138-024-01529-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DisRot: boosting the generalization capability of few-shot learning via knowledge distillation and self-supervised learning

Abstract

Access this article

Similar content being viewed by others

Knowledge Distillation: A Survey

Learning with Noisy Correspondence

Learning to Prompt for Vision-Language Models

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DisRot: boosting the generalization capability of few-shot learning via knowledge distillation and self-supervised learning

Abstract

Access this article

Similar content being viewed by others

Knowledge Distillation: A Survey

Learning with Noisy Correspondence

Learning to Prompt for Vision-Language Models

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation