Optimization model based on attention mechanism for few-shot image classification

Liao, Ruizhi; Zhai, Junhai; Zhang, Feng

doi:10.1007/s00138-023-01502-2

Optimization model based on attention mechanism for few-shot image classification

Research
Published: 19 January 2024

Volume 35, article number 19, (2024)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Ruizhi Liao¹,
Junhai Zhai¹ &
Feng Zhang¹

346 Accesses
Explore all metrics

Abstract

Deep learning has emerged as the leading approach for pattern recognition, but its reliance on large labeled datasets poses challenges in real-world applications where obtaining annotated samples is difficult. Few-shot learning, inspired by human learning, enables fast adaptation to new concepts with limited examples. Optimization-based meta-learning has gained popularity as a few-shot learning method. However, it struggles with capturing long-range dependencies of gradients and has slow convergence rates, making it challenging to extract features from limited samples. To overcome these issues, we propose MLAL, an optimization model based on attention for few-shot learning. The model comprises two parts: the attention-LSTM meta-learner, which optimizes gradients hierarchically using the self-attention mechanism, and the cross-attention base-learner, which uses the cross-attention mechanism to cross-learn the common category features of support and query sets in a meta-task. Extensive experiments on two benchmark datasets show that MLAL achieves exceptional 1-shot and 5-shot classification accuracy on MiniImagenet and TiredImagenet. The codes for our proposed method are available at https://github.com/wflrz123/MLAL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

Knowledge Distillation: A Survey

Article 22 March 2021

Deep Learning for Generic Object Detection: A Survey

Article Open access 31 October 2019

References

He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Wang, L., Zhang, L., Qi, X., et al.: Deep attention-based imbalanced image classification. IEEE Trans. Neural Netw. Learn. Syst. 33(8), 3320–3330 (2022). https://doi.org/10.1109/TNNLS.2021.3051721
Article MathSciNet Google Scholar
Liu, Z., Miao, Z., Zhan, X., et al.: Open long-tailed recognition in a dynamic world. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3200091
Article Google Scholar
Zhai, M.Y., Chen, L., Mori, G.: Hyper-Lifelong AN: scalable lifelong learning for image conditioned generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2021), pp. 2246–2255 (2021)
Zhai, M., Chen, L., Tung, F., et al.: Lifelong GAN: continual learning for conditional image generation. In: IEEE/CVF International Conference on Computer Vision (ICCV) 2019, 2759–2768 (2019). https://doi.org/10.1109/ICCV.2019.00285
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA
Zhang, Z., Wu, Y., Zhou, J., et al.: SG-net: syntax guided transformer for language representation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 6, pp. 3285–3299, 1 June 2022, https://doi.org/10.1109/TPAMI.2020.3046683
Lu, Y., Zhang, J., Zeng, J., et al.: Attention analysis and calibration for transformer in natural language generation. IEEE/ACM Trans. Audio Speech Lang. Process. 30, 1927–1938 (2022). https://doi.org/10.1109/TASLP.2022.3180678
Article Google Scholar
Wang, Y., Yao, Q., Kwok, J. T.: et al. Generalizing from a few examples: a survey on few-shot learning. In: ACM Computing Surveys, Volume 53, Issue 3, 2021, Article No. 63, pp 1–34. https://doi.org/10.1145/3386252
Lai, N., Kan, M., Han, C., et al.: Learning to learn adaptive classifier-predictor for few-shot learning. IEEE Trans. Neural Netw. Learn. Syst. 32(8), 3458–3470 (2021). https://doi.org/10.1109/TNNLS.2020.3011526
Article Google Scholar
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA
Hospedales, T., Antoniou, A., Micaelli, P., et al.: Meta-learning in neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(9), 5149–5169 (2022). https://doi.org/10.1109/TPAMI.2021.3079209
Tian, Y., Zhao, X., Huang, W.: Meta-learning approaches for learning-to-learn in deep learning: a survey. Neurocomputing 494(14), 203–223 (2022)
Article Google Scholar
Huisman, M., van Rijn, J.N., Plaat, A.: A survey of deep meta-learning. Artif. Intell. Rev. 54(6), 4483–4541 (2021). https://doi.org/10.1007/s10462-021-10004-4
Article Google Scholar
Andrychowicz, M., Denil, M., Gomez, S., et al.: Learning to learn by gradient descent by gradient descent. Adv. Neural. Inf. Process. Syst. 29, 3–17 (2016)
Google Scholar
Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. USA 79(8), 2554–2558 (1982)
Article MathSciNet Google Scholar
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: The 5th International Conference on Learning Representations (ICLR 2017), Toulon, France, April 24–26 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 735–1780 (1997)
Article Google Scholar
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: The 9th International Conference on Learning Representations (ICLR 2021), Vienna, Austria (2021)
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 2015. JMLR: W &CP volume 37
Vinyals, O., Blundell, C., Lillicrap, T., et al.: Matching networks for one shot learning. In: 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA
Sung, F., Yang, Y., Zhang, L., et al.: Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)
Peng, H.: A comprehensive overview and survey of recent advances in meta-learning. arxiv:2004.11149
Gao, F., Luo, X., Yang, Z., et al.: Label smoothing and task-adaptive loss function based on prototype network for few-shot learning. Neural Netw. 156, 39–48 (2022)
Article Google Scholar
Blaes, S., Burwick, T.: Few-shot learning in deep networks through global prototyping. Neural Netw. 94, 159–172 (2017)
Article Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, 1126–1135 (2017)
Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. arxiv:1803.02999
Fei, N., Lu, Z., Xiang, T., et al.: MELR: meta-learning via modeling episode-level relationships for few-shot learning. In: The 8th International Conference on Learning Representations (ICLR 2020), Virtual Conference, Formerly Addis Ababa Ethiopia, Apr 26th through May 1st (2020)
Guo, M.H., Xu, T.X., Liu, J.J., et al.: Attention mechanisms in computer vision: a survey. Comput. Visual Media 8, 331–368 (2022). https://doi.org/10.1007/s41095-022-0271-y
Article Google Scholar
Lei, J., Zhu, X., Wang, Y.: BAT: block and token self-attention for speech emotion recognition. Neural Netw. 156, 67–80 (2022)
Article Google Scholar
Li, J., Wang, H., Wu, K., et al.: Cross-attention-map-based regularization for adversarial domain adaptation. Neural Netw. 145, 128–138 (2022)
Article Google Scholar
Hou, R., Chang, H., Ma, B., et al.: Cross attention network for few-shot classification. In: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada
Qin, Z., Wang, H., Mawuli, C.B., et al.: Multi-instance attention network for few-shot learning. Inf. Sci. 611, 464–475 (2022)
Article Google Scholar
Wang, Y., Yan, J., Ye, X., et al.: Few-shot transfer learning with attention mechanism for high-voltage circuit breaker fault diagnosis. IEEE Trans. Ind. Appl. 58(3), 3353–3360 (2022). https://doi.org/10.1109/TIA.2022.3159617
Lee, E., Huang, C.H., Lee, C.Y.: Few-shot and continual learning with attentive independent mechanisms. In: IEEE/CVF International Conference on Computer Vision (ICCV) 2021, 9435–9444 (2021). https://doi.org/10.1109/ICCV48922.2021.00932
Liu, Y., Zhang, H., Yang, Y.: Few-shot image classification based on asymmetric convolution and attention mechanism. In: 2022 4th International Conference on Natural Language Processing (ICNLP), Xi’an, China, pp. 217–222 (2022)
Afrasiyabi, A., Lalonde, J.F., Gagné, C.: Mixture-based feature space learning for few-shot image classification. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021, pp. 9021–9031, https://doi.org/10.1109/ICCV48922.2021.00891
Afrasiyabi, A., Larochelle, H., Lalonde, J.F., et al.: Matching feature sets for few-shot image classification. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 9004–9014 (2022), https://doi.org/10.1109/CVPR52688.2022.00881
Shi, B., Li, W., Huo, J., et al.: Global- and local-aware feature augmentation with semantic orthogonality for few-shot image classification. Pattern Recognit. 142, 109702 (2023)
Article Google Scholar
Hao, F., He, F., Liu, L., et al.: Class-aware patch embedding adaptation for few-shot image classification. In: 2023 International Conference on Computer Vision (ICCV 2023), Paris, France. October 2–3 (2023)
Ma, T., Sun, Y., Yang, Z., et al.: ProD: prompting-to-disentangle domain knowledge for cross-domain few-shot image classification. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, pp. 19754–19763 (2023)
Dong, J., Wang, Y., Lai, J., et al.: Improving adversarially robust few-shot image classification with generalizable representations. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 9015–9024 (2022)
Zhu, Y., Liu, C., Jiang, S., et al.: Multi-attention meta learning for few-shot fine-grained image recognition. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI2020), January 2021 Article No.: 152, pp. 1090–1096
Li, J., Tong, J., Gao, G., et al.: Attention-enhanced relation network for few-shot image classification. In: ICIGP’23: Proceedings of the 2023 6th International Conference on Image and Graphics Processing. pp. 197–203 (2023)
Meng, X., Wang, X., Yin, S., et al.: Few-shot image classification algorithm based on attention mechanism and weight fusion. J. Eng. Appl. Sci. 70, 1–14 (2023)
Article Google Scholar
Fu, W., Zhou, L., Chen, J.: Bidirectional matching prototypical network for few-shot image classification. IEEE Signal Process. Lett. 29, 982–986 (2022). https://doi.org/10.1109/LSP.2022.3152686
Article Google Scholar
Jia, J., Feng, X., Yu, H.: Few-shot classification via efficient meta-learning with hybrid optimization. In: Engineering Applications of Artificial Intelligence, 127(Part A), 107296 (2024)
Shi, M., Ren, J.: A lightweight dense relation network with attention for hyperspectral image few-shot classification. In: Engineering Applications of Artificial Intelligence, 126(Part C), 106993 (2023)
Ren, M., Triantafillou, E., Ravi, S., et al.: Meta-learning for semi-supervised few-shot classification. In: The 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada, April 30–May 3 (2018)
Russakovsky, O., Deng, J., Su, H., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, pp. 211–252 (2015)
Lin, Z., Li, M., Zheng, Z., et al.: Self-attention ConvLSTM for spatiotemporal prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 11531–11538 (2020)
Zhong, Z., Zheng, L., Kang, G., et al.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence. 34(07), 13001–13008 (2020)

Download references

Acknowledgements

This research is supported by the Natural Key R &D Plan Project of China (2022YFE0196100), by the Key R &D program of science and technology foundation of Hebei Province, China (19210310D), by the Natural science foundation of Hebei Province, China (F2021201020), and by the Innovation Capacity Enhancement Program-Science and Technology Platform Project, Hebei Province (22567623H).

Author information

Authors and Affiliations

Hebei Key Laboratory of Machine Learning and Computational Intelligence, College of Mathematics and Information Science, Hebei University, Baoding, 071002, Hebei, China
Ruizhi Liao, Junhai Zhai & Feng Zhang

Authors

Ruizhi Liao
View author publications
You can also search for this author in PubMed Google Scholar
Junhai Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Feng Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junhai Zhai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liao, R., Zhai, J. & Zhang, F. Optimization model based on attention mechanism for few-shot image classification. Machine Vision and Applications 35, 19 (2024). https://doi.org/10.1007/s00138-023-01502-2

Download citation

Received: 18 June 2023
Revised: 22 October 2023
Accepted: 18 December 2023
Published: 19 January 2024
DOI: https://doi.org/10.1007/s00138-023-01502-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimization model based on attention mechanism for few-shot image classification

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

Knowledge Distillation: A Survey

Deep Learning for Generic Object Detection: A Survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimization model based on attention mechanism for few-shot image classification

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

Knowledge Distillation: A Survey

Deep Learning for Generic Object Detection: A Survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation