Skip to main content
Log in

Attentive fine-grained recognition for cross-domain few-shot classification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Cross-domain few-shot classification aims to recognize images in the new categories and domains that only contain few but unacquainted images. Considering the problems of fine-grained recognition in cross-domain few-shot classification including marginal overall-discrepancy in feature distribution and obvious fine-grained difference in dataset, this paper proposes a simple and effective attentive fine-grained recognition (AFGR) model. Specifically, the residual attention module is stacked into the feature encoder based on the residual network, which can linearly enhance different semantic feature information to help the metric function better locate the fine-grained feature information of the image. In addition, a bilinear metric function structure is proposed to learn and fuse different fine-grained image features, respectively, since the weights of bilinear measurement functions are not shared. Eventually, the final classification result is obtained by merging the recognition of bilinear metric function through posterior probability multiplication. In this paper, ablation experiments and comparative experiments are carried out with the typical few-shot dataset mini-ImageNet as the training domain and the CUB, Cars, Places and Plantae datasets as the test domain. The experimental results demonstrate that the proposed AFGR method is effective, with the highest increase in recognition accuracy 13.82% and 7.95% compared with the latest results under the experimental settings of 5-way1-shot and 5-way5-shot, respectively, which also proves the problems of fine-grained recognition in cross-domain small sample classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Wang Y, Yao Q, Kwok J, Ni L (2019) Generalizing from a few examples: a survey on few-Shot learning. In: Acm computing surveys, 53(3), article 63.

  2. Jia D, Wei D, Richard S, Li J, Kai L, Li F (2009) ImageNet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, CVPR, pp 248–255.

  3. Krizhevsky A, Sutskever I, Hinton G E (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems, NeurIPS, 25(2), pp 1097–1105.

  4. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, CVPR, pp 770–778.

  5. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-Scale image recognition. In: International conference on learning representations, pp 1–14.

  6. Li W, Wu Z, Zhang J, Ren T, Li F (2020) LGSim: local task-invariant and global task-specific similarity for few-shot classification. In: Neural computing and applications, 32(11).

  7. Wang P, Cheng J, Hao F, Wang L, Feng W (2020) Embedded adaptive cross-modulation neural network for few-shot learning. In: Neural computing and applications, 32(2).

  8. Zhu K, Wang J, Liu Y (2020) Radar target recognition algorithm based on data augmentation and WACGAN with a limited training data. In: Acta electronica sinica, 48(06), pp 1124–1131. (in Chinese)

  9. Shu Y, Mao L, Chen S, Yan Y (2020) Self-supervised learning and generative adversarial network-based facial attribute recognition with small sample size training. In: Journal of image and graphics. 25(11), pp 2391–2403. (in Chinese)

  10. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, 86(11), pp 2278–2324.

  11. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition, CVPR, pp 1–9.

  12. Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: IEEE conference on computer vision and pattern recognition, CVPR, pp 5987–5995.

  13. Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) One-shot learning with memory-augmented neural networks. In: Proceedings of the 33rd international conference on international conference on machine learning, ICML, 48, pp 1842–1850.

  14. Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: 5th international conference on learning representations, ICLR.

  15. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning, ICML, 70, pp 1126–1135.

  16. Koch G, Zemel R, Ruslan S (2015) Siamese neural networks for one-shot image recognition. In: Proceedings of international conference on machine learning deep learning workshop, pp 2.

  17. Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. In: Proceedings of the 30th international conference on neural information processing systems, NeurIPS, pp 3637–3645.

  18. Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Proceedings of the 31nd international conference on neural information processing systems, NeurIPS, pp 4080–4090.

  19. Sung F, Yang Y, Zhang L, Xiang T, Torr P, Hospedales T (2018) Learning to compare: relation network for few-shot learning. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR, pp 1199–1208.

  20. Garcia V, Bruna J (2017) Few-shot learning with graph neural networks. http://arxiv.org/abs/1711.04043

  21. Lake B, Salakhutdinov R, Tenenbaum J (2015) Human-level concept learning through probabilistic program induction. In: Science, 350(6266), pp 1332–1338.

  22. Chen W, Liu Y, Kira Z, Wang Y, Huang J (2019) A closer look at few-shot classification. In: 7th international conference on learning representations, ICLR.

  23. Guo Y, Codella N, Karlinsky L, Codella J Smith J, Saenko K, Rosing T, Feris R (2020) A broader study of cross-domain few-shot learning. In: European conference on computer vision, ECCV, pp 124–141.

  24. Shi B, Sun M, Puvvada K, Kac C, Matsoukas S, Wang C (2020) Few-shot acoustic event detection via meta-learning. In: IEEE international conference on acoustics, speech and signal processing, ICASSP, pp 76–80.

  25. Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-UCSD birds 200. In: Technical report CNS-TR-2010–001.

  26. Jonathan K, Michael S, Jia D, Li F (2013) 3D object representations for fine-grained categorization. In: IEEE international conference on computer vision workshops, pp 554–561.

  27. Tseng H, Lee H, Huang J, Yang M (2020) Cross-domain few-shot classification via learned feature-wise transformation. In: 8th international conference on learning representations, ICLR.

  28. Lin T, RoyChowdhury A, Maji S (2015) Bilinear CNN models for fine-grained visual recognition. In: IEEE international conference on computer vision, ICCV, pp 1449–1457.

  29. Ge s, Gao Z, Zhang B, Li P (2019) Kernelized bilinear CNN models for fine-grained visual recognition. In: Acta electronica sinica, 47(10), pp 2134–2141. (in Chinese)

  30. Shen H, Yang X, Wang L, Pan C (2017) Hierarchical B-CNN model guided by classification error for fine-grained classification. In: Journal of image and graphics, 22(07), pp 906–914. (in Chinese)

  31. Lyu Z, Deng T, Zhang L (2020) Cross-grained recognition method for aircraft sheet metal parts based on machine vision. In: Chinese journal of scientific instrument, 41(02), pp 195–204. (in Chinese)

  32. Fu J, Zheng H, Mei T (2017) Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition. In: IEEE conference on computer vision and pattern recognition, CVPR, pp 4476–4484.

  33. Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: IEEE conference on computer vision and pattern recognition, CVPR, pp 6450–6458.

  34. Hu J, Shen L, Albanie S, Sun G, Wu E (2019) Squeeze-and-excitation networks. In: IEEE transactions on pattern analysis and machine intelligence, 42(8), pp 2011–2023.

  35. Woo S, Park J, Lee J, Kweon S (2018) CBAM: convolutional block attention module. In: European conference on computer vision, pp 3–19.

  36. Gao Z, Xie J, Wang Q, Li P (2019) Global second-order pooling convolutional networks. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR, pp 3019–3028.

  37. Wei X, Wu J, Cui Q (2019) Deep learning for fine-grained image analysis: a survey. http://arxiv.org/abs/1907.03069

  38. Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2018) Places: a 10 million image database for scene recognition. In: IEEE transactions on pattern analysis and machine intelligence, 40(6), pp 1452–1464.

  39. Horn G, Aodha O, Song Y, Cui Y, Sun C, Shepard A, Adam H, Perona P, Belongie S (2018) The iNaturalist species classification and detection dataset. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR, pp 8769–8778.

  40. Hilliard N, Phillips L, Howland S, Yankov A, Corley C, Hodas N (2018) Few-shot learning with metric-agnostic conditional embeddings. http://arxiv.org/abs/1802.04376

  41. Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR, pp 4367–4375.

  42. Lifchitz Y, Avrithis Y, Picard S, Bursuc A (2019) Dense classification and implanting for few-shot learning. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR, pp 9250–9259.

  43. Rusu A, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2019) Meta-learning with latent embedding optimization. In: 7th international conference on learning representations, ICLR.

  44. Oreshkin B, Rodriguez P,Lacoste A (2018) Tadam: Task dependent adaptive metric for improved few-shot learning. In: Proceedings of the 32nd international conference on neural information processing systems, NeurIPS, pp 719–729.

  45. Antoniou A, Edwards H, Storkey A (2018) How to train your maml. http://arxiv.org/abs/1810.09502

  46. Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR, pp 10649–10657.

  47. Zhang H, Koniusz P (2019) Power normalizing second-order similarity network for few-shot learning. In: IEEE winter conference on applications of computer vision, WACV, pp 1185–1193.

  48. Gidaris S, Bursuc A, Komodakis N, P´erez P, Cord M (2019) Boosting few-shot visual learning with self-supervision. In: IEEE/CVF international conference on computer vision, ICCV, pp 8058-8067.

  49. Ye H, Hu H, Zhan D, Sha F (2020) Few-shot learning via embedding adaptation with set-to-set functions. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR, pp 8805-8814.

  50. Guo Y, Cheung N (2020) Attentive weights generation for few-shot learning via information maximization. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR, pp 13496-13505.

  51. Zhang H, Koniusz P, Jian S, Li H, Torr P (2021) Rethinking Class Relations: Absolute-relative supervised and unsupervised few-shot learning. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR.

  52. Chen Z, Ge J, Zhan H, Huang S, Wang D (2021) Pareto Self-supervised training for few-shot learning. In: IEEE/CVF conference on computer vision and pattern recognition, CVPR.

Download references

Acknowledgements

This work is supported by the Beijing Natural Science Foundation (4202015) and the Liaoning Natural Science Foundation (2020-KF-23-06).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chongchong Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sa, L., Yu, C., Ma, X. et al. Attentive fine-grained recognition for cross-domain few-shot classification. Neural Comput & Applic 34, 4733–4746 (2022). https://doi.org/10.1007/s00521-021-06627-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06627-x

Keywords

Navigation