Abstract
Deep learning has been successfully exploited to various computer vision tasks, which depend on abundant annotations. The core goal of few-shot learning, in contrast, is to learn a classifier to recognize new classes from only a few labeled examples that produce a key challenge of visual recognition. However, most of the existing methods often adopt image-level features or local monodirectional manner-based similarity measures, which suffer from the interference of non-dominant objects. To tackle this limitation, we propose a Bi-Directional Local Alignment (BDLA) approach for the few-shot visual classification problem. Specifically, building upon the episodic learning mechanism, we first adopt a shared embedding network to encode the 3D tensor features with semantic information, which can effectively describe the spatial geometric representation of the image. Afterwards, we construct a forward and a backward distance by exploring the nearest neighbor search to determine the semantic region-wise feature corresponding to each local descriptor of query sets and support sets. The bi-directional distance can encourage the alignment between similar semantic information while filtering out the interference information. Finally, we design a convex combination to merge the bi-directional distance and optimize the network in an end-to-end manner. Extensive experiments also show that our proposed approach outperforms several previous methods on four standard few-shot classification datasets.
Similar content being viewed by others
References
Lake B, Salakhutdinov R, Gross J, Tenenbaum J (2011) One shot learning of simple visual concepts. In: Proceedings of the annual meeting of the cognitive science society, vol 33
Fe-Fei L, et al. (2003) A bayesian approach to unsupervised one-shot learning of object categories. In: Proceedings ninth IEEE international conference on computer vision, IEEE, pp 1134– 1141
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. arXiv:1703.03400
Cai Q, Pan Y, Yao T, Yan C, Mei T (2018) Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4080–4088
Jamal MA, Qi G-J (2019) Task agnostic meta-learning for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11719–11727
Li W, Wang L, Xu J, Huo J, Gao Y, Luo J (2019) Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7260–7268
Qiao L, Shi Y, Li J, Wang Y, Huang T, Tian Y (2019) Transductive episodic-wise adaptive metric for few-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 3603–3612
Zhang H, Zhang J, Koniusz P (2019) Few-shot learning via saliency-guided hallucination of samples. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2770–2779
Zheng Z, Feng X, Yu H, Gao M (2022) Cooperative density-aware representation learning for few-shot visual recognition. Neurocomputing 471:208–218
Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks. In: International conference on machine learning, pp 1842–1850
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML deep learning workshop, vol 2, Lille
Sung F, Yang Y, Zhang L, Xiang T, Torr Philip HS, Hospedales T M (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208
Hao F, He F, Cheng J, Wang L, Cao J, Tao D (2019) Collect and select: Semantic alignment metric learning for few-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 8460–8469
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Vinyals O, Blundell C, Lillicrap T, Wierstra D, et al. (2016) Matching networks for one shot learning. In: Advances in neural information processing systems, pp 3630–3638
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, pp 4077–4087
Gao T, Han X, Liu Z, Sun M (2019) Hybrid attention-based prototypical networks for noisy few-shot relation classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 6407–6414
Wu Z, Li Y, Guo L, Jia K (2019) Parn: Position-aware relation networks for few-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 6659–6667
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Khosla A, Jayadevaprakash N, Yao B, Li F-F (2011) Novel dataset for fine-grained image categorization: Stanford dogs. In: Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC), vol 2
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561
Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-ucsd birds 200
Ravi S, Larochelle H (2016) Optimization as a model for few-shot learning
Li Z, Zhou F, Chen F, Li H (2017) Meta-sgd: Learning to learn quickly for few-shot learning. arXiv:1707.09835
Nichol A, Achiam J, Schulman J (2018) On first-order meta-learning algorithms. arXiv:1803.02999
Zhang R, Che T, Ghahramani Z, Bengio Y, Song Y (2018) Metagan: An adversarial approach to few-shot learning. In: Advances in neural information processing systems, pp 2365–2374
Li K, Zhang Y, Li K, Fu Y (2020) Adversarial feature hallucination networks for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13470–13479
Wang Y-X, Girshick R, Hebert M, Hariharan B (2018) Low-shot learning from imaginary data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7278–7286
Garcia V, Bruna J (2017) Few-shot learning with graph neural networks. arXiv:1711.04043
Boiman O, Shechtman E, Irani M (2008) In defense of nearest-neighbor based image classification. In: 2008 IEEE Conference on computer vision and pattern recognition, IEEE, pp 1–8
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv:1412.6980
Ren M, Triantafillou E, Ravi S, Snell J, Swersky K, Tenenbaum JB, Larochelle H, Zemel RS (2018) Meta-learning for semi-supervised few-shot classification. arXiv:1803.00676
Ravichandran A, Bhotika R, Soatto S (2019) Few-shot learning with embedded class models and shot-free meta training. In: Proceedings of the IEEE international conference on computer vision, pp 331–339
Huang H, Zhang J, Zhang J, Wu Q, Xu J (2019) Compare more nuanced: Pairwise alignment bilinear network for few-shot fine-grained learning. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), IEEE, pp 91– 96
Liu Y, Lee J, Park M, Kim S, Yang E, Hwang SJ, Yang Y (2018) Learning to propagate labels: Transductive propagation network for few-shot learning. arXiv:1805.10002
Triantafillou E, Zemel R, Urtasun R (2017) Few-shot learning through an information retrieval lens. In: Advances in neural information processing systems, pp 2255–2265
Bertinetto L, Henriques J F, Torr Philip HS, Vedaldi A (2018) Meta-learning with differentiable closed-form solvers. arXiv:1805.08136
Wei X-S, Wang P, Liu L, Shen C, Wu J (2019) Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples. IEEE Trans Image Process 28(12):6116–6125
Wang H, Wang Y, Zhou Z, Ji X, Gong D, Zhou J, Li Z, Liu W (2018) Cosface: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5265–5274
Hou R, Chang H, Ma B, Shan S, Chen X (2019) Cross attention network for few-shot classification. In: Proceedings of the 33rd international conference on neural information processing systems, pp 4003–4014
Chen W-Y, Liu Y-C, Kira Z, Wang Y-C F, Huang J-B (2019) A closer look at few-shot classification. In: International conference on learning representations
Acknowledgements
This work was supported in part by the Key Program of the National Natural Science Foundation of China under Grant No. 62136003, the National Natural Science Foundation of China under Grant Nos. 61772200 and 61772201, Shanghai Pujiang Talent Program under Grant No. 17PJ1401900, Shanghai Economic and Information Commission “Special Fund for Information Development” under Grant No. XX-XXFZ-02-20-2463.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zheng, Z., Feng, X., Yu, H. et al. BDLA: Bi-directional local alignment for few-shot learning. Appl Intell 53, 769–785 (2023). https://doi.org/10.1007/s10489-022-03479-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03479-3