Abstract
The detection of novel foregrounds only utilizing scarce annotated images, namely few-shot object detection, makes a detector no longer dependent on large-scale instantiated sets. The realistic challenge might lie in establishing the correlation of few instances and balancing the sensitivity between base and novel categories. In this paper, we propose a few-shot detector using instance-level feature correlation based on an interactive self-attention module to deeply mine the discriminating representations from scarce novel instances. Besides, using an extended soft threshold shrinkage, a feature aggregation procedure is introduced to eliminate redundant information while enhancing the representation sensitivity between base and novel categories. In the training phase, an orthogonal loss is applied to further enhance the feature distinguishability of inter-categories. Finally, we evaluate related competitive detectors on both benchmarks PASCAL-VOC07/12 and MS-COCO, with the results verifying the superior detection precision on AP, mAP and AR measurements of the proposed approach.
Similar content being viewed by others
References
Vinyals O, Blundell C, Lillicrap T, et al (2016) Matching networks for one shot learning[J]. Adv Neural Inf Process Syst 29:3630–3638
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks[C]//International Conference on Machine Learning. PMLR, pp 1126–1135
Snell J, Swersky K, Zemel RS (2017) Prototypical networks for few-shot learning[J]. arXiv:1703.05175
Lee K, Maji S, Ravichandran A, et al (2019) Meta-learning with differentiable convex optimization[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10657–10665
Kim G, Jung HG, Lee SW (2021) Spatial Reasoning for Few-Shot Object Detection[J]. Pattern Recogn:108118
Ravi S, Larochelle H (2016) Optimization as a model for few-shot learning[J]
Sung F, Yang Y, Zhang l, et al (2018) Learning to compare: Relation network for few-shot learning[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208
Bertinetto L, Henriques JF, Torr PHS et al (2018) Meta-learning with differentiable closed-form solvers[J]. arXiv:1805.08136
Gidaris S, Komodakis N (2018) Dynamic few-shot visual learning without forgetting[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4367–4375
Chen H, Wang Y, Wang G et al (2018) Lstd: A low-shot transfer detector for object detection[C]. Proc AAAI Conf Artif Intell 32(1)
Karlinsky L, Shtok J, Harary S et al (2019) Repmet: Representative-based metric learning for classification and few-shot object detection[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5197–5206
Kang B, Liu Z, Wang X et al (2019) Few-shot object detection via feature reweighting[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 8420–8429
Fan Q, Zhuo W, Tang CK et al (2020) Few-shot object detection with attention-RPN and multi-relation detector[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4013–4022
Li Y, Feng W, Lyu S, et al (2020) MM-FSOD: Meta and metric integrated few-shot object detection[J]. arXiv:2012.15159
Yan X, Chen Z, Xu A et al (2019) Meta r-cnn: Towards general solver for instance-level low-shot learning[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 9577–9586
Xiao Y, Marlet R (2020) Few-shot object detection and viewpoint estimation for objects in the wild[C]. European Conference on Computer Vision. Springer, Cham, pp 192–210
Wang X, Huang TE, Darrell T, et al (2020) Frustratingly simple few-shot object detection[J]. arXiv:2003.06957
Wu J, Liu S, Huang D, et al (2020) Multi-scale positive sample refinement for few-shot object detection[C]. European Conference on Computer Vision. Springer, Cham, pp 456–472
Yang Z, Wang Y, Chen X et al (2020) Context-transformer: tackling object confusion for few-shot detection[C]. Proc AAAI Conf Artif Intell 34(07):12653–12660
Everingham M, Eslami SMA, Van Gool L et al (2015) The pascal visual object classes challenge: A retrospective[J]. Int J Comput Vis 111(1):98–136
Liu L, Ma B, Zhang Y et al (2020) AFD-Net: Adaptive Fully-Dual Network for Few-Shot Object Detection[J]. arXiv:2011.14667
Jiao L, Zhang F, Liu F, et al (2019) A survey of deep learning-based object detection[J]. IEEE access 7:128837–128868
Isogawa K, Ida T, Shiodera T, et al (2017) Deep shrinkage convolutional neural network for adaptive noise reduction[J]. IEEE Signal Process Lett 25(2):224–228
Everingham M, Van Gool L, Williams CKI, et al (2010) The pascal visual object classes (voc) challenge[J]. Int J Comput Vis 88(2):303–338
Ren S, He K, Girshick R, et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Adv Neural Inf Process Syst 28:91–99
Bendre N, Marín HT, Najafirad P (2020) Learning from few samples: A survey[J]. arXiv:2007.15484
Law H, Teng Y, Russakovsky O, et al (2019) Cornernet-lite: Efficient keypoint based object detection[J]. arXiv:1904.08900
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: Common objects in context[C]. European conference on computer vision. Springer, Cham, pp 740–755
Liu W, Anguelov D, Erhan D, et al (2016) Ssd: Single shot multibox detector[C]. European conference on computer vision. Springer, Cham, pp 21–37
Redmon J, Divvala S, Girshick R, et al (2016) You only look once: Unified, real-time object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Wang Y, Yao Q, Kwok JT, et al (2020) Generalizing from a few examples: A survey on few-shot learning[J]. ACM Comput Surv (CSUR) 53(3):1–34
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement[J]. arXiv:1804.02767
Ranasinghe K, Naseer M, Hayat M, et al (2021) Orthogonal Projection Loss[J]. arXiv:2103.14021
Deng J, Dong W, Socher R, et al (2009) Imagenet: A large-scale hierarchical image database[C]. 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
Duan K, Bai S, Xie L et al (2019) Centernet: Object detection with keypoint triplets[J]. arXiv:1904.08189
Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints[C]. Proceedings of the European conference on computer vision (ECCV), pp 734–750
Gidaris S, Bursuc A, Komodakis n et al (2019) Boosting few-shot visual learning with self-supervision[C]. Proceedings of the IEEE/CVF international conference on computer vision, pp 8059–8068
Wang YX, Ramanan D, Hebert M (2019) Meta-learning to detect rare objects[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 9925–9934
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks[C]. International Conference on Machine Learning. PMLR, pp 1126–1135
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories[J]. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Lake BM, Salakhutdinov R, Tenenbaum JB (2015) Human-level concept learning through probabilistic program induction[J]. Science 350(6266):1332–1338
Zhao M, Zhong S, Fu X, et al (2019) Deep residual shrinkage networks for fault diagnosis[J]. IEEE Trans Industrial Inf 16(7):4681–4690
Chen X, Jiang M, Zhao Q (2020) Leveraging Bottom-Up and Top-Down Attention for Few-Shot Object Detection[J]. arXiv:2007.121042007.12104
Wang X, Girshick R, Gupta A et al (2018) Non-local neural networks[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Zhou X, Zhuo J, Krahenbuhl P (2019) Bottom-up object detection by grouping extreme and center points[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 850–859
Fu J, Liu J, Tian H, et al (2019) Dual attention network for scene segmentation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3146–3154
Chen TI, Liu YC, Su HT, et al (2021) Should I Look at the Head or the Tail? Dual-awareness Attention for Few-Shot Object Detection[J]. arXiv:2102.12152
Kim G, Jung HG, Lee SW (2020) Few-shot object detection via knowledge transfer[C]. 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, pp 3564– 3569
Li B, Yang B, Liu C et al (2021) Beyond Max-Margin: Class Margin Equilibrium for Few-shot Object Detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7363–7372
Hu H, Bai S, Li A et al (2021) Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10185–10194
Xu H, Wang X, Shao F, et al (2021) Few-Shot Object detection via sample Processing[J]. IEEE Access 9:29207–29221
Zhu C, Chen F, Ahmed U et al (2021) Semantic relation reasoning for shot-stable few-shot object detection[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8782–8791
Lee H, Lee M, Kwak N (2021) Few-Shot Object Detection by Attending to Per-Sample-Prototype. arXiv:2109.07734
Liu W et al (2021) Dynamic Relevance Learning for Few-Shot Object Detection. arXiv:2108.02235
Acknowledgements
The research was supported by the National Natural Science Foundation of China (62062048).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, M., Ning, H. & Liu, H. Object detection based on few-shot learning via instance-level feature correlation and aggregation. Appl Intell 53, 351–368 (2023). https://doi.org/10.1007/s10489-022-03399-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03399-2