Abstract
Few-shot object detection has made progress in recent years. However, most research assumes that base and new classes come from the same domain. In real-world applications, they often come from different domains, resulting in poor adaptability of existing methods. To address this problem, we designed an adaptive few-shot object detection framework. Based on the Meta R-CNN framework, we added an image domain classifier after the backbone’s last layer to reduce domain discrepancy. To avoid class feature confusion caused by image feature distribution alignment, we also added a feature filter module (CAFFM) to filter out features irrelevant to specific classes. We tested our method on three base/new splits and found significant performance improvements compared to the base model Meta R-CNN. In base/new split2, mAP50 increased by \(\pm 8 \% \), and in the remaining two splits, mAP50 improved by \(\pm 3 \% \). Our method outperforms state-of-the-art methods in most cases for the three different base/new splits, validating the efficacy and generality of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Redmon, J., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lin, T.-Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (2017)
Karlinsky, L., Shtok, J., Harary, S., et al.: RepMet: representative-based metric learning for classification and few-shot object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2019)
Zhang, T., Zhang, Y., Sun, X., et al.: Comparison network for one-shot conditional object detection. arXiv preprint arXiv:1904.02317 (2019)
Kang, B., Liu, Z., Wang, X., et al.: Few-shot object detection via feature reweighting. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE (2020)
Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9925–9934 (2019)
Yan, X., Chen, Z., Xu, A., et al.: Meta R-CNN: towards general solver for instance-level low-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9577–9586 (2019)
He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Zhang, G., Luo, Z., Cui, K., et al.: Meta-DETR: few-shot object detection via unified image-level meta-learning. arXiv preprint arXiv:2103.11731 (2021)
Zhu, X., Su, W., Lu, L., et al.: Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)
Vaswani, A., Shazeer, N., Parmar. N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Hu, H., Bai, S., Li, A., et al.: Dense relation distillation with context-aware aggregation for few-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10185–10194 (2021)
Wang, X., et al.: Frustratingly simple few-shot object detection. arXiv preprint arXiv:2003.06957 (2020)
Sun, B., Li, B., Cai, S., et al.: FSCE: few-shot object detection via contrastive proposal encoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7352–7362 (2021)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems 27 (2014)
Chen, Y., et al.: Domain adaptive faster R-CNN for object detection in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Saito, K., Ushiku, Y., Harada, T., et al.: Strong-weak distribution alignment for adaptive object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6956–6965 (2019)
Kim, T., Jeong, M., Kim, S., et al.: Diversify and match: a domain adaptive representation learning paradigm for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12456–12465 (2019)
Zhu, J.Y., Park, T., Isola, P., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural. Inf. Process. Syst. 28, 91–99 (2015)
Chen, W.-Y., et al.: A closer look at few-shot classification. arXiv preprint arXiv:1904.04232 (2019)
Zhao, A., et al.: Domain-adaptive few-shot learning. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Everingham, M., Van Gool, L., Williams, C.K.I., et al.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Cordts, M., Omran, M., Ramos, S., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)
Venkateswara, H., Eusebio, J., Chakraborty, S., et al.: Deep hashing network for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5018–5027 (2017)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The Kitti vision benchmark suite. In: 2012 IEEE conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yan, J., Wang, H., Liu, X. (2024). An Adaptive Detector for Few Shot Object Detection. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14447. Springer, Singapore. https://doi.org/10.1007/978-981-99-8079-6_45
Download citation
DOI: https://doi.org/10.1007/978-981-99-8079-6_45
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8078-9
Online ISBN: 978-981-99-8079-6
eBook Packages: Computer ScienceComputer Science (R0)