Multi-scale Positive Sample Refinement for Few-Shot Object Detection

Wu, Jiaxi; Liu, Songtao; Huang, Di; Wang, Yunhong

doi:10.1007/978-3-030-58517-4_27

Jiaxi Wu^12,13,14,
Songtao Liu^12,13,14,
Di Huang ORCID: orcid.org/0000-0002-2412-9330^12,13,14 &
…
Yunhong Wang^12,14

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12361))

Included in the following conference series:

European Conference on Computer Vision

4486 Accesses
134 Citations

Abstract

Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances, and is useful when manual annotation is time-consuming or data acquisition is limited. Unlike previous attempts that exploit few-shot classification techniques to facilitate FSOD, this work highlights the necessity of handling the problem of scale variations, which is challenging due to the unique sample distribution. To this end, we propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD. It generates multi-scale positive samples as object pyramids and refines the prediction at various scales. We demonstrate its advantage by integrating it as an auxiliary branch to the popular architecture of Faster R-CNN with FPN, delivering a strong FSOD solution. Several experiments are conducted on PASCAL VOC and MS COCO, and the proposed approach achieves state of the art results and significantly outperforms other counterparts, which shows its effectiveness. Code is available at https://github.com/jiaxi-wu/MPSR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bertinetto, L., Henriques, J.F., Valmadre, J., Torr, P.H.S., Vedaldi, A.: Learning feed-forward one-shot learners. In: Advances in Neural Information Processing Systems (NIPS) (2016)
Google Scholar
Bilen, H., Vedaldi, A.: Weakly supervised deep detection networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Chen, H., Wang, Y., Wang, G., Qiao, Y.: LSTD: a low-shot transfer detector for object detection. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Dong, X., Zheng, L., Ma, F., Yang, Y., Meng, D.: Few-example object detection with model communication. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 41, 1641–1654 (2019)
Article Google Scholar
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vision 111(1), 98–136 (2014). https://doi.org/10.1007/s11263-014-0733-5
Article Google Scholar
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88, 303–338 (2010). https://doi.org/10.1007/s11263-009-0275-4
Article Google Scholar
Fan, Q., Zhuo, W., Tang, C.K., Tai, Y.W.: Few-shot object detection with attention-RPN and multi-relation detector. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning (ICML) (2017)
Google Scholar
Gao, J., Wang, J., Dai, S., Li, L.J., Nevatia, R.: NOTE-RCNN: noise tolerant ensemble RCNN for semi-supervised object detection. In: IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Girshick, R.B.: Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 37, 1904–1916 (2015)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Karlinsky, L., et al.: RepMet: representative-based metric learning for classification and few-shot object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Kim, Y., Kang, B.-N., Kim, D.: SAN: learning relationship between convolutional features for multi-scale object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 328–343. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_20
Chapter Google Scholar
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop (2015)
Google Scholar
Li, F., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 28, 594–611 (2006)
Article Google Scholar
Li, Z., et al.: Thoracic disease identification and localization with limited supervision. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, S., Huang, D., Wang, Y.: Receptive field block net for accurate and fast object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 404–419. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_24
Chapter Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Misra, I., Shrivastava, A., Hebert, M.: Watch and learn: semi-supervised learning of object detectors from videos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Munkhdalai, T., Yu, H.: Meta networks. In: International Conference on Machine Learning (ICML) (2017)
Google Scholar
Qiao, S., Liu, C., Shen, W., Yuille, A.L.: Few-shot image recognition by predicting parameters from activations. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Singh, B., Davis, L.S.: An analysis of scale invariance in object detection-SNIP. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Singh, B., Najibi, M., Davis, L.S.: SNIPER: efficient multi-scale training. In: Advances in Neural Information Processing Systems (NIPS) (2018)
Google Scholar
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H.S., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Tang, P., Wang, X., Bai, X., Liu, W.: Multiple instance detection network with online instance classifier refinement. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Tang, P., et al.: Weakly supervised region proposal network and object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 370–386. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_22
Chapter Google Scholar
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems (NIPS) (2016)
Google Scholar
Wan, F., Liu, C., Ke, W., Ji, X., Jiao, J., Ye, Q.: C-MIL: continuation multiple instance learning for weakly supervised object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Wang, X., Huang, T.E., Darrell, T., Gonzalez, J.E., Yu, F.: Frustratingly simple few-shot object detection. In: International Conference on Machine Learning (ICML) (2020)
Google Scholar
Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta R-CNN: towards general solver for instance-level low-shot learning. In: IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar

Download references

Acknowledgment

This work is funded by the Research Program of State Key Laboratory of Software Development Environment (SKLSDE-2019ZX-03) and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

BAIC for BDBC, Beihang University, Beijing, 100191, China
Jiaxi Wu, Songtao Liu, Di Huang & Yunhong Wang
SKLSDE, Beihang University, Beijing, 100191, China
Jiaxi Wu, Songtao Liu & Di Huang
SCSE, Beihang University, Beijing, 100191, China
Jiaxi Wu, Songtao Liu, Di Huang & Yunhong Wang

Authors

Jiaxi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Songtao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Di Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yunhong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Di Huang .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 100 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, J., Liu, S., Huang, D., Wang, Y. (2020). Multi-scale Positive Sample Refinement for Few-Shot Object Detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12361. Springer, Cham. https://doi.org/10.1007/978-3-030-58517-4_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-58517-4_27
Published: 10 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58516-7
Online ISBN: 978-3-030-58517-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics