Skip to main content

Few-Shot Object Counting and Detection

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13680))

Included in the following conference series:

Abstract

We tackle a new task of few-shot object counting and detection. Given a few exemplar bounding boxes of a target object class, we seek to count and detect all objects of the target class. This task shares the same supervision as the few-shot object counting but additionally outputs the object bounding boxes along with the total object count. To address this challenging problem, we introduce a novel two-stage training strategy and a novel uncertainty-aware few-shot object detector: Counting-DETR. The former is aimed at generating pseudo ground-truth bounding boxes to train the latter. The latter leverages the pseudo ground-truth provided by the former but takes the necessary steps to account for the imperfection of pseudo ground-truth. To validate the performance of our method on the new task, we introduce two new datasets named FSCD-147 and FSCD-LVIS. Both datasets contain images with complex scenes, multiple object classes per image, and a huge variation in object shapes, sizes, and appearance. Our proposed approach outperforms very strong baselines adapted from few-shot object counting and few-shot object detection with a large margin in both counting and detection metrics. The code and models are available at https://github.com/VinAIResearch/Counting-DETR.

T. Nguyen and C. Pham—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abousamra, S., Hoai, M., Samaras, D., Chen, C.: Localization in the crowd with topological constraints. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI) (2021)

    Google Scholar 

  2. Arteta, C., Lempitsky, V., Noble, J.A., Zisserman, A.: Detecting overlapping instances in microscopy images using extremal region trees. Med. Image Anal. 27, 3–16 (2016). discrete Graphical Models in Biomedical Image Analysis

    Google Scholar 

  3. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  4. Chen, H., Wang, Y., Wang, G., Qiao, Y.: Lstd: a low-shot transfer detector for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  5. Chen, L., Yang, T., Zhang, X., Zhang, W., Sun, J.: Points as queries: weakly semi-supervised object detection by points. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8823–8832 (2021)

    Google Scholar 

  6. Fan, Q., Zhuo, W., Tang, C.K., Tai, Y.W.: Few-shot object detection with attention-RPN and multi-relation detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  7. Fan, Z., Ma, Y., Li, Z., Sun, J.: Generalized few-shot object detection without forgetting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  8. Fang, Y., Zhan, B., Cai, W., Gao, S., Hu, B.: Locality-constrained spatial transformer network for video crowd counting. arXiv preprint arXiv:1907.07911 (2019)

  9. Feng, D., Rosenbaum, L., Timm, F., Dietmayer, K.: Labels are not perfect: improving probabilistic object detection via label uncertainty. arXiv preprint arXiv:2008.04168 (2020)

  10. Feng, D., et al.: Labels are not perfect: inferring spatial uncertainty in object detection. IEEE Trans. Intell. Transp. Syst. 23(8), 9981–9994 (2021)

    Article  Google Scholar 

  11. Goldman, E., Herzig, R., Eisenschtat, A., Goldberger, J., Hassner, T.: Precise detection in densely packed scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  12. Gupta, A., Dollar, P., Girshick, R.: Lvis: a dataset for large vocabulary instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  14. He, Y., Zhu, C., Wang, J., Savvides, M., Zhang, X.: Bounding box regression with uncertainty for accurate object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  15. Hsieh, M.R., Lin, Y.L., Hsu, W.H.: Drone-based object counting by spatially regularized regional proposal networks. In: The IEEE International Conference on Computer Vision (ICCV), IEEE (2017)

    Google Scholar 

  16. Hu, D., Mou, L., Wang, Q., Gao, J., Hua, Y., Dou, D., Zhu, X.X.: Ambient sound helps: audiovisual crowd counting in extreme conditions. arXiv preprint (2020)

    Google Scholar 

  17. Idrees, H., et al.: Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  18. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (ICML), PMLR (2015)

    Google Scholar 

  19. Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  20. Lee, Y., Hwang, J.W., Kim, H.I., Yun, K., Kwon, Y.: Localization uncertainty estimation for anchor-free object detection. arXiv preprint arXiv:2006.15607 (2020)

  21. Lian, D., Li, J., Zheng, J., Luo, W., Gao, S.: Density map regression guided detection network for rgb-d crowd counting and localization. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019

    Google Scholar 

  22. Lin, H., Hong, X., Wang, Y.: Object counting: you only need to look at one. arXiv preprint arXiv:2112.05993 (2021)

  23. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

  24. Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019

    Google Scholar 

  25. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)

  26. Lu, E., Xie, W., Zisserman, A.: Class-agnostic counting. In: Asian Conference on Computer Vision (ACCV) (2018)

    Google Scholar 

  27. Mundhenk, T.N., Konjevod, G., Sakla, W.A., Boakye, K.: A large contextual dataset for classification, detection and counting of cars with deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 785–800. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_48

    Chapter  Google Scholar 

  28. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019)

    Google Scholar 

  29. Peng, D., Sun, Z., Chen, Z., Cai, Z., Xie, L., Jin, L.: Detecting heads using feature refine net and cascaded multi-scale architecture. In: 2018 24th International Conference on Pattern Recognition (ICPR), IEEE (2018)

    Google Scholar 

  30. Ranjan, V., Hoai, M.: Vicinal counting networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4221–4230 (2022)

    Google Scholar 

  31. Ranjan, V., Le, H., Hoai, M.: Iterative crowd counting. In: Proceedings of European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  32. Ranjan, V., Sharma, U., Nguyen, T., Hoai, M.: Learning to count everything. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  33. Ranjan, V., Wang, B., Shah, M., Hoai, M.: Uncertainty estimation and sample selection for crowd counting. In: Proceedings of the Asian Conference on Computer Vision (ACCV) (2020)

    Google Scholar 

  34. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. (NeurIPS) 28 (2015)

    Google Scholar 

  35. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)

    Google Scholar 

  36. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  37. Sindagi, V.A., Yasarla, R., Patel, V.M.: Pushing the frontiers of unconstrained crowd counting: new dataset and benchmark method. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  38. Tian, Z., Shen, C., Chen, H., He, T.: Fcos: fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  39. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (NeurIPS) (2017)

    Google Scholar 

  40. Wang, B., Liu, H., Samaras, D., Hoai, M.: Distribution matching for crowd counting. In: Advances in Neural Information Processing Systems (NeurIPS) (2020)

    Google Scholar 

  41. Wang, Q., Gao, J., Lin, W., Li, X.: Nwpu-crowd: a large-scale benchmark for crowd counting and localization. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 43(6), 2141–2149 (2021)

    Article  Google Scholar 

  42. Wang, X., Huang, T.E., Darrell, T., Gonzalez, J.E., Yu, F.: Frustratingly simple few-shot object detection. arXiv preprint arXiv:2003.06957 (2020)

  43. Wang, Y., Zhang, X., Yang, T., Sun, J.: Anchor detr: query design for transformer-based detector. arXiv preprint arXiv:2109.07107 (2021)

  44. Wu, J., Liu, S., Huang, D., Wang, Y.: Multi-scale positive sample refinement for few-shot object detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 456–472. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_27

    Chapter  Google Scholar 

  45. Xiao, Y., Marlet, R.: Few-shot object detection and viewpoint estimation for objects in the wild. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 192–210. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_12

    Chapter  Google Scholar 

  46. Xie, W., Noble, J.A., Zisserman, A.: Microscopy cell counting and detection with fully convolutional regression networks. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 6(3), 283–292 (2018)

    Article  Google Scholar 

  47. Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta r-cnn : towards general solver for instance-level low-shot learning. In: Proceedings of IEEE International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  48. You, Z., Yang, K., Luo, W., Lu, X., Cui, L., Le, X.: Iterative correlation-based feature refinement for few-shot counting. arXiv preprint arXiv:2201.08959 (2022)

  49. Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  50. Zhang, G., Luo, Z., Cui, K., Lu, S.: Meta-detr: few-shot object detection via unified image-level meta-learning. arXiv preprint arXiv:2103.11731 (2021)

  51. Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  52. Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. In: arXiv preprint arXiv:1904.07850 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khoi Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, T., Pham, C., Nguyen, K., Hoai, M. (2022). Few-Shot Object Counting and Detection. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13680. Springer, Cham. https://doi.org/10.1007/978-3-031-20044-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20044-1_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20043-4

  • Online ISBN: 978-3-031-20044-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics