From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting

Xiong, Haipeng; Lu, Hao; Liu, Chengxin; Liu, Liang; Shen, Chunhua; Cao, Zhiguo

doi:10.1007/s11263-023-01782-1

From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting

Published: 01 April 2023

Volume 131, pages 1722–1740, (2023)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Haipeng Xiong¹^na1,
Hao Lu¹^na1,
Chengxin Liu¹,
Liang Liu¹,
Chunhua Shen² &
…
Zhiguo Cao ORCID: orcid.org/0000-0002-9223-1863¹

662 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Visual counting, a task that aims to estimate the number of objects from an image/video, is an open-set problem by nature as the number of population can vary in \([0,+\infty )\) in theory. However, collected data are limited in reality, which means that only a closed set is observed. Existing methods typically model this task through regression, while they are prone to suffer from unseen scenes with counts out of the scope of the closed set. In fact, counting has an interesting and exclusive property—spatially decomposable. A dense region can always be divided until sub-region counts are within the previously observed closed set. We therefore introduce the idea of spatial divide-and-conquer (S-DC) that transforms open-set counting into a closed set problem. This idea is implemented by a novel Supervised Spatial Divide-and-Conquer Network (SS-DCNet). It can learn from a closed set but generalize to open-set scenarios via S-DC. We provide mathematical analyses and a controlled experiment on synthetic data, demonstrating why closed-set modeling works well. Experiments show that SS-DCNet achieves state-of-the-art performance in crowd counting, vehicle counting and plant counting. SS-DCNet also demonstrates superior transferablity under the cross-dataset setting. Code and models are available at: https://git.io/SS-DCNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Class-Agnostic Counting

Focus for Free in Density-Based Counting

Article 09 February 2024

Semi-supervised Crowd Counting via Self-training on Surrogate Tasks

References

Babu, S. D., Surya, S., & Venkatesh, B. R. (2017). Switching convolutional neural network for crowd counting. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 5744–5752).
Babu, S. D., Sajjan, N. N., Venkatesh, B. R., & Srinivasan, M. (2018). Divide and grow: Capturing huge diversity in crowd images with incrementally growing cnn. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 3618–3626).
Cao, X., Wang, Z., Zhao, Y., & Su, F. (2018). Scale aggregation network for accurate and efficient crowd counting. In The European Conference on Computer Vision (ECCV), (pp. 734–750).
Chattopadhyay, P., Vedantam , R., Selvaraju, R. R., Batra, D., & Parikh, D. (2017). Counting everyday objects in everyday scenes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 1135–1144).
Chen, K., Loy, C. C., Gong, S., & Xiang, T. (2012). Feature mining for localised crowd counting. In Proceedings of British Machine Vision Conference (BMVC).
Chen, K., Gong , S., Xiang, T., & Change, Loy. C. (2013). Cumulative attribute space for age and crowd density estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 2467–2474).
Chen, X., Bin, Y., Sang, N., & Gao, C. (2019). Scale pyramid network for crowd counting. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), (pp. 1941–1950).
Cheng, Z., Li, J., Dai, Q., Wu, X., & Hauptmann, A. G. (2019). Learning spatial awareness to improve crowd counting. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), (pp. 6152–6161).
Cohen, J. P., Boucher, G., Glastonbury, C. A., Lo, H. Z., & Bengio, Y. (2017). Count-ception: Counting by fully convolutional redundant counting. In Proceedings of IEEE International Conference on Computer Vision Workshop (ICCVW), (pp. 18–26).
Dehaene, S., Izard, V., Spelke, E., & Pica, P. (2008). Log or linear? Distinct intuitions of the number scale in western and Amazonian indigene cultures. Science, 320(5880), 1217–1220.
Article MathSciNet MATH Google Scholar
Fu, H., Gong, M., Wang, C., Batmanghelich, K., & Tao, D. (2018). Deep ordinal regression network for monocular depth estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 2002–2011).
Girshick, R . (2015). Fast R-CNN. In Proceedings of IEEE International Conference on Computer Vision (ICCV), (pp. 1440–1448).
Guerrerogó, mezolmedo. R., Torrejiménez, B., Lópezsastre, R., Maldonadobascón, S., & Oñororubio, D. (2015). Extremely overlapping vehicle counting. In Pattern Recognition and Image Analysis, (pp. 423–431).
Idrees, H., Saleemi, I., Seibert, C., & Shah, M. (2013). Multi-source multi-scale counting in extremely dense crowd images. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 2547–2554).
Idrees, H., Tayyab, M., Athrey, K., Zhang, D., Al-Maadeed, S., Rajpoot, N., & Shah, M. (2018). Composition loss for counting, density map estimation and localization in dense crowds. In The European Conference on Computer Vision (ECCV), (pp. 532–546).
Jiang, X., Xiao, Z., Zhang, B., Zhen, X., Cao, X., Doermann, D., & Shao, L. (2019) . Crowd counting and density estimation by trellis encoder-decoder networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 6133–6142).
Lempitsky, V., & Zisserman, A. (2010). Learning to count objects in images. In Advances in Neural Information Processing Systems (NIPS), (pp. 1324–1332).
Li, R., Xian, K., Shen, C., Cao, Z., Lu, H., & Hang, L. (2018a). Deep attention-based classification network for robust depth prediction. In Proceedings of the Asian Conference on Computer Vision (ACCV).
Li, R., Xian, K., Shen, C., Cao, Z., Lu, H., & Hang, L. (2018b). Deep attention-based classification network for robust depth prediction. In Proceedings of Asian Conference on Computer Vision (ACCV).
Li, Y., Zhang, X., Chen, D. (2018c) Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 1091–1100)
Liu, J., Gao, C., Meng, D., & Hauptmann, A. G. (2018a). Decidenet: Counting varying density crowds through attention guided detection and density estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 5197–5206).
Liu, L., Wang, H., Li, G., Ouyang, W., & Liang, L. (2018b). Crowd counting using deep recurrent spatial-aware network. In International Joint Conference on Artificial Intelligence (IJCAI).
Liu, L., Lu, H., Xiong, H., Xian, K., Cao, Z., & Shen, C .(2019). Counting objects by blockwise classification. In IEEE Transactions on Circuits and Systems for Video Technology.
Lu, H., Cao, Z., Xiao, Y., Fang, Z., Zhu, Y., & Xian, K. (2015). Fine-grained maize tassel trait characterization with multi-view representations. Computers and Electronics in Agriculture, 118, 143–158. https://doi.org/10.1016/j.compag.2015.08.027
Article Google Scholar
Lu, H., Cao, Z., Xiao, Y., Li, Y., & Zhu, Y. (2016). Region-based colour modelling for joint crop and maize tassel segmentation. Biosystems Engineering, 147, 139–150. https://doi.org/10.1016/j.biosystemseng.2016.04.007
Article Google Scholar
Lu, H., Cao, Z., Xiao, Y., Zhuang, B., & Shen, C. (2017). TasselNet: counting maize tassels in the wild via local counts regression network. Plant Methods, 13(1), 79–95.
Article Google Scholar
Ma, Z., Wei, X., Hong, X., & Gong, Y. (2019). Bayesian loss for crowd count estimation with point supervision. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), (pp. 6142–6151).
Niu, Z., Zhou, M., Wang, L., Gao, X., & Hua, G. (2016) Ordinal regression with multiple output cnn for age estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 4920–4928).
Oñoro-Rubio, D., & López-Sastre, R. J. (2016). Towards perspective-free object counting with deep learning. In The European Conference on Computer Vision (ECCV), (pp. 615–629).
Osterreicher, F., & Vajda, I. (2003). A new class of metric divergences on probability spaces and its applicability in statistics. Annals of the Institute of Statistical Mathematics, 55(3), 639–653.
Article MathSciNet MATH Google Scholar
Panareda Busto, P., Gall ,J. (2017). Open set domain adaptation. In Proceedings of IEEE International Conference on Computer Vision (ICCV), (pp. 754–763).
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. (2019). Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NIPS), (pp. 8024–8035).
Ranjan, V., Le, H., Hoai, M. (2018). Iterative crowd counting. In The European Conference on Computer Vision (ECCV), (pp. 270–285).
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-assisted Intervention, (pp. 234–241).
Scheirer, W. J., de Rezende, Rocha A., Sapkota, A., & Boult, T. E. (2012). Toward open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(7), 1757–1772.
Article Google Scholar
Shen, Z., Xu, Y., Ni, B., Wang, M., Hu, J., & Yang, X. (2018). Crowd counting via adversarial cross-scale consistency pursuit. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 5245–5254).
Shi, Z., Zhang, L., Liu, Y., Cao, X., Ye, Y., Cheng, M. M., & Zheng, G. (2018). Crowd counting with deep negative correlation learning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 5382–5390).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition.
Sindagi, V. A., Patel, V. M. (2017). Generating high-quality crowd density maps using contextual pyramid cnns. In The IEEE International Conference on Computer Vision (ICCV), (pp. 1861–1870).
Sindagi, V. A., Yasarla, R., & Patel, V. M. (2019). Pushing the frontiers of unconstrained crowd counting: New dataset and benchmark method. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), (pp. 1221–1231).
Stahl, T., Pintea, S. L., & van Gemert, J. C. (2019). Divide and count: Generic object counting by image divisions. IEEE Transactions on Image Processing, 28(2), 1035–1044.
Article MathSciNet MATH Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 1–9).
Tian, Y., Lei, Y., Zhang, J., & Wang, J. Z. (2019). Padnet: Pan-density crowd counting. In IEEE Transactions on Image Processing, (pp. 1–1), https://doi.org/10.1109/TIP.2019.2952083
Tota, K., & Idrees. H . (2015). Counting in dense crowds using deep features. In Center for Research in Computer Vision (CRCV).
Tsai, Y. H., Hung, W. C., Schulter, S., Sohn, K., Yang , M. H., & Chandraker, M. (2018). Learning to adapt structured output space for semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Uijlings, J. R. R., Sande, K. E. A. V. D., Gevers, T., & Smeulders, A. W. M. (2013). Selective search for object recognition. International Journal of Computer Vision, 104(2), 154–171.
Article Google Scholar
Xiong, H., Cao, Z., Lu, H., Madec, S., Liu, L., & Shen, C. (2019). TasselNetv2: In-field counting of wheat spikes with context-augmented local regression networks. Plant Methods, 15, 150–163.
Article Google Scholar
Xiong, H., Lu, H., Liu, C., Liang, L., Cao, Z., & Shen, C. (2019b). From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (pp. 8362–8371).
Xu, C., Qiu, K., Fu, J., Bai, S., Xu, Y., & Bai, X. (2019). Learn to scale: Generating multipolar normalized density maps for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), (pp. 8382–8390).
Yan, Z., Yuan, Y., Zuo, W., Tan, X., Wang, Y., Wen, S., & Ding, E. (2019). Perspective-guided convolution networks for crowd counting. In The IEEE International Conference on Computer Vision (ICCV).
Zhang, C., Li, H., Wang, X., & Yang, X. (2015). Cross-scene crowd counting via deep convolutional neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 833–841).
Zhang, Y., Zhou, D., Chen, S., Gao, S., & Ma, Y. (2016). Single-image crowd counting via multi-column convolutional neural network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (pp. 589–597).

Download references

Author information

Haipeng Xiong and Hao Lu have contributed equally to this work.

Authors and Affiliations

School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China
Haipeng Xiong, Hao Lu, Chengxin Liu, Liang Liu & Zhiguo Cao
College of Computer Science and Technology, Zhejiang University, Hangzhou, 310058, China
Chunhua Shen

Authors

Haipeng Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Hao Lu
View author publications
You can also search for this author in PubMed Google Scholar
Chengxin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Liang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chunhua Shen
View author publications
You can also search for this author in PubMed Google Scholar
Zhiguo Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiguo Cao.

Additional information

Communicated by Esa Rahtu.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the Natural Science Foundation of China under Grant Nos. 62106080 and 61876211. Part of the work was done when H. Xiong and L. Liu were visiting The University of Adelaide and when H. Lu and C. Shen were with The University of Adelaide.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xiong, H., Lu, H., Liu, C. et al. From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting. Int J Comput Vis 131, 1722–1740 (2023). https://doi.org/10.1007/s11263-023-01782-1

Download citation

Received: 21 September 2020
Accepted: 12 March 2023
Published: 01 April 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11263-023-01782-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting

Abstract

Access this article

Similar content being viewed by others

Class-Agnostic Counting

Focus for Free in Density-Based Counting

Semi-supervised Crowd Counting via Self-training on Surrogate Tasks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting

Abstract

Access this article

Similar content being viewed by others

Class-Agnostic Counting

Focus for Free in Density-Based Counting

Semi-supervised Crowd Counting via Self-training on Surrogate Tasks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation