Weakly-Supervised Crowd Counting Learns from Sorting Rather Than Locations

Yang, Yifan; Li, Guorong; Wu, Zhe; Su, Li; Huang, Qingming; Sebe, Nicu

doi:10.1007/978-3-030-58598-3_1

Yifan Yang¹²,
Guorong Li^12,13,
Zhe Wu¹²,
Li Su^12,13,
Qingming Huang^12,13,14 &
…
Nicu Sebe¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12353))

Included in the following conference series:

European Conference on Computer Vision

3527 Accesses
43 Citations

Abstract

In crowd counting datasets, the location labels are costly, yet, they are not taken into the evaluation metrics. Besides, existing multi-task approaches employ high-level tasks to improve counting accuracy. This research tendency increases the demand for more annotations. In this paper, we propose a weakly-supervised counting network, which directly regresses the crowd numbers without the location supervision. Moreover, we train the network to count by exploiting the relationship among the images. We propose a soft-label sorting network along with the counting network, which sorts the given images by their crowd numbers. The sorting network drives the shared backbone CNN model to obtain density-sensitive ability explicitly. Therefore, the proposed method improves the counting accuracy by utilizing the information hidden in crowd numbers, rather than learning from extra labels, such as locations and perspectives. We evaluate our proposed method on three crowd counting datasets, and the performance of our method plays favorably against the fully supervised state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boominathan, L., Kruthiventi, S.S.S., Babu, R.V.: Crowdnet: a deep convolutional network for dense crowd counting. In: ACM Multimedia, pp. 640–644 (2016)
Google Scholar
Cao, X., Wang, Z., Zhao, Y., Su, F.: Scale aggregation network for accurate and efficient crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 757–773 (2018)
Google Scholar
Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–7 (2008)
Google Scholar
Cheng, Z., Li, J., Dai, Q., Wu, X., Hauptmann, A.G.: Learning spatial awareness to improve crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6152–6161 (2019)
Google Scholar
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transportation distances. In: Neural Information Processing Systems, pp. 2292–2300 (2013)
Google Scholar
Cuturi, M., Teboul, O., Vert, J.: Differentiable ranks and sorting using optimal transport. In: Conference on Neural Information Processing Systems (2019)
Google Scholar
Deb, D., Ventura, J.: An aggregated multicolumn dilated convolution network for perspective-free counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 195–204 (2018)
Google Scholar
Grover, A., Wang, E.H., Zweig, A., Ermon, S.: Stochastic optimization of sorting networks via continuous relaxations. In: International Conference on Learning Representations (2019)
Google Scholar
Guerrerogomezolmedo, R., Torrejimenez, B., Lopezsastre, R.J., Maldonadobascon, S., Onororubio, D.: Extremely overlapping vehicle counting. In: Iberian Conference on Pattern Recognition and Image Analysis, pp. 423–431 (2015)
Google Scholar
Guo, B., et al.: Mobile crowd sensing and computing: the review of an emerging human-powered sensing paradigm. ACM Comput. Surv. 48(1), 7:1–7:31 (2015)
Google Scholar
Huang, S., et al.: Body structure aware deep crowd counting. IEEE Trans. Image Process. 27(3), 1049–1059 (2018)
Article MathSciNet Google Scholar
Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2547–2554 (2013)
Google Scholar
Jiang, X., et al.: Crowd counting and density estimation by trellis encoder-decoder network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Lempitsky, V.S., Zisserman, A.: Learning to count objects in images. In: NIPS (2010)
Google Scholar
Li, Y., Zhang, X., Chen, D.: Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1091–1100 (2018)
Google Scholar
Linderman, S.W., Mena, G., Cooper, H., Paninski, L., Cunningham, J.P.: Reparameterizing the birkhoff polytope for variational permutation inference. In: International Conference on Artificial Intelligence and Statistics (2017)
Google Scholar
Liu, C., Wen, X., Mu, Y.: Recurrent attentive zooming for joint crowd counting and precise localization. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Liu, J., Gao, C., Meng, D., Hauptmann, A.G.: Decidenet: counting varying density crowds through attention guided detection and density estimation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5197–5206 (2018)
Google Scholar
Liu, N., Long, Y., Zou, C., Niu, Q., Pan, L., Wu, H.: Adcrowdnet: an attention-injective deformable convolutional network for crowd understanding. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Liu, X., De Weijer, J.V., Bagdanov, A.D.: Leveraging unlabeled data for crowd counting by learning to rank. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7661–7669 (2018)
Google Scholar
Liu, Y., Shi, M., Zhao, Q., Wang, X.: Point in, box out: beyond counting persons in crowds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Longyin, W., et al.: Drone-based joint density map estimation, localization and tracking with space-time multi-scale attention network. arxiv (2020)
Google Scholar
Loy, C.C., Gong, S., Xiang, T.: From semi-supervised to transfer counting of crowds. In: International Conference on Computer Vision, pp. 2256–2263 (2013)
Google Scholar
Ma, Z., Wei, X., Hong, X., Gong, Y.: Bayesian loss for crowd count estimation with point supervision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving Jigsaw puzzles. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 69–84. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_5
Chapter Google Scholar
Ranjan, V., Le, H., Hoai, M.: Iterative crowd counting. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 278–293. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_17
Chapter Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer Assisted Intervention, pp. 234–241 (2015)
Google Scholar
Sam, D.B., Babu, R.V.: Top-down feedback for crowd counting convolutional neural network. In: National Conference on Artificial Intelligence, pp. 7323–7330 (2018)
Google Scholar
Sam, D.B., Sajjan, N., Babu, R.V., Srinivasan, M.: Divide and grow: capturing huge diversity in crowd images with incrementally growing CNN. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3618–3626 (2018)
Google Scholar
Sam, D.B., Sajjan, N.N., Maurya, H., Radhakrishnan, V.B.: Almost unsupervised learning for dense crowd counting. Assoc. Adv. Artif. Intell. 33, 8868–8875 (2019)
Google Scholar
Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4031–4039 (2017)
Google Scholar
Sermanet, P., et al.: Time-contrastive networks: self-supervised learning from video. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Shen, Z., Xu, Y., Ni, B., Wang, M., Hu, J., Yang, X.: Crowd counting via adversarial cross-scale consistency pursuit. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Google Scholar
Sheng, X., Tang, J., Xiao, X., Xue, G.: Leveraging GPS-less sensing scheduling for green mobile crowd sensing. IEEE Internet Things J. 1(4), 328–336 (2014)
Article Google Scholar
Shi, M., Yang, Z., Xu, C., Chen, Q.: Revisiting perspective information for efficient crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Shi, Z., Mettes, P., Snoek, C.G.M.: Counting with focus for free. In: International Conference on Computer Vision, pp. 4200–4209 (2019)
Google Scholar
Shi, Z., et al.: Crowd counting with deep negative correlation learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5382–5390 (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)
Google Scholar
Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: International Conference on Computer Vision, pp. 1879–1888 (2017)
Google Scholar
Wan, J., Chan, A.B.: Adaptive density map generation for crowd counting. In: International Conference on Computer Vision, pp. 1130–1139 (2019)
Google Scholar
Wang, Q., Gao, J., Lin, W., Yuan, Y.: Learning from synthetic data for crowd counting in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Xu, D., Xiao, J., Zhao, Z., Shao, J., Xie, D., Zhuang, Y.: Self-supervised spatiotemporal learning via video clip order prediction. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10334–10343 (2019)
Google Scholar
Yan, Z., et al.: Perspective-guided convolution networks for crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 833–841 (2015)
Google Scholar
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 589–597 (2016)
Google Scholar
Zhao, M., Zhang, J., Zhang, C., Zhang, W.: Leveraging heterogeneous auxiliary tasks to assist crowd counting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12736–12745 (2019)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the Italy-China collaboration project TALENT:2018YFE0118400, in part by National Natural Science Foundation of China: 61620106009, 61772494, 61931008, U1636214, 61836002 and 61976069, in part by Key Research Program of Frontier Sciences, CAS: QYZDJ-SSW-SYS013, in part by Youth Innovation Promotion Association CAS.

Author information

Authors and Affiliations

School of Computer Science and Technology, UCAS, Beijing, China
Yifan Yang, Guorong Li, Zhe Wu, Li Su & Qingming Huang
Key Lab of Big Data Mining and Knowledge Management, UCAS, Beijing, China
Guorong Li, Li Su & Qingming Huang
Key Lab of Intelligent Information Processing, ICT, CAS, Beijing, China
Qingming Huang
University of Trento, Trento, Italy
Nicu Sebe

Authors

Yifan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Guorong Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Wu
View author publications
You can also search for this author in PubMed Google Scholar
Li Su
View author publications
You can also search for this author in PubMed Google Scholar
Qingming Huang
View author publications
You can also search for this author in PubMed Google Scholar
Nicu Sebe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guorong Li .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Y., Li, G., Wu, Z., Su, L., Huang, Q., Sebe, N. (2020). Weakly-Supervised Crowd Counting Learns from Sorting Rather Than Locations. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12353. Springer, Cham. https://doi.org/10.1007/978-3-030-58598-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-58598-3_1
Published: 07 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58597-6
Online ISBN: 978-3-030-58598-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics