Denstity Level Aware Network for Crowd Counting

Zhong, Wencai; Wang, Wei; Lu, Hongtao

doi:10.1007/978-3-030-63830-6_23

Wencai Zhong^14,15,
Wei Wang¹⁴ &
Hongtao Lu¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12532))

Included in the following conference series:

International Conference on Neural Information Processing

2256 Accesses
1 Citations

Abstract

Crowd counting has wide applications in video surveillance and public safety, while it remains an extremely challenging task due to large scale variation and diverse crowd distributions. In this paper, we present a novel method called Density Level Aware Network (DLA-Net) to improve the density map estimation in varying density scenes. Specifically, we divide the input into multiple regions according to their density levels and handle the regions independently. Dense regions (with small scale heads) require higher resolution features from shallow layers, while sparse regions (with large heads) need deep features with broader receptive filed. Based on this requirement, we propose to predict multiple density maps focusing on regions of varying density levels correspondingly. Inspired by the U-Net architecture, our density map estimators borrow features of shallow layers to improve the estimation of dense regions. Moreover, we design a Density Level Aware Loss (DLA-Loss) to better supervise those density maps in different regions. We conduct extensive experiments on three crowd counting datasets (ShanghaiTech, UCF-CC-50 and UCF-QNRF) to validate the effectiveness of the proposed method. The results demonstrate that our DLA-Net achieves the best performance compared with other state-of-the-art approaches.

W. Zhong—Work down as an intern at Alibaba Group.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR 2005), vol. 1, pp. 886–893. IEEE (2005)
Google Scholar
Ding, X., Lin, Z., He, F., Wang, Y., Huang, Y.: A deeply-recursive convolutional network for crowd counting. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1942–1946. IEEE (2018)
Google Scholar
Guo, H., He, F., Cheng, X., Ding, X., Huang, Y.: Pay attention to deep feature fusion in crowd density estimation. In: Gedeon, T., Wong, K.W., Lee, M. (eds.) ICONIP 2019. CCIS, vol. 1142, pp. 363–370. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36808-1_39
Chapter Google Scholar
Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2547–2554 (2013)
Google Scholar
Idrees, H., et al.: Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 532–546 (2018)
Google Scholar
Jiang, X., Xiao, Z., Zhang, B., Zhen, X., Cao, X., Doermann, D., Shao, L.: Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 6133–6142 (2019)
Google Scholar
Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Advances in neural information processing systems, pp. 1324–1332 (2010)
Google Scholar
Li, Y., Zhang, X., Chen, D.: Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1091–1100 (2018)
Google Scholar
Liu, N., Long, Y., Zou, C., Niu, Q., Pan, L., Wu, H.: Adcrowdnet: an attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3225–3234 (2019)
Google Scholar
Marsden, M., McGuinness, K., Little, S., O’Connor, N.E.: Fully convolutional crowd counting on highly congested scenes. arXiv preprint arXiv:1612.00220 (2016)
Ryan, D., Denman, S., Fookes, C., Sridharan, S.: Crowd counting using multiple local features. In: 2009 Digital Image Computing: Techniques and Applications, pp. 81–88. IEEE (2009)
Google Scholar
Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4031–4039. IEEE (2017)
Google Scholar
Shen, Z., Xu, Y., Ni, B., Wang, M., Hu, J., Yang, X.: Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5245–5254 (2018)
Google Scholar
Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1861–1870 (2017)
Google Scholar
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57(2), 137–154 (2004)
Article Google Scholar
Yan, Y., et al.:Unsupervised image saliency detection with gestalt-laws guided optimization and visual attention based refinement. Pattern Recogn. 79, 65– 78 (2018). https://doi.org/10.1016/j.patcog.2018.02.004,http://www.sciencedirect.com/science/article/pii/S0031320318300517
Zabalza, J., et al.: Novel segmented stacked autoencoder for effectivedimensionality reduction and feature extraction in hyperspectral imaging. Neurocomputing 185, 1–10 (2016).https://doi.org/10.1016/j.neucom.2015.11.044,http://www.sciencedirect.com/science/article/pii/S0925231215017798
Zhang, A., et al.: Attentional neural fields for crowd counting. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5714–5723 (2019)
Google Scholar
Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 833–841 (2015)
Google Scholar
Zhang, L., Shi, M., Chen, Q.: Crowd counting via scale-adaptive convolutional neural network. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1113–1121. IEEE (2018)
Google Scholar
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 589–597 (2016)
Google Scholar

Download references

Acknowledgments

This work is supported by Alibaba Group (Grant No. SccA50202002101), NSFC (No. 61772330, 61533012, 61876109), and Major Scientific Research Project of Zhejiang Lab (No. 2019DB0ZX01).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Wencai Zhong, Wei Wang & Hongtao Lu
Alibaba Group, Hangzhou, China
Wencai Zhong

Authors

Wencai Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongtao Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongtao Lu .

Editor information

Editors and Affiliations

Department of AI, Ping An Life, Shenzhen, China
Haiqin Yang
Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi-Sing Leung
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
James T. Kwok
School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
Jonathan H. Chan
The Chinese University of Hong Kong, New Territories, Hong Kong
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, W., Wang, W., Lu, H. (2020). Denstity Level Aware Network for Crowd Counting. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12532. Springer, Cham. https://doi.org/10.1007/978-3-030-63830-6_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-63830-6_23
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63829-0
Online ISBN: 978-3-030-63830-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics