Skip to main content
Log in

An attentive hierarchy ConvNet for crowd counting in smart city

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Crowd counting plays a crucial rule in the development of smart city. However, the problems of scale variations and background interferences degrade the performance of the crowd counting in real-world scenarios. To address these problems, a novel attentive hierarchy ConvNet (AHNet) is proposed in this paper. The AHNet extracts hierarchy features by a designed discriminative feature extractor and mines the semantic features in a coarse-to-fine manner by a hierarchical fusion strategy. Meanwhile, a re-calibrated attention (RA) module is built in various levels to suppress the influence of background interferences, and a feature enhancement (FE) module is built to recognize head regions at various scales. Experimental results on five people crowd datasets and two cross-domain vehicle crowd datasets illustrate that the proposed AHNet achieves competitive performance in accuracy and generalization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Abualigah, L., Forestiero, A., Elaziz, M.A.: Bio-inspired agents for a distributed NLP-based clustering in smart environments. In: International Conference on Soft Computing and Pattern Recognition, 2021, pp. 678–687. Springer (2021). https://doi.org/10.1007/978-3-030-96302-6_64

  2. Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008, pp. 1–7 (2008). https://doi.org/10.1109/CVPR.2008.4587569

  3. Chen, K., Gong, S., Xiang, T., Loy, C.C.: Cumulative attribute space for age and crowd density estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2467–2474 (2013)

  4. Ding, X., He, F., Lin, Z., Wang, Y., Guo, H., Huang, Y.: Crowd density estimation using fusion of multi-layer features. IEEE Trans. Intell. Transp. Syst. 22, 4776–4787 (2021). https://doi.org/10.1109/TNNLS.2021.3084827

    Article  Google Scholar 

  5. Fu, M., Xu, P., Li, X., Liu, Q., Ye, M., Zhu, C.: Fast crowd density estimation with convolutional neural networks. Eng. Appl. Artif. Intell. 43, 81–88 (2015). https://doi.org/10.1016/j.engappai.2015.04.006

    Article  Google Scholar 

  6. Gao, G., Gao, J., Liu, Q., Wang, Q., Wang, Y.: CNN-based density estimation and crowd counting: a survey (2020). ArXiv: abs/2003.12783

  7. Gao, J., Wang, Q., Li, X.: PCC Net: perspective crowd counting via spatial convolutional network. IEEE Trans. Circuits Syst. Video Technol. 30, 3486–3498 (2020). https://doi.org/10.1109/TCSVT.2019.2919139

    Article  Google Scholar 

  8. Gao, J., Wang, Q., Yuan, Y.: SCAR: spatial-/channel-wise attention regression networks for crowd counting. Neurocomputing 363, 1–8 (2019). https://doi.org/10.1016/j.neucom.2019.08.018

    Article  Google Scholar 

  9. Hsieh, M.R., Lin, Y.L., Hsu, W.H.: Drone-based object counting by spatially regularized regional proposal network. In: Proceedings of the International Conference on Computer Vision (ICCV), 2017, pp. 4165–4173 (2017)

  10. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 7132–7141 (2018). https://doi.org/10.1109/TPAMI.2019.2913372

  11. Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2547–2554 (2013). https://doi.org/10.1109/CVPR.2013.329

  12. Idrees, H., Tayyab, M., Athrey, K., Zhang, D., Al-Maadeed, S., Rajpoot, N., Shah, M.: Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 532–546 (2018). https://doi.org/10.1007/978-3-030-01216-8_33

  13. Kasmani, S.A., He, X., Jia, W., Wang, D., Zeibots, M.: A-CCNN: adaptive CCNN for density estimation and crowd counting. In: Proceedings of the IEEE International Conference on Image Processing (ICIP), 2018, pp. 948–952 (2018).https://doi.org/10.1109/ICIP.2018.8451399

  14. Kiliç, E., Ozturk, S.: An accurate car counting in aerial images based on convolutional neural networks. J. Ambient Intell. Humaniz. Comput. (2021). https://doi.org/10.1007/s12652-021-03377-5

    Article  Google Scholar 

  15. Li, Y., Zhang, X., Chen, D.: CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 1091–1100 (2018). https://doi.org/10.1109/CVPR.2018.00120

  16. Lin, T.Y., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 42, 318–327 (2020)

    Article  Google Scholar 

  17. Liu, J., Gao, C., Meng, D., Hauptmann, A.: DecideNet: counting varying density crowds through attention guided detection and density estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 5197–5206 (2018). https://doi.org/10.1109/CVPR.2018.00545

  18. Liu, L., Jiang, J., Jia, W., Amirgholipour, S., Wang, Y., Zeibots, M., He, X.: DENet: a universal network for counting crowd with varying densities and scales. IEEE Trans. Multimed. 23, 1060–1068 (2021). https://doi.org/10.1109/TMM.2020.2992979

    Article  Google Scholar 

  19. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 21–37 (2016). https://doi.org/10.1007/978-3-319-46448-0_2

  20. Nguyen, V., Ngo, T.D.: Single-image crowd counting: a comparative survey on deep learning-based approaches. Int. J. Multimed. Inf. Retr. 9, 63–80 (2019). https://doi.org/10.1007/s13735-019-00181-y

    Article  Google Scholar 

  21. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2015). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  22. Sam, D.B., Sajjan, N.N., Babu, R.V.: Divide and grow: capturing huge diversity in crowd images with incrementally growing CNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 3618–3626 (2018)

  23. Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4031–4039 (2017). https://doi.org/10.1109/CVPR.2017.429

  24. Shi, X., Li, X., Wu, C., Kong, S., Yang, J.S., He, L.: A real-time deep network for crowd counting. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 2328–2332 (2020). https://doi.org/10.1109/ICASSP40776.2020.9053780

  25. Sindagi, V., Patel, V.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2017, pp. 1–6 (2017). https://doi.org/10.1109/AVSS.2017.8078491

  26. Sindagi, V., Patel, V.: Generating high-quality crowd density maps using contextual pyramid CNNs. In: Proceedings of the International Conference on Computer Vision (ICCV), 2017, pp. 1879–1888 (2017). https://doi.org/10.1109/ICCV.2017.206

  27. Sindagi, V.A., Patel, V.M.: A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognit. Lett. 107, 3–16 (2018). https://doi.org/10.1016/j.patrec.2017.07.007

    Article  Google Scholar 

  28. Stahl, T., Pintea, S.L., Gemert, J.C.V.: Divide and count: generic object counting by image divisions. IEEE Trans. Image Process. 28, 1035–1044 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  29. Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of the International Conference on Computer Vision (ICCV), 2019, pp. 9626–9635 (2019). https://doi.org/10.1109/ICCV.2019.00972

  30. Viola, P.A., Jones, M.: Robust real-time face detection. Int. J. Comput. Vis. 57, 137–154 (2004)

    Article  Google Scholar 

  31. Wang, Q., Gao, J., Lin, W., Li, X.: NWPU-Crowd: a large-scale benchmark for crowd counting and localization. IEEE Trans. Pattern Anal. Mach. Intell. 43, 2141–2149 (2021). https://doi.org/10.1109/TPAMI.2020.3013269

    Article  Google Scholar 

  32. Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., Sun, X., Fu, K.: SCRDet: towards more robust detection for small, cluttered and rotated objects. In: Proceedings of the International Conference on Computer Vision (ICCV), 2019, pp. 8231–8240 (2019). https://doi.org/10.1109/ICCV.2019.00832

  33. Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 833–841 (2015). https://doi.org/10.1109/CVPR.2015.7298684

  34. Zhang, L., Shi, M., Chen, Q.: Crowd counting via scale-adaptive convolutional neural network. In: Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV), 2018, pp. 1113–1121 (2018). https://doi.org/10.1109/WACV.2018.00127

  35. Zhang, L., Shi, Z., Cheng, M.M., Liu, Y., Bian, J.W., Zhou, J.T., Zheng, G., Zeng, Z.: Nonlinear regression via deep negative correlation learning. IEEE Trans. Pattern Anal. Mach. Intell. 43, 982–998 (2021). https://doi.org/10.1109/TPAMI.2019.2943860

    Article  Google Scholar 

  36. Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 589–597 (2016). https://doi.org/10.1109/CVPR.2016.70

  37. Zhu, M., Wang, X., Tang, J., Wang, N., Qu, L.: Attentive multi-stage convolutional neural network for crowd counting. Pattern Recognit. Lett. 135, 279–285 (2020)

    Article  Google Scholar 

  38. Ziadeh, A., Abualigah, L., Elaziz, M.A., Şahin, C.B., Almazroi, A.A., Omari, M.: Augmented grasshopper optimization algorithm by differential evolution: a power scheduling application in smart homes. Multimed. Tools Appl. 80(21), 31569–31597 (2021). https://doi.org/10.1007/s11042-021-11099-1

    Article  Google Scholar 

  39. Zou, Z., Cheng, Y., Qu, X., Ji, S., Guo, X., Zhou, P.: Attend to count: crowd counting with adaptive capacity multi-scale CNNs. Neurocomputing 367, 75–83 (2019). https://doi.org/10.1016/J.NEUCOM.2019.08.009

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Nos. 61601266 and 61801272) and National Natural Science Foundation of Shandong Province (Nos. ZR2021QD041 and ZR2020MF127).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingliang Gao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhai, W., Gao, M., Souri, A. et al. An attentive hierarchy ConvNet for crowd counting in smart city. Cluster Comput 26, 1099–1111 (2023). https://doi.org/10.1007/s10586-022-03749-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-022-03749-2

Keywords

Navigation