Skip to main content
Log in

SA-DCPNet: Scale-aware deep convolutional pyramid network for crowd counting

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Crowd counting is one of the most complex research topics in the field of computer vision. There are many challenges associated with this task, including severe occlusion, scale variation, and complex background. Multi-column networks are commonly used for crowd counting, but they suffer from scale variation and feature similarity, which leads to poor analysis of crowd sequences. To address these issues, we propose a scale-aware deep convolutional pyramid network for crowd counting. We have introduced a scale-aware deep convolutional pyramid module by integrating message passing and global attention mechanisms into a multi-column network. The proposed network minimizes the problem of scale variation using SA-DPCM and uses a multi-column variance loss function to handle issues with feature similarity. Experiments have been performed over the ShanghaiTech and UCF-CC-50 datasets, which show the better performance of the proposed method in terms of mean absolute error and root mean square error.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data associated with this work will be provided upon a reasonable request.

References

  1. Sindagi VA, Patel VM (2017) Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS), IEEE, pp 1–6

  2. Babu Sam D, Surya S, Venkatesh Babu R (2017) Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5744–5752

  3. Xiong F, Shi X, Yeung D-Y (2017) Spatiotemporal modeling for crowd counting in videos. In: Proceedings of the IEEE international conference on computer vision, pp 5151–5159

  4. Ying Y, Huilin Z, Jin Q, Cheng P, Duoqian M (2021) Survey on deep learning based crowd counting. J Comput Res Dev 58(12):2724

    Google Scholar 

  5. Fan Z, Zhang H, Zhang Z, Lu G, Zhang Y, Wang Y (2022) A survey of crowd counting and density estimation based on convolutional neural network. Neurocomputing 472:224–251

    Article  Google Scholar 

  6. Li B, Huang H, Zhang A, Liu P, Liu C (2021) Approaches on crowd counting and density estimation: a review. Pattern Anal Appl 24:853–874

    Article  Google Scholar 

  7. Zhan B, Monekosso DN, Remagnino P, Velastin SA, Xu L-Q (2008) Crowd analysis: a survey. Mach Vis Appl 19(5):345–357

    Article  Google Scholar 

  8. Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 833–841

  9. Sindagi VA, Patel VM (2018) A survey of recent advances in cnn-based single image crowd counting and density estimation. Pattern Recogn Lett 107:3–16

    Article  Google Scholar 

  10. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

  11. Zhang A, Shen J, Xiao Z, Zhu F, Zhen X, Cao X, Shao L (2019) Relational attention network for crowd counting. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6788–6797

  12. Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for ac-curate and efficient crowd counting. In: proceedings of the European conference on computer vision (ECCV), pp 734–750

  13. Li Y, Zhang X, Chen D (2018) Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100

  14. Cheng J, Chen Z, Zhang X, Li Y, Jing X (2020) Exploit the potential of multi-column architecture for crowd counting, arXiv preprintarXiv:2007.05779

  15. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems 25

  16. Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) See more, know more: unsupervised video object segmentation with coattention siamese networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3623–3632

  17. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2547–2554

  18. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1, IEEE, pp 886–893

  19. Wang Q, Gao J, Lin W, Yuan Y (2019) Learning from synthetic data for crowd counting in the wild. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8198–8207

  20. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  21. Li M, Zhang Z, Huang K, Tan T (2008) Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: 2008 19th international conference on pattern recognition, IEEE, pp 1–4

  22. Zeng C, Ma H (2010) Robust head-shoulder detection by pca-based multi-level hog-lbp detector for people counting. In: 2010 20th international conference on pattern recognition, IEEE, pp 2069–2072

  23. Ryan D, Denman S, Fookes C, Sridharan S (2009) Crowd counting using multiple local features. In: 2009 digital image computing: techniques and applications, IEEE, pp 81–88

  24. Marana A, Costa LdF, Lotufo R, Velastin S (1998) On the efficacy of texture analysis for crowd monitoring. In: Proceedings SIBGRAPI’98 international symposium on computer graphics, image processing, and vision (Cat. No. 98EX237), IEEE, pp 354–361

  25. Boominathan L, Kruthiventi SS, Babu RV (2016) Crowdnet: a deep convolutional network for dense crowd counting. In: Proceedings of the24th ACM international conference on multimedia, pp 640–644

  26. Chan AB, Liang Z-SJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. In:2008 IEEE conference on computer vision and pattern recognition, IEEE, pp 1–7

  27. Chan B, Vasconcelos N (2009) Bayesian poisson regression for crowd counting. In: 2009 IEEE 12th international conference on computer vision, IEEE, pp 545–551

  28. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034

  29. Zhang Y, Zhou C, Chang F, Kot AC (2019) Multi-resolution attention convolutional neural network for crowd counting. Neurocomputing 329:144–152

    Article  Google Scholar 

  30. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE international conference on computer vision, pp 1861–1870

  31. Liu J, Gao C, Meng D, Hauptmann AG (2018) Decidenet: Counting varying density crowds through attention guided detection and density estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5197–5206

  32. Hossain M, Hosseinzadeh M, Chanda O, Wang Y (2019) Crowd counting using scale-aware attention networks. In: 2019 IEEE winter conference on applications of computer vision (WACV), IEEE, pp 1280–1288

  33. Yu X, Chu Y, Jiang F, Guo Y, Gong D (2018) SVMs classification based two-side cross domain collaborative filtering by inferring intrinsic user and item features. Knowl-Based Syst 1(141):80–91

    Article  Google Scholar 

  34. Yu X, Jiang F, Du J, Gong D (2019) A cross-domain collaborative filtering algorithm with expanding user and item features via the latent factor space of auxiliary domains. Pattern Recognit 94:96–109

    Article  Google Scholar 

  35. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556

  36. Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5245–5254

  37. Shi Z, Mettes P, Snoek CG (2019) Counting with focus for free. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4200–4209

  38. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141

  39. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization, arXiv preprint arXiv: 1412.6980

  40. Oh M-h, Olsen P, Ramamurthy KN (2020) Crowd counting with decomposed uncertainty. In: Proceedings of the AAAI conference on artificial intelligence. 34: 11799–11806

  41. Cenggoro TW, Aslamiah AH, Yunanto A (2019) Feature pyramid networks for crowd counting. Proc Comput Sci 157:175–182

    Article  Google Scholar 

  42. Gao J, Wang Q, Li X (2019) Pcc net: perspective crowd counting via spatial convolutional network. IEEE Trans Circuits Syst Video Technol 30(10):3486–3498

    Article  Google Scholar 

  43. Wang L, Yin B, Tang X, Li Y (2019) Removing background interference for crowd counting via de-background detail convolutional network. Neuro-computing 332:360–371

    Google Scholar 

  44. Sam DB, Sajjan NN, Babu RV, Srinivasan M (2018) Divide and grow: capturing huge diversity in crowd images with incrementally growing cnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3618–3626

  45. Zhai W, Li Q, Zhou Y, Li X, Pan J, Zou G, Gao M (2023) Da 2 net: a dual attention-aware network for robust crowd counting. Multimed Syst 29(5):3027–3040

    Article  Google Scholar 

  46. Zhang L, Shi Z, Cheng M-M, Liu Y, Bian J-W, Zhou JT, Zheng G, Zeng Z (2019) Nonlinear regression via deep negative correlation learning. IEEE Trans Pattern Anal Mach Intell 43(3):982–998

    Article  Google Scholar 

  47. Khan SD, Salih Y, Zafar B, Noorwali A (2021) A deep-fusion network for crowd counting in high-density crowded scenes. Int J Comput Intell Syst 14(1):1–12

    Google Scholar 

  48. Liu W, Salzmann M, Fua P (2019) Context-aware crowd counting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5099–5108

  49. Chen X, Bin Y, Sang N, Gao C (2019) Scale pyramid network for crowd counting. In: 2019 IEEE winter conference on applications of computer vision (WACV), IEEE, pp 1941–1950

  50. Ranjan V, Le H, Hoai M (2018) Iterative crowd counting. In: Proceedings of the European conference on computer vision (ECCV), pp 270–285

  51. Cheng J, Xiong H, Cao Z, Lu H (2021) Decoupled two-stage crowd counting and beyond. IEEE Trans Image Process 30:2862–2875

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajiv Singh.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tyagi, B., Nigam, S. & Singh, R. SA-DCPNet: Scale-aware deep convolutional pyramid network for crowd counting. Neural Comput & Applic 36, 9283–9295 (2024). https://doi.org/10.1007/s00521-024-09572-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-024-09572-7

Keywords

Navigation