Skip to main content
Log in

PDDNet: lightweight congested crowd counting via pyramid depth-wise dilated convolution

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The accuracy of crowd counting is susceptible to scale variations of crowd head in the congested scene. Some counting networks, such as crowd density pre-classification networks or multi-column counting networks, are proposed to model the different scales of crowd head. However, most of them own a complex network structure with many network parameters, making deploying a crowd counting network in practical application scenarios challenging. To this end, we propose a lightweight crowd counting network termed PDDNet. The front-end of the PDDNet chooses the first 13 layers of GhostNet to extract the crowd feature, and the back-end of the PDDNet is implemented with the proposed lightweight pyramidal convolution modules (LPC) to extract the multi-scale features. Finally, the extracted multi-scale features are fed to transposed convolution layers to regress the final crowd density map. We conduct extensive experiments on the commonly-used crowd counting datasets, i.e., ShanghaiTech, UCF_QNRF, and NWPU_Crowd. The experiment results show the superiority of our model compared with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Zeng X, Wu Y, Hu S, Wang R, Ye Y (2020) Dspnet: deep scale purifier network for dense crowd counting. Expert Syst Appl 141:112977

    Article  Google Scholar 

  2. Sam DB, Babu RV (2018) Top-down feedback for crowd counting convolutional neural network. In: Thirty-second AAAI conference on artificial intelligence

  3. Cheng Z-Q, Li J-X, Dai Q, Wu X, He J-Y, Hauptmann AG (2019) Improving the learning of multi-column convolutional neural network for crowd counting. In: Proceedings of the 27th ACM international conference on multimedia, pp 1897– 1906

  4. Hossain M, Hosseinzadeh M, Chanda O, Wang Y (2019) Crowd counting using scale-aware attention networks. In: 2019 IEEE winter conference on applications of computer vision (WACV), pp 1280–1288

  5. Zhang L, Shi M, Chen Q (2018) Crowd counting via scale-adaptive convolutional neural network. In: 2018 IEEE winter conference on applications of computer vision (WACV), pp 1113–1121

  6. Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: Proceedings of the european conference on computer vision (ECCV), pp 734–750

  7. Shi Z, Zhang L, Liu Y, Cao X, Ye Y, Cheng M-M, Zheng G (2018) Crowd counting with deep negative correlation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5382–5390

  8. Jiang X, Xiao Z, Zhang B, Zhen X, Cao X, Doermann D, Shao L (2019) Crowd counting and density estimation by trellis encoder-decoder networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6133–6142

  9. Li Y, Zhang X, Chen D (2018) Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100

  10. Guo D, Li K, Zha Z-J, Wang M (2019) Dadnet: dilated-attention-deformable convnet for crowd counting. In: Proceedings of the 27th ACM international conference on multimedia, pp 1823–1832

  11. Yan Z, Yuan Y, Zuo W, Tan X, Wang Y, Wen S, Ding E (2019) Perspective-guided convolution networks for crowd counting. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 952–961

  12. Liu W, Salzmann M, Fua P (2019) Context-aware crowd counting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5099–5108

  13. Ma Z, Wei X, Hong X, Gong Y (2019) Bayesian loss for crowd count estimation with point supervision. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6142–6151

  14. Xie Y, Lu Y, Wang S (2020) Rsanet: deep recurrent scale-aware network for crowd counting. In: IEEE international conference on image processing, pp 1531–1535

  15. Liang L, Zhao H, Zhou F, Zhang Q, Song Z, Shi Q (2022) Sc2net: scale-aware crowd counting network with pyramid dilated convolution. Appl Intell:1–14

  16. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd International conference on learning representations, ICLR

  17. Sindagi VA, Patel VM (2019) Ha-ccn: hierarchical attention-based crowd counting network. IEEE Trans Image Process 29:323–335

    Article  MathSciNet  MATH  Google Scholar 

  18. Wang S, Lu Y, Zhou T, Di H, Lu L, Zhang L (2020) Sclnet: spatial context learning network for congested crowd counting. Neurocomputing 404:227–239

    Article  Google Scholar 

  19. Chu H, Tang J, Hu H (2021) Attention guided feature pyramid network for crowd counting. J Vis Commun Image Represent 80:103319

    Article  Google Scholar 

  20. Amirgholipour S, Jia W, Liu L, Fan X, Wang D, He X (2021) Pdanet: pyramid density-aware attention based network for accurate crowd counting. Neurocomputing 451:215–230

    Article  Google Scholar 

  21. Liu Y-B, Jia R-S, Liu Q-M, Zhang X-L, Sun H-M (2021) Crowd counting method based on the self-attention residual network. Appl Intell 51(1):427–440

    Article  Google Scholar 

  22. Gu L, Pang C, Zheng Y, Lyu C, Lyu L (2022) Context-aware pyramid attention network for crowd counting. Appl Intell 52(6):6164–6180

    Article  Google Scholar 

  23. Shi Y, Sang J, Wu Z, Wang F, Liu X, Xia X, Sang N (2022) Mgsnet: a multi-scale and gated spatial attention network for crowd counting. Appl Intell:1–11

  24. Li Y-C, Jia R-S, Hu Y-X, Han D-N, Sun H-M (2022) Crowd density estimation based on multi scale features fusion network with reverse attention mechanism. Appl Intell:1–17

  25. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

  26. Sindagi VA, Patel VM (2017) Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6

  27. Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5245–5254

  28. Gao J, Wang Q, Li X (2019) Pcc net: perspective crowd counting via spatial convolutional network. IEEE Trans Circuits Syst Video Technol 30(10):3486–3498

    Article  Google Scholar 

  29. Shi X, Li X, Wu C, Kong S, Yang J, He L (2020) A real-time deep network for crowd counting. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2328–2332

  30. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861

  31. Ma N, Zhang X, Zheng H-T, Sun J (2018) Shufflenet v2: practical guidelines for efficient cnn architecture design. In: Proceedings of the european conference on computer vision (ECCV), pp 116–131

  32. Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856

  33. Han K, Wang Y, Tian Q, Guo J, Xu C, Xu C (2020) Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1580–1589

  34. Topkaya IS, Erdogan H, Porikli F (2014) Counting people by clustering person detector outputs. In: 2014 11th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 313–318

  35. Li Z, Zhang L, Fang Y, Wang J, Xu H, Yin B, Lu H (2016) Deep people counting with faster r-cnn and correlation tracking. In: Proceedings of the international conference on internet multimedia computing and service, pp 57–60

  36. Babu Sam D, Surya S, Venkatesh Babu R (2017) Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5744–5752

  37. Tian Y, Lei Y, Zhang J, Wang JZ (2019) Padnet: pan-density crowd counting. IEEE Trans Image Process 29:2714–2727

    Article  MATH  Google Scholar 

  38. Bai S, He Z, Qiao Y, Hu H, Wu W, Yan J (2020) Adaptive dilated network with self-correction supervision for counting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4594–4603

  39. Wang W, Liu Q, Wang W (2022) Pyramid-dilated deep convolutional neural network for crowd counting. Appl Intell 52(2):1825–1837

    Article  Google Scholar 

  40. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE conference on computer vision, pp 1861–1870

  41. Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the european conference on computer vision (ECCV), pp 532–546

  42. Laradji IH, Rostamzadeh N, Pinheiro PO, Vazquez D, Schmidt M (2018) Where are the blobs: counting by localization with point supervision. In: Proceedings of the european conference on computer vision (ECCV), pp 547–562

  43. Liu C, Weng X, Mu Y (2019) Recurrent attentive zooming for joint crowd counting and precise localization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1217–1226

  44. Chen X, Yu X, Di H, Wang S (2021) Sa-internet: scale-aware interaction network for joint crowd counting and localization. In: Chinese conference on pattern recognition and computer vision, pp 203–215

  45. Zhou T, Wang S, Zhou Y, Yao Y, Li J, Shao L (2020) Motion-attentive transition for zero-shot video object segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 13066–13073

  46. Zhou T, Li J, Wang S, Tao R, Shen J (2020) Matnet: motion-attentive transition network for zero-shot video object segmentation. IEEE Trans Image Process 29:8326–8338

    Article  MATH  Google Scholar 

  47. Zhou T, Li L, Li X, Feng C-M, Li J, Shao L (2021) Group-wise learning for weakly supervised semantic segmentation. IEEE Trans Image Process 31:799–811

    Article  Google Scholar 

  48. Lai Q, Zhou T, Khan S, Sun H, Shen J, Shao L (2022) Weakly supervised visual saliency prediction. IEEE Trans Image Process 31:3111–3124

    Article  Google Scholar 

  49. Zhang S, Zhang X, Li H, He H, Song D, Wang L (2022) Hierarchical pyramid attentive network with spatial separable convolution for crowd counting. Eng Appl Artif Intell 108:104563

    Article  Google Scholar 

  50. Song Q, Wang C, Wang Y, Tai Y, Wang C, Li J, Wu J, Ma J (2021) To choose or to fuse? scale selection for crowd counting. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 2576–2583

  51. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  52. Ma J, Dai Y, Tan Y-P (2019) Atrous convolutions spatial pyramid network for crowd counting and density estimation. Neurocomputing 350:91–101

    Article  Google Scholar 

  53. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520

  54. Wu B, Wan A, Yue X, Jin P, Zhao S, Golmant N, Gholaminejad A, Gonzalez J, Keutzer K (2018) Shift: a zero flop, zero parameter alternative to spatial convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9127–9135

  55. Yang Z, Wang Y, Liu C, Chen H, Xu C, Shi B, Xu C, Xu C (2019) Legonet: efficient convolutional neural networks with lego filters. In: International conference on machine learning. PMLR, pp 7005–7014

  56. Wang W, Yu Z, Fu C, Cai D, He X (2021) Cop: customized correlation-based filter level pruning method for deep cnn compression. Neurocomputing 464:533–545

    Article  Google Scholar 

  57. He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1389–1397

  58. Wan D, Shen F, Liu L, Zhu F, Huang L, Yu M, Shen HT, Shao L (2020) Deep quantization generative networks. Pattern Recogn 105:107338

    Article  Google Scholar 

  59. Chen H, Wang Y, Xu C, Yang Z, Liu C, Shi B, Xu C, Xu C, Tian Q (2019) Data-free learning of student networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3514–3522

  60. Liu L, Chen J, Wu H, Chen T, Li G, Lin L (2020) Efficient crowd counting via structured knowledge transfer. In: Proceedings of the 28th ACM international conference on multimedia, pp 2645–2654

  61. Wang S, Zhou T, Lu Y, Di H (2021) Contextual transformation network for lightweight remote-sensing image super-resolution. IEEE Trans Geosci Remote Sens 60:1–13

    Google Scholar 

  62. Paoletti ME, Haut JM, Pereira NS, Plaza J, Plaza A (2021) Ghostnet for hyperspectral image classification. IEEE Trans Geosci Remote Sens 59(12):10378–10393

    Article  Google Scholar 

  63. Kazerouni IA, Dooly G, Toal D (2021) Ghost-unet: an asymmetric encoder-decoder architecture for semantic segmentation from scratch. IEEE Access 9:97457–97465

    Article  Google Scholar 

  64. Chen X, Bin Y, Sang N, Gao C (2019) Scale pyramid network for crowd counting. In: Winter conference on applications of computer vision, pp 1941–1950

  65. Wang Q, Gao J, Lin W, Li X (2020) Nwpu-crowd: a large-scale benchmark for crowd counting and localization. IEEE Trans Pattern Anal Mach Intell 43(6):2141–2149

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huailin Zhao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, L., Zhao, H., Zhou, F. et al. PDDNet: lightweight congested crowd counting via pyramid depth-wise dilated convolution. Appl Intell 53, 10472–10484 (2023). https://doi.org/10.1007/s10489-022-03967-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03967-6

Keywords

Navigation