Skip to main content
Log in

Multiscale aggregation network via smooth inverse map for crowd counting

  • 1229: Multimedia Data Analysis for Smart City Environment Safety
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Crowd counting is a practical yet essential research topic in computer vision, which has been beneficial to diverse applications in smart city environment safety. The commonly adopted paradigm in most existing methods is to regress a Gaussian density map that works as the learning objective during model training. However, given the unavoidable identity occlusion and scale variation in a crowd image, the corresponding Gaussian density map is degraded, failing to provide reliable supervision for optimization. To address this problem, we propose to replace the traditional Gaussian density map with a better alternation, namely the smooth inverse map (SIM). The proposed SIM can reflect the head location spatially and provide a smooth gradient to stabilize the model learning. Besides, we want the method to learn more discriminative features to cope with the challenge of large-scale variations. We deliver a multiscale aggregation (MA) to adaptively fuse features in different hierarchies to benefit semantic information under diverse receptive filed. The SIM and MA are meant to be complementary modules to guide the model in learning an accurate density map. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed method compared with the state-of-the-art techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Cao X, Wang Z, Zhao Y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: ECCV. https://doi.org/10.1007/978-3-030-01228-1_45

  2. Dollár P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34:743–761. https://doi.org/10.1109/TPAMI.2011.155

    Article  Google Scholar 

  3. Felzenszwalb PF, Girshick RB, McAllester DA, Ramanan D (2009) Object detection with discriminatively trained part based models. IEEE Trans Pattern Anal Mach Intell 32:1627–1645. https://doi.org/10.1109/TPAMI.2009.167

    Article  Google Scholar 

  4. Fu M, Xu P, Li X, Liu Q, Ye M, Zhu C (2015) Fast crowd density estimation with convolutional neural networks. Eng Appl Artif Intell 43:81–88. https://doi.org/10.1016/j.engappai.2015.04.006

    Article  Google Scholar 

  5. Gao J, Wang Q, Li X (2020) Pcc net: Perspective crowd counting via spatial convolutional network. IEEE Trans Circuits Syst Video Technol 30:3486–3498. https://doi.org/10.1109/TCSVT.2019.2919139

    Article  Google Scholar 

  6. Guo X, Gao M, Zhai W, Shang J, Li Q (2022) Spatial-frequency attention network for crowd counting Big data. https://doi.org/10.1089/big.2022.0039

  7. Hwan Oh M, Olsen PA, Ramamurthy KN (2020) Crowd counting with decomposed uncertainty, arXiv:1903.07427. https://doi.org/10.1609/AAAI.V34I07.6852

  8. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: 2013 IEEE Conference on computer vision and pattern recognition, pp 2547–2554. https://doi.org/10.1109/CVPR.2013.329

  9. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2547–2554. https://doi.org/10.1109/CVPR.2013.329

  10. Idrees H, Tayyab M, Athrey K, Zhang D, Al-Maadeed S, Rajpoot N, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the european conference on computer vision (ECCV), pp 532–546. https://doi.org/10.1007/978-3-030-01216-8_33

  11. Kasmani SA, He X, Jia W, Wang D, Zeibots M (2018) A-ccnn: Adaptive ccnn for density estimation and crowd counting. In: 2018 25th IEEE International conference on image processing (ICIP), pp 948–952. https://doi.org/10.1109/ICIP.2018.8451399

  12. Li Y, Zhang X, Chen D (2018) Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 1091–1100. https://doi.org/10.1109/CVPR.2018.00120

  13. Liang D, Xu W, Zhu Y, Zhou Y (2021) Reciprocal distance transform maps for crowd counting and people localization in dense crowd. arXiv:2102.07925

  14. Liu C, Weng X, Mu Y (2019) Recurrent attentive zooming for joint crowd counting and precise localization. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 1217–1226. https://doi.org/10.1109/CVPR.2019.00131

  15. Liu W, Salzmann M, Fua PV (2019) Context-aware crowd counting. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 5094–5103. https://doi.org/10.1109/CVPR.2019.00524

  16. Oghaz MM, Khadka AR, Argyriou V, Remagnino P (2019) Content-aware density map for crowd counting and density estimation. arXiv:1906.07258

  17. Olmschenk G, Tang H, Zhu Z (2020) Improving dense crowd counting convolutional neural networks using inverse k-nearest neighbor maps and multiscale upsampling, arXiv:1902.05379. https://doi.org/10.5220/0009156201850195

  18. Sajid U, Ma W, Wang G (2021) Multi-resolution fusion and multi-scale input priors based crowd counting. In: 2020 25th International conference on pattern recognition (ICPR), pp 5790–5797. https://doi.org/10.1109/ICPR48806.2021.9412406

  19. Sam DB, Peri SV, Sundararaman MN, Kamath A, Babu RV (2021) Locate, size, and count: Accurately resolving people in dense crowds via detection. IEEE Trans Pattern Anal Mach Intell 43:2739–2751. https://doi.org/10.1109/tpami.2020.2974830

    Article  Google Scholar 

  20. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: 2017 IEEE International conference on computer vision (ICCV), pp 1879–1888. https://doi.org/10.1109/ICCV.2017.206

  21. Sindagi VA, Patel VM (2019) Multi-level bottom-top and top-bottom feature fusion for crowd counting. In: 2019 IEEE/CVF International conference on computer vision (ICCV), pp 1002–1012. https://doi.org/10.1109/ICCV.2019.00109

  22. Sindagi VA, Patel VM (2020) Ha-ccn: Hierarchical attention-based crowd counting network. IEEE Trans Image Process 29:323–335. https://doi.org/10.1109/TIP.2019.2928634

    Article  MathSciNet  Google Scholar 

  23. Sindagi VA, Yasarla R, Patel VM (2022) Jhu-crowd++: Large-scale crowd counting dataset and a benchmark method. IEEE Trans Pattern Anal Mach Intell 44:2594–2609. https://doi.org/10.1109/tpami.2020.3035969

    Article  Google Scholar 

  24. Song Q, Wang C, Wang Y, Tai Y, Wang C, Li J, Wu J, Ma J (2021) To choose or to fuse? scale selection for crowd counting. In: AAAI

  25. Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 5686–5696. https://doi.org/10.1109/CVPR.2019.00584

  26. Tian Y, Lei Y, Zhang J, Wang JZ (2020) Padnet: Pan-density crowd counting. IEEE Trans Image Process 29:2714–2727. https://doi.org/10.1109/TIP.2019.2952083

    Article  Google Scholar 

  27. Topkaya IS, Erdogan H, Porikli FM (2014) Counting people by clustering person detector outputs. In: 2014 11th IEEE International conference on advanced video and signal based surveillance (AVSS), pp 313–318. https://doi.org/10.1109/AVSS.2014.6918687

  28. Wan J, Wang Q, Chan AB (2022) Kernel-based density map generation for dense object counting. IEEE Trans Pattern Anal Mach Intell 44:1357–1370. https://doi.org/10.1109/TPAMI.2020.3022878

    Article  Google Scholar 

  29. Wang Q, Gao J, Lin W, Yuan Y (2019) Learning from synthetic data for crowd counting in the wild. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 8190–8199. https://doi.org/10.1109/CVPR.2019.00839

  30. Xu C, Liang D, Xu Y, Bai S, Zhan W, Tomizuka M, Bai X (2022) Autoscale: Learning to scale for crowd counting. Int J Comput Vis, pp 1–30. https://doi.org/10.1007/s11263-021-01542-z

  31. Zhai W, Gao M, Anisetti M, Li Q, Jeon S, Pan J (2022) Group-split attention network for crowd counting, J Electron Imaging. https://doi.org/10.1117/1.JEI.31.4.041214

  32. Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks

  33. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR), pp 589–597. https://doi.org/10.1109/CVPR.2016.70

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China (Nos. 61601266 and 61801272) and National Natural Science Foundation of Shandong Province (Nos. ZR2021QD041 and ZR2020MF127).

Funding

This work is supported in part by the National Natural Science Foundation of China (Nos. 61601266 and 61801272) and National Natural Science Foundation of Shandong Province (Nos. ZR2021QD041 and ZR2020MF127).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingliang Gao.

Ethics declarations

Ethics approval and consent to participate

We declare that there is no ethics issue.

Conflict of Interests

We declare that we have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: 1229: Multimedia Data Analysis for Smart City Environment Safety

Guest Editors: Alessandro Bruno, Aladine Chetouani, Zoheir Sabeur, Marouane Tliba, Evangelos Maltezos, Miguel Gonzalez San Emeterio

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, X., Gao, M., Zhai, W. et al. Multiscale aggregation network via smooth inverse map for crowd counting. Multimed Tools Appl (2022). https://doi.org/10.1007/s11042-022-13664-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-022-13664-8

Keywords

Navigation