Skip to main content
Log in

TFA-CNN: an efficient method for dealing with crowding and noise problems in crowd counting

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Crowd counting technology is to let people understand the spatial distribution of crowds in various scenes. In reality, a large number of occlusions and scale variations make it extremely challenging to achieve accurate counting in crowded venues. Aiming at these problems, this paper designs a crowd density estimation network that can maintain good accuracy in scenes that are both crowded and have large-scale changes: Texture Feature Attention Convolutional Neural Network (TFA-CNN). Specifically: (1) A Differential Texture Module (DT Module) is proposed to identify various texture features of the bottom feature map and to better distinguish between background and foreground regions; (2) proposed the Multi-Channel Threshold Replacement Attention Module (MTRA Module), which combines channel and spatial attention mechanisms to allow the network to pay more focus on the head position of the crowd, thereby reducing the counting error. TFA-CNN has conducted multiple experiments on several publicly available and challenging datasets, and the results are superior to many SOTA methods, demonstrating excellent generalization and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The tagged data set used in this article is available on request from the corresponding author.

References

  1. Wen, L., Du, D., Zhu, P., Hu, Q., Wang, Q., Bo, L., Lyu, S.: Detection, tracking, and counting meets drones in crowds: a benchmark. In: CVPR, pp. 7812–7821 (2021)

  2. Liu, N., Long, Y., Zou, C., Niu, Q., Pan, L., Wu, H., ADCrowdNet: An attention-injective deformable convolutional network for crowd understanding. In: CVPR, pp. 3225–3234 (2019)

  3. Liang, D., Chen, X., Wei, X., Zhou, Y., Xiang Bai, X., TransCrowd: weakly-supervised crowd counting with transformers. Sci. China Inf. Sci 65(6), 1–14 (2022)

    Article  Google Scholar 

  4. Rao, A.S., Gubbi, J., Marusic, S., et al.: Estimation of crowd density by clustering motion cues. Vis. Comput. 31, 1533–1552 (2015)

    Article  Google Scholar 

  5. Bo, Wu., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors. Int. J. Comput. Vis. 75(2), 247–266 (2007)

    Article  Google Scholar 

  6. Azizpour, H., Laptev, I.: Object detection using strongly-supervised deformable part models. In: ECCV, pp. 836–849 (2012)

  7. Lempitsky, V.S., Zisserman, A.: Learning to count objects in images. In: NIPS, pp. 1324–1332 (2010)

  8. Tian, M., Guo, H., Long, C.: Multi-level attentive convoluntional neural network for crowd counting. arXiv https://arxiv.org/abs/2105.11422 (2021)

  9. Marsden, M., McGuinness, K., Little, S., O'Connor, N.E.: Fully convolutional crowd counting on highly congested scenes. In: VISIGRAPP, pp. 27–33 (2017)

  10. Liyan Xiong, Hu., Yi, X.H., Huang, W.: An efficient multi-scale contextual feature fusion network for counting crowds with varying densities and scales. Multimed. Tools Appl 82(9), 13929–13949 (2023)

    Article  Google Scholar 

  11. Kong, D., Gray, D., Tao, H.: A viewpoint invariant approach for crowd counting. In: ICPR, pp. 1187–1190 (2006)

  12. Siva, P., Javad Shafiee, M., Jamieson, M., Wong, A.: Real-time, embedded scene invariant crowd counting using scale-normalized histogram of moving gradients (HoMG). In: CVPR Workshop 67–74 (2016)

  13. Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: CVPR, pp. 589–597 (2016)

  14. Liu, L., Qiu, Z., Li, G., Liu, S., Ouyang, W., Lin, L.: Crowd counting with deep structured scale integration network. In: ICCV, pp. 1774–1783 (2019)

  15. Chan, A.B., Vasconcelos, N.: Bayesian Poisson regression for crowd counting. In: ICCV, pp. 545–551 (2009)

  16. Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: CVPR, pp. 833–841 (2015)

  17. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: CVPR, pp. 1–9 (2015)

  18. Ma, Y.: Inception-based crowd counting - being fast while remaining accurate. arXiv https://arxiv.org/abs/2210.09796v1 (2022)

  19. Wang, Q., Breckon, T.P.: Crowd counting via segmentation guided attention networks and curriculum loss. IEEE Trans. Intell. Transp. Syst. 23(9), 15233–15243 (2022)

    Article  Google Scholar 

  20. Liu, W., Salzmann, M., Fua, P.: Context-aware crowd counting. In: CVPR, pp. 5099–5108 (2019)

  21. Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: CVPR., pp. 3146–3154 (2019)

  22. Gao, J., Wang, Qi., Yuan, Y.: SCAR: Spatial-/channel-wise attention regression networks for crowd counting. Neurocomputing 363, 1–8 (2019)

    Article  Google Scholar 

  23. Li, H., Zhang, S., Kong, W.: Bilateral counting network for single-image object counting. Vis. Comput. 36(8), 1693–1704 (2020)

    Article  Google Scholar 

  24. Khan, M.A., Menouar, H., Hamila, R.: Crowd density estimation using imperfect labels. In: ICCE 1–6 (2023)

  25. Bai, S., He, Z., Xu, C., Qiao, Y. et al.: Adaptive dilated network with self-correction supervision for counting. In: CVPR, pp. 4594–4603 (2022)

  26. Ma, Y., Sanchez, V., Guha, T., Fusioncount: Efficient crowd counting via multiscale feature fusion. In: ICIP, pp. 3256–3260 (2022)

  27. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv https://arxiv.org/abs/1706.05587 (2017)

  28. Li, Y., Zhang, X., Chen, D.: CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: CVPR, pp .1091–1100 (2018)

  29. Wang, X., Zhao, Y., Yang, T., Ruan Q.: Multi-scale context aggregation network with attention-guided for crowd counting. In: ICSP https://arxiv.org/abs/2104.02245 (2020)

  30. Sam, D.B., Surya, S., Venkatesh Babu, R.: Switching convolutional neural network for crowd counting. In: CVPR , pp. 4031–4039 (2017)

  31. Li, Z., Shuhua, Lu., Dong, Y., Guo, J.: MSFFA: a multi-scale feature fusion and attention mechanism network for crowd counting. Vis. Comput. 39(3), 1045–1056 (2023)

    Article  Google Scholar 

  32. Song, Q., Wang, C., Wang, Y., Tai, Y., Wang, C., Li, J., Wu, J., Ma, J.: To choose or to fuse? Scale selection for crowd counting. In: AAAI, pp .2576–2583 (2021)

  33. Hou, Y., Li, C., Yang, F., Ma, C., Zhu, L., Yuan Li, Huizhu Jia, Xiaodong Xie (2020) BBA-net: A bi-branch attention network for crowd counting. ICASSP 4072–4076

  34. Ma, Z., Wei, X., Hong, X., Gong, Y.: Bayesian loss for crowd count estimation with point supervision. In: ICCV, pp. 6141–6150 (2019)

  35. Cheng, Z.-Q., Li, J.-X., Dai, Q., Wu, X., Hauptmann, A.G.: Learning spatial awareness to improve crowd counting. In: ICCV, pp. 6151–6160 (2019)

  36. Miao, Y., Lin, Z., Ding, G., Han, J.: Shallow feature based dense attention network for crowd counting. In: AAAI, pp. 11765–11772 (2020)

  37. Sindagi, V.A., Patel, V.M.: Inverse attention guided deep crowd counting network. In: AVSS, pp. 1–8 (2019)

  38. Zhang, A., Yue, L., Shen, J., Zhu, F., Zhen, X., Cao, X., Shao, L.: Attentional neural fields for crowd counting. In: ICCV, pp. 5713–5722 (2019)

  39. Jiang, X., Zhang, L., Xu, M., Zhang, T., Lv, P., Zhou, B., Yang, X., Pang, X.: Attention scaling for crowd counting. In: CVPR, pp. 4705–4714 (2020)

  40. Amirgholipour, S., Jia, W., Liu, L., Fan, X., Wang, D., He, X.: PDANet: pyramid density-aware attention based network for accurate crowd counting. Neurocomputing 451, 215–230 (2021)

    Article  Google Scholar 

  41. Lin, H., Ma, Z., Ji, R., Wang, Y., Hong, X.: Boosting crowd counting via multifaceted attention. In: CVPR., pp. 19596–19605 (2022)

  42. Han, S., Wang, G., Liu, D.: Indirect-instant attention optimization for crowd counting in dense scenes. arXiv https://arxiv.org/abs/2206.05648v1 (2022)

  43. Tang, C., Liu, X., An, S., Wang, P., BR2Net: Defocus blur detection via a bidirectional channel attention residual refining network. IEEE Trans. Multimed. 23, 624–635 (2021)

    Article  Google Scholar 

  44. Gao, J., Huang, Z., Lei, Y., Wang, J.Z., Wang, F.Y., Zhang, J.: S2FPR: crowd counting via self-supervised course to fine feature pyramid ranking. arXiv. 2201.04819. https://arxiv.org/abs/2201.04819 (2022)

  45. Zhikang Zou, Yu., Cheng, X.Q., Ji, S., Guo, X., Zhou, P.: Attend to count: crowd counting with adaptive capacity multi-scale CNNs. Neurocomputing 367, 75–83 (2019)

    Article  Google Scholar 

  46. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint https://arxiv.org/abs/1412.6980 (2014)

  47. Li, Y., Zhang, X., Chen, D., Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In: CVPR, pp. 1091–1100 (2018)

  48. Ma, T., Ji, Q., Ning, L.: Scene invariant crowd counting using multi-scales head detection in video surveillance. IET Image Process 12(12), 2258–2263 (2018)

    Article  Google Scholar 

  49. Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: CVPR, pp. 2547–2554 (2013)

  50. Cheng, Z.-Q., Dai, Q., Li, H., Song, J., Wu, X., Hauptmann, A.G.: Rethinking spatial invariance of convolutional networks for object counting. In: CVPR, pp. 19638–19648 (2022)

  51. Idrees, H., Tayyab, M., Athrey, K., Zhang, D., Al-Maddeed, S., Rajpoot, N., Shah, M.: Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of IEEE European Conference on Computer Vision (ECCV), Munich, Germany, September 8–14 (2018)

  52. Liu, L., Jiang, J., Jia, W., Amirgholipour, S., Wang, Yi., Zeibots, M., He, X.: DENet: a universal network for counting crowd with varying densities and scales. IEEE Trans. Multimed. 23, 1060–1068 (2021)

    Article  Google Scholar 

  53. Cao, X., Wang, Z., Zhao, Y., Su, F.: Scale aggregation network for accurate and efficient crowd counting. In: ECCV, pp. 757–773 (2018)

  54. Li, P., Zhang, M., Wan, J., Jiang, M.: Multi-scale guided attention network for crowd counting. Sci. Program. 2021, 1–13 (2021)

    Google Scholar 

  55. Ding, X., He, F., Lin, Z., Wang, Y., Guo, H., Huang, Y.: Crowd density estimation using fusion of multilayer features. IEEE Trans. Intell. Transp. Syst. 22(8), 4776–4787 (2021)

    Article  Google Scholar 

  56. Sindagi, V.A., Patel, V.M., HA-CCN: Hierarchical attention-based crowd counting network. IEEE Trans. Image Process. 29, 323–335 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  57. Liang, D., Xu, W., Xiang Bai, X.: An end-to-end transformer model for crowd localization. In: ECCV vol 13661 (2022)

  58. Jiang, X., Xiao, Z., Zhang, B., Zhen, X., Cao, X., Doermann, D., Shao L.: Crowd counting and density estimation by trellis encoderdecoder networks. In: CVPR pp. 6133–6142 (2019)

  59. Tian, Y., Chu, X., Wang, H.: Cctrans: simplifying and improving crowd counting with transformer. arXiv preprint arXiv:2109.14483 (2021)

  60. Wang, F., Liu, K., Long, F., Sang, N., Xia, X., Sang, J.: Joint cnn and transformer network via weakly supervised learning for efficient crowd counting. arXiv preprint arXiv:2203.06388, (2022)

  61. Chen, Y., Yang, J., Chen, B., Shaoyi, Du.: Counting varying density crowds through density guided adaptive selection cnn and transformer estimation. IEEE Trans. Circ. Syst. Video Technol. 33(3), 1055–1068 (2023)

    Article  Google Scholar 

  62. Liang, D., Chen, X., Wei, Xu., Zhou, Yu., Bai, X.: TransCrowd: weakly-supervised crowd counting with transformers. Sci. China Inf. Sci. 65(6), 1–14 (2022)

    Article  Google Scholar 

  63. Jiang, X., Xiao, Z., Zhang, B., Zhen, X., Cao, X., Doermann, D.S., Shao, L.: (2019) Crowd counting and density estimation by trellis encoder-decoder networks. In: CVPR , pp. 6133–6142 (2019)

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Nos. 62067002, 61967006, and 62062033), in part by the Science and Technology Project of the Transportation Department of Jiangxi Province, China (Nos. 2022X0040) and in part by the Natural Science Foundation of Jiangxi Province underGrant 20232BAB202018.

Author information

Authors and Affiliations

Authors

Contributions

XL and LZ wrote the main manuscript style, HX optimized it, and ZY and HP collected part of the data. All the authors read the manuscript.

Corresponding author

Correspondence to Zhida Li.

Ethics declarations

Conflict of interest

The authors declare that there are no competing interests related to the content of this article.

Additional information

Communicated by T. Li.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiong, L., Li, Z., Huang, X. et al. TFA-CNN: an efficient method for dealing with crowding and noise problems in crowd counting. Multimedia Systems 29, 3259–3276 (2023). https://doi.org/10.1007/s00530-023-01194-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01194-8

Keywords

Navigation