Skip to main content
Log in

Metro passengers counting and density estimation via dilated-transposed fully convolutional neural network

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Metro passenger counting and density estimation are crucial for traffic scheduling and risk prevention. Although deep learning has achieved great success in passenger counting, most existing methods ignore fundamental appearance information, leading to density maps of low quality. To address this problem, we propose a novel counting method called “dilated-transposed fully convolution neural network” (DT-CNN), which combines a feature extraction module (FEM) and a feature recovery module (FRM) to generate high-quality density maps and accurately estimate passenger counts in highly congested metro scenes. Specifically, the FEM is composed of a CNN, and a set of dilated convolutional layers extract 2D features relevant to scenes containing crowded human objects. Then, the resulting density map produced by the FEM is processed by the FRM to learn potential features, which is used to restore feature map pixels. The DT-CNN is end-to-end trainable and independent of the backbone fully convolutional network architecture. In addition, we introduce a new metro passenger counting dataset (Zhengzhou_MT++) that contains 396 images with 3,978 annotations. Extensive experiments conducted on self-built datasets and three representative crowd-counting datasets show the proposed method achieves superior performance relative to other state-of-the-art methods in terms of counting accuracy and density map quality. The Zhengzhou MT++ dataset is available at https://github.com/YellowChampagne/Zhengzhou_MT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Liu X, Tu PH, Rittscher J, Perera AGA, Krahnstoever N (2005) Detecting and counting people in surveillance applications. In: IEEE conference on advanced video and signal based surveillance, pp 306–311

  2. Huazhong X, Lv P, Meng L (2010) A people counting system based on head-shoulder detection and tracking in surveillance video. In: International conference on computer design and applications, vol 1, pp V1–394–V1–398

  3. Yi CT, Ho CC, Jinn WD, Li KY (2010) A People Counting System Based on Face-Detection. In: 4th International conference on genetic and evolutionary computing, pp 699–702

  4. Sheng Z, Tian K, Tian Q, Qu H (2018) A faster R-CNN based high-normalization sample calibration method for dense subway passenger flow detection. In: 11th International congress on image and signal processing, biomedical engineering and informatics, pp 1–5

  5. Zhao ZQ, Cheung YM, Hu H, Wu X (2016) Corrupted and occluded face recognition via cooperative sparse representation. Pattern Recognit 56:77

    Article  Google Scholar 

  6. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: IEEE conference on computer vision and pattern recognition, pp 589–597

  7. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: IEEE international conference on computer vision, pp 1879–1888

  8. Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: IEEE conference on computer vision and pattern recognition, pp 4031–4039

  9. Li Y, Zhang X, Chen D (2018) CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: IEEE conference on computer vision and pattern recognition, pp 1091–1100

  10. Dollár P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743

    Article  Google Scholar 

  11. Felzenszwalb PF, Girshick RB, McAllester DA, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627

    Article  Google Scholar 

  12. Chan AB, Liang ZJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE conference on computer vision and pattern recognition, pp 1–7. https://doi.org/10.1109/CVPR.2008.4587569

  13. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: IEEE conference on computer vision and pattern recognition, pp 2547–2554

  14. Ding X, Lin Z, He F, Wang Y, Huang Y (2018) A deeply-recursive convolutional network for crowd counting. In: IEEE international conference on acoustics, speech and signal processing, pp 1942–1946

  15. Zhang J, Zhu G, Wang Z (2020) Multi-column Atrous convolutional neural network for counting metro passengers. Symmetry 12(682):1

    Google Scholar 

  16. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 3431–3440

  17. Pan J, Sayrol E, Giró-i-Nieto X, McGuinness K, O’Connor NE (2016) Shallow and deep convolutional networks for saliency prediction. In: IEEE conference on computer vision and pattern recognition, pp 598–606

  18. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd international conference on learning representations

  19. Chan AB, Liang ZJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE conference on computer vision and pattern recognition, pp 1–7

  20. Ke C, Chen CL, Gong S, Tao X (2012) Feature mining for localised crowd counting. In: British machine vision conference, pp 1–11

  21. Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: IEEE conference on computer vision and pattern recognition, pp 833–841

  22. Idrees H, Tayyab M, Athrey K, Zhang D, Al-Máadeed S, Rajpoot NM, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Computer Vision—ECCV 2018—15th European Conference, pp 544–559

  23. Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 878–885

  24. Chan AB, Vasconcelos N (2009) Bayesian Poisson regression for crowd counting. In: IEEE 12th international conference on computer vision, pp 545–551

  25. Lempitsky VS, Zisserman A (2010) Learning to count objects in images. In: 24th annual conference on neural information processing systems, pp 1324–1332

  26. Pham VQ, Kozakaya T, Yamaguchi O, Okada R (2015) COUNT forest: CO-voting uncertain number of targets using random forest for crowd density estimation. In: IEEE international conference on computer vision, pp 3253–3261

  27. Bhatia V, Rani R (2018) DFuzzy: a deep learning-based fuzzy clustering model for large graphs. Knowl Inf Syst 57:1

    Article  Google Scholar 

  28. Zhang S, Zhang W, Niu J (2019) Improving short-text representation in convolutional networks by dependency parsing. Knowl Inf Syst 61:1

    Article  Google Scholar 

  29. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: 26th annual conference on neural information processing systems, pp 1106–1114

  30. Deng C, Xue Y, Liu X, Li C, Tao D (2019) Active transfer learning network: a unified deep joint spectral-spatial feature learning model for hyperspectral image classification. IEEE Trans Geosci Rem Sens 57(3):1741

    Article  Google Scholar 

  31. Chollet F (2017) Xception: Deep Learning with Depthwise Separable Convolutions. In: IEEE conference on computer vision and pattern recognition, pp 1800–1807

  32. Banerjee D, Islam K, Xue K, Mei G, Xiao L, Zhang G, Xu R, Lei C, Ji S, Li J (2019) A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. Knowl Inf Syst 60:1

    Article  Google Scholar 

  33. Ooro-Rubio D, López-Sastre RJ (2016) Towards perspective-free object counting with deep learning. In: European Conference on Computer Vision (ECCV)

  34. Boominathan L, Kruthiventi SSS, Babu RV (2016) CrowdNet: a deep convolutional network for dense crowd counting. In: the 2016 ACM

  35. Wang L, Yin B, Tang X, Li Y (2019) Removing background interference for crowd counting via de-background detail convolutional network. Neurocomputing 332(MAR.7):360

    Article  Google Scholar 

  36. Wang Q, Gao J, Lin W, Yuan Y (2019) Learning from synthetic data for crowd counting in the wild. In: IEEE conference on computer vision and pattern recognition, pp 8198–8207

  37. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR)

  38. Wu X, Zheng Y, Ye H, Hu W, Ma T, Yang J, He L (2020) Counting crowds with varying densities via adaptive scenario discovery framework. Neurocomputing 397:127

    Article  Google Scholar 

  39. Li J, Xue Y, Wang W, Ouyang G (2020) Cross-Level Parallel Network for Crowd Counting. IEEE Trans Ind Inf 16(1):566

    Article  Google Scholar 

  40. Liangzi Rong CL (2020) A strong baseline for crowd counting and unsupervised people localization

  41. Long J, Shelhamer E, Darrell T (2015) Fully Convolutional Networks for Semantic Segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640

    Google Scholar 

  42. Wang M, Wang S, Kong P (2019) Simplified VGG based super resolution restoration for face recognition. In: ICCPR ’19: 2019 8th international conference on computing and pattern recognition

  43. Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. In: 4th International conference on learning representations, pp 1–13

  44. Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834

    Article  Google Scholar 

  45. Chen L, Papandreou G, Schroff F, Adam H (2017) Rethinking Atrous Convolution for Semantic Image Segmentation. CoRR. abs/1706.05587

  46. Cao C, Wang Z, Zhao y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: European conference on computer vision, vol 11209

  47. Liu L, Jia W, Jiang J, Amirgholipour S, He X (2020) DENet: a universal network for counting crowd with varying densities and scales. IEEE Trans Multimed PP(99):1

    Google Scholar 

  48. Dai G, Hu Y, Yang Y, Zhang N, Abraham A, Liu H (2019) A novel fuzzy rule extraction approach using Gaussian kernel-based granular computing. Knowl Inf Syst 61:1

    Article  Google Scholar 

  49. Zeng X, Wu Y, Hu S, Wang R, Ye Y (2020) DSPNet: Deep scale purifier network for dense crowd counting. Expert Syst Appl 141:1

    Article  Google Scholar 

  50. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In 3rd International conference on learning representations

  51. Wang Zhou, Bovik AC, Sheikh H.R, Simoncelli E.P (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600

    Article  Google Scholar 

  52. Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: IEEE conference on computer vision and pattern recognition, pp 5245–5254

  53. Liu X, van de Weijer J, Bagdanov AD (2018) Leveraging unlabeled data for crowd counting by learning to rank. In: IEEE conference on computer vision and pattern recognition, pp 7661–7669

  54. Shi Z, Zhang L, Liu Y, Cao X, Ye Y, Cheng M, Zheng G (2018) Crowd counting with deep negative correlation learning. In: IEEE conference on computer vision and pattern recognition, pp 5382–5390

  55. Sam DB, Sajjan NN, Babu RV, Srinivasan M (2018) Divide and grow: capturing huge diversity in crowd images with incrementally growing CNN. In: IEEE conference on computer vision and pattern recognition, pp 3618–3626

  56. Ranjan V, Le H, Hoai M (2018) Iterative crowd counting. In: Computer Vision—ECCV 2018—15th European Conference, pp 278–293

  57. Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481

    Article  Google Scholar 

  58. Sindagi VA, Patel VM (2017) CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 14th IEEE international conference on advanced video and signal based surveillance, pp 1–6

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (NSFC) General program (61673353) and Young Scientist Fund of NSFC (61603344).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, G., Zeng, X., Jin, X. et al. Metro passengers counting and density estimation via dilated-transposed fully convolutional neural network. Knowl Inf Syst 63, 1557–1575 (2021). https://doi.org/10.1007/s10115-021-01563-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-021-01563-7

Keywords

Navigation