Abstract
Metro passenger counting and density estimation are crucial for traffic scheduling and risk prevention. Although deep learning has achieved great success in passenger counting, most existing methods ignore fundamental appearance information, leading to density maps of low quality. To address this problem, we propose a novel counting method called “dilated-transposed fully convolution neural network” (DT-CNN), which combines a feature extraction module (FEM) and a feature recovery module (FRM) to generate high-quality density maps and accurately estimate passenger counts in highly congested metro scenes. Specifically, the FEM is composed of a CNN, and a set of dilated convolutional layers extract 2D features relevant to scenes containing crowded human objects. Then, the resulting density map produced by the FEM is processed by the FRM to learn potential features, which is used to restore feature map pixels. The DT-CNN is end-to-end trainable and independent of the backbone fully convolutional network architecture. In addition, we introduce a new metro passenger counting dataset (Zhengzhou_MT++) that contains 396 images with 3,978 annotations. Extensive experiments conducted on self-built datasets and three representative crowd-counting datasets show the proposed method achieves superior performance relative to other state-of-the-art methods in terms of counting accuracy and density map quality. The Zhengzhou MT++ dataset is available at https://github.com/YellowChampagne/Zhengzhou_MT.
Similar content being viewed by others
References
Liu X, Tu PH, Rittscher J, Perera AGA, Krahnstoever N (2005) Detecting and counting people in surveillance applications. In: IEEE conference on advanced video and signal based surveillance, pp 306–311
Huazhong X, Lv P, Meng L (2010) A people counting system based on head-shoulder detection and tracking in surveillance video. In: International conference on computer design and applications, vol 1, pp V1–394–V1–398
Yi CT, Ho CC, Jinn WD, Li KY (2010) A People Counting System Based on Face-Detection. In: 4th International conference on genetic and evolutionary computing, pp 699–702
Sheng Z, Tian K, Tian Q, Qu H (2018) A faster R-CNN based high-normalization sample calibration method for dense subway passenger flow detection. In: 11th International congress on image and signal processing, biomedical engineering and informatics, pp 1–5
Zhao ZQ, Cheung YM, Hu H, Wu X (2016) Corrupted and occluded face recognition via cooperative sparse representation. Pattern Recognit 56:77
Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: IEEE conference on computer vision and pattern recognition, pp 589–597
Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: IEEE international conference on computer vision, pp 1879–1888
Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: IEEE conference on computer vision and pattern recognition, pp 4031–4039
Li Y, Zhang X, Chen D (2018) CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: IEEE conference on computer vision and pattern recognition, pp 1091–1100
Dollár P, Wojek C, Schiele B, Perona P (2012) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743
Felzenszwalb PF, Girshick RB, McAllester DA, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627
Chan AB, Liang ZJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE conference on computer vision and pattern recognition, pp 1–7. https://doi.org/10.1109/CVPR.2008.4587569
Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: IEEE conference on computer vision and pattern recognition, pp 2547–2554
Ding X, Lin Z, He F, Wang Y, Huang Y (2018) A deeply-recursive convolutional network for crowd counting. In: IEEE international conference on acoustics, speech and signal processing, pp 1942–1946
Zhang J, Zhu G, Wang Z (2020) Multi-column Atrous convolutional neural network for counting metro passengers. Symmetry 12(682):1
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 3431–3440
Pan J, Sayrol E, Giró-i-Nieto X, McGuinness K, O’Connor NE (2016) Shallow and deep convolutional networks for saliency prediction. In: IEEE conference on computer vision and pattern recognition, pp 598–606
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: 3rd international conference on learning representations
Chan AB, Liang ZJ, Vasconcelos N (2008) Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE conference on computer vision and pattern recognition, pp 1–7
Ke C, Chen CL, Gong S, Tao X (2012) Feature mining for localised crowd counting. In: British machine vision conference, pp 1–11
Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: IEEE conference on computer vision and pattern recognition, pp 833–841
Idrees H, Tayyab M, Athrey K, Zhang D, Al-Máadeed S, Rajpoot NM, Shah M (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Computer Vision—ECCV 2018—15th European Conference, pp 544–559
Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 878–885
Chan AB, Vasconcelos N (2009) Bayesian Poisson regression for crowd counting. In: IEEE 12th international conference on computer vision, pp 545–551
Lempitsky VS, Zisserman A (2010) Learning to count objects in images. In: 24th annual conference on neural information processing systems, pp 1324–1332
Pham VQ, Kozakaya T, Yamaguchi O, Okada R (2015) COUNT forest: CO-voting uncertain number of targets using random forest for crowd density estimation. In: IEEE international conference on computer vision, pp 3253–3261
Bhatia V, Rani R (2018) DFuzzy: a deep learning-based fuzzy clustering model for large graphs. Knowl Inf Syst 57:1
Zhang S, Zhang W, Niu J (2019) Improving short-text representation in convolutional networks by dependency parsing. Knowl Inf Syst 61:1
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: 26th annual conference on neural information processing systems, pp 1106–1114
Deng C, Xue Y, Liu X, Li C, Tao D (2019) Active transfer learning network: a unified deep joint spectral-spatial feature learning model for hyperspectral image classification. IEEE Trans Geosci Rem Sens 57(3):1741
Chollet F (2017) Xception: Deep Learning with Depthwise Separable Convolutions. In: IEEE conference on computer vision and pattern recognition, pp 1800–1807
Banerjee D, Islam K, Xue K, Mei G, Xiao L, Zhang G, Xu R, Lei C, Ji S, Li J (2019) A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. Knowl Inf Syst 60:1
Ooro-Rubio D, López-Sastre RJ (2016) Towards perspective-free object counting with deep learning. In: European Conference on Computer Vision (ECCV)
Boominathan L, Kruthiventi SSS, Babu RV (2016) CrowdNet: a deep convolutional network for dense crowd counting. In: the 2016 ACM
Wang L, Yin B, Tang X, Li Y (2019) Removing background interference for crowd counting via de-background detail convolutional network. Neurocomputing 332(MAR.7):360
Wang Q, Gao J, Lin W, Yuan Y (2019) Learning from synthetic data for crowd counting in the wild. In: IEEE conference on computer vision and pattern recognition, pp 8198–8207
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR)
Wu X, Zheng Y, Ye H, Hu W, Ma T, Yang J, He L (2020) Counting crowds with varying densities via adaptive scenario discovery framework. Neurocomputing 397:127
Li J, Xue Y, Wang W, Ouyang G (2020) Cross-Level Parallel Network for Crowd Counting. IEEE Trans Ind Inf 16(1):566
Liangzi Rong CL (2020) A strong baseline for crowd counting and unsupervised people localization
Long J, Shelhamer E, Darrell T (2015) Fully Convolutional Networks for Semantic Segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640
Wang M, Wang S, Kong P (2019) Simplified VGG based super resolution restoration for face recognition. In: ICCPR ’19: 2019 8th international conference on computing and pattern recognition
Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. In: 4th International conference on learning representations, pp 1–13
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834
Chen L, Papandreou G, Schroff F, Adam H (2017) Rethinking Atrous Convolution for Semantic Image Segmentation. CoRR. abs/1706.05587
Cao C, Wang Z, Zhao y, Su F (2018) Scale aggregation network for accurate and efficient crowd counting. In: European conference on computer vision, vol 11209
Liu L, Jia W, Jiang J, Amirgholipour S, He X (2020) DENet: a universal network for counting crowd with varying densities and scales. IEEE Trans Multimed PP(99):1
Dai G, Hu Y, Yang Y, Zhang N, Abraham A, Liu H (2019) A novel fuzzy rule extraction approach using Gaussian kernel-based granular computing. Knowl Inf Syst 61:1
Zeng X, Wu Y, Hu S, Wang R, Ye Y (2020) DSPNet: Deep scale purifier network for dense crowd counting. Expert Syst Appl 141:1
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In 3rd International conference on learning representations
Wang Zhou, Bovik AC, Sheikh H.R, Simoncelli E.P (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600
Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: IEEE conference on computer vision and pattern recognition, pp 5245–5254
Liu X, van de Weijer J, Bagdanov AD (2018) Leveraging unlabeled data for crowd counting by learning to rank. In: IEEE conference on computer vision and pattern recognition, pp 7661–7669
Shi Z, Zhang L, Liu Y, Cao X, Ye Y, Cheng M, Zheng G (2018) Crowd counting with deep negative correlation learning. In: IEEE conference on computer vision and pattern recognition, pp 5382–5390
Sam DB, Sajjan NN, Babu RV, Srinivasan M (2018) Divide and grow: capturing huge diversity in crowd images with incrementally growing CNN. In: IEEE conference on computer vision and pattern recognition, pp 3618–3626
Ranjan V, Le H, Hoai M (2018) Iterative crowd counting. In: Computer Vision—ECCV 2018—15th European Conference, pp 278–293
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481
Sindagi VA, Patel VM (2017) CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 14th IEEE international conference on advanced video and signal based surveillance, pp 1–6
Acknowledgements
This work is supported by the National Natural Science Foundation of China (NSFC) General program (61673353) and Young Scientist Fund of NSFC (61603344).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhu, G., Zeng, X., Jin, X. et al. Metro passengers counting and density estimation via dilated-transposed fully convolutional neural network. Knowl Inf Syst 63, 1557–1575 (2021). https://doi.org/10.1007/s10115-021-01563-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-021-01563-7