Scene-adaptive crowd counting method based on meta learning with dual-input network DMNet

Zhao, Haoyu; Min, Weidong; Xu, Jianqiang; Wang, Qi; Zou, Yi; Fu, Qiyan

doi:10.1007/s11704-021-1207-x

Scene-adaptive crowd counting method based on meta learning with dual-input network DMNet

Research Article
Published: 08 August 2022

Volume 17, article number 171304, (2023)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Haoyu Zhao¹,
Weidong Min^2,3,
Jianqiang Xu¹,
Qi Wang¹,
Yi Zou¹ &
…
Qiyan Fu¹

93 Accesses
2 Citations
Explore all metrics

Abstract

Crowd counting is recently becoming a hot research topic, which aims to count the number of the people in different crowded scenes. Existing methods are mainly based on training-testing pattern and rely on large data training, which fails to accurately count the crowd in real-world scenes because of the limitation of model’s generalization capability. To alleviate this issue, a scene-adaptive crowd counting method based on meta-learning with Dual-illumination Merging Network (DMNet) is proposed in this paper. The proposed method based on learning-to-learn and few-shot learning is able to adapt different scenes which only contain a few labeled images. To generate high quality density map and count the crowd in low-lighting scene, the DMNet is proposed, which contains Multi-scale Feature Extraction module and Element-wise Fusion Module. The Multi-scale Feature Extraction module is used to extract the image feature by multi-scale convolutions, which helps to improve network accuracy. The Element-wise Fusion module fuses the low-lighting feature and illumination-enhanced feature, which supplements the missing illumination in low-lighting environments. Experimental results on benchmarks, WorldExpo’10, DISCO, USCD, and Mall, show that the proposed method outperforms the existing state-of-the-art methods in accuracy and gets satisfied results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

References

Wang Q, Gao J, Lin W, Li X. NWPU-crowd: a large-scale benchmark for crowd counting and localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(6): 2141–2149
Article Google Scholar
Liu Y, Wen Q, Chen H, Liu W, Qin J, Han G, He S. Crowd counting via cross-stage refinement networks. IEEE Transactions on Image Processing, 2020, 29: 6800–6812
Article Google Scholar
Gao J, Wang Q, Li X. PCC Net: perspective crowd counting via spatial convolutional network. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(10): 3486–3498
Article Google Scholar
Reddy M K K, Hossain M A, Rochan M, Wang Y. Few-shot scene adaptive crowd counting using meta-learning. In: Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). 2020, 2803–2812
Liu X, Van De Weijer J, Bagdanov A D. Leveraging unlabeled data for crowd counting by learning to rank. In: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7661–7669
Zhang C, Li H, Wang X, Yang X. Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015, 833–841
Loy C C, Gong S, Xiang T. From semi-supervised to transfer counting of crowds. In: Proceedings of the 2013 IEEE International Conference on Computer Vision. 2013, 2256–2263
Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1126–1135
Zhao M, Zhang C, Zhang J, Porikli F, Ni B, Zhang W. Scale-aware crowd counting via depth-embedded convolutional neural networks. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(10): 3651–3662
Article Google Scholar
Fang Y, Gao S, Li J, Luo W, He L, Hu B. Multi-level feature fusion based Locality-Constrained Spatial Transformer network for video crowd counting. Neurocomputing, 2020, 392: 98–107
Article Google Scholar
Sam D B, Peri S V, Sundararaman M N, Kamath A, Babu R V. Locate, size, and count: accurately resolving people in dense crowds via detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(8): 2739–2751
Google Scholar
Liu L, Lu H, Xiong H, Xian K, Cao Z, Shen C. Counting objects by blockwise classification. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(10): 3513–3527
Article Google Scholar
Wu X, Zheng Y, Ye H, Hu W, Ma T, Yang J, He L. Counting crowds with varying densities via adaptive scenario discovery framework. Neurocomputing, 2020, 397: 127–138
Article Google Scholar
Hu D, Mou L, Wang Q, Gao J, Hua Y, Dou D, Zhu X X. Ambient sound helps: audiovisual crowd counting in extreme conditions. 2020, arXiv preprint arXiv: 2005.07097
Zhao H, Min W, Wei X, Wang Q, Fu Q, Wei Z. MSR-FAN: multi-scale residual feature-aware network for crowd counting. IET Image Processing, 2021, 15(14): 3512–3521
Article Google Scholar
Zheng H, Lin Z, Cen J, Wu Z, Zhao Y. Cross-line pedestrian counting based on spatially-consistent two-stage local crowd density estimation and accumulation. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(3): 787–799
Article Google Scholar
Shen Z, Xu Y, Ni B, Wang M, Hu J, Yang X. Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 5245–5254
Yang B, Zhan W, Wang N, Liu X, Lv J. Counting crowds using a scale-distribution-aware network and adaptive human-shaped kernel. Neurocomputing, 2020, 390: 207–216
Article Google Scholar
Zou Z, Cheng Y, Qu X, Ji S, Guo X, Zhou P. Attend to count: crowd counting with adaptive capacity multi-scale CNNs. Neurocomputing, 2019, 367: 75–83
Article Google Scholar
Wang L, Yin B, Tang X, Li Y. Removing background interference for crowd counting via de-background detail convolutional network. Neurocomputing, 2019, 322: 360–371
Article Google Scholar
Chen J, Wang Z. Crowd counting with segmentation attention convolutional neural network. IET Image Processing, 2021, 15(6): 1221–1231
Article Google Scholar
Jiang S, Lu X, Lei Y, Liu L. Mask-aware networks for crowd counting. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(9): 3119–3129
Article Google Scholar
Min W, Fan M, Guo X, Han Q. A new approach to track multiple vehicles with the combination of robust detection and two classifiers. IEEE Transactions on Intelligent Transportation Systems, 2018, 19(1): 174–186
Article Google Scholar
Yang H, Liu L, Min W, Yang X, Xiong X. Driver yawning detection based on subtle facial action recognition. IEEE Transactions on Multimedia, 2020, 23: 572–583
Article Google Scholar
Wang Q, Min W, He D, Zou S, Huang T, Zhang Y, Liu R. Discriminative fine-grained network for vehicle re-identification using two-stage re-ranking. Science China Information Sciences, 2020, 63(11): 212102
Article Google Scholar
Ma Y, Zhong G, Liu W, Wang Y, Jiang P, Zhang R. ML-CGAN: conditional generative adversarial network with a meta-learner structure for high-quality image generation with few training data. Cognitive Computation, 2021, 13(2): 418–430
Article Google Scholar
Jung I, You K, Noh H, Cho M, Han B. Real-time object tracking via meta-learning: efficient model adaptation and one-shot channel pruning. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 11205–11212, doi: https://doi.org/10.1609/aaai.v34i07.6779
Elsken T, Staffler B, Metzen J H, Hutter F. Meta-learning of neural architectures for few-shot learning. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, 12362–12372
Xu C, Shen J, Du X. A method of few-shot network intrusion detection based on meta-learning framework. IEEE Transactions on Information Forensics and Security, 2020, 15: 3540–3552
Article Google Scholar
Ye H J, Sheng X R, Zhan D C. Few-shot learning with adaptively initialized task optimizer: a practical meta-learning approach. Machine Learning, 2020, 109(3): 643–664
Article MathSciNet Google Scholar
Nichol A, Achiam J, Schulman J. On first-order meta-learning algorithms. 2018, arXiv preprint arXiv: 1803.02999v3
Wang D, Cheng Y, Yu M, Guo X, Zhang T. A hybrid approach with optimization-based and metric-based meta-learner for few-shot learning. Neurocomputing, 2019, 349: 202–211
Article Google Scholar
Lai N, Kan M, Han C, Song X, Shan S. Learning to learn adaptive classifier-predictor for few-shot learning. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(8): 3458–3470
Article Google Scholar
Chan A B, Liang Z S J, Vasconcelos N. Privacy preserving crowd monitoring: counting people without people models or tracking. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–7
Zhang Q, Nie Y, Zheng W S. Dual illumination estimation for robust exposure correction. Computer Graphics Forum, 2019, 38(7): 243–252
Article Google Scholar
Zhang Y, Zhang J, Guo X. Kindling the darkness: a practical low-light image enhancer. In: Proceedings of the 27th ACM International Conference on Multimedia. 2019, 1632–1640
Wei C, Wang W, Yang W, Liu J. Deep Retinex decomposition for low-light enhancement. 2018, arXiv preprint arXiv: 1808.04560
Guo X, Li Y, Ling H. LIME: low-light image enhancement via illumination map estimation. IEEE Transactions on Image Processing, 2017, 26(2): 982–993
Article MathSciNet Google Scholar
Li Y, Zhang X, Chen D. CSRNet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 1091–1100
Liu W, Salzmann M, Fua P. Context-aware crowd counting. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 5094–5103
Chu J, Guo Z, Leng L. Object detection based on multi-layer convolution feature fusion and online hard example mining. IEEE Access, 2018, 6: 19959–19967
Article Google Scholar
Zhang Y, Chu J, Leng L, Miao J. Mask-Refined R-CNN: a network for refining object details in instance segmentation. Sensors, 2020, 20(4): 1010
Article Google Scholar
Zhang Y, Zhou D, Chen S, Gao S, Ma Y. Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, 589–597

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 62076117 and 61762061), the Natural Science Foundation of Jiangxi Province, China (20161ACB20004) and Jiangxi Key Laboratory of Smart City (20192BCD40002).

Author information

Authors and Affiliations

School of Information Engineering, Nanchang University, Nanchang, 330031, China
Haoyu Zhao, Jianqiang Xu, Qi Wang, Yi Zou & Qiyan Fu
School of Software, Nanchang University, Nanchang, 330047, China
Weidong Min
Jiangxi Key Laboratory of Smart City, Nanchang, 330047, China
Weidong Min

Authors

Haoyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Weidong Min
View author publications
You can also search for this author in PubMed Google Scholar
Jianqiang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Zou
View author publications
You can also search for this author in PubMed Google Scholar
Qiyan Fu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weidong Min.

Additional information

Haoyu Zhao obtained the BE degree of computer science and technology at Nanchang University in China in 2019. He is a post-graduate at Nanchang University in China now. His research interests include computer vision and deep learning.

Weidong Min received the BE, ME and PhD degrees in computer application from Tsinghua University, China in 1989, 1991 and 1995, respectively. He is currently a Professor and the Dean, School of Software, Nanchang University, China. He is an Executive Director of China Society of Image and Graphics. His current research interests include image and video processing, artificial intelligence, big data, distributed system and smart city information technology.

Jianqiang Xu obtained the ME degree from Information Engineering School of Nanchang University, China in 2010. He is currently pursuing the PhD degree with the Information Engineering School of Nanchang University, China. His research interests include computer vision, pattern recognition, machine learning, computer image and video processing.

Qi Wang obtained the ME degree in computer science and technology from Nanchang University, China in 2017. He is currently pursuing the PhD degree at Nanchang University, China. His current research focuses on computer vision, particularly vehicle re-identification.

Yi Zou obtained the BE degree of computer science and technology at Nanchang University, China in 2021. She is a post-graduate at Nanchang University in China now. Her research interests include image processing and deep learning.

Qiyan Fu received the ME degree in Electronic and Communication Engineering from Nanchang University, China in 2017. She is currently pursuing the PhD degree at Nanchang University, China. Her current research focuses on artificial intelligence and computer vision.

Electronic supplementary material