Abstract
The research in pedestrian detection has made remarkable progress in recent years. However, robust pedestrian detection in crowded scenes remains a considerable challenge. Many methods resort to additional annotations (visible body or head) of a dataset or develop attention mechanisms to alleviate the difficulties posed by occlusions. However, these methods rarely use contextual information to strengthen the features extracted by a backbone network. The main aim of this paper is to extract more effective and discriminative features of pedestrians for robust pedestrian detection with heavy occlusions. To this end, we propose a Global Context-Aware module to exploit contextual information for pedestrian detection. Fusing global context with the information derived from the visible part of occluded pedestrians enhances feature representations. The experimental results obtained on two challenging benchmarks, CrowdHuman and CityPersons, demonstrate the effectiveness and merits of the proposed method. Code and models are available at: https://github.com/FlyingZstar/crowded-pedestrian-detection.
Similar content being viewed by others
References
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 5693–5703
Wang X, Tong J, Wang R (2021) Attention refined network for human pose estimation. Neural Process Lett 53(4):2853–2872
Chen D, Zhang S, Ouyang W, Yang J, Tai Y (2018) Person Search via A Mask-Guided Two-Stream CNN Model
Dong W, Zhang Z, Song C, Tan T (2020) Instance guided proposal network for person search. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Ye M, Shen J, Lin G, Xiang T, Hoi SCH (2021) Deep learning for person re-identification: A survey and outlook. IEEE Trans Pattern Anal Mach Intell PP(99):1–1
Li D, Hu R, Huang W, Li D, Wang X, Hu C (2021) Trajectory association for person re-identification. Neural Process Lett 53(5):3267–3285
Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 1933–1941
Yang Y, Li G, Wu Z, Su L, Huang Q, Sebe N (2020) Reverse perspective network for perspective-aware object counting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 4374–4383
Liu W, Liao S, Hu W, Liang X, Chen X (2018) Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of the European Conference on Computer Vision (ECCV), p. 618–634
Mao J, Xiao T, Jiang Y, Cao Z (2017) What can help pedestrian detection? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 3127–3136
Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection. In: European Conference on Computer Vision. Springer, Berlin, pp 354–370
Zhang S, Benenson R, Schiele B (2017) CityPersons: A diverse dataset for pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Wojek C, Dollar P, Schiele B, Perona P (2012) Pedestrian detection: An evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743
Shao S, Zhao Z, Li B, Xiao T, Yu G, Zhang X, Sun J (2018) CrowdHuman: A Benchmark for Detecting Human in a Crowd
Ouyang W, Wang X (2014) Joint deep learning for pedestrian detection. In: IEEE International Conference on Computer Vision
Chi C, Zhang S, Xing J, Lei Z, Zou X (2020) PedHunter: Occlusion robust pedestrian detector in crowded scenes. Proceedings of the AAAI Conference on Artificial Intell 34(7):10639–10646
Pang Y, Xie J, Khan MH, Anwer RM, Khan FS, Shao L (2019) Mask-guided attention network for occluded pedestrian detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, p. 4967–4975
Zhang S, Yang J, Schiele B (2018) Occluded pedestrian detection through guided attention in CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 6995–7003
Wang X, Xiao T, Jiang Y, Shao S, Shen C (2018) Repulsion Loss: Detecting pedestrians in a crowd. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Occlusion-aware R-CNN: Detecting pedestrians in a crowd. In: European Conference on Computer Vision (ECCV)
Bodla N, Singh B, Chellappa R, Davis LS (2017) Soft-NMS: Improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)
Liu S, Huang D, Wang Y (2020) Adaptive NMS: Refining pedestrian detection in a crowd. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Huang X, Ge Z, Jie Z, Yoshie O (2020) NMS by representative region: Towards crowded pedestrian detection by proposal pairing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Chu X, Zheng A, Zhang X, Sun J (2020) Detection in crowded scenes: One proposal, multiple predictions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Simonyan K, Zisserman A (2015) Very Deep Convolutional Networks for Large-Scale Image Recognition
Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: Single shot multibox detector. In: European Conference on Computer Vision, p. 21–37
Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(6)
Cai Z, Vasconcelos N (2018) Cascade R-CNN: Delving into high quality object detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Qin Z, Li Z, Zhang Z, Bao Y, Sun J (2019) ThunderNet: Towards real-time generic object detection on mobile devices. In: ICCV
Tan M, Pang R, Le QV (2020) EfficientDet: Scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Song L, Li Y, Jiang Z, Li Z, Sun H, Sun J, Zheng N (2020) Fine-Grained Dynamic Head for Object Detection
Zhou C, Yuan J (2017) Multi-label learning of part detectors for heavily occluded pedestrian detection. In: IEEE International Conference on Computer Vision
Tian Y, Luo P, Wang X, Tang X (2015) Deep learning strong parts for pedestrian detection. In: Proceedings of the IEEE International Conference on Computer Vision, p. 1904–1912
Zhang J, Lin L, Li Y, Chen Y-c, Zhu J, Hu Y, Hoi SCH (2019) Attribute-aware Pedestrian Detection in a Crowd
Zhou C, Yuan J (2018) Bi-box regression for pedestrian detection and occlusion estimation. In: ECCV
Zhang K, Xiong F, Sun P, Hu L, Li B, Yu G (2019) Double Anchor R-CNN for Human Detection in a Crowd
Xie J, Cholakkal H, Anwer RM, Khan FS, Shah M (2020) Count- and similarity-aware R-CNN for pedestrian detection. In: ECCV
Song X, Zhao K, Chu WS, Zhang H, Guo J (2020) Progressive refinement network for occluded pedestrian detection. In: ECCV
Wu J, Zhou C, Yang M, Zhang Q, Yuan J (2020) Temporal-context enhanced detection of heavily occluded pedestrians. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Islam MM, Newaz AAR, Gokaraju B, Karimoddini A (2020) Pedestrian detection for autonomous cars: Occlusion handling by classifying body parts. In: 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), p. 1433–1438. IEEE
Wang S, Cheng J, Liu H, Tang M (2018) PCN: Part and context information for pedestrian detection with CNNs. arXiv preprint arXiv:1804.04483
Fei C, Liu B, Chen Z, Yu N (2019) Learning pixel-level and instance-level context-aware features for pedestrian detection in crowds. IEEE Access 7:94944–94953
Xie H, Chen Y, Shin H (2019) Context-aware pedestrian detection especially for small-sized instances with deconvolution integrated Faster R-CNN (DIF R-CNN). Appl Intell 49(3):1200–1211
Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2020) IAUnet: Global context-aware feature learning for person reidentification. IEEE Transactions on Neural Networks and Learning Systems
Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, p. 10599–10606
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 1–9
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: The IEEE International Conference on Computer Vision (ICCV)
Cordts M, Omran M, Ramos S, Rehfeld T, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Xu Z, Li B, Yuan Y, Dang A (2020) Beta R-CNN: Looking into pedestrian detection from another perspective. Advances in Neural Information Processing Systems
Song T, Sun L, Xie D, Sun H, Pu S (2018) Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: Proceedings of the European Conference on Computer Vision (ECCV), p. 536–551
Liu W, Liao S, Ren W, Hu W, Yu Y (2019) High-level semantic feature detection: A new perspective for pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, p. 5187–5196
Funding
This work is supported by the National Key Research and Development Program of China under Grant (2017YFC1601800), the National Natural Science Foundation of China under Grant (61876072, 61902153, 62072243).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by [Zhenxing Liu]. The first draft of the manuscript was written by [Zhenxing Liu] and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, Z., Song, X., Feng, Z. et al. Global Context-Aware Feature Extraction and Visible Feature Enhancement for Occlusion-Invariant Pedestrian Detection in Crowded Scenes. Neural Process Lett 55, 803–817 (2023). https://doi.org/10.1007/s11063-022-10910-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-022-10910-w