Abstract
Pedestrian detection is an important basic research topic in the field of target detection, which can provide effective information support for public places with large flow density such as shopping malls and scenic spots as well as intelligent security fields. To solve the problem of tracking target loss and low detection and recognition rate caused by pedestrian occlusion or portrait over-lap in pedestrian target detection in a crowded area or scene, an improved YOLOv5 algorithm integrating the attention mechanism was proposed. The relationship between feature channels and spatial information of the feature map was deeply mined by introducing an attention mechanism to further enhance feature extraction of pedestrian target visual area. To improve the feature fusion ability, Bidirectional Feature Pyramid Network (BiFPN) feature pyramid was used to enrich the cross-scale connection mode, preserve the shallow layer characteristics, and improve the detection accuracy. To improve the convergence ability of the model, EIoU was used to replace the original loss function of YOLOv5 to optimize the regression prediction of the anchor, which reduces the training difficulty of the network and improves the detection rate under occlusion. Compared with the general YOLOv5 algorithm, the improved algorithm proposed in this paper has higher accuracy and a lower missing rate in pedestrian detection in crowded areas or scenes, while the real-time performance of the original algorithm is still maintained.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Girshick, R.: Fast R-CNN. In: Proceedings of the 2015 IEEE International Conference on Computer Vision, pp. 1440–1448. IEEE, Boston (2015)
Shaoqing, R.E.N., Kaiming, H.E., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2015)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162. IEEE, Salt Lake City (2018)
Redmon, J., Divvala, S., Girshick, R., et al.: you only look once: unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE, Las Vegas (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517–6525. IEEE, Honolulu (2017)
Bochkovskiy, A., Wang, C., Liao H.: YOLOv4: Optimal Speed and Accuracy of Object Detection [EB/OL]. https://arxiv.org/abs/2004.10934 (2020)
Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer, Amsterdam (2016)
Tian, Z., Shen, C., Chen, H., et al.: FCOS: fully convolutional one-stage object detection. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision and Pattern Recognition, pp. 9627–9636. IEEE, Long Beach (2019)
Wang, X., Xiao T., Jiang, Y., et al.: Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp. 7774–7783. IEEE, Salt Lake City (2018)
Pang, Y., Xie, J., Khan, M., et al.: Mask-guided attention network for occluded pedestrian detection. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, pp. 4966–4974. IEEE, Seoul (2019)
Woo, S., Park, J., Lee, J.-Y., Kweon, I. S.: CBAM: Convolutional Block Attention Module (2018). https://doi.org/10.48550/arXiv.1807.06521
Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790. IEEE Press, Piscataway, NJ (2020)
Zhang, Y.F., Ren, W., Zhang, Z., et al.: Focal and efficient IOU loss for accurate bounding box regression (2021)
Zheng, Z., Wang, P., Liu, W., et al.: Distance-IoU loss: faster and better learning for bounding box regression. Proceed. AAAI Conf. Artif. Intell. 34(07), 12993–13000 (2020)
Wang, C.Y., Liao, H.Y.M., Wu, Y.H., et al.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)
He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(09), 1904–1916 (2015)
Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Zhang, S., Xie, Y., Wan, J., et al.: Widerperson: a diverse dataset for dense pedestrian detection in the wild. IEEE Trans. Multimedia 22(2), 380–393 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Funding Statement
This work is funded by National Key R&D Program of China (2021YFC3320300) and 2022 Industrial Internet Innovation and Development Engineering Project (TC220H055).
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, S., Tian, Y., Hao, N., Zhou, J., Zhang, X. (2024). Improved YOLOv5 Algorithm for Intensive Pedestrian Detection. In: Li, S. (eds) Computational and Experimental Simulations in Engineering. ICCES 2023. Mechanisms and Machine Science, vol 146. Springer, Cham. https://doi.org/10.1007/978-3-031-44947-5_47
Download citation
DOI: https://doi.org/10.1007/978-3-031-44947-5_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44946-8
Online ISBN: 978-3-031-44947-5
eBook Packages: EngineeringEngineering (R0)