Abstract
To solve problems such as the low detection accuracy of helmet wearing, missing detection and poor real-time performance of embedded equipment in the scene of remote and small targets at the construction site, the text proposes an improved YOLO v5 for small target helmet wearing detection. Based on YOLO v5, the self-attention transformer mechanism and swin transformer module are introduced in the feature fusion step to increase the receptive field of the convolution kernel and globally model the high-level semantic feature information extracted from the backbone network to make the model more focused on helmet feature learning. Replace some convolution operators with lighter and more efficient Involution operators to reduce the number of parameters. The connection mode of the Concat is improved, and 1 × 1 convolution is added. The experimental results compared with YOLO v5 show that the size of the improved helmet detection model is reduced by 17.8% occupying only 33. 2 MB, FPS increased by 5%, and mAP@0.5 reached 94.9%. This approach effectively improves the accuracy of small target helmet wear detection, and meets the deployment requirements for low computational power embedded devices.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Feng, P., Sai, Y.X.: Common safety hazards and corresponding countermeasures on construction. Constr. Saf. 6, 33–35 (2014)
Yang, Y.B., Li, D.: Improved lightweight helmet wearing detection algorithm of YOLOv5. Comput. Eng. Appl. 1–8, 2021–01–19
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO:9000: better, faster, stronger. In: Hawaii, USA: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Girshick, R., Donahue, D., et al.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2015)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision. IEEE, Piscataway, pp. 1440–1448 (2015)
Rensq, S., Hekm, K., Girshick R, et al.: Faster R-CNN: towards real time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: Single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Sun, L.C., Wang, L.: An improved YOLO V5-based algorithm of safety helmet wearing detection. In: 2022 34th Chinese Control and Decision Conference, pp. 2031–2035 (2022)
Xu, X.F., Zhao, W.F., Zou, H.Q., et al.: Detection algorithm of safety helmet wear based on MobileNet-SSD. Comput. Eng. 47(10), 298–305, 313 (2021)
Zhang, J., Qu, P.Q., Sun, C., et al.: Helmet wearing detection algorithm based on improved yolov5. Comput. Appl. 42(4), 1292–1300 (2022)
Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for insurance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Rezatofihhi, H., Tsoi, N., Gwak, J., et al.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)
Liu, J.H.: Active and Semisupervised Learning based on ELM for Multiclass Image Classification. Southeast University, Nanjing (2016)
Wang, H., Shi, J.C., Zhang, Z.W.: Text semantic relation extraction of LSTM based on attention mechanism. Appl. Res. Comput. 35(5), 1417–1420 (2018)
Tang, H.T., Xue, J.B., Han, J.Q.: A method of multiscale forward attention model for speech recognition. Acta Electron. Sin. 48(7), 1255–1260 (2020)
Wang, W.G., Shen, J.B., Yu, Y.Z., et al.: Stereoscopic thumbnail creation via efficient stereo saliency detection. IEEE Trans. Visual Comput. Graph. 23(8), 2014–2027 (2016)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention Is All You NeNeed. arXiv:1706.03762 (2017)
Liu, Z., Lin, Y., Cao, Y., et al.: Swin Transformer: Hierarchical vision Transformer using shifted window. arXiv:2103.14030 (2021)
Li, D., Hu, J., Wang, C.H., et al.: Involution: Inverting the Inherence of Convolution for Visual Recognition. arXiv:2013.062 55 (2021)
Zhou, F.Y., Jin, L.P., Dong, J.: A review of convolutional neural networks. J. Comput. Sci. 6(40), 1230–1251 (2017)
Zhang, X.Y., Zhou, X.Y., Lin, M.X., et al.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proeedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Ma, N.N., Zhang, X., Zheng, H.T., et al.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. European Conference on Computer Vision. Springer, Cham (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hu, J., Li, J., Zhang, Q. (2023). Small Target Helmet Wearing Detection Algorithm Based on Improved YOLO V5. In: Yu, Z., et al. Data Science. ICPCSEE 2023. Communications in Computer and Information Science, vol 1879. Springer, Singapore. https://doi.org/10.1007/978-981-99-5968-6_6
Download citation
DOI: https://doi.org/10.1007/978-981-99-5968-6_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-5967-9
Online ISBN: 978-981-99-5968-6
eBook Packages: Computer ScienceComputer Science (R0)