Abstract
The complexity of infrastructure scenarios has led to a sustained increase in the global number of worksites related fatalities and injuries. Therefore, safety helmets play an essential role in protecting construction workers from accidents. It is essential to detect whether the helmets are correctly worn on the heads for smart construction site. However, due to the complex construction environments, it is challenging to precisely detect safety helmet wearing in real-time. This paper proposes an enhanced version of You Only Look Once version 5 (YOLOv5) to improve the detection accuracy, where bi-directional feature pyramid network (BiFPN), attention mechanism, and transfer learning are fully integrated. The BiFPN is taken to replace the original feature pyramid network (FPN) via adding additional cross layer edges with adaptive connecting weights. Attention mechanism is added after the end of backbone and neck network to let the network pay more attention on the interested region. Transfer learning is adopted for model training. The model is pre-trained by a head detection database and then fine-tuned by the helmet database. The proposed enhanced YOLOv5 is tested on a public GDU-HWD dataset, where both helmet and its color can be identified. This study achieves the accuracy at 93.3%, which is 4.8% higher than that of the original YOLOv5, but does not bring in much computing burden to the network. It is believed that the enhance version can also be successfully used in other similar detection tasks.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Colantonio A, Mcvittie D, Lewko J, Yin J (2009) Traumatic brain injuries in the construction industry. Brain injury : BI 23(11):873–8. https://doi.org/10.1080/02699050903036033
Dakhli Z, Danel T, Lafhaj Z (2019) Smart construction site: ontology of information system architecture. Modular Offsite Construct (MOC) Summit Proceed:41–50. https://doi.org/10.29173/mocs75
Deng L, Li H, Liu H, Gu J (2022) A lightweight yolov3 algorithm used for safety helmet detection. Sci Reports 12(1):1–15. https://doi.org/10.1038/s41598-022-15272-w
Dewi C, Chen R-, Jiang X, Yu H (2022) Deep convolutional neural network for enhancing traffic sign recognition developed on yolo v4. Multimed Tools Appl:1–25. https://doi.org/10.1007/s11042-022-12962-5
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: exceeding yolo series in 2021. arXiv:2107.08430
Han G, Zhu M, Zhao X, Gao H (2021) Method based on the cross-layer attention mechanism and multiscale perception for safety helmet-wearing detection. Comput Electr Eng 95:107458. https://doi.org/10.1016/j.compeleceng.2021.107458
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141. https://doi.org/10.1109/cvpr.2018.00745
Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 784–799. https://doi.org/10.1007/978-3-030-01264-9_48
Li J, Liu H, Wang T, Jiang M, Wang S, Li K, Zhao X (2017) Safety helmet wearing detection based on image processing and machine learning. In: 2017 Ninth international conference on advanced computational intelligence (ICACI). IEEE, pp 201–205. https://doi.org/10.1109/icaci.2017.7974509
Li G, Song Z, Fu Q (2018) A new method of image detection for small datasets under the framework of yolo network. In: 2018 IEEE 3rd advanced information technology, electronic and automation control conference (IAEAC). IEEE, pp 1031–1035. https://doi.org/10.1109/iaeac.2018.8577214
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125. https://doi.org/10.1109/cvpr.2017.106
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768. https://doi.org/10.1109/cvpr.2018.00913
Lu J, Behbood V, Hao P, Zuo H, Xue S, Zhang G (2015) Transfer learning using computational intelligence: a survey. Knowl-Based Syst 80:14–23. https://doi.org/10.1016/j.knosys.2015.01.010
Man CK, Quddus M, Theofilatos A (2022) Transfer learning for spatio-temporal transferability of real-time crash prediction models. Accid Anal Prev 165:106511. https://doi.org/10.1016/j.aap.2021.106511
Mneymneh BE, Abbas M, Khoury H (2019) Vision-based framework for intelligent monitoring of hardhat wearing on construction sites. J Comput Civil Eng 33(2):04018066. https://doi.org/10.1109/icpr48806.2021.9412103
Nie M, Wang K (2018) Pavement distress detection based on transfer learning. In: 2018 5th International conference on systems and informatics (ICSAI). IEEE, pp 435–439. https://doi.org/10.1109/icsai.2018.8599473
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359. https://doi.org/10.1109/TKDE.2009.191
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788. https://doi.org/10.1109/cvpr.2016.91
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/tpami.2016.2577031
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 658–666. https://doi.org/10.1109/cvpr.2019.00075
Siebert FW, Lin H (2020) Detecting motorcycle helmet use with deep learning. Accid Anal Prev 134:105319. https://doi.org/10.1016/j.aap.2019.105319
Song R, Wang Z (2022) Rbfpdet: an anchor-free helmet wearing detection method. Appl Intell:1–16. https://doi.org/10.1007/s10489-022-03664-4
Stewart R, Andriluka M, Ng AY (2016) End-to-end people detection in crowded scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2325–2333. https://doi.org/10.1109/cvpr.2016.255
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, pp 6105–6114. arXiv:1905.11946
Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790. https://doi.org/10.1109/cvpr42600.2020.01079
(2020). Ultralytics.yolov5 online. https://github.com/ultralytics/yolov5
Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) Cspnet: a new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 390–391. https://doi.org/10.1109/cvprw50498.2020.00203
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
Wu J, Cai N, Chen W, Wang H, Wang G (2019) Automatic detection of hardhats worn by construction personnel: a deep learning approach and benchmark dataset. Autom Constr 106:102894. https://doi.org/10.1016/j.autcon.2019.102894
Yue S, Zhang Q, Shao D, Fan Y, Bai J (2022) Safety helmet wearing status detection based on improved boosted random ferns. Multimed Tools Appl 81(12):16783–16796. https://doi.org/10.1007/s11042-022-12014-y
Zhao J, Li C, Xu Z, Jiao L, Zhao Z, Wang Z (2022) Detection of passenger flow on and off buses based on video images and yolo algorithm. Multimed Tools Appl 81(4):4669–4692. https://doi.org/10.1007/s11042-021-10747-w
Zheng Y, Bao H, Meng C, Ma N (2021) A method of traffic police detection based on attention mechanism in natural scene. Neurocomputing 458:592–601. https://doi.org/10.1016/j.neucom.2019.12.144
Zheng Z, Wang P, Ren D, Liu W, Ye R, Hu Q, Zuo W (2021) Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cybern:1–13
Zhu X, Cheng D, Zhang Z, Lin S, Dai J (2019) An empirical study of spatial attention mechanisms in deep networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6688–6697. https://doi.org/10.1109/iccv.2019.00679
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fang, Y., Ma, Y., Zhang, X. et al. Enhanced YOLOv5 algorithm for helmet wearing detection via combining bi-directional feature pyramid, attention mechanism and transfer learning. Multimed Tools Appl 82, 28617–28641 (2023). https://doi.org/10.1007/s11042-023-14395-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-14395-0