Skip to main content
Log in

Enhanced YOLOv5 algorithm for helmet wearing detection via combining bi-directional feature pyramid, attention mechanism and transfer learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The complexity of infrastructure scenarios has led to a sustained increase in the global number of worksites related fatalities and injuries. Therefore, safety helmets play an essential role in protecting construction workers from accidents. It is essential to detect whether the helmets are correctly worn on the heads for smart construction site. However, due to the complex construction environments, it is challenging to precisely detect safety helmet wearing in real-time. This paper proposes an enhanced version of You Only Look Once version 5 (YOLOv5) to improve the detection accuracy, where bi-directional feature pyramid network (BiFPN), attention mechanism, and transfer learning are fully integrated. The BiFPN is taken to replace the original feature pyramid network (FPN) via adding additional cross layer edges with adaptive connecting weights. Attention mechanism is added after the end of backbone and neck network to let the network pay more attention on the interested region. Transfer learning is adopted for model training. The model is pre-trained by a head detection database and then fine-tuned by the helmet database. The proposed enhanced YOLOv5 is tested on a public GDU-HWD dataset, where both helmet and its color can be identified. This study achieves the accuracy at 93.3%, which is 4.8% higher than that of the original YOLOv5, but does not bring in much computing burden to the network. It is believed that the enhance version can also be successfully used in other similar detection tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Colantonio A, Mcvittie D, Lewko J, Yin J (2009) Traumatic brain injuries in the construction industry. Brain injury : BI 23(11):873–8. https://doi.org/10.1080/02699050903036033

    Article  Google Scholar 

  2. Dakhli Z, Danel T, Lafhaj Z (2019) Smart construction site: ontology of information system architecture. Modular Offsite Construct (MOC) Summit Proceed:41–50. https://doi.org/10.29173/mocs75

  3. Deng L, Li H, Liu H, Gu J (2022) A lightweight yolov3 algorithm used for safety helmet detection. Sci Reports 12(1):1–15. https://doi.org/10.1038/s41598-022-15272-w

    Google Scholar 

  4. Dewi C, Chen R-, Jiang X, Yu H (2022) Deep convolutional neural network for enhancing traffic sign recognition developed on yolo v4. Multimed Tools Appl:1–25. https://doi.org/10.1007/s11042-022-12962-5

  5. Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: exceeding yolo series in 2021. arXiv:2107.08430

  6. Han G, Zhu M, Zhao X, Gao H (2021) Method based on the cross-layer attention mechanism and multiscale perception for safety helmet-wearing detection. Comput Electr Eng 95:107458. https://doi.org/10.1016/j.compeleceng.2021.107458

    Article  Google Scholar 

  7. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141. https://doi.org/10.1109/cvpr.2018.00745

  8. Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 784–799. https://doi.org/10.1007/978-3-030-01264-9_48

  9. Li J, Liu H, Wang T, Jiang M, Wang S, Li K, Zhao X (2017) Safety helmet wearing detection based on image processing and machine learning. In: 2017 Ninth international conference on advanced computational intelligence (ICACI). IEEE, pp 201–205. https://doi.org/10.1109/icaci.2017.7974509

  10. Li G, Song Z, Fu Q (2018) A new method of image detection for small datasets under the framework of yolo network. In: 2018 IEEE 3rd advanced information technology, electronic and automation control conference (IAEAC). IEEE, pp 1031–1035. https://doi.org/10.1109/iaeac.2018.8577214

  11. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125. https://doi.org/10.1109/cvpr.2017.106

  12. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755. https://doi.org/10.1007/978-3-319-10602-1_48

  13. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

  14. Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768. https://doi.org/10.1109/cvpr.2018.00913

  15. Lu J, Behbood V, Hao P, Zuo H, Xue S, Zhang G (2015) Transfer learning using computational intelligence: a survey. Knowl-Based Syst 80:14–23. https://doi.org/10.1016/j.knosys.2015.01.010

    Article  Google Scholar 

  16. Man CK, Quddus M, Theofilatos A (2022) Transfer learning for spatio-temporal transferability of real-time crash prediction models. Accid Anal Prev 165:106511. https://doi.org/10.1016/j.aap.2021.106511

    Article  Google Scholar 

  17. Mneymneh BE, Abbas M, Khoury H (2019) Vision-based framework for intelligent monitoring of hardhat wearing on construction sites. J Comput Civil Eng 33(2):04018066. https://doi.org/10.1109/icpr48806.2021.9412103

    Article  Google Scholar 

  18. Nie M, Wang K (2018) Pavement distress detection based on transfer learning. In: 2018 5th International conference on systems and informatics (ICSAI). IEEE, pp 435–439. https://doi.org/10.1109/icsai.2018.8599473

  19. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359. https://doi.org/10.1109/TKDE.2009.191

    Article  Google Scholar 

  20. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788. https://doi.org/10.1109/cvpr.2016.91

  21. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767

  22. Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/tpami.2016.2577031

    Article  Google Scholar 

  23. Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 658–666. https://doi.org/10.1109/cvpr.2019.00075

  24. Siebert FW, Lin H (2020) Detecting motorcycle helmet use with deep learning. Accid Anal Prev 134:105319. https://doi.org/10.1016/j.aap.2019.105319

    Article  Google Scholar 

  25. Song R, Wang Z (2022) Rbfpdet: an anchor-free helmet wearing detection method. Appl Intell:1–16. https://doi.org/10.1007/s10489-022-03664-4

  26. Stewart R, Andriluka M, Ng AY (2016) End-to-end people detection in crowded scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2325–2333. https://doi.org/10.1109/cvpr.2016.255

  27. Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning. PMLR, pp 6105–6114. arXiv:1905.11946

  28. Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790. https://doi.org/10.1109/cvpr42600.2020.01079

  29. (2020). Ultralytics.yolov5 online. https://github.com/ultralytics/yolov5

  30. Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) Cspnet: a new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 390–391. https://doi.org/10.1109/cvprw50498.2020.00203

  31. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19. https://doi.org/10.1007/978-3-030-01234-2_1

  32. Wu J, Cai N, Chen W, Wang H, Wang G (2019) Automatic detection of hardhats worn by construction personnel: a deep learning approach and benchmark dataset. Autom Constr 106:102894. https://doi.org/10.1016/j.autcon.2019.102894

    Article  Google Scholar 

  33. Yue S, Zhang Q, Shao D, Fan Y, Bai J (2022) Safety helmet wearing status detection based on improved boosted random ferns. Multimed Tools Appl 81(12):16783–16796. https://doi.org/10.1007/s11042-022-12014-y

    Article  Google Scholar 

  34. Zhao J, Li C, Xu Z, Jiao L, Zhao Z, Wang Z (2022) Detection of passenger flow on and off buses based on video images and yolo algorithm. Multimed Tools Appl 81(4):4669–4692. https://doi.org/10.1007/s11042-021-10747-w

    Article  Google Scholar 

  35. Zheng Y, Bao H, Meng C, Ma N (2021) A method of traffic police detection based on attention mechanism in natural scene. Neurocomputing 458:592–601. https://doi.org/10.1016/j.neucom.2019.12.144

    Article  Google Scholar 

  36. Zheng Z, Wang P, Ren D, Liu W, Ye R, Hu Q, Zuo W (2021) Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cybern:1–13

  37. Zhu X, Cheng D, Zhang Z, Lin S, Dai J (2019) An empirical study of spatial attention mechanisms in deep networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6688–6697. https://doi.org/10.1109/iccv.2019.00679

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuguang Zhang.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, Y., Ma, Y., Zhang, X. et al. Enhanced YOLOv5 algorithm for helmet wearing detection via combining bi-directional feature pyramid, attention mechanism and transfer learning. Multimed Tools Appl 82, 28617–28641 (2023). https://doi.org/10.1007/s11042-023-14395-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14395-0

Keywords

Navigation