Learning accurate personal protective equipment detection from virtual worlds

Di Benedetto, Marco; Carrara, Fabio; Meloni, Enrico; Amato, Giuseppe; Falchi, Fabrizio; Gennaro, Claudio

doi:10.1007/s11042-020-09597-9

Learning accurate personal protective equipment detection from virtual worlds

Published: 25 August 2020

Volume 80, pages 23241–23253, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Marco Di Benedetto ORCID: orcid.org/0000-0001-5781-7060¹,
Fabio Carrara¹,
Enrico Meloni¹,
Giuseppe Amato¹,
Fabrizio Falchi¹ &
…
Claudio Gennaro¹

685 Accesses
7 Citations
4 Altmetric
Explore all metrics

Abstract

Deep learning has achieved impressive results in many machine learning tasks such as image recognition and computer vision. Its applicability to supervised problems is however constrained by the availability of high-quality training data consisting of large numbers of humans annotated examples (e.g. millions). To overcome this problem, recently, the AI world is increasingly exploiting artificially generated images or video sequences using realistic photo rendering engines such as those used in entertainment applications. In this way, large sets of training images can be easily created to train deep learning algorithms. In this paper, we generated photo-realistic synthetic image sets to train deep learning models to recognize the correct use of personal safety equipment (e.g., worker safety helmets, high visibility vests, ear protection devices) during at-risk work activities. Then, we performed the adaptation of the domain to real-world images using a very small set of real-world images. We demonstrated that training with the synthetic training set generated and the use of the domain adaptation phase is an effective solution for applications where no training set is available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Vision-Based Ergonomic Risk Estimation: Deep-Learning Strategies

Detection of Personal Protective Equipment in Factories: A Survey and Benchmark Dataset

100+ FPS detector of personal protective equipment for worker safety: A deep learning approach for green edge computing

Article 15 November 2021

References

Aubry M, Russell BC (2015) Understanding deep features with computer-generated imagery. In: Proceedings of the IEEE international conference on computer vision, pp 2875–2883
Bewley A, Rigley J, Liu Y, Hawke J, Shen R, Lam V-D, Kendall A (2018) Learning to drive from simulation without real world labels, arXiv:1812.03823
Bochinski E, Eiselein V, Sikora T (2016) Training a convolutional neural network for multi-class object detection using solely virtual world data. In: Advanced Video and Signal Based Surveillance (AVSS), 2016 13th, IEEE international conference on IEEE, pp 278–285
Chen C, Seff A, Kornhauser A, Xiao J (2015) Deepdriving: Learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE international conference on computer vision, pp 2722–2730
Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255
di Benedetto M, Meloni E, Amato G, Falchi F, Gennaro C (2019) Learning safety equipment detection using virtual worlds. In: International conference on Content-Based multimedia indexing (CBMI), Sep. 2019, pp 8–13
Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Article Google Scholar
Fabbri M, Lanzi F, Calderara S, Palazzi A, Vezzani R, Cucchiara R (2018) Learning to detect and track visible and occluded body joints in a virtual world. In: European Conference on Computer Vision (ECCV)
Filipowicz A, Liu J, Kornhauser A (2017) Learning to recognize distance to stop signs using the virtual world of grand theft auto 5, Tech. Rep.
Hong Z-W, Yu-Ming C, Su S-Y, Shann T-Y, Chang Y-H, Yang H-K, Ho BH-L, Tu C-C, Chang Y-C, Hsiao T-C et al (2018) Virtual-to-real: Learning to control in visual semantic segmentation, arXiv:1802.00285
Johnson-Roberson M, Barto C, Mehta R, Sridhar SN, Vasudevan R (2016) Driving in the matrix, Can virtual worlds replace human-generated annotations for real world tasks? arXiv:1610.01983
Kuznetsova A, Rom H, Alldrin N, Uijlings JRR, Krasin I, Pont-tuset J, Kamali S, Popov S, Malloci M, Duerig T, Ferrari V (2018) The open images dataset V4: unified image classification, object detection, and visual relationship detection at scale. arXiv:1811.00982
Lai K-T, Lin C-C, Kang C-Y, Liao M-E, Chen M-S (2018) Vivid: virtual environment for visual deep learning. In: Proceedings of the 26th ACM international conference on multimedia, ser. MM ’18. New York, NY, USA: ACM, pp 1356–1359
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Computer vision – ECCV 2014. Springer International Publishing, Cham, pp 740–755
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Long M, Cao Y, Wang J, Jordan MI (2015) Learning transferable features with deep adaptation networks. In: Proceedings of the 32nd international conference on international conference on machine learning - volume 37, ser, ICML’15. http://JMLR.org, pp 97–105
Luo W, Sun P, Zhong F, Liu W, Zhang T, Wang Y (2019) End-to-end active object tracking and its real-world deployment via reinforcement learning. IEEE Trans Pattern Anal Mach Intell
Marín J, Vázquez D, Gerónimo D, López AM (2010) Learning appearance in virtual scenarios for pedestrian detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 2010, pp 137–144
Martinez M, Sitawarin C, Finch K, Meincke L, Yablonski A, Kornhauser A (2017) Beyond grand theft auto v for training, testing and enhancing deep learning in self driving cars, arXiv:1712.01397
Meloni E, Di Benedetto M, Amato G, Falchi F, Gennaro C (2019) Project Website. http://aimir.isti.cnr.it/vw-ppe
Qiu W, Yuille A (2016) Unrealcv: Connecting computer vision to unreal engine. In: European conference on computer vision. Springer, pp 909–916
RAGE Plugin Hook (2013) https://ragepluginhook.net
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: The IEEE conference on Computer Vision and Pattern Recognition (CVPR)
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates, Inc., pp 91–99
Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: ground truth from computer games. In: European conference on computer vision. Springer, pp 102–118
Rockstar Games Inc. (2013) Grand Theft Auto - V https://www.rockstargames.com/V
Tremblay J, Prakash A, Acuna D, Brophy M, Jampani V, Anil C, To T, Cameracci E, Boochoon S, Birchfield S (2018) Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 969–977
Vázquez D, Lopez AM, Ponsa D (2012) Unsupervised domain adaptation of virtual and real worlds for pedestrian detection. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp 3492–3495
Vázquez D, López AM, Marín J, Ponsa D, Gerónimo D (2014) Virtual and real world adaptation for pedestrian detection. IEEE Trans Pattern Anal Mach Intell 36(4):797–809
Article Google Scholar

Download references

Acknowledgements

This work was partially supported by “Automatic Data and documents Analysis to enhance human-based processes” (ADA), funded by CUP CIPE D55F17000290009, and by the AI4EU project, funded by EC (H2020 - Contract n. 825619). We gratefully acknowledge the support of NVIDIA Corporation with the donation of a Jetson TX2 board used for this research.

Author information

Authors and Affiliations

Institute of Information Science and Technologies, National Research Council, Pisa, Italy
Marco Di Benedetto, Fabio Carrara, Enrico Meloni, Giuseppe Amato, Fabrizio Falchi & Claudio Gennaro

Authors

Marco Di Benedetto
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Carrara
View author publications
You can also search for this author in PubMed Google Scholar
Enrico Meloni
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Amato
View author publications
You can also search for this author in PubMed Google Scholar
Fabrizio Falchi
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Gennaro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Di Benedetto.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Di Benedetto, M., Carrara, F., Meloni, E. et al. Learning accurate personal protective equipment detection from virtual worlds. Multimed Tools Appl 80, 23241–23253 (2021). https://doi.org/10.1007/s11042-020-09597-9

Download citation

Received: 03 November 2019
Revised: 04 June 2020
Accepted: 12 August 2020
Published: 25 August 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11042-020-09597-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning accurate personal protective equipment detection from virtual worlds

Abstract

Access this article

Similar content being viewed by others

Vision-Based Ergonomic Risk Estimation: Deep-Learning Strategies

Detection of Personal Protective Equipment in Factories: A Survey and Benchmark Dataset

100+ FPS detector of personal protective equipment for worker safety: A deep learning approach for green edge computing

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning accurate personal protective equipment detection from virtual worlds

Abstract

Access this article

Similar content being viewed by others

Vision-Based Ergonomic Risk Estimation: Deep-Learning Strategies

Detection of Personal Protective Equipment in Factories: A Survey and Benchmark Dataset

100+ FPS detector of personal protective equipment for worker safety: A deep learning approach for green edge computing

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation