Skip to main content
Log in

CDNet: a real-time and robust crosswalk detection network on Jetson nano based on YOLOv5

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Realizing real-time and robust crosswalk (zebra crossing) detection in complex scenarios and under limited computing power is one of the important difficulties of current intelligent traffic management systems (ITMS). Limited edge computing capabilities and real complex scenarios such as in cloudy, sunny, rainy, foggy and at night simultaneously challenge this task. In this study, the crosswalk detection network (CDNet) based on YOLOv5 is proposed to achieve fast and accurate crosswalk detection under the vision of the vehicle-mounted camera, and real-time detection is implemented on Jetson nano device. The powerful convolution neural network feature extractor is used to handle complex environments, the squeeze-and-excitation (SE) attention mechanism module is embedded into the network, the negative samples training (NST) method is used to improve the accuracy, the region of interest (ROI) algorithm is utilized to further improve the detection speed, and a novel slide receptive field short-term vector memory (SSVM) algorithm is proposed to improve vehicle-crossing behavior detection accuracy, the synthetic fog augmentation algorithm is used to allow the model adaptable to foggy scenario. Finally, with a detection speed of 33.1 FPS on Jetson nano, we obtained an average F1 score of 94.83% in the above complex scenarios. For better weather condition such as sunny and cloudy days, the F1 score exceeds 98%. This work provides a reference for the specific application of artificial neural network algorithm optimization methods on edge computing devices. The datasets, tutorials and source codes are available on GitHub.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  2. Girshick R, et al. (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/cvpr.2014.81

  3. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition, p 1. https://arxiv.org/abs/1409.1556

  4. He K, et al. (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2016.90

  5. Szegedy C, et al. (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2015.7298594

  6. Liu W, et al. (2016) SSD: single shot multibox detector. In: 2016 European conference on computer vision (ECCV). https://doi.org/10.1007/978-3-319-46448-0_2

  7. Huang G, et al. (2017) Densely connected convolutional networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2017.243

  8. He KM, et al. (2017) Mask R-CNN. In: 2017 IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/iccv.2017.322

  9. Wang K, et al. (2019) Deep high-resolution representation learning for human pose estimation. In: 2019 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/cvpr.2019.00584

  10. Wang JD et al (2020) Deep high-resolution representation learning for visual recognition. IEEE Trans Pattern Anal Mach Intell 1(1):1–1. https://doi.org/10.1109/tpami.2020.2983686

    Article  Google Scholar 

  11. Redmon J, et al. (2016) You only look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/cvpr.2016.91

  12. Bochkovskiy A, Wang CY, Liao HYM (2020) YOLOv4: optimal speed and accuracy of object detection, p 1. https://arxiv.org/abs/2004.10934v1

  13. Ultralytics (2020) YOLOv5. 2021–02–01]. https://github.com/ultralytics/yolov5/tree/v4.0. Accessed 01 Feb 2021

  14. Microsoft (2014) COCO dataset. https://cocodataset.org. Accessed 02 Mar 2021

  15. Zhang ZD (2021) The dataset, demo and source code of crosswalk detection network (CDNet). https://github.com/zhangzhengde0225/CDNet. Accessed 02 Mar 2021

  16. Se S (2000) Zebra-crossing detection for the partially sighted. In: 2000 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2000.854787

  17. Uddin MS, Shioyama T (2005) Detection of pedestrian crossing using bipolarity feature-an image-based technique. IEEE Trans Intell Transp Syst 6(4):439–445. https://doi.org/10.1109/TITS.2005.858787

    Article  Google Scholar 

  18. Akinlar C, Topal C (2011) Edlines: real-time line segment detection by Edge Drawing. In: 2011 IEEE international conference on image processing. https://doi.org/10.1109/ICIP.2011.6116138

  19. Mascetti S et al (2016) ZebraRecognizer: pedestrian crossing recognition for people with visual impairment or blindness. Pattern Recogn 60(1):405–419. https://doi.org/10.1016/j.patcog.2016.05.002

    Article  Google Scholar 

  20. Huang X, Lin Q (2017) An improved method of zebra crossing detection based on bipolarity. Comput Appl Softw 34(12):202–205. https://doi.org/10.3969/j.issn.1000-386x.2017.12.038

    Article  Google Scholar 

  21. Chen N, Hong F, Bai B (2019) Zebra crossing recognition method based on edge feature and Hough transform. J Zhejiang Univer Sci Technol 31(06):476–483. https://doi.org/10.3969/j.issn.1671-8798.2019.06.008

    Article  Google Scholar 

  22. Redmon J, Farhadi A (2018) YOLOv3: an Incremental Improvement, p 1. https://arxiv.org/abs/1804.02767

  23. Wang C, et al. (2020) CSPNet: a new backbone that can enhance learning capability of CNN. In: 2020 IEEE conference on computer vision and pattern recognition workshops (CVPRW). https://doi.org/10.1109/CVPRW50498.2020.00203

  24. He K et al (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824

    Article  Google Scholar 

  25. Liu S, et al. (2018) Path aggregation network for instance segmentation. In: 2018 IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2018.00913

  26. Hu J, et al. (2019) Squeeze-and-excitation networks, p 1. https://arxiv.org/abs/1709.01507

  27. He KM, Sun J, Tang XO (2011) Single image haze removal using dark channel prior. IEEE Trans Pattern Anal Mach Intell 33(12):2341–2353. https://doi.org/10.1109/tpami.2010.168

    Article  Google Scholar 

  28. Zhang ZD (2021) The training, testing and validation datasets for crosswalk detection network (CDNet). https://github.com/zhangzhengde0225/CDNet/blob/master/docs/DATASETS.md. Accessed 02 Mar 2021

  29. Stehman SV (1997) Selecting and interpreting measures of thematic classification accuracy. Remote Sens Environ 62(1):77–89. https://doi.org/10.1016/s0034-4257(97)00083-7

    Article  Google Scholar 

  30. Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom 21(1):1–1. https://doi.org/10.1186/s12864-019-6413-7

    Article  Google Scholar 

  31. Hussian T et al (2021) Multi-view summarization and activity recognition meet edge computing in IoT environments. IEEE Internet Things J 8(12):9634–9644. https://doi.org/10.1109/JIOT.2020.3027483

    Article  Google Scholar 

  32. Chen K, et al. (2019) MMDetection: open MMLab detection toolbox and benchmark. https://github.com/open-mmlab/mmdetection. Accessed 10 Oct 2021

  33. Ren SQ et al (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/tpami.2016.2577031

    Article  Google Scholar 

  34. Ge Z, et al. (2021) YOLOX: exceeding YOLO series in 2021, p 1. https://arxiv.org/abs/2107.08430

  35. Liu Z, et al. (2021) Swin transformer: hierarchical vision transformer using shifted windows, p 1. https://arxiv.org/abs/2103.14030

  36. NVIDIA (2019) TensorRT open source software. https://github.com/NVIDIA/TensorRT. Accessed 10 Oct 2021

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China [Grant Numbers: 61873163]. We also acknowledge the Center for High Performance Computing at Shanghai Jiao Tong University for providing computing resources.

Author information

Authors and Affiliations

Authors

Contributions

Zhengde Zhang contributed to conceptualization, methodology, software, formal analysis, data curation, writing—original draft and visualization. Menglu Tan contributed to methodology, data curation, writing—original draft and writing—review and editing. Zhicai Lan contributed to conceptualization, investigation and writing—review and editing. Haichun Liu contributed to validation and investigation. Ling Pei contributed to resources, funding acquisition and methodology. Wenxian Yu contributed to resources, supervision, funding acquisition and project administration.

Corresponding authors

Correspondence to Zheng-De Zhang or Wen-Xian Yu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, ZD., Tan, ML., Lan, ZC. et al. CDNet: a real-time and robust crosswalk detection network on Jetson nano based on YOLOv5. Neural Comput & Applic 34, 10719–10730 (2022). https://doi.org/10.1007/s00521-022-07007-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07007-9

Keywords

Navigation