Real-time detection network for tiny traffic sign using multi-scale attention module

Yang, TingTing; Tong, Chao

doi:10.1007/s11431-021-1950-9

Real-time detection network for tiny traffic sign using multi-scale attention module

Article
Published: 30 December 2021

Volume 65, pages 396–406, (2022)
Cite this article

Science China Technological Sciences Aims and scope Submit manuscript

TingTing Yang¹ &
Chao Tong¹

350 Accesses
16 Citations
Explore all metrics

Abstract

As one of the key technologies of intelligent vehicles, traffic sign detection is still a challenging task because of the tiny size of its target object. To address the challenge, we present a novel detection network improved from yolo-v3 for the tiny traffic sign with high precision in real-time. First, a visual multi-scale attention module (MSAM), a light-weight yet effective module, is devised to fuse the multi-scale feature maps with channel weights and spatial masks. It increases the representation power of the network by emphasizing useful features and suppressing unnecessary ones. Second, we exploit effectively fine-grained features about tiny objects from the shallower layers through modifying backbone Darknet-53 and adding one prediction head to yolo-v3. Finally, a receptive field block is added into the neck of the network to broaden the receptive field. Experiments prove the effectiveness of our network in both quantitative and qualitative aspects. The mAP@0.5 of our network reaches 0.965 and its detection speed is 55.56 FPS for 512 × 512 images on the challenging Tsinghua-Tencent 100k (TT100k) dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time traffic sign detection based on multiscale attention and spatial information aggregator

Article 16 September 2022

A lightweight method for small scale traffic sign detection based on YOLOv4-Tiny

Article 10 November 2023

Deep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4

Article 22 April 2022

References

Everingham M, van Gool L, Williams C K I, et al. The pascal visual object classes (VOC) challenge. Int J Comput Vis, 2010, 88: 303–338
Article Google Scholar
Zhu Z, Liang D, Zhang S, et al. Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 2110–2118
Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1137–1149
Article Google Scholar
Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector. In: Proceedings of the European Conference on Computer Vision. Amsterdam, 2016. 21–37
Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection. 2020, ArXiv: 2004.10934
Pang G, Shen C, Cao L, et al. Deep learning for anomaly detection: A review. ACM Comput Surv, 2021, 54: 1–38
Article Google Scholar
Lillo-Castellano J M, Mora-Jiménez I, Figuera-Pozuelo C, et al. Traffic sign segmentation and classification using statistical learning methods. Neurocomputing, 2015, 153: 286–299
Article Google Scholar
Loy G, Barnes N. Fast shape-based road sign detection for a driver assistance system. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Sendai, 2004. 70–75
Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). San Diego, 2005. 886–893
Lee T S. Image representation using 2D Gabor wavelets. IEEE Trans Pattern Anal Mach Intell, 1996, 18: 959–971
Article Google Scholar
Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, 2001
Yin X, Liu X. Multi-task convolutional neural network for poseinvariant face recognition. IEEE Trans Image Process, 2018, 27: 964–975
Article MathSciNet Google Scholar
Cheng M M, Zhang Z, Lin W Y, et al. BING: Binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, 2014. 3286–3293
Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017. 2117–2125
Liu S, Qi L, Qin H, et al. Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018. 8759–8768
Liu S, Huang D, Wang Y. Learning spatial fusion for single-shot object detection. 2019, ArXiv: 1911.09516
Yang T, Long X, Sangaiah A K, et al. Deep detection network for real-life traffic sign in vehicular networks. Comput Netw, 2018, 136: 95–104
Article Google Scholar
Lu Y, Lu J, Zhang S, et al. Traffic signal detection and classification in street views using an attention model. Comp Visual Media, 2018, 4: 253–266
Article Google Scholar
Meng Z, Fan X, Chen X, et al. Detecting small signs from large images. In: Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI). San Diego, 2017. 217–224
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, 2014. 580–587
Girshick R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision. Santiago, 2015. 1440–1448
He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell, 2015, 37: 1904–1916
Article Google Scholar
Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 779–788
Redmon J, Farhadi A. Yolo9000: Better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017. 7263–7271
Redmon J, Farhadi A. Yolov3: An incremental improvement. 2018, ArXiv: 1804.02767
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, 2017. 2980–2988
Duan K, Bai S, Xie L, et al. Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, 2019. 6569–6578
Law H, Deng J. Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV). Munich, 2018. 734–750
Law H, Teng Y, Russakovsky O, et al. Cornernet-lite: Efficient key-point based object detection. 2019, ArXiv: 1904.08900
Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: Common objects in context. In: Proceedings of the European Conference on Computer Vision. Zurich, 2014. 740–755
Qiao S, Chen L C, Yuille A. Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. 2020, ArXiv: 2006.02334
Fu C Y, Liu W, Ranga A, et al. DSSD: Deconvolutional single shot detector. 2017, ArXiv: 1701.06659
Cui L S, Ma R, Lv P, et al. MDSSD: multi-scale deconvolutional single shot detector for small objects. Sci China Inf Sci, 2020, 63: 120113
Article Google Scholar
Li J, Liang X, Wei Y, et al. Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, 2017
Singh B, Davis L S. An analysis of scale invariance in object detection-SNIP. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, 2018
Singh B, Najibi M, Davis L S. SNIPER: Efficient multi-scale training. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, 2018. 9333–9343
Hu H, Gu J, Zhang Z, et al. Relation networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018. 3588–3597
Hu J, Shen L, Albanie S, et al. Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell, 2020, 42: 2011–2023
Article Google Scholar
Li X, Wang W, Hu X, et al. Selective kernel networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019. 510–519
Woo S, Park J, Lee J Y, et al. CBAM: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV). Munich, 2018. 3–19
Park J, Woo S, Lee J Y, et al. BAM: Bottleneck attention module. In: Proceedings of the British Machine Vision Conference (BMVC) and British Machine Vision Association (BMVA). Newcastle upon Tyne, 2018
Wang X, Girshick R, Gupta A, et al. Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018. 7794–7803
Cao Y, Xu J, Lin S, et al. GCNet: Non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. Seoul, 2019
Tian Y, Gelernter J, Wang X, et al. Traffic sign detection using a multi-scale recurrent attention network. IEEE Trans Intell Transp Syst, 2019, 20: 4466–4475
Article Google Scholar
Cao J, Zhang J, Huang W. Traffic sign detection and recognition using multi-scale fusion and prime sample attention. IEEE Access, 2021, 9: 3579–3591
Article Google Scholar
Zhang H, Cao Z, Yan Z, et al. Sill-net: Feature augmentation with separated illumination representation. 2021, ArXiv: 2102.03539
Rezatofighi H, Tsoi N, Gwak J, et al. Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019. 658–666
Araujo A, Norris W, Sim J. Computing receptive fields of convolutional neural networks. Distill, 2019, 4: 21
Article Google Scholar
Zhang H, Qin L, Li J, et al. Real-time detection method for small traffic signs based on Yolov3. IEEE Access, 2020, 8: 64145–64156
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science & Engineering, Beihang University, Beijing, 100191, China
TingTing Yang & Chao Tong

Authors

TingTing Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Tong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to TingTing Yang or Chao Tong.

Additional information

This work was supported by the National Key R&D Program of China (Grant Nos. 2018YFB2101100 and 2019YFB2101600), the National Natural Science Foundation of China (Grant No. 62176016), the Guizhou Province Science and Technology Project: Research and Demonstration of Science and Technology Big Data Mining Technology Based on Knowledge Graph (Qiankehe[2021] General 382), the Training Program of the Major Research Plan of the National Natural Science Foundation of China (Grant No. 92046015), and the Beijing Natural Science Foundation Program and Scientific Research Key Program of Beijing Municipal Commission of Education (Grant No. KZ202010025047).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, T., Tong, C. Real-time detection network for tiny traffic sign using multi-scale attention module. Sci. China Technol. Sci. 65, 396–406 (2022). https://doi.org/10.1007/s11431-021-1950-9

Download citation

Received: 01 August 2021
Accepted: 22 October 2021
Published: 30 December 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11431-021-1950-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time detection network for tiny traffic sign using multi-scale attention module

Abstract

Access this article

Similar content being viewed by others

Real-time traffic sign detection based on multiscale attention and spatial information aggregator

A lightweight method for small scale traffic sign detection based on YOLOv4-Tiny

Deep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Real-time detection network for tiny traffic sign using multi-scale attention module

Abstract

Access this article

Similar content being viewed by others

Real-time traffic sign detection based on multiscale attention and spatial information aggregator

A lightweight method for small scale traffic sign detection based on YOLOv4-Tiny

Deep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation