YOLOF-F: you only look one-level feature fusion for traffic sign detection

Wei, Hongyang; Zhang, Qianqian; Qin, Yugang; Li, Xiang; Qian, Yurong

doi:10.1007/s00371-023-02813-1

YOLOF-F: you only look one-level feature fusion for traffic sign detection

Original article
Published: 17 March 2023

Volume 40, pages 747–760, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Hongyang Wei^1,2,3,
Qianqian Zhang^1,2,3^na1,
Yugang Qin^1,2,3^na1,
Xiang Li^1,2,3^na1 &
…
Yurong Qian ORCID: orcid.org/0000-0001-6564-4745^1,2,3^na1

599 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

This paper proposes a detector that focuses on multi-scale detection problems and effectively enhances the detection performance to solve the problem that is hard to detect minor traffic signs. This detector, called YOLOF-F (you only look one-level feature fusion), is a single-stage detector that extracts multi-scale feature information from a single layer of fusion feature. First, we propose FFM (feature fusion module) to fuse different scales. Next, we offer a new encoder CDE (corner dilated encoder) to enhance the angular point information in the feature map, improve position regression accuracy, and maintain a faster detection speed. Finally, YOLOF-F achieved 74.57% and 77.23% of the AP on the GTSDB and CTSD datasets and reached 32 FPS. Extensive experiments validate that YOLOF-F is faster and more effective than most traffic sign detection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Traffic sign detection based on multi-scale feature extraction and cascade feature fusion

Article 06 August 2022

Traffic sign detection algorithm based on feature expression enhancement

Article 21 August 2021

MDCN: Multi-scale Dilated Convolutional Enhanced Residual Network for Traffic Sign Detection

References

Rajendran, S.P., Shine, L., Pradeep, R., et al.: Fast and accurate traffic sign recognition for self driving cars using retinanet based detector. In: International Conference on Communication and Electronics Systems (ICCES), pp. 784–790 (2019)
Liu, X., Xiong, F.: A real-time traffic sign detection model based on improved yolov3. In: IOP Conference Series: Materials Science and Engineering, vol. 787, p. 012034 (2020)
Chen, Q., Wang, Y., Yang, T., et al.: You only look one-level feature. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13039–13048 (2021)
Lin, T.-Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117–2125 (2017)
Cao, J., Zhang, J., Huang, W.: Traffic sign detection and recognition using multi-scale fusion and prime sample attention. IEEE Access 9, 3579–3591 (2020)
Article Google Scholar
Ren, K., Huang, L., Fan, C., et al.: Real-time traffic sign detection network using ds-detnet and lite fusion fpn. J. Real-Time Image Proc. 18(6), 2181–2191 (2021)
Article Google Scholar
Liu, F., Qian, Y., Li, H., et al.: Caffnet: channel attention and feature fusion network for multi-target traffic sign detection. Int. J. Pattern Recognit. Artif. Intell. 35(07), 2152008 (2021)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (IJCV) 60(2), 91–110 (2004)
Article Google Scholar
Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(1), 23–38 (1998)
Article Google Scholar
Vaillant, R., Monrocq, C., Le Cun, Y.: Original approach for the localisation of objects in images. IEE Proc. Vis. Image Signal Process 141(4), 245–250 (1994)
Article Google Scholar
Sermanet, P., Kavukcuoglu, K., Chintala, S., et al.: Pedestrian detection with unsupervised multi-stage feature learning. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3626–3633 (2013)
Lin, T.-Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: single shot multibox detector. In: European Conference on Computer Vision (ECCV), pp. 21–37 (2016)
Li, Z., Peng, C., Yu, G., et al.: Detnet: design backbone for object detection. In: European Conference on Computer Vision (ECCV), pp. 334–350 (2018)
Zhang, S., Chi, C., Yao, Y., et al.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Conference on Computer Vision and Pattern Recognition(CVPR), pp. 9759–9768 (2020)
Zhao, Q., Sheng, T., Wang, Y., et al.: M2det: a single-shot object detector based on multi-level feature pyramid network. In: AAAI Conference On Artificial Intelligence(AAAI), vol. 33, pp. 9259–9266 (2019)
Ghiasi, G., Lin, T.-Y., Le, Q.V.: Nas-fpn: learning scalable feature pyramid architecture for object detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7036–7045 (2019)
Kong, T., Sun, F., Tan, C., et al.: Deep feature pyramid reconfiguration for object detection. In: European Conference on Computer Vision (ECCV), pp. 169–185 (2018)
Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for instance segmentation. In: Conference on Computer Vision and Pattern Recognition(CVPR), pp. 8759–8768 (2018)
Wei, H., Zhang, Q., Han, J., et al.: Sarnet: Spatial attention residual network for pedestrian and vehicle detection in large scenes. Appl. Intell. 1–16 (2022)
Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10781–10790 (2020)
Girshick, R.: Fast r-cnn. In: IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)
Ren, S., He, K., Girshick, R., et al.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Zhang, S., Chi, C., Yao, Y., Lei, Z., et al.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Conference on Computer Vision and Pattern Recognition(CVPR), pp. 9759–9768 (2020)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: Unified, real-time object detection. In: Conference on Computer Vision and Pattern Recognition(CVPR), pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7263–7271 (2017)
Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: European Conference on Computer Vision (ECCV), pp. 734–750 (2018)
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Duan, K., Bai, S., Xie, L., et al.: Centernet: keypoint triplets for object detection. In: International Conference on Computer Vision (ICCV), pp. 6569–6578 (2019)
Carion, N., Massa, F., Synnaeve, G., et al.: End-to-end object detection with transformers. In: European Conference on Computer Vision (ECCV), pp. 213–229 (2020)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017)
Google Scholar
Jin, Y., Fu, Y., Wang, W., et al.: Multi-feature fusion and enhancement single shot detector for traffic sign recognition. IEEE Access 8, 38931–38940 (2020)
Article Google Scholar
Wang, F., Li, Y., Wei, Y., et al.: Improved faster rcnn for traffic sign detection. In: Conference on Intelligent Transportation Systems (ITSC), pp. 1–6 (2020)
Bi, Z., Yu, L., Gao, H., et al.: Improved vgg model-based efficient traffic sign recognition for safe driving in 5g scenarios. Int. J. Mach. Learn. Cybern. 12(11), 3069–3080 (2021)
Article Google Scholar
Wan, J., Ding, W., Zhu, H., et al.: An efficient small traffic sign detection method based on yolov3. J. Signal Process. Syst. 93(8), 899–911 (2021)
Article Google Scholar
Zhu, X., Hu, H., Lin, S., et al.: Deformable convnets v2: More deformable, better results. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9308–9316 (2019)
Houben, S., Stallkamp, J., Salmen, J., et al.: Detection of traffic signs in real-world images: the german traffic sign detection benchmark. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2013)
Zhang, J., Huang, M., Jin, X., et al.: A real-time chinese traffic sign detection algorithm based on modified yolov2. Algorithms 10(4), 127 (2017)
Article MathSciNet Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Zhu, X., Su, W., Lu, L., et al.: Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)

Download references

Acknowledgements

This work was supported by the National Science Foundation of China under Grant U1803261 and funded by the National Natural Science Foundation of China (61966035), Xinjiang Uygur Autonomous Region Innovation Team (XJEDU2017T002), the international cooperation project of China-region Science and Technology Department “Data-driven China-Russia cloud computing sharing platform construction” (2020E01023), research on depth learning labeling method based on multi-feature fusion (2020D01A34), and research on Video Information Processing Technology Based on Public Safety(U1803261).

Author information

Qianqian Zhang, Yugang Qin, Xiang Li and Yurong Qian have contributed equally to this work.

Authors and Affiliations

School of Software, Xinjiang University, Ürümqi, 830000, China
Hongyang Wei, Qianqian Zhang, Yugang Qin, Xiang Li & Yurong Qian
Key Laboratory of Signal Detection and Processing, Ürümqi, 83000, China
Hongyang Wei, Qianqian Zhang, Yugang Qin, Xiang Li & Yurong Qian
Key Laboratory of Software Engineering, Xinjiang University, Ürümqi, 83000, China
Hongyang Wei, Qianqian Zhang, Yugang Qin, Xiang Li & Yurong Qian

Authors

Hongyang Wei
View author publications
You can also search for this author in PubMed Google Scholar
Qianqian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yugang Qin
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yurong Qian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yurong Qian.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest in this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wei, H., Zhang, Q., Qin, Y. et al. YOLOF-F: you only look one-level feature fusion for traffic sign detection. Vis Comput 40, 747–760 (2024). https://doi.org/10.1007/s00371-023-02813-1

Download citation

Accepted: 19 February 2023
Published: 17 March 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00371-023-02813-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

YOLOF-F: you only look one-level feature fusion for traffic sign detection

Abstract

Access this article

Similar content being viewed by others

Traffic sign detection based on multi-scale feature extraction and cascade feature fusion

Traffic sign detection algorithm based on feature expression enhancement

MDCN: Multi-scale Dilated Convolutional Enhanced Residual Network for Traffic Sign Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

YOLOF-F: you only look one-level feature fusion for traffic sign detection

Abstract

Access this article

Similar content being viewed by others

Traffic sign detection based on multi-scale feature extraction and cascade feature fusion

Traffic sign detection algorithm based on feature expression enhancement

MDCN: Multi-scale Dilated Convolutional Enhanced Residual Network for Traffic Sign Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation