Skip to main content
Log in

An improved anchor-free method for traffic scene object detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Due to the low detection accuracy of most anchor-free detectors and the slow detection speed of anchor-based detectors. Therefore, to balance the detection accuracy and speed of traffic scene objects, a new anchor-free detector called FABNet is proposed in this paper. The method is mainly composed of feature pyramid fusion module (FPFM), cascade attention module (CAM), and boundary feature extraction module (BFEM). Firstly, we design a feature pyramid fusion module to generate richer semantic information. The proposal of the feature pyramid fusion module not only improves the detection accuracy of objects, but also solves the problem of detection of objects of different sizes. Secondly, the cascade attention module achieves the local representation of features by exploiting hierarchical attention, spatial attention and channel attention. The proposal of cascade attention module improves the representation ability of object detection head. Finally, to obtain more foreground information under the influence of complex background, we design a boundary feature extraction module to extract the boundary features of the object effectively. We perform sufficient experiments on three public datasets, i.e., BDD100K, PASCAL VOC, and KITTI. The results show that our method achieves state-of-the-art levels in both accuracy and speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author on reasonable request.

References

  1. Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2874–2883. https://doi.org/10.1109/CVPR.2016.314

  2. Bochkovskiy A, Wang CY, Liao HYM (2020) YOLOv4: optimal speed and accuracy of object detection. https://doi.org/10.48550/arXiv.2004.10934

  3. Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162. https://doi.org/10.1109/CVPR.2018.00644

  4. Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 354–370. https://doi.org/10.1007/978-3-319-46493-0_22

  5. Cao J, Chen Q, Guo J, Shi R (2020) Attention-guided context feature pyramid network for object detection. https://doi.org/10.48550/arXiv.2005.11475

  6. Chen X et al (2015) 3D object proposals for accurate object class detection. In: Proceedings of the advances in neural information processing systems, pp 424–432. https://doi.org/10.1016/j.patcog.2022.108796

  7. Chen Y, Zhang Z, Cao Y, Wang L, Lin S, Hu H (2020) RepPoints V2: verification meets regression for object detection. https://doi.org/10.48550/arXiv.2007.08508

  8. Cheng G, Si Y, Hong H, Yao X, Guo L (2021) Cross-scale feature fusion for object detection in optical remote sensing images. IEEE Geosci Remote Sens Lett:431–435. https://doi.org/10.1109/LGRS.2020.2975541

  9. Cheng G, Yao Y, Li S, Li K, Xie X, Wang J, Yao X, Han J (2022) Dual-aligned oriented detector. IEEE Trans Geosci Remote Sens:1–11. https://doi.org/10.1109/TGRS.2022.3149780

  10. Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387. https://doi.org/10.48550/arXiv.1605.06409

  11. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88:303–338. https://doi.org/10.1007/s11263-009-0275-4

    Article  Google Scholar 

  12. Fan S, Zhu F, Chen S, Zhang H, Wang FY (2021) FII-CenterNet: an anchor-free detection with foreground attention for traffic object detection. IEEE Trans Veh Technol 70:121–132. https://doi.org/10.1109/TVT.2021.3049805

    Article  Google Scholar 

  13. Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC (2017) DSSD: Deconvolutional single shot detector. https://doi.org/10.48550/arXiv.1701.06659

  14. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The Kitti vision benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3354–3361. https://doi.org/10.1109/CVPR.2012.6248074

  15. Gidaris S, Komodakis N (2015) Object detection via a multiregion and semantic segmentation-aware cnn model. In: Proceedings of the IEEE international conference on computer vision, pp 1134–1142. https://doi.org/10.48550/arXiv.1505.01749

  16. Girshck R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 580–587. https://doi.org/10.1109/CVPR.2014.81

  17. Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448. https://doi.org/10.1109/ICCV.2015.169

  18. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778. https://doi.org/10.48550/arXiv.1512.03385

  19. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: proceedings of the IEEE international conference on computer vision, pp 2961–2969. https://doi.org/10.48550/arXiv.1703.06870

  20. Hu X, Xu X, Xiao Y, Chen H, He S, Qin J (2018) SINet: a scale-insensitive convolutional neural network for fast vehicle detection. IEEE Trans Intell Transp Syst:1010–1019. https://doi.org/10.1109/TITS.2018.2838132

  21. Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y (2017) RON: reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5936–5944. https://doi.org/10.1109/CVPR.2017.557

  22. Kong T, Sun F, Liu H, Jiang Y, Shi J (2020) FoveaBox: beyond anchor-based object detector. IEEE Trans Image Process 29:7389–7398. https://doi.org/10.1109/TIP.2020.3002345

    Article  MATH  Google Scholar 

  23. Lan S, Ren Z, Wu Y, Davis LS, Hua G (2020) SaccadeNet: a fast and accurate object detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10394–10403. https://doi.org/10.1109/CVPR42600.2020.01041

  24. Law H, Deng J (2018) CornerNet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750. https://doi.org/10.1007/s11263-019-01204-1

  25. Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-aware trident networks for object detection. In: IEEE/CVF international conference on computer vision (ICCV), pp 6053–6062. https://doi.org/10.1109/ICCV.2019.00615

  26. Lim JS, Astrid M, Yoon HJ, Lee SL (2021) Small object detection using context and attention. In: 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp 181–186. https://doi.org/10.48550/arXiv.1912.06319

  27. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: common objects in context. In: European conference on computer vision, pp 740–755. https://doi.org/10.1007/978-3-319-10602-1_48

  28. Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988. https://doi.org/10.1109/TPAMI.2018.2858826

  29. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: IEEE conference on computer vision and pattern recognition, pp 2117–2125. https://doi.org/10.1109/CVPR.2017.106

  30. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

  31. Liu Y, Wang R, Shan S, Chen X (2018) Structure inference net: object detection using scene-level context and instance-level relationships. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6985–6994. https://doi.org/10.1109/CVPR.2018.00730

  32. Liu W, Liao S, Ren W, Hu W, Yu Y (2019) High-level semantic feature detection: a new perspective for pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5182–5191. https://doi.org/10.1109/CVPR.2019.00533

  33. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440. https://doi.org/10.1109/TPAMI.2016.2572683

  34. Lu X, Wang W, Ma C, Shen J , Shao L, Porikli F (2019) See more, know more: unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3618–3627. https://doi.org/10.1109/cvpr.2019.00374

  35. Lu X, Wang W, Shen J, Crandall D, Luo J (2020) Zero-shot video object segmentation with co-attention siamese networks. IEEE Trans Pattern Anal Mach Intell 44:2228–2242. https://doi.org/10.1109/TPAMI.2020.3040258

    Article  Google Scholar 

  36. Lu X, Wang W, Shen J, Crandall D, Gool LV (2021) Segmenting objects from relational visual data. IEEE Trans Pattern Anal Mach Intell 44:7885–7897. https://doi.org/10.1109/TPAMI.2021.3115815

    Article  Google Scholar 

  37. Mousavian A, Anguelov D, Flynn J, Kosecka J (2017) 3-D bounding box estimation using deep learning and geometry. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5632–5640. https://doi.org/10.1109/CVPR.2017.597

  38. Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra R-CNN: towards balanced learning for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 821–830. https://doi.org/10.1109/CVPR.2019.00091

  39. Qiao S, Chen L C, Yuille A (2021 DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10208–10219. https://doi.org/10.1109/CVPR46437.2021.01008

  40. Qiu H, Ma Y, Li Z, Liu S, Liu S, Sun J (2020) BorderDet: border feature for dense object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 549–564. https://doi.org/10.1007/978-3-030-58452-8_32

  41. Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271. https://doi.org/10.1109/CVPR.2017.690

  42. Redmon J, Farhadi A (2018) YOLOv3: An incremental improvement. https://doi.org/10.48550/arXiv.1804.02767

  43. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition, pp 779–788. https://doi.org/10.1109/CVPR.2016.91

  44. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards realtime object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99. https://doi.org/10.48550/arXiv.1506.01497

  45. Samet N, Hicsonmez S, Akbas E (2020) HoughNet: integrating near and long-range evidence for bottom-up object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 406–423. https://doi.org/10.1007/978-3-030-58595-2_25

  46. Simonyan K, Zisserman A (2014) Very deep connolutional networks for large-scale image recognition. https://doi.org/10.48550/arXiv.1409.1556

  47. Tan M, Le QV (2019) EfficientNet: Rethinking model scaling for convolution neural networks. https://doi.org/10.48550/arXiv.1905.11946

  48. Tan M, Pang R, Le QV (2020) EfficientDet: scalable and efficient object detection, In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10778–10787. https://doi.org/10.1109/CVPR42600.2020.01079

  49. Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection. In: IEEE/CVF international conference on computer vision (ICCV), pp 9626–9635. https://doi.org/10.1109/ICCV.2019.00972

  50. Wang X, Chen K, Huang Z, Yao C, Liu W (2017) Point linking network for object detection. https://doi.org/10.48550/arXiv.1706.03646

  51. Wei J, He J, Zhou Y, Chen K, Tang Z, Xiong Z (2020) Enhanced object detection with deep convolutional neural networks for advanced driving assistance. IEEE Trans Intell Transp Syst 21:1572–1583. https://doi.org/10.1109/TITS.2019.2910643

    Article  Google Scholar 

  52. Xiang Y, Choi W, Lin Y, Savarese S (2017) Subcategory-aware convolutional neural networks for object proposals and detection. In: Proceedings of the IEEE winter conference on applications of computer vision, pp 924–933. https://doi.org/10.1109/WACV.2017.108

  53. Yang D, Zou Y, Zhang J, Li G (2019) C-RPNs: promoting object detection in real world via a cascade structure of region proposal networks. Neurocomputing 367:20–30. https://doi.org/10.1016/j.neucom.2019.08.016

    Article  Google Scholar 

  54. Yang Z, Liu S, Hu H, Wang L, Lin S (2019) RepPoints: point set representation for object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9657–9666. https://doi.org/10.1109/ICCV.2019.00975

  55. Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F (2018) BDD100K: a diverse driving dataset for heterogeneous multitask learning. https://doi.org/10.48550/arXiv.1805.04687

  56. Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9759–9768. https://doi.org/10.1109/CVPR42600.2020.00978

  57. Zhou X, Zhuo J, Krahenbuhl P (2019) Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 850–859. https://doi.org/10.48550/arXiv.1901.08043

  58. Zhou X, Wang D, Krähenbühl P (2019) Objects as points. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271. https://doi.org/10.48550/arXiv.1904.07850

  59. Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, Lu H (2017) CoupleNet: coupling global structure with local parts for object detection. In: IEEE international conference on computer vision, pp 4126–4134. https://doi.org/10.48550/arXiv.1708.02863

Download references

Acknowledgments

This work was supported in part by NSFC (61572286 and 61472220), NSFC Joint with Zhejiang Integration of Informatization and Industrializaiton under Key Project (U1609218), and the Fostering Project of Dominant Discipline a Talent Team of Shandong Province Higher Education.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tianping Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, T., Feng, K., Yan, Y. et al. An improved anchor-free method for traffic scene object detection. Multimed Tools Appl 82, 34703–34724 (2023). https://doi.org/10.1007/s11042-023-15077-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15077-7

Keywords

Navigation