FESSD:SSD target detection based on feature fusion and feature enhancement

Qian, Huaming; Wang, Huilin; Feng, Shuai; Yan, Shuya

doi:10.1007/s11554-023-01258-y

FESSD:SSD target detection based on feature fusion and feature enhancement

Original Research Paper
Published: 27 January 2023

Volume 20, article number 2, (2023)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Huaming Qian¹,
Huilin Wang¹,
Shuai Feng¹ &
…
Shuya Yan¹

605 Accesses
11 Citations
Explore all metrics

Abstract

In recent years, significant breakthroughs have been made in target detection. However, although the existing two-stage target detection algorithm has high precision, the detection velocity is slow to content the real-time requirements. One-stage target detection algorithms can meet real-time requirements but have poor detection capabilities, especially for detecting the small target. In this paper, we propose an end-to-end feature fusion and feature enhancement SSD (FESSD) target detection algorithm to increase the capability of one-stage target detection. Firstly, a deeper ResNet-50 is used to replace VGG16 as the backbone network to obtain richer semantic information. Five extra layers are added to generate feature maps of different sizes for multi-scale target detection. Then, the feature maps are fused by the maximum pooling feature fusion module (MPFFM) and upsampling feature fusion module (UPFFM) to generate a new feature pyramid, which introduces semantic information into the shallow feature mapping. Finally, the feature enhancement module (FEM) is used to expand the receptive field of the output feature map, introduce more context information, and further enhance the feature expression ability of the model. Experimental results on the PASCAL VOC and MS COCO datasets validated the method’s validity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive feature fusion with attention mechanism for multi-scale target detection

Article 10 July 2020

L-SSD: lightweight SSD target detection based on depth-separable convolution

Article 16 February 2024

Real-time detector design for small targets based on bi-channel feature fusion mechanism

Article 22 June 2021

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Cao, G., Xie, X., Yang, W., Liao, Q., Shi, G., Wu, J.: Feature-fused DDS: fast detection for small objects. In: Ninth International Conference on Graphic and Image Processing (ICGIP 2017), vol. 10615, p. 106151E. International Society for Optics and Photonics (2018)
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. Adv. Neural Inf. Process. Syst. 29, 379–387 (2016)
Google Scholar
Everingham, M., Zisserman, A., Williams, C.K., Gool, L.V., Allan, M., Bishop, C.M., Chapelle, O., Dalal, N., Deselaers, T., Dorkó, G., et al.: The 2005 pascal visual object classes challenge. In: Machine Learning Challenges Workshop, pp. 117–176. Springer (2005)
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2015)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Ji, H., Gao, Z., Mei, T., Ramesh, B.: Vehicle detection in remote sensing images leveraging on simultaneous super-resolution. IEEE Geosci. Remote Sens. Lett. 17(4), 676–680 (2019)
Article Google Scholar
Kumar, C., Punitha, R., et al.: Performance analysis of object detection algorithm for intelligent traffic surveillance system. In: 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), pp. 573–579. IEEE (2020)
Li, Z., Zhou, F.: Fssd: feature fusion single shot multibox detector. arXiv preprint arXiv:1712.00960 (2017)
Lian, G., Wang, Y., Qin, H., Chen, G.: Towards unified on-road object detection and depth estimation from a single image. Int. J. Mach. Learn. Cybern. 13(5), 1231–1241 (2022)
Article Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)
Liu, S., Huang, D., et al.: Receptive field block net for accurate and fast object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 385–400 (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Lu, X., Ji, J., Xing, Z., Miao, Q.: Attention and feature fusion SSD for remote sensing object detection. IEEE Trans. Instrum. Meas. 70, 1–9 (2021)
Article Google Scholar
Mao, L., Li, X., Yang, D., Zhang, R.: Convolutional feature frequency adaptive fusion object detection network. Neural Process. Lett. 53(5), 3545–3560 (2021)
Article Google Scholar
Preetha, K., et al.: A fuzzy rule-based abandoned object detection using image fusion for intelligent video surveillance systems. Turk. J. Comput. Math. Educ. (TURCOMAT) 12(3), 3694–3702 (2021)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer (2015)
Shi, W., Bao, S., Tan, D.: Ffessd: an accurate and efficient single-shot detector for target detection. Appl. Sci. 9(20), 4276 (2019)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, H., Wang, W., Zhao, S., Shen, J., Lam, K.M.: Pyramid dilated deeper convlstm for video salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 715–731 (2018)
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)
Wang, K., Liu, M.: Yolov3-mt: a yolov3 using multi-target tracking for vehicle visual detection. Appl. Intell. 52(2), 2070–2091 (2022)
Article Google Scholar
Wang, Q., Wu, B., Zhu, P., Li, P., Hu, Q.: Eca-net: efficient channel attention for deep convolutional neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Wang, W., Zhao, S., Shen, J., Hoi, S.C., Borji, A.: Salient object detection with pyramid attention and salient edges. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1448–1457 (2019)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Ye, L., Wang, L., Sun, Y., Zhao, L., Wei, Y.: Parallel multi-stage features fusion of deep convolutional neural networks for aerial scene classification. Remote Sens. Lett. 9(3), 294–303 (2018)
Article Google Scholar
Ying, X., Wang, Q., Li, X., Yu, M., Jiang, H., Gao, J., Liu, Z., Yu, R.: Multi-attention object detection model in remote sensing images based on multi-scale. IEEE Access 7, 94508–94519 (2019)
Article Google Scholar
Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 472–480 (2017)

Download references

Acknowledgements

This work was supported by Key-Area Research and Development Program of Guangdong Province under Grant (Funding No.: 2020B0909020001) and National Natural Science Foundation of China (Funding No.: 61573113).

Author information

Authors and Affiliations

College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, 150001, China
Huaming Qian, Huilin Wang, Shuai Feng & Shuya Yan

Authors

Huaming Qian
View author publications
You can also search for this author in PubMed Google Scholar
Huilin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Feng
View author publications
You can also search for this author in PubMed Google Scholar
Shuya Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huilin Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Qian, H., Wang, H., Feng, S. et al. FESSD:SSD target detection based on feature fusion and feature enhancement. J Real-Time Image Proc 20, 2 (2023). https://doi.org/10.1007/s11554-023-01258-y

Download citation

Received: 13 July 2022
Accepted: 11 November 2022
Published: 27 January 2023
DOI: https://doi.org/10.1007/s11554-023-01258-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FESSD:SSD target detection based on feature fusion and feature enhancement

Abstract

Access this article

Similar content being viewed by others

Adaptive feature fusion with attention mechanism for multi-scale target detection

L-SSD: lightweight SSD target detection based on depth-separable convolution

Real-time detector design for small targets based on bi-channel feature fusion mechanism

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

FESSD:SSD target detection based on feature fusion and feature enhancement

Abstract

Access this article

Similar content being viewed by others

Adaptive feature fusion with attention mechanism for multi-scale target detection

L-SSD: lightweight SSD target detection based on depth-separable convolution

Real-time detector design for small targets based on bi-channel feature fusion mechanism

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation