An enhanced SSD with feature cross-reinforcement for small-object detection

Gong, Lixiong; Huang, Xiao; Chao, Yinkang; Chen, Jialin; Lei, Binwen

doi:10.1007/s10489-023-04544-1

An enhanced SSD with feature cross-reinforcement for small-object detection

Published: 07 March 2023

Volume 53, pages 19449–19465, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Lixiong Gong¹^na1,
Xiao Huang ORCID: orcid.org/0009-0007-9490-8595¹^na1,
Yinkang Chao¹,
Jialin Chen¹ &
…
Binwen Lei¹

534 Accesses
5 Citations
1 Altmetric
Explore all metrics

A Correction to this article was published on 06 May 2023

This article has been updated

Abstract

Due to the limited feature information possessed by small objects in images, it is difficult for a single-shot multibox detector (SSD) to quickly notice the important regions of these small image objects. We propose an enhanced SSD based on feature cross-reinforcement (FCR-SSD). For shallow sampling, an improved group shuffling-efficient channel attention (GS-ECA) mechanism is used to make the model focus on the object areas rather than the background. Then, an FCR module allows the multiscale information from the shallow layer to be passed to the subsequent layer and fused to generate an enhanced feature map, which improves the utilization of the context information associated with small objects. We develop an adaptive algorithm for calculating positive and negative candidate box selection thresholds to select positive and negative samples, determine the intersection over union (IOU) thresholds of candidate boxes and ground-truth boxes, and adaptively determine the threshold for each ground-truth box. The proposed FCR-SSD algorithm achieves 79.6% mean average precision (mAP) for the PASCAL VOC 2007 dataset and 30.1% mAP for the MS COCO dataset at 34.2 frames per second (FPS) when run on an RTX 3080Ti GPU. The experimental results show that the FCR-SSD model yields high accuracy and a good detection speed in small-target detection tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

End-to-End Object Detection with Transformers

Change history

06 May 2023
A Correction to this paper has been published: https://doi.org/10.1007/s10489-023-04640-2

References

Wei J, He J, Zhou Y, Chen K, Tang Z, Xiong Z (2020) Enhanced object detection with deep convolutional neural networks for Advanced driving assistance. IEEE Trans Intell Transp Syst 21(4):1572–1583
Article Google Scholar
Guo G, Wang H, Yan Y, Zheng J, Li B (2020) A fast face detection method via convolutional neural network. Neurocomputing 395:128–137
Article Google Scholar
Ferrag MA, Maglaras L, Moschoyiannis S, Janicke H (2020) Deep learning for cyber security intrusion detection: approaches, datasets, and comparative study. J Inform Secur Appl 50:1–19
Google Scholar
Qiu T, Wen C, Xie K, Wen F, Sheng G, Tang X (2019) Efficient medical image enhancement based on CNN-FBB model. IET Image Proc 13(10):1736–1744
Article Google Scholar
Retnamony J, Muniasamy S, Stanley B (2022) Enhanced global and local face feature extraction for effective recognition of facial emotions. Concurrency and Computation-Practice & Experience 34(5). https://doi.org/10.1002/cpe.6701
Owczarek M (2020) The impact and importance of fabric image preprocessing for the new method of individual inter-thread pores detection. Autex Res J 20(3):250–262
Article Google Scholar
Zhu Y, Zhang F, Li L, Lin Y, Zhang Z, Shi L, Qin T (2021) Research on classification model of Panax notoginseng taproots based on machine Vision Feature Fusion. Sensors 21(23). https://doi.org/10.3390/s2123794
Guo S, Liu F, Yuan X, Zou C, Chen L, Shen T (2021) HSPOG: an optimized target Recognition Method based on histogram of spatial pyramid oriented gradients. Tsinghua Sci Technol 26(4):475–483
Article Google Scholar
Perez-Benito F, Signol F, Perez-Cortes J, Pollan M, Perez-Gomez B, Salas-Trejo D, Llobet R (2019) Global parenchymal texture features based on histograms of oriented gradients improve cancer development risk estimation from healthy breasts. Comput Methods Programs Biomed 177:123–132. https://doi.org/10.1016/j.cmpb.2019.05.022
Article Google Scholar
Safdari M, Moallem P, Satari M (2019) SIFT detector boosted by adaptive contrast threshold to improve matching robustness of Remote sensing panchromatic images. Ieee J Sel Top Appl Earth Observations Remote Sens 12(2):675–684. https://doi.org/10.1109/jstars.2019.2892360
Article Google Scholar
Doyle L, Mould D (2019) Augmenting photographs with textures using the Laplacian pyramid. Visual Comput 35(10):1489–1500
Article Google Scholar
Fu Z, Zhao Y, Xu Y, Xu L, Xu J (2020) Gradient structural similarity based gradient filtering for multi-modal image fusion. Inform Fusion 53:251–268. https://doi.org/10.1016/j.inffus.2019.06.025
Article Google Scholar
Xu J, Liu Z, Hou Y, Zhen X, Shao L, Cheng M (2021) Pixel-level non-local image smoothing with objective evaluation. IEEE Trans Multimedia 23:4065–4078
Article Google Scholar
Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, Santamaria J, Fadhel MA, Al-Amidie M, Farhan L (2021) Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 8(1):1–74
Article Google Scholar
Zhou D (2018) Deep distributed convolutional neural networks: universality. Anal Appl 16(6):895–919
Article MathSciNet MATH Google Scholar
Mu R, Zeng X (2019) A review of deep learning research. KSII Trans Internet Inform Syst (TIIS) 13(4):1738–1764. https://doi.org/10.3837/tiis.2019.04.001
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Wang J, Wang W, Gao W (2018) Multiscale Deep alternative neural network for large-scale video classification. IEEE Trans Multimedia 20(10):2578–2592. https://doi.org/10.1109/tmm.2018.2855081
Article Google Scholar
Wang D, Li Y, Ma L, Bai Z, Chan J (2019) Going deeper with densely connected convolutional neural networks for Multispectral Pansharpening. Remote Sens 11(22). https://doi.org/10.3390/rs11222608
Ha V, Ren J, Xu X, Liao W, Zhao S, Ren J, Yan G (2020) Optimized highway deep learning network for fast single image super-resolution reconstruction. J Real-Time Image Proc 17(6):1961–1970. https://doi.org/10.1007/s11554-020-00973-0
Article Google Scholar
Lu Y, Dong L, Zhang T, Xu W (2020) A robust detection algorithm for Infrared Maritime Small and Dim targets. Sensors 20(4):1–19
Article Google Scholar
Li Y, Zhang D, Lee D (2019) IIRNet: a lightweight deep neural network using intensely inverted residuals for image recognition. Image Vis Comput 92:1–8
Article Google Scholar
Shelhamer E, Long J, Darrell T (2017) Fully Convolutional Networks for Semantic Segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
Article Google Scholar
Rawat W, Wang Z (2017) Deep convolutional neural networks for image classification: a Comprehensive Review. Neural Comput 29(9):2352–2449
Article MathSciNet MATH Google Scholar
Zhu K, Wang R, Zhao Q, Cheng J, Tao D (2020) A cuboid CNN Model with an attention mechanism for Skeleton-Based action recognition. IEEE Trans Multimedia 22(11):2977–2989
Article Google Scholar
Omar W, Oh Y, Chung J, Lee I (2021) Aerial dataset integration for vehicle detection based on YOLOv4. Korean J Remote Sens 37(4):747–761. https://doi.org/10.7780/kjrs.2021.37.4.6
Article Google Scholar
Vinyals O, Toshev A, Bengio S, Erhan D (2017) Show and tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge. IEEE Trans Pattern Anal Mach Intell 39(4):652–663
Article Google Scholar
Xi R, Hou J, Lou W (2020) Potato bud detection with improved faster R-CNN. Trans Asabe 63(3):557–569. https://doi.org/10.13031/trans.13628
Article Google Scholar
Ren P, Wang L, Fang W, Song S, Djahel S (2020) A novel squeeze YOLO-based real-time people counting approach. Int J Bio-Inspired Comput 16(2):94–101. https://doi.org/10.1504/ijbic.2020.109674
Article Google Scholar
Biswas D, Su H, Wang C, Stevanovic A, Wang W (2019) An automatic traffic density estimation using single shot detection (SSD) and MobileNet-SSD. Phys Chem Earth 110:176–184. https://doi.org/10.1016/j.pce.2018.12.001
Article Google Scholar
Cao J, Song C, Song S, Peng S, Wang D, Shao Y, Xiao F (2020) Vehicle detection algorithm for Smart Car based on improved SSD model. Sensors 20(16):1–21
Article Google Scholar
Cheng Y, Liu W, Xing W (2021) Weighted feature fusion and attention mechanism for object detection. J Electron Imaging 30(2):1–12
Article Google Scholar
Zhou S, Qiu J (2021) Enhanced SSD with interactive multi-scale attention features for object detection. Multimedia Tools and Applications 80(8):11539–11556. https://doi.org/10.1007/s11042-020-10191-2
Article MathSciNet Google Scholar
Cai Z, Vasconcelos N (2021) Cascade R-CNN: high quality object detection and Instance Segmentation. IEEE Trans Pattern Anal Mach Intell 43(5):1483–1498
Article Google Scholar
Lin T, Goyal P, Girshick R, He K, Dollar P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327
Article Google Scholar
Zhang K, Cui L, Yin Y (2020) A multivariate grey incidence model for different scale data based on spatial pyramid pooling. J Syst Eng Electron 31(4):770–779. https://doi.org/10.23919/jsee.2020.000052
Article Google Scholar
Li T, Yu Y, Huang C, Yang J, Zhong Y, Hao Y (2022) Method for predicting cutter remaining life based on multi-scale cyclic convolutional network. Int J Distrib Sens Netw 18(5). https://doi.org/10.1177/15501329221102077
Chen S, Tan X, Wang B, Lu H, Hu X, Fu Y (2020) Reverse attention-based residual network for salient object detection. IEEE Trans Image Process 29:3763–3776
Article MATH Google Scholar
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023
Article Google Scholar
Xue H, Sun M, Liang Y (2022) ECANet: explicit cyclic attention-based network for video saliency prediction. Neurocomputing 468:233–244
Article Google Scholar
Khan R, Khattak H, Wong W, AlSalman H, Mosleh M, Rahman S (2021) Intelligent Malaysian Sign Language Translation System Using Convolutional-Based Attention Module with Residual Network. Computational Intelligence and Neuroscience, 2021. doi:https://doi.org/10.1155/2021/9023010
Lee H, Kwon H (2017) Going deeper with contextual CNN for Hyperspectral Image classification. IEEE Trans Image Process 26(10):4843–4855
Article MathSciNet Google Scholar
Poernomo A, Kang D (2018) Biased dropout and Crossmap Dropout: learning towards effective dropout regularization in convolutional neural network. Neural Netw 104:60–67. https://doi.org/10.1016/j.neunet.2018.03.016
Article Google Scholar
Abu Al-Haija Q (2022) Leveraging ShuffleNet transfer learning to enhance handwritten character recognition. Gene Expr Patterns 45. https://doi.org/10.1016/j.gep.2022.119263
Wang J, Yu J, He Z (2022) DECA: a novel multi-scale efficient channel attention module for object detection in real-life fire images. Appl Intell 52(2):1362–1375. https://doi.org/10.1007/s10489-021-02496-y
Article Google Scholar
Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3–11
Article Google Scholar
Li S, Sultonov F, Tursunboev J, Park J, Yun S, Kang J (2022) Ghostformer: a GhostNet-Based two-stage transformer for small object detection. Sensors 22(18). https://doi.org/10.3390/s22186939
Bai L, Zhao Y, Huang X (2018) A CNN Accelerator on FPGA using depthwise separable convolution. Ieee Trans Circuits Syst Ii-Express Briefs 65(10):1415–1419
Google Scholar
Wu K, Bai C, Wang D, Liu Z, Huang T, Zheng H (2021) Improved object detection algorithm of YOLOv3 Remote sensing image. Ieee Access 9:113889–113900. https://doi.org/10.1109/access.2021.3103522
Article Google Scholar
Yarotsky D (2017) Error bounds for approximations with deep ReLU networks. Neural Netw 94:103–114
Article MATH Google Scholar
Li M, Xu D, Zhang D, Zou J (2020) The seeding algorithms for spherical k-means clustering. J Global Optim 76(4):695–708. https://doi.org/10.1007/s10898-019-00779-w
Article MathSciNet MATH Google Scholar
Chang Y, Anagaw A, Chang L, Wang Y, Hsiao C, Lee W (2019) Ship detection based on YOLOv2 for SAR Imagery. Remote Sens 11(7). https://doi.org/10.3390/rs11070786
Shen Z, Liu Z, Li J, Jiang Y, Chen Y, Xue X (2020) Object detection from scratch with Deep Supervision. IEEE Trans Pattern Anal Mach Intell 42(2):398–412. https://doi.org/10.1109/tpami.2019.2922181
Article Google Scholar
Ma F, Xu Y, Xu P (2021) Research on the Minimum size of received Signal Strength difference localization network. Int J Comput Intell Syst 14(1). https://doi.org/10.1007/s44196-021-00015-y
Zhang Y, Zhou W, Wang Y, Xu L (2020) A real-time recognition method of static gesture based on DSSD. Multimedia Tools and Applications 79(25–26):17445–17461. https://doi.org/10.1007/s11042-020-08725-9
Article Google Scholar
Wang X, Wang J, Tang P, Liu W (2019) Weakly- and semi-supervised fast region-based CNN for object detection. J Comput Sci Technol 34(6):1269–1278. https://doi.org/10.1007/s11390-019-1975-z
Article Google Scholar
Zhang Y, Zhu S, Yu C, Zhao L (2022) Small-footprint keyword spotting based on gated Channel Transformation Sandglass residual neural network. Int J Pattern recognit Artif Intell 36(07). https://doi.org/10.1142/s0218001422580034
Chen Y, Lai K, Liu D, Chen M (2022) TAGNet: triplet-attention graph networks for Hashtag recommendation. IEEE Trans Circuits Syst Video Technol 32(3):1148–1159. https://doi.org/10.1109/tcsvt.2021.3074599
Article Google Scholar
Zheng C, Zhang J, Hwang J, Huang B (2022) Double-branch Dehazing Network based on self-calibrated attentional convolution. Knowl Based Syst 240. https://doi.org/10.1016/j.knosys.2022.108148

Download references

Author information

Lixiong Gong and Xiao Huang have contributed to the work equally and should be regarded as co-first authors.

Authors and Affiliations

School of Mechanical Engineering, Hubei University of Technology, Wuhan, China
Lixiong Gong, Xiao Huang, Yinkang Chao, Jialin Chen & Binwen Lei

Authors

Lixiong Gong
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yinkang Chao
View author publications
You can also search for this author in PubMed Google Scholar
Jialin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Binwen Lei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao Huang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: The incorrect Figures 5, 6, and 7 were captured.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gong, L., Huang, X., Chao, Y. et al. An enhanced SSD with feature cross-reinforcement for small-object detection. Appl Intell 53, 19449–19465 (2023). https://doi.org/10.1007/s10489-023-04544-1

Download citation

Accepted: 24 February 2023
Published: 07 March 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s10489-023-04544-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An enhanced SSD with feature cross-reinforcement for small-object detection

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

Change history

06 May 2023

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An enhanced SSD with feature cross-reinforcement for small-object detection

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

Change history

06 May 2023

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation