Skip to main content
Log in

A small object detection network for remote sensing based on CS-PANet and DSAN

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The proportion of small objects in remote sensing images is relatively small, which is prone to feature loss or interference from the surrounding complex background in the detection process. To solve this problem, a remote sensing small object detection network (CD-YOLOX) based on enhanceds’ feature pyramid network (CS-PANet) and global local combine module (DSAN) based on YOLOX is proposed. Firstly, to solve the problem of small object feature loss and surrounding complex background interference caused by multiple convolution and feature stacking operations in PANet, CS-PANet is proposed. This method improves the network’s focus on effective feature channels for small objects by using channel attention at the input of PANet, meanwhile, the input that passes the channel attention is connected to the output of PANet across layers, which makes the PANet retain the richer original information of small objects. Secondly, to further reduce the interference of the surrounding complex background on the small objects, a DSAN module consisting of self-attention mechanism, dilated convolution and residual connection is proposed before network prediction. This module combines self-attention mechanism with dilated convolution so that the network focuses on the global feature region of the small object in the feature map while effectively complementing the local context of this region, and preserving the original information through residual connection. Finally, the effectiveness of the method is verified using the remote sensing dataset NWPU VHR-10 and the general datasets KITTI and PASCAL VOC. The experiment shows that the method improves the detection accuracy by 5.56% in the NWPU VHR-10 dataset and by 1.93% and 2.51% in the KITTI and PASCAL VOC datasets respectively compared to the original network, which fully verifies the effectiveness of the method for detecting the small objects of remote sensing and the ability of the method to detect general purpose objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and materials

The data used to support the findings of this study are available from the corresponding author upon request.

Code availability

Not applicable.

References

  1. Xiao Y, Tian Z, Yu J, Zhang Y, Liu S, Du S, Lan X (2020) A review of object detection based on deep learning. Multimed Tools Appl 79:23729–23791

    Article  Google Scholar 

  2. Zhang C, Zhang X, Jiang M (2021) Research on parallel detection technology of remote sensing object based on deep learning. In: 2021 4th international conference on intelligent autonomous systems (ICoIAS). IEEE, pp 29–32

  3. Ye X, Xiong F, Lu J, Zhou J, Qian Y (2020) F3-net: Feature fusion and filtration network for object detection in optical remote sensing images. Remote Sens 12(24):4027

    Article  ADS  Google Scholar 

  4. Huang Z, Li W, Xia X-G, Wu X, Cai Z, Tao R (2021) A novel nonlocal-aware pyramid and multiscale multitask refinement detector for object detection in remote sensing images. IEEE Trans Geosci Remote Sens 60:1–20

    Google Scholar 

  5. Liu R, Yu Z, Mo D, Cai Y (2020) An improved faster-RCNN algorithm for object detection in remote sensing images. In: 2020 39th Chinese control conference (CCC). IEEE, pp 7188–7192

  6. Rabbi J, Ray N, Schubert M, Chowdhury S, Chao D (2020) Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network. Remote Sensing 12(9):1432

    Article  ADS  Google Scholar 

  7. Wang N, Li B, Wei X, Wang Y, Yan H (2020) Ship detection in spaceborne infrared image based on lightweight CNN and multisource feature cascade decision. IEEE Trans Geosci Remote Sens 59(5):4324–4339

    Article  ADS  Google Scholar 

  8. Sakai K, Seo T, Fuse T (2019) Traffic density estimation method from small satellite imagery: towards frequent remote sensing of car traffic. In: 2019 IEEE intelligent transportation systems conference (ITSC). IEEE, pp 1776–1781

  9. Wang H, Cao H, Kai Y, Bai H, Chen X, Yang Y, Xing L, Zhou C (2022) Multi-source remote sensing intelligent characterization technique-based disaster regions detection in high-altitude mountain forest areas. IEEE Geosci Remote Sens Lett 19:1–5

    CAS  Google Scholar 

  10. Girshick, R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision. pp 1440–1448

  11. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision. pp 2961–2969

  12. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, pp 21–37

  13. Bochkovskiy A, Wang CY, Liao H (2020) YOLOv4: optimal speed and accuracy of object detection

  14. Ge Z, Liu S, Wang F, Li Z, Sun J (2021) YOLOX: exceeding yolo series in 2021. arXiv:2107.08430

  15. Wang C-Y, Bochkovskiy A, Liao H-YM (2023) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 7464–7475

  16. Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection. arXiv:1902.07296

  17. Chen Y, Zhang P, Li Z, Li Y, Zhang X, Meng G, Xiang S, Sun J, Jia J (2020) Stitcher: feedback-driven data provider for object detection. 2(7):12 arXiv:2004.12432

  18. Noh J, Bae W, Lee W, Seo J, Kim G (2019) Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 9725–9734

  19. Bai Y, Zhang Y, Ding M, Ghanem B (2018) SOD-MTGAN: small object detection via multi-task generative adversarial network. In: Proceedings of the European conference on computer vision (ECCV). pp 206–221

  20. Li J, Liang X, Wei Y, Xu T, Feng J, Yan S (2017) Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1222–1230

  21. Qu J, Bi X, Liu S (2021) Research on recognition algorithm of LSS based on video in airport clearance area. In: 2021 IEEE 2nd international conference on big data, artificial intelligence and internet of things engineering (ICBAIE). IEEE, pp 110–113

  22. Liu S, Qi L, Qin H, Shi J, Jia, J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8759–8768

  23. Zhang J, Zhang H, Liu B, Qu G, Wang F, Zhang H, Shi X (2023) Small object intelligent detection method based on adaptive recursive feature pyramid. Heliyon 9(7)

  24. Jocher G, Stoken A, Borovec J, Chaurasia A, Changyu L, Hogan A, Hajek J, Diaconu L, Kwon Y, Defretin Y et al (2021) ultralytics/yolov5: v5. 0-yolov5-p6 1280 models, aws, supervise. ly and youtube integrations. Zenodo

  25. Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNET: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) workshops

  26. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Computer vision–ECCV 2014: 13th European conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13. Springer, pp 818–833

  27. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778

  28. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2117–2125

  29. Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122

  30. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  PubMed  Google Scholar 

  31. Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking Atrous convolution for semantic image segmentation. arXiv:1706.05587

  32. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2881–2890

  33. Liu S, Huang D, et al. (2018) Receptive field block net for accurate and fast object detection. In: Proceedings of the European conference on computer vision (ECCV). pp 385–400

  34. Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-aware trident networks for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 6054–6063

  35. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  36. Cortes C, Lawarence N, Lee D, Sugiyama M, Garnett R (2015) Advances in neural information processing systems 28. In: Proceedings of the 29th annual conference on neural information processing systems

  37. Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7794–7803

  38. Hu H, Gu J, Zhang Z, Dai J, Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3588–3597

  39. Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning. PMLR, pp 7354–7363

  40. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3146–3154

  41. Sanghyun W, Jongchan P, Joon-Young L et al (2018) CBAM: convolutional block attention module; proceedings of the Proceedings of the European conference on computer vision (ECCV). F

  42. Guo Q, Liu J, Kaliuzhnyi M (2022) YOLOX-SAR: high-precision object detection system based on visible and infrared sensors for SAR remote sensing. IEEE Sens J 22(17):17243–17253

    Article  ADS  Google Scholar 

  43. Chen J, Hong H, Song B, Guo J, Chen C, Xu J (2023) MDCT: multi-kernel dilated convolution and transformer for one-stage object detection of remote sensing images. Remote Sens 15(2):371

    Article  ADS  Google Scholar 

  44. Zhao T, Liu N, Celik T, Li H-C (2021) An arbitrary-oriented object detector based on variant Gaussian label in remote sensing images. IEEE Geosci Remote Sens Lett 19:1–5

    CAS  Google Scholar 

  45. Guo Y, Tong X, Xu X, Liu S, Feng Y, Xie H (2022) An anchor-free network with density map and attention mechanism for multiscale object detection in aerial images. IEEE Geosci Remote Sens Lett 19:1–5

    Google Scholar 

  46. Mehtab S, Yan WQ (2022) Flexible neural network for fast and accurate road scene perception. Multimed Tools Appl 81(5):7169–7181

    Article  Google Scholar 

  47. Yya B, Hl A, Wei FB (2020) Faster-YOLO: an accurate and faster object detection method. Dig Signal Process 102

  48. Ma W, Wu Y, Cen F, Wang G (2020) MDFN: multi-scale deep feature learning network for object detection. Pattern Recogn 100:107149

    Article  Google Scholar 

  49. Hwang Y-J, Lee J-G, Moon U-C, Park H-H (2020) SSD-TSEFFM: new SSD using trident feature and squeeze and extraction feature fusion. Sensors 20(13):3630

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  50. Dai Y, Liu W, Wang H, Xie W, Long K (2022) YOLO-Former: marrying YOLO and transformer for foreign object detection. IEEE Trans Instrum Meas 71:1–14

    Google Scholar 

  51. Yan L, Li K, Gao R, Wang C, Xiong N (2022) An intelligent weighted object detector for feature extraction to enrich global image information. Appl Sci 12(15):7825

Download references

Funding

This work is supported by the grants from National Science Foundation of China (No.62102373, 62006213), the science and technology research project of Henan province(No.212102310053,222102210118).

Author information

Authors and Affiliations

Authors

Contributions

Jie zhang: Conceptualizatio; Methodology; Analysis; Resources; Writing review and editing; Investigation; Supervision. Bowen Liu: Data curation; Investigation; Software; Validation; Writing original draft and editing. Hongyan Zhang: Conceptualizatio; Methodology; Validation Analysis; Review and editing; Visualization. Lei Zhang: Conceptualizatio; Methodology; Analysis; Resources; Investigation; Supervision. Fengxian Wang: Data curation; Investigation; Software; Resources; Data curation. Yibin Chen: Investigation; Software; Resources; Data curation.

Corresponding author

Correspondence to Lei Zhang.

Ethics declarations

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Conflicts of interest

No potential conflict of interest was reported by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Liu, B., Zhang, H. et al. A small object detection network for remote sensing based on CS-PANet and DSAN. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18397-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18397-4

Keywords

Navigation