Skip to main content
Log in

Improved remote sensing image target detection based on YOLOv7

  • Published:
Optoelectronics Letters Aims and scope Submit manuscript

Abstract

Remote sensing images are taken at high altitude from above, with complex spatial scenes of images and a large number of target types. The detection of image targets on large scale remote sensing images suffers from the problem of small target size and target density. This paper proposes an improved model for remote sensing image detection based on you only look once version 7 (YOLOv7). First, the small-scale detection layer is added to reacquire tracking frames to improve the network’s recognition ability of small-scale targets, and then Bottleneck Transformers are fused in the backbone to make full use of the convolutional neural network (CNN)+Transformer architecture to enhance the feature extraction ability of the network. After that, the convolutional block attention module (CBAM) mechanism is added in the head to improve the model’s ability of small-scale target. Finally, the non-maximum suppressed (NMS) of YOLOv7 algorithm is changed to distance intersection over union-non maximum suppression (DIOU-NMS) to improve the detection ability of overlapping targets in the network. The results show that the method in this paper can improve the detection rate of small-scale targets in remote sensing images and effectively solve the problem of high overlap and is tested on the NWPU-VHR10 and DOTA1.0 datasets, and the accuracy of the improved model is improved by 6.3% and 4.2%, respectively, compared with the standard YOLOv7 algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. LU X, ZHENG X, YUAN Y. Remote sensing scene classification by unsupervised representation learning[J]. IEEE transactions on geoscience and remote sensing, 2017, 55(9): 5148–5157.

    Article  ADS  Google Scholar 

  2. AFAQ Y, MANOCHA A. Analysis on change detection techniques for remote sensing applications: a review[J]. Ecological informatics, 2021, 63: 101310.

    Article  Google Scholar 

  3. ZHAO Z Q, ZHENG P, XU S, et al. Object detection with deep learning: a review[J]. IEEE transactions on neural networks and learning systems, 2019, 30(11): 3212–3232.

    Article  PubMed  Google Scholar 

  4. SHAFIQUE A, CAO G, KHAN Z, et al. Deep learning-based change detection in remote sensing images: a review[J]. Remote sensing, 2022, 14(4): 871.

    Article  ADS  Google Scholar 

  5. GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 23–28, 2014, Columbus, OH, USA. New York: IEEE, 2014, 978: 580–587.

    Google Scholar 

  6. REN S Q, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis & machine intelligence, 2017, 39(06): 1137–1149.

    Article  Google Scholar 

  7. LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Computer Vision-ECCV 2016: 14th European Conference, October 11–14, 2016, Amsterdam, Netherlands. Berlin, Heidelberg: Springer International Publishing, 2016: 21–37.

    Google Scholar 

  8. REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 27–30, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 779–788.

    Google Scholar 

  9. REDMON J, FARHADI A. Yolo9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, July 21–26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 7263–7271.

    Google Scholar 

  10. REDMON J, FARHADI A. Yolov3: an incremental improvement[EB/OL]. (2018-04-08) [2023-01-23]. https://arxiv.org/abs/1804.02767.

  11. ZHANG H, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. (2017-10-25) [2023-01-23]. https://arxiv.org/abs/1710.09412.

  12. WANG C, SHI J, YANG X, et al. Geospatial object detection via deconvolutional region proposal network[J]. IEEE journal of selected topics in applied earth observations and remote sensing, 2019, 12(8): 3014–3027.

    Article  ADS  Google Scholar 

  13. CHENG G, ZHOU P, HAN J. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE transactions on geoscience and remote sensing, 2016, 54(12): 7405–7415.

    Article  ADS  Google Scholar 

  14. YU X, GONG Y, JIANG N, et al. Scale match for tiny person detection[C]//2020 IEEE Winter Conference on Applications of Computer Vision (WACV), March 1–5, 2020, Snowmass, CO, USA. New York: IEEE, 2020: 1257–1265.

    Google Scholar 

  15. LUO H, WANG P, CHEN H, et al. Object detection method based on shallow feature fusion and semantic information enhancement[J]. IEEE sensors journal, 2021, 21(19): 21839–21851.

    Article  ADS  Google Scholar 

  16. DENG C, WANG M, LIU L, et al. Extended feature pyramid network for small object detection[J]. IEEE transactions on multimedia, 2021, 24: 1968–1979.

    Article  Google Scholar 

  17. BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2023-01-23]. https://arxiv.org/abs/2004.10934.

  18. WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 17–24, 2023, Vancouver, BC, Canada. New York: IEEE, 2023: 7464–7475.

    Google Scholar 

  19. SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck transformers for visual recognition[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 20–25, 2021, Nashville, TN, USA. New York: IEEE, 2021: 16514–16524.

    Google Scholar 

  20. WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Computer Vision-ECCV 2018: 15th European Conference, September 8–14, 2018, Munich, Germany. Berlin, Heidelberg: Springer International Publishing, 2018: 3–19.

    Google Scholar 

  21. ZHENG Z, WANG P, LIU W, et al. Distance-iouloss: faster and better learning for bounding box regression[EB/OL]. (2019-11-19) [2023-01-23]. https://arxiv.org/abs/1911.08287.

  22. CHENG G, HAN J, ZHOU P, et al. Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J]. ISPRS Journal of photogrammetry and remote sensing, 2014, 986: 119–132.

    Article  ADS  Google Scholar 

  23. LI K, CHENG G, BU S, et al. Rotation-insensitive and context-augmented object detection in remote sensing images[J]. IEEE transactions on geoscience and remote sensing, 2017, 56(4): 2337–2348.

    Article  ADS  Google Scholar 

  24. YANG X, YANG J, YAN J, et al. SCRDet: towards more robust detection for small, cluttered and rotated objects[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV), October 27, 2019, Seoul, Korea (South). New York: IEEE, 2019: 8231–8240.

    Google Scholar 

  25. LI C, XU C, CUI Z, et al. Feature-attentioned object detection in remote sensing imagery[C]//2019 IEEE International Conference on Image Processing (ICIP), September 22–25, 2019, Taipei, China. New York: IEEE, 2019, 978: 3886–3890.

    Google Scholar 

  26. QIAN W, YANG X, PENG S, et al. Learning modulated loss for rotated object detection[EB/OL]. (2019-11-19) [2023-01-23]. https://arxiv.org/abs/1911.08299.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhihong Chen.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest.

Additional information

This work has been supported by the National Natural Science Foundation of China (Nos.62005196 and 61974104).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, S., Chen, Z., Zhang, H. et al. Improved remote sensing image target detection based on YOLOv7. Optoelectron. Lett. 20, 234–242 (2024). https://doi.org/10.1007/s11801-024-3063-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11801-024-3063-z

Document code

Navigation