Skip to main content
Log in

Combining YOLO and background subtraction for small dynamic target detection

  • Research
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

YOLO, an important algorithm for target detection, is ineffective in detecting small dynamic targets. In this paper, we utilize background subtraction, which is highly sensitive to dynamic pixels, to provide YOLO with the location and features of small dynamic targets, thus reducing the missed detection rate of small targets. This method uses background subtraction and YOLO to obtain the mask and class of the target, respectively. If the target’s mask and class can be detected, the features of YOLO and Masks data module are constructed or updated using its characteristics and class. Conversely, if only the target mask is obtained, the target mask is introduced into the features of YOLO and Masks data module for similarity detection, so as to determine the target class. Finally, YOLO performs the forced detection of the target based on the coordinates of the mask with the determined class. Validated with the SBMnet dataset, the experimental results show that for dynamic targets with three different line-of-sight distances, the method proposed in this paper improves the precision by 2.3%, recall by 3.5%, and F1-score by 3.1%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Code availability

Code is available. Please contact the corresponding author if you need the code.

References

  1. Zhang, M.J., Wang, N.N., Li, Y.S., Gao, X.B.: Neural probabilistic graphical model for face sketch synthesis. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2623–2637 (2019)

    Article  MathSciNet  Google Scholar 

  2. Zhang, M.J., Wang, N.N., Li, Y.S., Gao, X.B.: Deep latent low-rank representation for face sketch synthesis. IEEE Trans. Neural Netw. Learn. Syst. 30(10), 3109–3123 (2019)

    Article  Google Scholar 

  3. Mansour, R.F., Escorcia-Gutierrez, J., Gamarra, M., Villanueva, J.A., Leal, N.: Intelligent video anomaly detection and classification using faster RCNN with deep reinforcement learning mode. Image Vis. Comput. 112, 104229 (2021)

    Article  Google Scholar 

  4. Lu, X.C., Ji, J., Xing, Z.Q., Miao, Q.G.: Attention and feature fusion SSD for remote sensing object detection. IEEE Trans. Instrum. Meas. 70, 1–9 (2021)

    Article  Google Scholar 

  5. Chen, G., Wang, H.T., Chen, K., Li, Z.J., Song, Z.D., Liu, Y.L., Chen, W.K., Knoll, A.: A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal. IEEE Trans. Syst. Man Cybern. Syst. 52(2), 936–953 (2020)

    Article  Google Scholar 

  6. Zhao, L., Zhi, L.Q., Zhao, C., Zheng, W.: Fire-YOLO: a small target object detection method for fire inspection. Sustainability 14(9), 4930 (2022)

    Article  Google Scholar 

  7. Betti, A., Tucci, M.: YOLO-S: a lightweight and accurate YOLO-like Network for small target detection in aerial imagery. Sensors 23(4), 1865 (2023)

    Article  Google Scholar 

  8. Li, J.J., Chen, J., Sheng, B., Li, P., Yang, P., Feng, D.D., Qi, J.: Automatic detection and classification system of domestic waste via multimodel cascaded convolutional neural network. IEEE Trans. Industr. Inf. 18(1), 163–173 (2022)

    Article  Google Scholar 

  9. Romano, Y., Isidoro, J., Milanfar, P.: RAISR: rapid and accurate image super resolution. IEEE Trans. Comput. Imag. 3(1), 110–125 (2016)

    Article  MathSciNet  Google Scholar 

  10. Wang, Z.Z., Xie, K., Zhang, X.Y., Chen, H.Q., Wen, C., He, J.B.: Small-object detection based on yolo and dense block via image super-resolution. IEEE Access 9, 56416–56429 (2021)

    Article  Google Scholar 

  11. Bai, Y.C., Zhang, Y.Q., Ding, M.L., Ghanem, B.: Sod-mtgan: Small object detection via multi-task generative adversarial network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 11217 206–221 (2018)

  12. Zhang, M.J., Wu, Q.Q., Zhang, J., Gao, X.B., Guo, J., Tao, D.C.: Fluid micelle network for image super-resolution reconstruction. IEEE Trans. Cybern. 53(1), 578–591 (2022)

    Article  Google Scholar 

  13. Zakria, Z., Deng, J., Kumar, R., Khokhar, M.S., Cai, J., Kumar, J.: Multiscale and direction target detecting in remote sensing images via modified YOLO-v4. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 15, 1039–1048 (2022)

    Article  Google Scholar 

  14. Liu, Y., Sun, P., Wergeles, N., Shang, Y.: A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021)

    Article  Google Scholar 

  15. Lin, Y.T., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2117–2125 (2017)

  16. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European conference on computer vision, Springer, Cham, pp. 21–37 (2016)

  17. Ji, S.J., Ling, Q.H., Han, F.: An improved algorithm for small object detection based on YOLO v4 and multi-scale contextual information. Comput. Electr. Eng. 105, 108490 (2023)

    Article  Google Scholar 

  18. Liang, Z.W., Shao, J., Zhang, D.Y., Gao, L.L.: Small object detection using deep feature pyramid networks. In: Advances in Multimedia Information Processing–PCM 2018: 19th Pacific-Rim Conference on Multimedia, Hefei, China, September, 21–22, 2018, Proceedings, Part III 19 Springer International Publishing, pp. 554–564 (2018)

  19. Lin, X., Sun, S.Z., Huang, W., Sheng, B., Li, P., Feng, D.D.: EAPT: efficient attention pyramid transformer for image processing. IEEE Trans. Multimedia 25, 50–61 (2023)

    Article  Google Scholar 

  20. Wang, S.H., Wang, Y.D., Chang, Y.J., Zhao, R.K., She, Y.S.: EBSE-YOLO: high precision recognition algorithm for small target foreign object detection. IEEE Access 11, 57951–57964 (2023)

    Article  Google Scholar 

  21. Zhang, R., Wen, C.B.: SOD-YOLO: a small target defect detection algorithm for wind turbine blades based on improved YOLOv5. Adv. Theory Simul. 5(7), 2100631 (2022)

    Article  Google Scholar 

  22. Zhang, M.J., Zhang, R., Zhang, J., Guo, J., Li, Y.S., Gao, X.B.: Dim2Clear network for infrared small target detection. IEEE Trans. Geosci. Remote Sens. 61, 1–14 (2023)

    Google Scholar 

  23. Zhang, M.J., Bai, H.C., Zhang, J., Zhang, R., Wang, C.Y., Guo, J., Gao, X.B.: Rkformer: Runge-kutta transformer with random-connection attention for infrared small target detection. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 1730–1738 (2022)

  24. Zhang, M.J., Zhang, R., Yang, Y.X., Bai, H.C., Zhang, J., Guo, J.: ISNet: Shape matters for infrared small target detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 877–886 (2022)

  25. Lu, X., Li, B.Y., Yue, Y.X., Li, Q.Q., Yan, J.J.: Grid r-cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7363–7372 (2019)

  26. Gkioxari, G., Malik, J., Johnson, J.: Mesh r-cnn. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9785–9795 (2019)

  27. Hu, X.L., Liu, Y., Zhao, Z.X., Liu, J.T., Yang, X.T., Sun, C.H., Chen, S.H., Li, B., Zhou, C.: Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-V4 network. Comput. Electron. Agric. 185, 106135 (2021)

    Article  Google Scholar 

  28. Junos, M.H., Mohd Khairuddin, A.S.M., Thannirmalai, S., Dahari, M.: Automatic detection of oil palm fruits from UAV images using an improved YOLO model. Vis. Comput. 38(7), 2341–2355 (2022)

    Article  Google Scholar 

  29. Jiang, J.H., Fu, X.J., Qin, R., Wang, X.Y., Ma, Z.F.: High-speed lightweight ship detection algorithm based on YOLO-v4 for three-channels RGB SAR image. Remote Sens. 13(10), 1909 (2021)

    Article  Google Scholar 

  30. Wang, H., Zhang, F., Wang, L.: Fruit classification model based on improved Darknet53 convolutional neural network. In: 2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), IEEE, pp. 881–884 (2020)

  31. Shan, M.M., Zhang, J., Zhu, H.L., Li, C.H., Tian, F.L.: Grasp Detection Algorithm Based on CSP-ResNet. In: 2022 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML), IEEE, pp. 501–506 (2022)

  32. Wang, X.L., Wang, S., Cao, J.Q., Wang, Y.S.: Data-driven based tiny-YOLOv3 method for front vehicle detection inducing SPP-net. IEEE Access. 8, 110227–110236 (2020)

    Article  Google Scholar 

  33. Yu, H.F., Li, X.B., Feng, Y.K., Han, S.: Multiple attentional path aggregation network for marine object detectio. Appl. Intell. 53(2), 2434–2451 (2023)

    Article  Google Scholar 

  34. Neubeck, A., Van, Gool. L.: Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR’06), IEEE, pp. 850–855 (2006)

  35. Roy, A.M., Bhaduri, J.: Real-time growth stage detection model for high degree of occultation using DenseNet-fused YOLOv4. Comput. Electron. Agric. 193, 106694 (2022)

    Article  Google Scholar 

  36. Ma, H.Y., Liu, Z.W., Jiang, K., Jiang, B.B., Feng, H.H., Hu, S.F.: A novel ST-ViBe algorithm for satellite fog detection at dawn and dusk. Remote Sens. 15(9), 2331 (2023)

    Article  Google Scholar 

  37. Jodoin, P.M., Maddalena, L., Petrosino, A., Wang, Y.: Extensive benchmark and survey of modeling methods for scene background initialization. IEEE Trans. Image Process. 26(11), 5244–5256 (2017)

    Article  MathSciNet  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 62263023).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization and resources were contributed by JW and JX; methodology, formal analysis, writing–original draft preparation, and visualization were done by JW; software and investigation were involved by JW and MT; validation did by JW, JX, MT, YH, PX, and HG; data curation was attributed by JW, JX, and MT; writing–review and editing, project administration, and funding acquisition were done by JX; supervision was involved by JX and MT; .

Corresponding author

Correspondence to Ming Tang.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest/competing interests to declare that are relevant to the content of this article. All authors declare that they have no financial interests.

Ethics approval

All authors affirm that human research participants provided informed consent for publication of the images in Fig. 6. The rest of the images are from the SBMnet dataset.

Consent for publication

All authors have read and agreed to the published version of the manuscript.

Employment

All authors’ organizations receive the same financial benefits.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiong, J., Wu, J., Tang, M. et al. Combining YOLO and background subtraction for small dynamic target detection. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03342-1

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03342-1

Keywords

Navigation