Combining YOLO and background subtraction for small dynamic target detection

Xiong, Jian; Wu, Jie; Tang, Ming; Xiong, Pengwen; Huang, Yushui; Guo, Hang

doi:10.1007/s00371-024-03342-1

Combining YOLO and background subtraction for small dynamic target detection

Research
Published: 21 March 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Jian Xiong¹,
Jie Wu²,
Ming Tang¹,
Pengwen Xiong¹,
Yushui Huang² &
…
Hang Guo²

236 Accesses
1 Altmetric
Explore all metrics

Abstract

YOLO, an important algorithm for target detection, is ineffective in detecting small dynamic targets. In this paper, we utilize background subtraction, which is highly sensitive to dynamic pixels, to provide YOLO with the location and features of small dynamic targets, thus reducing the missed detection rate of small targets. This method uses background subtraction and YOLO to obtain the mask and class of the target, respectively. If the target’s mask and class can be detected, the features of YOLO and Masks data module are constructed or updated using its characteristics and class. Conversely, if only the target mask is obtained, the target mask is introduced into the features of YOLO and Masks data module for similarity detection, so as to determine the target class. Finally, YOLO performs the forced detection of the target based on the coordinates of the mask with the determined class. Validated with the SBMnet dataset, the experimental results show that for dynamic targets with three different line-of-sight distances, the method proposed in this paper improves the precision by 2.3%, recall by 3.5%, and F1-score by 3.1%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Code availability

Code is available. Please contact the corresponding author if you need the code.

References

Zhang, M.J., Wang, N.N., Li, Y.S., Gao, X.B.: Neural probabilistic graphical model for face sketch synthesis. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2623–2637 (2019)
Article MathSciNet Google Scholar
Zhang, M.J., Wang, N.N., Li, Y.S., Gao, X.B.: Deep latent low-rank representation for face sketch synthesis. IEEE Trans. Neural Netw. Learn. Syst. 30(10), 3109–3123 (2019)
Article Google Scholar
Mansour, R.F., Escorcia-Gutierrez, J., Gamarra, M., Villanueva, J.A., Leal, N.: Intelligent video anomaly detection and classification using faster RCNN with deep reinforcement learning mode. Image Vis. Comput. 112, 104229 (2021)
Article Google Scholar
Lu, X.C., Ji, J., Xing, Z.Q., Miao, Q.G.: Attention and feature fusion SSD for remote sensing object detection. IEEE Trans. Instrum. Meas. 70, 1–9 (2021)
Article Google Scholar
Chen, G., Wang, H.T., Chen, K., Li, Z.J., Song, Z.D., Liu, Y.L., Chen, W.K., Knoll, A.: A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal. IEEE Trans. Syst. Man Cybern. Syst. 52(2), 936–953 (2020)
Article Google Scholar
Zhao, L., Zhi, L.Q., Zhao, C., Zheng, W.: Fire-YOLO: a small target object detection method for fire inspection. Sustainability 14(9), 4930 (2022)
Article Google Scholar
Betti, A., Tucci, M.: YOLO-S: a lightweight and accurate YOLO-like Network for small target detection in aerial imagery. Sensors 23(4), 1865 (2023)
Article Google Scholar
Li, J.J., Chen, J., Sheng, B., Li, P., Yang, P., Feng, D.D., Qi, J.: Automatic detection and classification system of domestic waste via multimodel cascaded convolutional neural network. IEEE Trans. Industr. Inf. 18(1), 163–173 (2022)
Article Google Scholar
Romano, Y., Isidoro, J., Milanfar, P.: RAISR: rapid and accurate image super resolution. IEEE Trans. Comput. Imag. 3(1), 110–125 (2016)
Article MathSciNet Google Scholar
Wang, Z.Z., Xie, K., Zhang, X.Y., Chen, H.Q., Wen, C., He, J.B.: Small-object detection based on yolo and dense block via image super-resolution. IEEE Access 9, 56416–56429 (2021)
Article Google Scholar
Bai, Y.C., Zhang, Y.Q., Ding, M.L., Ghanem, B.: Sod-mtgan: Small object detection via multi-task generative adversarial network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 11217 206–221 (2018)
Zhang, M.J., Wu, Q.Q., Zhang, J., Gao, X.B., Guo, J., Tao, D.C.: Fluid micelle network for image super-resolution reconstruction. IEEE Trans. Cybern. 53(1), 578–591 (2022)
Article Google Scholar
Zakria, Z., Deng, J., Kumar, R., Khokhar, M.S., Cai, J., Kumar, J.: Multiscale and direction target detecting in remote sensing images via modified YOLO-v4. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 15, 1039–1048 (2022)
Article Google Scholar
Liu, Y., Sun, P., Wergeles, N., Shang, Y.: A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021)
Article Google Scholar
Lin, Y.T., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2117–2125 (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European conference on computer vision, Springer, Cham, pp. 21–37 (2016)
Ji, S.J., Ling, Q.H., Han, F.: An improved algorithm for small object detection based on YOLO v4 and multi-scale contextual information. Comput. Electr. Eng. 105, 108490 (2023)
Article Google Scholar
Liang, Z.W., Shao, J., Zhang, D.Y., Gao, L.L.: Small object detection using deep feature pyramid networks. In: Advances in Multimedia Information Processing–PCM 2018: 19th Pacific-Rim Conference on Multimedia, Hefei, China, September, 21–22, 2018, Proceedings, Part III 19 Springer International Publishing, pp. 554–564 (2018)
Lin, X., Sun, S.Z., Huang, W., Sheng, B., Li, P., Feng, D.D.: EAPT: efficient attention pyramid transformer for image processing. IEEE Trans. Multimedia 25, 50–61 (2023)
Article Google Scholar
Wang, S.H., Wang, Y.D., Chang, Y.J., Zhao, R.K., She, Y.S.: EBSE-YOLO: high precision recognition algorithm for small target foreign object detection. IEEE Access 11, 57951–57964 (2023)
Article Google Scholar
Zhang, R., Wen, C.B.: SOD-YOLO: a small target defect detection algorithm for wind turbine blades based on improved YOLOv5. Adv. Theory Simul. 5(7), 2100631 (2022)
Article Google Scholar
Zhang, M.J., Zhang, R., Zhang, J., Guo, J., Li, Y.S., Gao, X.B.: Dim2Clear network for infrared small target detection. IEEE Trans. Geosci. Remote Sens. 61, 1–14 (2023)
Google Scholar
Zhang, M.J., Bai, H.C., Zhang, J., Zhang, R., Wang, C.Y., Guo, J., Gao, X.B.: Rkformer: Runge-kutta transformer with random-connection attention for infrared small target detection. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 1730–1738 (2022)
Zhang, M.J., Zhang, R., Yang, Y.X., Bai, H.C., Zhang, J., Guo, J.: ISNet: Shape matters for infrared small target detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 877–886 (2022)
Lu, X., Li, B.Y., Yue, Y.X., Li, Q.Q., Yan, J.J.: Grid r-cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7363–7372 (2019)
Gkioxari, G., Malik, J., Johnson, J.: Mesh r-cnn. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9785–9795 (2019)
Hu, X.L., Liu, Y., Zhao, Z.X., Liu, J.T., Yang, X.T., Sun, C.H., Chen, S.H., Li, B., Zhou, C.: Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-V4 network. Comput. Electron. Agric. 185, 106135 (2021)
Article Google Scholar
Junos, M.H., Mohd Khairuddin, A.S.M., Thannirmalai, S., Dahari, M.: Automatic detection of oil palm fruits from UAV images using an improved YOLO model. Vis. Comput. 38(7), 2341–2355 (2022)
Article Google Scholar
Jiang, J.H., Fu, X.J., Qin, R., Wang, X.Y., Ma, Z.F.: High-speed lightweight ship detection algorithm based on YOLO-v4 for three-channels RGB SAR image. Remote Sens. 13(10), 1909 (2021)
Article Google Scholar
Wang, H., Zhang, F., Wang, L.: Fruit classification model based on improved Darknet53 convolutional neural network. In: 2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), IEEE, pp. 881–884 (2020)
Shan, M.M., Zhang, J., Zhu, H.L., Li, C.H., Tian, F.L.: Grasp Detection Algorithm Based on CSP-ResNet. In: 2022 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML), IEEE, pp. 501–506 (2022)
Wang, X.L., Wang, S., Cao, J.Q., Wang, Y.S.: Data-driven based tiny-YOLOv3 method for front vehicle detection inducing SPP-net. IEEE Access. 8, 110227–110236 (2020)
Article Google Scholar
Yu, H.F., Li, X.B., Feng, Y.K., Han, S.: Multiple attentional path aggregation network for marine object detectio. Appl. Intell. 53(2), 2434–2451 (2023)
Article Google Scholar
Neubeck, A., Van, Gool. L.: Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR’06), IEEE, pp. 850–855 (2006)
Roy, A.M., Bhaduri, J.: Real-time growth stage detection model for high degree of occultation using DenseNet-fused YOLOv4. Comput. Electron. Agric. 193, 106694 (2022)
Article Google Scholar
Ma, H.Y., Liu, Z.W., Jiang, K., Jiang, B.B., Feng, H.H., Hu, S.F.: A novel ST-ViBe algorithm for satellite fog detection at dawn and dusk. Remote Sens. 15(9), 2331 (2023)
Article Google Scholar
Jodoin, P.M., Maddalena, L., Petrosino, A., Wang, Y.: Extensive benchmark and survey of modeling methods for scene background initialization. IEEE Trans. Image Process. 26(11), 5244–5256 (2017)
Article MathSciNet Google Scholar

Download references

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 62263023).

Author information

Authors and Affiliations

School of Advanced Manufacturing, Nanchang University, Nanchang, People’s Republic of China
Jian Xiong, Ming Tang & Pengwen Xiong
School of Information Engineering, Nanchang University, Nanchang, People’s Republic of China
Jie Wu, Yushui Huang & Hang Guo

Authors

Jian Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Jie Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Tang
View author publications
You can also search for this author in PubMed Google Scholar
Pengwen Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Yushui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Hang Guo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization and resources were contributed by JW and JX; methodology, formal analysis, writing–original draft preparation, and visualization were done by JW; software and investigation were involved by JW and MT; validation did by JW, JX, MT, YH, PX, and HG; data curation was attributed by JW, JX, and MT; writing–review and editing, project administration, and funding acquisition were done by JX; supervision was involved by JX and MT; .

Corresponding author

Correspondence to Ming Tang.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest/competing interests to declare that are relevant to the content of this article. All authors declare that they have no financial interests.

Ethics approval

All authors affirm that human research participants provided informed consent for publication of the images in Fig. 6. The rest of the images are from the SBMnet dataset.

Consent for publication

All authors have read and agreed to the published version of the manuscript.

Employment

All authors’ organizations receive the same financial benefits.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xiong, J., Wu, J., Tang, M. et al. Combining YOLO and background subtraction for small dynamic target detection. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03342-1

Download citation

Accepted: 25 February 2024
Published: 21 March 2024
DOI: https://doi.org/10.1007/s00371-024-03342-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining YOLO and background subtraction for small dynamic target detection

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent for publication

Employment

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Combining YOLO and background subtraction for small dynamic target detection

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent for publication

Employment

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation