Abstract
YOLO, an important algorithm for target detection, is ineffective in detecting small dynamic targets. In this paper, we utilize background subtraction, which is highly sensitive to dynamic pixels, to provide YOLO with the location and features of small dynamic targets, thus reducing the missed detection rate of small targets. This method uses background subtraction and YOLO to obtain the mask and class of the target, respectively. If the target’s mask and class can be detected, the features of YOLO and Masks data module are constructed or updated using its characteristics and class. Conversely, if only the target mask is obtained, the target mask is introduced into the features of YOLO and Masks data module for similarity detection, so as to determine the target class. Finally, YOLO performs the forced detection of the target based on the coordinates of the mask with the determined class. Validated with the SBMnet dataset, the experimental results show that for dynamic targets with three different line-of-sight distances, the method proposed in this paper improves the precision by 2.3%, recall by 3.5%, and F1-score by 3.1%.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Code availability
Code is available. Please contact the corresponding author if you need the code.
References
Zhang, M.J., Wang, N.N., Li, Y.S., Gao, X.B.: Neural probabilistic graphical model for face sketch synthesis. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2623–2637 (2019)
Zhang, M.J., Wang, N.N., Li, Y.S., Gao, X.B.: Deep latent low-rank representation for face sketch synthesis. IEEE Trans. Neural Netw. Learn. Syst. 30(10), 3109–3123 (2019)
Mansour, R.F., Escorcia-Gutierrez, J., Gamarra, M., Villanueva, J.A., Leal, N.: Intelligent video anomaly detection and classification using faster RCNN with deep reinforcement learning mode. Image Vis. Comput. 112, 104229 (2021)
Lu, X.C., Ji, J., Xing, Z.Q., Miao, Q.G.: Attention and feature fusion SSD for remote sensing object detection. IEEE Trans. Instrum. Meas. 70, 1–9 (2021)
Chen, G., Wang, H.T., Chen, K., Li, Z.J., Song, Z.D., Liu, Y.L., Chen, W.K., Knoll, A.: A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal. IEEE Trans. Syst. Man Cybern. Syst. 52(2), 936–953 (2020)
Zhao, L., Zhi, L.Q., Zhao, C., Zheng, W.: Fire-YOLO: a small target object detection method for fire inspection. Sustainability 14(9), 4930 (2022)
Betti, A., Tucci, M.: YOLO-S: a lightweight and accurate YOLO-like Network for small target detection in aerial imagery. Sensors 23(4), 1865 (2023)
Li, J.J., Chen, J., Sheng, B., Li, P., Yang, P., Feng, D.D., Qi, J.: Automatic detection and classification system of domestic waste via multimodel cascaded convolutional neural network. IEEE Trans. Industr. Inf. 18(1), 163–173 (2022)
Romano, Y., Isidoro, J., Milanfar, P.: RAISR: rapid and accurate image super resolution. IEEE Trans. Comput. Imag. 3(1), 110–125 (2016)
Wang, Z.Z., Xie, K., Zhang, X.Y., Chen, H.Q., Wen, C., He, J.B.: Small-object detection based on yolo and dense block via image super-resolution. IEEE Access 9, 56416–56429 (2021)
Bai, Y.C., Zhang, Y.Q., Ding, M.L., Ghanem, B.: Sod-mtgan: Small object detection via multi-task generative adversarial network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 11217 206–221 (2018)
Zhang, M.J., Wu, Q.Q., Zhang, J., Gao, X.B., Guo, J., Tao, D.C.: Fluid micelle network for image super-resolution reconstruction. IEEE Trans. Cybern. 53(1), 578–591 (2022)
Zakria, Z., Deng, J., Kumar, R., Khokhar, M.S., Cai, J., Kumar, J.: Multiscale and direction target detecting in remote sensing images via modified YOLO-v4. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 15, 1039–1048 (2022)
Liu, Y., Sun, P., Wergeles, N., Shang, Y.: A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021)
Lin, Y.T., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2117–2125 (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European conference on computer vision, Springer, Cham, pp. 21–37 (2016)
Ji, S.J., Ling, Q.H., Han, F.: An improved algorithm for small object detection based on YOLO v4 and multi-scale contextual information. Comput. Electr. Eng. 105, 108490 (2023)
Liang, Z.W., Shao, J., Zhang, D.Y., Gao, L.L.: Small object detection using deep feature pyramid networks. In: Advances in Multimedia Information Processing–PCM 2018: 19th Pacific-Rim Conference on Multimedia, Hefei, China, September, 21–22, 2018, Proceedings, Part III 19 Springer International Publishing, pp. 554–564 (2018)
Lin, X., Sun, S.Z., Huang, W., Sheng, B., Li, P., Feng, D.D.: EAPT: efficient attention pyramid transformer for image processing. IEEE Trans. Multimedia 25, 50–61 (2023)
Wang, S.H., Wang, Y.D., Chang, Y.J., Zhao, R.K., She, Y.S.: EBSE-YOLO: high precision recognition algorithm for small target foreign object detection. IEEE Access 11, 57951–57964 (2023)
Zhang, R., Wen, C.B.: SOD-YOLO: a small target defect detection algorithm for wind turbine blades based on improved YOLOv5. Adv. Theory Simul. 5(7), 2100631 (2022)
Zhang, M.J., Zhang, R., Zhang, J., Guo, J., Li, Y.S., Gao, X.B.: Dim2Clear network for infrared small target detection. IEEE Trans. Geosci. Remote Sens. 61, 1–14 (2023)
Zhang, M.J., Bai, H.C., Zhang, J., Zhang, R., Wang, C.Y., Guo, J., Gao, X.B.: Rkformer: Runge-kutta transformer with random-connection attention for infrared small target detection. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 1730–1738 (2022)
Zhang, M.J., Zhang, R., Yang, Y.X., Bai, H.C., Zhang, J., Guo, J.: ISNet: Shape matters for infrared small target detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 877–886 (2022)
Lu, X., Li, B.Y., Yue, Y.X., Li, Q.Q., Yan, J.J.: Grid r-cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7363–7372 (2019)
Gkioxari, G., Malik, J., Johnson, J.: Mesh r-cnn. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9785–9795 (2019)
Hu, X.L., Liu, Y., Zhao, Z.X., Liu, J.T., Yang, X.T., Sun, C.H., Chen, S.H., Li, B., Zhou, C.: Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-V4 network. Comput. Electron. Agric. 185, 106135 (2021)
Junos, M.H., Mohd Khairuddin, A.S.M., Thannirmalai, S., Dahari, M.: Automatic detection of oil palm fruits from UAV images using an improved YOLO model. Vis. Comput. 38(7), 2341–2355 (2022)
Jiang, J.H., Fu, X.J., Qin, R., Wang, X.Y., Ma, Z.F.: High-speed lightweight ship detection algorithm based on YOLO-v4 for three-channels RGB SAR image. Remote Sens. 13(10), 1909 (2021)
Wang, H., Zhang, F., Wang, L.: Fruit classification model based on improved Darknet53 convolutional neural network. In: 2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), IEEE, pp. 881–884 (2020)
Shan, M.M., Zhang, J., Zhu, H.L., Li, C.H., Tian, F.L.: Grasp Detection Algorithm Based on CSP-ResNet. In: 2022 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML), IEEE, pp. 501–506 (2022)
Wang, X.L., Wang, S., Cao, J.Q., Wang, Y.S.: Data-driven based tiny-YOLOv3 method for front vehicle detection inducing SPP-net. IEEE Access. 8, 110227–110236 (2020)
Yu, H.F., Li, X.B., Feng, Y.K., Han, S.: Multiple attentional path aggregation network for marine object detectio. Appl. Intell. 53(2), 2434–2451 (2023)
Neubeck, A., Van, Gool. L.: Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR’06), IEEE, pp. 850–855 (2006)
Roy, A.M., Bhaduri, J.: Real-time growth stage detection model for high degree of occultation using DenseNet-fused YOLOv4. Comput. Electron. Agric. 193, 106694 (2022)
Ma, H.Y., Liu, Z.W., Jiang, K., Jiang, B.B., Feng, H.H., Hu, S.F.: A novel ST-ViBe algorithm for satellite fog detection at dawn and dusk. Remote Sens. 15(9), 2331 (2023)
Jodoin, P.M., Maddalena, L., Petrosino, A., Wang, Y.: Extensive benchmark and survey of modeling methods for scene background initialization. IEEE Trans. Image Process. 26(11), 5244–5256 (2017)
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 62263023).
Author information
Authors and Affiliations
Contributions
Conceptualization and resources were contributed by JW and JX; methodology, formal analysis, writing–original draft preparation, and visualization were done by JW; software and investigation were involved by JW and MT; validation did by JW, JX, MT, YH, PX, and HG; data curation was attributed by JW, JX, and MT; writing–review and editing, project administration, and funding acquisition were done by JX; supervision was involved by JX and MT; .
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest/competing interests to declare that are relevant to the content of this article. All authors declare that they have no financial interests.
Ethics approval
All authors affirm that human research participants provided informed consent for publication of the images in Fig. 6. The rest of the images are from the SBMnet dataset.
Consent for publication
All authors have read and agreed to the published version of the manuscript.
Employment
All authors’ organizations receive the same financial benefits.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xiong, J., Wu, J., Tang, M. et al. Combining YOLO and background subtraction for small dynamic target detection. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03342-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s00371-024-03342-1