MSF-YOLO: A multi-scale features fusion-based method for small object detection

Yang, Fengyu; Zhou, Jiaqi; Chen, Yuan; Liao, Jie; Yang, Mingxiang

doi:10.1007/s11042-023-17818-0

MSF-YOLO: A multi-scale features fusion-based method for small object detection

Published: 06 January 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Fengyu Yang^1,2,
Jiaqi Zhou¹,
Yuan Chen¹,
Jie Liao² &
…
Mingxiang Yang³

306 Accesses
Explore all metrics

Abstract

Small object detection has been widely used in real-world applications, such as small object detection from the perspective of UAVs and industrial inspection to locate small defects visible on the surface of materials. The width of each layer of network structure is not enough to represent rich multi-scale information, which may result in the model being insensitive to small objects and low detection accuracy. To address the above issues, we propose an MSF-YOLO model on the basis of the YOLOv3 algorithm. First, the multi-scale features of image is fused. With respect to the original ResNet cell, the single convolutional scale is increased to four convolutional scales, and the features under each different perceptual field are fused to obtain rich hierarchical information from images. Second, the initial anchor box is optimized. Twice K-means clustering methods are invoked to optimize the size of the initial anchor box to improve the overlap of the anchor box, further improving the accuracy of the model. Finally, the convergence of model is accelerated. By introducing the weight parameters obtained from training on the COCO dataset, the training process of the model is optimized as well as the convergence of the model is accelerated. Experimental results on two public datasets show that MSF-YOLO outperforms YOLOv3 with an average accuracy of 98.67% and 97.51%, and performs very well in mAP and IoU metrics compared to state-of-the-art models. Finally, an industrial dataset is introduced for evaluation, and the results showed a 31.54% improvement over the original YOLOv3. In summary, the MSF-YOLO model proposed in this paper is adaptable to the small object detection task in many different scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Data availability

UAV-View and S2TLD datasets are available in refs 15, 26. Airplane-Rive is an industrial dataset that is not publicly according to the partner's policy available, but are available from the corresponding author on reasonable request. The source code of the paper is available on https://github.com/798911956/MSF-YOLO.

References

Pathak AR, Pandey M, Rautaray S (2018) Application of deep learning for object detection. Procedia Comput Sci 132:1706–1717
Article Google Scholar
Sharma V, Mir RN (2020) A comprehensive and systematic look up into deep learning based object detection techniques: A review. Comput Sci Rev 38:100301
Article MathSciNet Google Scholar
Zhao ZQ, Zheng P, Xu S et al (2019) Object detection with deep learning: A review. IEEE Trans Neural Netw Learn Syst 30(11):3212–3232
Article Google Scholar
Dhillon A, Verma GK (2020) Convolutional neural network: a review of models, methodologies and applications to object detection. Prog Artif Intell 9(2):85–112
Article Google Scholar
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014. Proceedings, Part V 13. Springer, pp 740–755
Everingham M, Van Gool L, Williams CKI et al (2010) The pascal visual object classes (voc) challenge. Int J Comput Vision 88(2):303–338
Article Google Scholar
Liu Y, Sun P, Wergeles N et al (2021) A survey and performance evaluation of deep learning methods for small object detection. Expert Syst Appl 172(4):114602
Article Google Scholar
Tong K, Wu Y, Zhou F (2020) Recent advances in small object detection based on deep learning: A review. Image Vis Comput 97:103910
Article Google Scholar
Peters M, Neumann M, Iyyer M et al (2018) Deep contextualized word. Representations. https://doi.org/10.18653/v1/N18-1202
Article Google Scholar
He K, Chen X, Xie S, Li Y, Dollar P, Girshick R (2022) Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16000–16009
Weng L (2017) Object detection for dummies part 3: R-cnn family. lilianweng.github.io/lil-log
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448
Sun X, Wu P, Hoi SCH (2018) Face detection using deep learning: An improved faster RCNN approach. Neurocomputing 299:42–50
Article Google Scholar
Du J (2018) Understanding of object detection based on CNN family and YOLO[C]//Journal of Physics: Conference Series. IOP Publishing 1004(1):012029
Google Scholar
Loey M, Manogaran G, Taha MHN et al (2021) Fighting against COVID-19: A novel deep learning model based on YOLO-v2 with ResNet-50 for medical face mask detection. Sustain Cities Soc 65:102600
Article Google Scholar
Wang C-Y, Bochkovskiy A, Liao H-YM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475
Kisantal M, Wojna Z, Murawski J, Naruniec J, Cho K (2019) Augmentation for small object detection. arXiv preprint arXiv:1902.07296
Ren Y, Zhu C, Xiao S (2018) Small object detection in optical remote sensing images via modified faster R-CNN. Applied ences 8(5):813
Google Scholar
Fu K, Chang Z, Zhang Y et al (2020) Rotation-aware and multi-scale convolutional neural network for object detection in remote sensing images. ISPRS J Photogramm Remote Sens 161:294–308
Article Google Scholar
Hu Y, Wu X, Zheng G, Liu X (2019) Object detection of UAV for anti-UAV based on improved yolo v3. In: 2019 Chinese Control Conference (CCC). IEEE, pp 8386–8390
Pham MT, Courtrai L, Friguet C et al (2020) YOLO-Fine: one-stage detector of small objects under various backgrounds in remote sensing images. Remote Sens 12(15):2501
Article Google Scholar
Hu G, Yang Z, Hu L, Huang L, Han J (2018) Small object detection with multiscale features. Int J Digit Multim Broadcast 2018
Liu M, Wang X, Zhou A et al (2020) UAV-YOLO: Small object detection on unmanned aerial vehicle perspective. Sensors 20(8):2238
Article Google Scholar
Zhang C, Benz P, Argaw DM, Lee S, Kim J, Rameau F, Bazin J-C, Kweon IS (2021) Resnet or densenet? Introducing dense shortcuts to resnet. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3550–3559
Zhong Y, Wang J, Peng J, Zhang L (2020) Anchor box optimization for object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 1286–1294
Anand R, Shanthi T, Nithish MS et al (2020) Face recognition and classification using GoogleNET architecture[M]//Soft computing for problem solving. Springer, Singapore, pp 261–269
Google Scholar
Cheng B, Girshick R, Dollar P, Berg AC, Kirillov A (2021) Boundary IOU: improving object-centric image segmentation evaluation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 15334–15342
Yang X, Yan J, Liao W, Yang X, Tang J, He T (2022) Scrdet++: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. IEEE Trans Pattern Anal Mach Intell 45(2):2384–2399

Download references

Funding

This work was supported by Jiangxi Provincial Department of Science and Technology (Grant numbers: 20202BBEL53002) and Beijing Science and Technology Planning Project (Grant number: Z201100001820022).

Author information

Authors and Affiliations

Department of Software, Nanchang HangKong University, Jiangxi, China
Fengyu Yang, Jiaqi Zhou & Yuan Chen
Hongdu-Aviation-Industry-Group, Jiangxi, China
Fengyu Yang & Jie Liao
China Institute of Water Resources and Hydropower Research, Beijing, China
Mingxiang Yang

Authors

Fengyu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqi Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jie Liao
View author publications
You can also search for this author in PubMed Google Scholar
Mingxiang Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiaqi Zhou.

Ethics declarations

Conflict of interest

Authors declare that they have no conflicts of interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, F., Zhou, J., Chen, Y. et al. MSF-YOLO: A multi-scale features fusion-based method for small object detection. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-023-17818-0

Download citation

Received: 03 June 2022
Revised: 19 October 2023
Accepted: 04 December 2023
Published: 06 January 2024
DOI: https://doi.org/10.1007/s11042-023-17818-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MSF-YOLO: A multi-scale features fusion-based method for small object detection

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MSF-YOLO: A multi-scale features fusion-based method for small object detection

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation