YOLO-TUF: An Improved YOLOv5 Model for Small Object Detection

Chen, Hua; Yang, Wenqian; Wang, Wei; Liu, Zhicai

doi:10.1007/978-981-97-1277-9_37

Hua Chen⁸,
Wenqian Yang⁸,
Wei Wang⁸ &
…
Zhicai Liu⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2058))

Included in the following conference series:

International Artificial Intelligence Conference

152 Accesses

Abstract

Traditional target detection algorithms frequently encounter challenges in accurately detecting objects within complex, cluttered environments. This paper presents an optimized YOLOv5-based model to mitigate such limitations. Our contributions are threefold: Firstly, we enhance the upsampling procedure by amalgamating transposed convolution with the CBAM attention mechanism, fortifying the network’s fine-grained feature extraction capabilities. Secondly, we introduce an optimized feature-processing module, which enhances feature utilization while maintaining a lightweight architecture. Lastly, we integrate EfficientNet into the backbone architecture to amplify feature extraction performance. We validate our approach using the PASCAL VOC dataset, achieving an mAP0.5 of 84.00% and an mAP0.5:0.95 of 62.10%, while maintaining a modest parameter size of 13.22MB. These results mark an improvement of 4.50% ± 0.12% and 8.20% ± 0.09% over the benchmark, demonstrating an efficient trade-off between computational efficiency and detection accuracy. The proposed model outperforms conventional YOLOv5 algorithms and remains competitive with contemporary state-of-the-art object detection techniques. Code is available at https://github.com/chenxz0906chenxz/YOLO-TUF/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adibhatla, V.A., Chih, H.C., Hsu, C.C., Cheng, J., Abbod, M.F., Shieh, J.S.: Defect detection in printed circuit boards using you-only-look-once convolutional neural networks. Electronics 9(9), 1547 (2020)
Article Google Scholar
Amanatiadis, A., Andreadis, I.: A survey on evaluation methods for image interpolation. Meas. Sci. Technol. 20(10), 104015 (2009)
Article Google Scholar
Arun, P.V.: A comparative analysis of different dem interpolation methods. Egypt. J. Remote Sens. Space Sci. 16(2), 133–139 (2013)
MathSciNet Google Scholar
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Chen, Y., Yang, X., Zhong, B., Pan, S., Chen, D., Zhang, H.: CNNTracker: online discriminative object tracking via deep convolutional neural network. Appl. Soft Comput. 38, 1088–1098 (2016)
Article Google Scholar
Farhadi, A., Redmon, J.: YOLOv3: an incremental improvement. In: Computer Vision and Pattern Recognition, vol. 1804. Springer, Heidelberg (2018)
Google Scholar
Fayaz, S., Parah, S.A., Qureshi, G., Kumar, V.: Underwater image restoration: a state-of-the-art review. IET Image Proc. 15(2), 269–285 (2021)
Article Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Jocher, G., et al.: Ultralytics/yolov5: v7. 0-yolov5 sota realtime instance segmentation. Zenodo (2022)
Google Scholar
Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Google Scholar
Li, F., Chen, H., Liu, Z., Zhang, X., Wu, Z.: Fully automated detection of retinal disorders by image-based deep learning. Graefes Arch. Clin. Exp. Ophthalmol. 257, 495–505 (2019)
Article Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Liu, Y., Sun, P., Wergeles, N., Shang, Y.: A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Pang, J., Li, C., Shi, J., Xu, Z., Feng, H.: R2CNN: fast tiny object detection in large-scale remote sensing images. IEEE Trans. Geosci. Remote Sens. 57(8), 5512–5524 (2019)
Article Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Google Scholar
Sun, K., Wen, Q., Zhou, H.: Ganster R-CNN: occluded object detection network based on generative adversarial nets and faster R-CNN. IEEE Access 10, 105022–105030 (2022)
Article Google Scholar
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Google Scholar
Tong, K., Wu, Y., Zhou, F.: Recent advances in small object detection based on deep learning: a review. Image Vis. Comput. 97, 103910 (2020)
Article Google Scholar
Wang, S., et al.: Artificial intelligence in lung cancer pathology image analysis. Cancers 11(11), 1673 (2019)
Article Google Scholar
Yang, Y., Zhou, Y., Din, N.U., Li, J., He, Y., Zhang, L.: An improved YOLOv5 model for detecting laser welding defects of lithium battery pole. Appl. Sci. 13(4), 2402 (2023)
Article Google Scholar
Yue, L., Shen, H., Li, J., Yuan, Q., Zhang, H., Zhang, L.: Image super-resolution: the techniques, applications, and future. Signal Process. 128, 389–408 (2016)
Article Google Scholar
Zhang, Q., Zhang, H., Lu, X.: Adaptive feature fusion for small object detection. Appl. Sci. 12(22), 11854 (2022)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Software Engineering, Xihua university, Chengdu, 610039, People’s Republic of China
Hua Chen, Wenqian Yang, Wei Wang & Zhicai Liu

Authors

Hua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenqian Yang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhicai Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhicai Liu .

Editor information

Editors and Affiliations

Huazhong University of Science and Technology, Wuhan, Hubei, China
Hai Jin
Chinese Academy of Science, Shenzhen, China
Yi Pan
Nanjing University of Science and Technology, Nanjing, China
Jianfeng Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, H., Yang, W., Wang, W., Liu, Z. (2024). YOLO-TUF: An Improved YOLOv5 Model for Small Object Detection. In: Jin, H., Pan, Y., Lu, J. (eds) Artificial Intelligence and Machine Learning. IAIC 2023. Communications in Computer and Information Science, vol 2058. Springer, Singapore. https://doi.org/10.1007/978-981-97-1277-9_37

Download citation

DOI: https://doi.org/10.1007/978-981-97-1277-9_37
Published: 03 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1276-2
Online ISBN: 978-981-97-1277-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

YOLO-TUF: An Improved YOLOv5 Model for Small Object Detection