Abstract
Traditional target detection algorithms frequently encounter challenges in accurately detecting objects within complex, cluttered environments. This paper presents an optimized YOLOv5-based model to mitigate such limitations. Our contributions are threefold: Firstly, we enhance the upsampling procedure by amalgamating transposed convolution with the CBAM attention mechanism, fortifying the network’s fine-grained feature extraction capabilities. Secondly, we introduce an optimized feature-processing module, which enhances feature utilization while maintaining a lightweight architecture. Lastly, we integrate EfficientNet into the backbone architecture to amplify feature extraction performance. We validate our approach using the PASCAL VOC dataset, achieving an mAP0.5 of 84.00% and an mAP0.5:0.95 of 62.10%, while maintaining a modest parameter size of 13.22MB. These results mark an improvement of 4.50% ± 0.12% and 8.20% ± 0.09% over the benchmark, demonstrating an efficient trade-off between computational efficiency and detection accuracy. The proposed model outperforms conventional YOLOv5 algorithms and remains competitive with contemporary state-of-the-art object detection techniques. Code is available at https://github.com/chenxz0906chenxz/YOLO-TUF/.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adibhatla, V.A., Chih, H.C., Hsu, C.C., Cheng, J., Abbod, M.F., Shieh, J.S.: Defect detection in printed circuit boards using you-only-look-once convolutional neural networks. Electronics 9(9), 1547 (2020)
Amanatiadis, A., Andreadis, I.: A survey on evaluation methods for image interpolation. Meas. Sci. Technol. 20(10), 104015 (2009)
Arun, P.V.: A comparative analysis of different dem interpolation methods. Egypt. J. Remote Sens. Space Sci. 16(2), 133–139 (2013)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Chen, Y., Yang, X., Zhong, B., Pan, S., Chen, D., Zhang, H.: CNNTracker: online discriminative object tracking via deep convolutional neural network. Appl. Soft Comput. 38, 1088–1098 (2016)
Farhadi, A., Redmon, J.: YOLOv3: an incremental improvement. In: Computer Vision and Pattern Recognition, vol. 1804. Springer, Heidelberg (2018)
Fayaz, S., Parah, S.A., Qureshi, G., Kumar, V.: Underwater image restoration: a state-of-the-art review. IET Image Proc. 15(2), 269–285 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Jocher, G., et al.: Ultralytics/yolov5: v7. 0-yolov5 sota realtime instance segmentation. Zenodo (2022)
Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Li, F., Chen, H., Liu, Z., Zhang, X., Wu, Z.: Fully automated detection of retinal disorders by image-based deep learning. Graefes Arch. Clin. Exp. Ophthalmol. 257, 495–505 (2019)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Liu, Y., Sun, P., Wergeles, N., Shang, Y.: A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 172, 114602 (2021)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Pang, J., Li, C., Shi, J., Xu, Z., Feng, H.: R2CNN: fast tiny object detection in large-scale remote sensing images. IEEE Trans. Geosci. Remote Sens. 57(8), 5512–5524 (2019)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Sun, K., Wen, Q., Zhou, H.: Ganster R-CNN: occluded object detection network based on generative adversarial nets and faster R-CNN. IEEE Access 10, 105022–105030 (2022)
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Tong, K., Wu, Y., Zhou, F.: Recent advances in small object detection based on deep learning: a review. Image Vis. Comput. 97, 103910 (2020)
Wang, S., et al.: Artificial intelligence in lung cancer pathology image analysis. Cancers 11(11), 1673 (2019)
Yang, Y., Zhou, Y., Din, N.U., Li, J., He, Y., Zhang, L.: An improved YOLOv5 model for detecting laser welding defects of lithium battery pole. Appl. Sci. 13(4), 2402 (2023)
Yue, L., Shen, H., Li, J., Yuan, Q., Zhang, H., Zhang, L.: Image super-resolution: the techniques, applications, and future. Signal Process. 128, 389–408 (2016)
Zhang, Q., Zhang, H., Lu, X.: Adaptive feature fusion for small object detection. Appl. Sci. 12(22), 11854 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, H., Yang, W., Wang, W., Liu, Z. (2024). YOLO-TUF: An Improved YOLOv5 Model for Small Object Detection. In: Jin, H., Pan, Y., Lu, J. (eds) Artificial Intelligence and Machine Learning. IAIC 2023. Communications in Computer and Information Science, vol 2058. Springer, Singapore. https://doi.org/10.1007/978-981-97-1277-9_37
Download citation
DOI: https://doi.org/10.1007/978-981-97-1277-9_37
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1276-2
Online ISBN: 978-981-97-1277-9
eBook Packages: Computer ScienceComputer Science (R0)