Space to depth convolution bundled with coordinate attention for detecting surface defects

Wan, Wenqian; Wang, Lei; Wang, Bingbing; Yu, Haoyang; Shi, Kuijie; Liu, Gang

doi:10.1007/s11760-024-03122-3

Space to depth convolution bundled with coordinate attention for detecting surface defects

Original Paper
Published: 01 April 2024

Volume 18, pages 4861–4874, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Wenqian Wan¹,
Lei Wang¹,
Bingbing Wang¹,
Haoyang Yu¹,
Kuijie Shi¹ &
…
Gang Liu¹

137 Accesses
Explore all metrics

Abstract

Surface defects of steel plates unavoidably exist during the industrial production proceeding due to the complex productive technologies and always exhibit some typical characteristics, such as irregular shape, random position, and various size. Therefore, detecting these surface defects with high performance is crucial for producing high-quality products in practice. In this paper, an improved network with high performance based on You Only Look Once version 5 (YOLOv5) is proposed for detecting surface defects of steel plates. Firstly, the Space to Depth Convolution (SPD-Conv) is utilized to make the feature information transforming from space to depth, helpful for preserving the entirety of discriminative feature information to the greatest extent under the proceeding of down-sampling. Subsequently, the coordinate attention mechanism is introduced and embedded into the bottleneck of C3 modules to effectively enhance the weights of some important feature channels, in favor of capturing more important feature information from different channels after SPD-Conv operations. Finally, the Spatial Pyramid Pooling Faster module is replaced by the Spatial Pyramid Pooling Fully Connected Spatial Pyramid Convolution module to further enhance the feature expression capability and efficiently realize the multi-scale feature fusion. The experimental results on NEU-DET dataset show that, compared with YOLOv5, the mAP and mAP50 dramatically increase from 51.7, 87.0 to 61.4, 92.6%, respectively. Meanwhile, the frame rate of 250 FPS implies that it still preserves a well real-time performance. Undoubtedly, the improved algorithm proposed in this paper exhibits outstanding performance, which may be also used to recognize the surface defects of aluminum plates, as well as plastic plates, armor plates and so on in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Surface defect detection of hot rolled steel based on multi-scale feature fusion and attention mechanism residual block

Article Open access 01 April 2024

CABF-YOLO: a precise and efficient deep learning method for defect detection on strip steel surface

Article 03 April 2024

Enhanced feature Fusion structure of YOLO v5 for detecting small defects on metal surfaces

Article 07 January 2023

Code availability

Code generated or used during the study is available from the corresponding author by request.

References

Luo, Q., Fang, X., Liu, L., Yang, C., Sun, Y.: Automated visual defect detection for flat steel surface: a survey. IEEE Trans. Instrum. Meas. 69(3), 626–644 (2020)
Article Google Scholar
Noble, W.: What is a support vector machine? Nat. Biotechnol. 24(12), 1565–1567 (2006)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 15(45), 5–32 (2001)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2015)
Article Google Scholar
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Bochkovskiy, A., Wang, C. Y., Liao, H. Y. M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Fu, C. Y., Liu, W., Ranga, A., Tyagi, A., Berg, A. C.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., Berg, A. C.: Ssd: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, pp. 21–37 (2016)
Puliti, S., Astrup, R.: Automatic detection of snow breakage at single tree level using YOLOv5 applied to UAV imagery. Int. J. Appl. Earth Observ. Geoinf. 112, 102946 (2022)
Zhao, Z., Yang, X., Zhou, Y., Sun, Q., Ge, Z., Liu, D.: Real-time detection of particleboard surface defects based on improved YOLOV5 target detection. Sci. Rep. 11(1), 21777 (2021)
Article Google Scholar
Wang, T., Su, J., Xu, C., Zhang, Y.: An intelligent method for detecting surface defects in aluminium profiles based on the improved YOLOv5 algorithm. Electronics 11(15), 2304 (2022)
Article Google Scholar
Guo, Z., Wang, C., Yang, G., Huang, Z., Li, G.: Msft-yolo: Improved yolov5 based on transformer for detecting defects of steel surface. Sensors. 22(9), 3467 (2022)
Article Google Scholar
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: Efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.. 11534–11542 (2020)
Fang, Y., Guo, X., Chen, K., Zhou, Z., Ye, Q.: Accurate and automated detection of surface knots on sawn timbers using YOLO-V5 model. BioResources 16(3), 5390 (2021)
Article Google Scholar
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
Sunkara, R., Luo, T.: No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459 (2022)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Jaderberg, M., Simonyan, K., Zisserman, A.: Spatial transformer networks. Adv. Neural Inf. Proces. Syst., 28 (2015)
Woo, S., Park, J., Lee, J. Y., Kweon, I. S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Aouayeb, M., Hamidouche, W., Soladie, C., Kpalma, K., Seguier, R.: Learning vision Transformer with Squeeze and Excitation for Facial Expression Recognition. arXiv preprint arXiv:2107.03107 (2021)
Zhang, D.Y., Zhang, W., Cheng, T., Zhou, X.G., Yan, Z., Wu, Y., Yang, X.: Detection of wheat scab fungus spores utilizing the Yolov5-ECA-ASFF network structure. Comput. Electron. Agric. 210, 107953–107965 (2023)
Article Google Scholar
Zhu, X., Liu, J., Zhou, X., Qian, S., Yu, J.: Enhanced feature Fusion structure of YOLOv5 for detecting small defects on metal surfaces. Int. J. Mach. Learn. Cybern. 14(6), 2041–2051 (2023)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L. C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Zhou, D., Hou, Q., Chen, Y., Feng, J., Yan, S.: Rethinking bottleneck structure for efficient mobile network design. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, 680–697 (2020)
Hou, J., You, B., Xu, J., Wang, T., Cao, M.: Surface defect detection of preform based on improved YOLOv5. Appl. Sci. 13(13), 7860 (2023)
Article Google Scholar
Le, H.F., Zhang, L.J., Liu, Y.X.: Surface defect detection of industrial parts based on YOLOv5. IEEE Access. 10, 130784–130794 (2022)
Article Google Scholar
He, Y., Song, K., Meng, Q., Yan, Y.: An end-to-end steel surface defect detection approach via fusing multiple hierarchical features. IEEE Trans. Instrum. Meas. 69(4), 1493–1504 (2019)
Article Google Scholar
Lv, X., Duan, F., Jiang, J.J., Fu, X., Gan, L.: Deep metallic surface defect detection: the new benchmark and detection network. Sensors. 20(6), 1562 (2020)
Article Google Scholar
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: Yolox: Exceeding Yolo Series in 2021. arXiv preprint arXiv:2107.08430 (2021)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9627–9636 (2019)
Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale imAge Recognition. arXiv preprint arXiv:1409.1556 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lin, T. Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of JiangXi Province under Grant No. 20224ACB201010 and the Postgraduate Student Innovation Fund of Jiangxi Province under Grant No.YC2023-S273.

Author information

Authors and Affiliations

College of Physics and Communication Electronics, Jiangxi Normal University, Nanchang, 330022, Jiangxi, China
Wenqian Wan, Lei Wang, Bingbing Wang, Haoyang Yu, Kuijie Shi & Gang Liu

Authors

Wenqian Wan
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bingbing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haoyang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Kuijie Shi
View author publications
You can also search for this author in PubMed Google Scholar
Gang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The overall study supervised by GL; Methodology, software, and preparing the original draft by WW; Review and editing by LW and BW; The results were analyzed and validated by HY and KS. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Gang Liu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Conflict of interest

All the authors of this paper have no conflicts of interest, financial or otherwise.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wan, W., Wang, L., Wang, B. et al. Space to depth convolution bundled with coordinate attention for detecting surface defects. SIViP 18, 4861–4874 (2024). https://doi.org/10.1007/s11760-024-03122-3

Download citation

Received: 02 September 2023
Revised: 25 February 2024
Accepted: 28 February 2024
Published: 01 April 2024
Issue Date: July 2024
DOI: https://doi.org/10.1007/s11760-024-03122-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Space to depth convolution bundled with coordinate attention for detecting surface defects

Abstract

Access this article

Similar content being viewed by others

Surface defect detection of hot rolled steel based on multi-scale feature fusion and attention mechanism residual block

CABF-YOLO: a precise and efficient deep learning method for defect detection on strip steel surface

Enhanced feature Fusion structure of YOLO v5 for detecting small defects on metal surfaces

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Space to depth convolution bundled with coordinate attention for detecting surface defects

Abstract

Access this article

Similar content being viewed by others

Surface defect detection of hot rolled steel based on multi-scale feature fusion and attention mechanism residual block

CABF-YOLO: a precise and efficient deep learning method for defect detection on strip steel surface

Enhanced feature Fusion structure of YOLO v5 for detecting small defects on metal surfaces

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation