Abstract
Surface defect detection in industrial processes is crucial for ensuring product quality and reducing material waste. Automated defect identification using deep learning techniques has become a vital aspect of the automated surface defect detection field. However, achieving accurate and automatic defect segmentation remains a significant challenge, especially for fine precision segmentation required in high-quality products. The traditional approaches for defect segmentation have several limitations, such as difficulty in preserving fine details and contextual information, leading to poor segmentation performance. To overcome these limitations, new segmentation algorithms that can preserve fine precision and contextual information need to be evaluated. Therefore, there is a need for novel segmentation algorithms that can accurately identify and segment defects in industrial processes, incorporating multi-scale contextual information, preserving fine details, and handling complex and subtle defects. In this paper, we propose a novel approach for steel defect segmentation called multi-scale cross-patch attention with dilated convolution (MCPAD-UNet). This approach employs a subsampled module that achieves the same dimensionality reduction as max-pooling while preserving the fine precision of the features. Additionally, MCPAD-UNet utilizes a cross-patch attention module with dilated convolution, simultaneously collecting channel–spatial data and integrating relevant multi-scale features to reduce the semantic gap and enhance detailed information. To prevent overfitting, we apply dropout after each hybrid dilated convolution block. Extensive testing on the public Severstal: Steel Defect Detection dataset demonstrates the effectiveness of our approach, achieving Dice scores of 95.3%, outperforming the competition's overall score by 5.2%. Our proposed method has the potential to significantly improve defect detection in industrial processes, thereby reducing material waste and improving product quality.
Similar content being viewed by others
Data availability
The data are available on request.
References
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. Comput. Vis. Pattern Recognit. 2017, 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195
Bulnes, F.G., Usamentiaga, R., Garcia, D.F., Molleda, J.: An efficient method for defect detection during the manufacturing of web materials. J. Intell. Manuf. 27(2), 431–445 (2016). https://doi.org/10.1007/s10845-014-0876-9
Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder–Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Technical reports
Oztemel, E., Gursev, S.: Literature review of Industry 4.0 and related technologies. J. Intell. Manuf. 31, 127–182 (2018). https://doi.org/10.1007/s10845-018-1433-8
Rački, D., Tomaževič, D., & Skočaj, D.: A compact convolutional neural network for textured surface anomaly detection. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1331–1339 (2018). https://doi.org/10.1109/WACV.2018.00150
Luo, Q., Fang, X., Liu, L., Yang, C., Sun, Y.: Automated visual defect detection for flat steel surface: a survey. IEEE Trans. Instrum. Meas. 69(3), 626–644 (2020)
Yu, Z., Wu, X., Gu, X.: Fully convolutional networks for surface defect inspection in industrial environment. In: International Conference on Computer Vision Systems, pp. 417–426. Springer (2017)
Li, Y. et al.: Research on segmentation of steel surface defect images based on improved Res-UNet network. 电子与信息学报 44, 1–8 (2022)
Jamshidi, P., Velez, M., Kastner, C., Siegmund, N., Kawthekar, P.: Transfer learning for improving model predictions in highly configurable software. In: Proceedings of the 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, Ser. SEAMS ’17, pp. 31–41. IEEE Press, Piscataway (2017). https://doi.org/10.1109/SEAMS.2017.11
Su, Z., et al.: An improved U-Net method for the semantic segmentation of remote sensing images. Appl. Intell. 52(3), 3276–3288 (2022)
Wu, Y., et al.: Hybrid deep learning architecture for rail surface segmentation and surface defect detection. Comput. Aided Civ. Infrastruct. Eng. 37(2), 227–244 (2022)
Guo, C., Szemenyei, M., Yi, Y., Hu, Y., Wang, W., Zhou, W.: Channel attention residual U-net for retinal vessel segmentation (2020). arXiv:2004.03702
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA (June 2020)
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2019)
Wang, X., Wang, S., Zhang, Z., Yin, X., Wang, T., Li, N.: CPAD-Net: contextual parallel attention and dilated network for liver tumor segmentation. Biomed. Signal Process. Control 79(2), 104258 (2023)
Zhu, W., Wang, Q., Luo, L., Zhang, Y., Lu, Q., Yeh, W.-C., Liang, J.: CPAM: Cross patch attention module for complex texture tile block defect detection. Appl. Sci. 12(23), 11959 (2022). https://doi.org/10.3390/app122311959
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, pp. 234–241. Springer International Publishing (2015)
Oktay, O., Schlemper, J., Folgoc, L. L., Lee, M., Heinrich, M., Misawa, K., et al.: Attention u-net: learning where to look for the pancreas (2018) arXiv:1804.03999
Isensee, F., Petersen, J., Klein, A., Zimmerer, D., Jaeger, P. F., Kohl, S., et al.: nnu-net: Self-adapting framework for u-net-based medical image segmentation (2018). arXiv:1809.10486
Zhou, Z., Rahman Siddiquee, M. M., Tajbakhsh, N., Liang, J.: UNet++: A nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, pp. 3–11. Springer International Publishing (2018)
Sahayam, S., Nenavath, R., Jayaraman, U., Prakash, S.: Brain tumor segmentation using a hybrid multi-resolution U-Net with residual dual attention and deep supervision on MR images. Biomed. Signal Process. Control 78, 103939 (2022)
Zhao, W., Chen, F., Huang, H., Li, D., Cheng, W.: A new steel defect detection algorithm based on deep learning. Comput. Intell. Neurosci. 2021, 1–13 (2021)
Saiz, F.A., Alfaro, G., Barandiaran, I., Graña, M.: Generative adversarial networks to improve the robustness of visual defect segmentation by semantic networks in manufacturing components. Appl. Sci. 11(14), 6368 (2021)
Hu, J., Yan, P., Su, Y., Wu, D., Zhou, H.: A method for classification of surface defect on metal workpieces based on twin attention mechanism generative adversarial network. IEEE Sens. J. 21(12), 13430–13441 (2021)
Lee, S.Y., Tama, B.A., Moon, S.J., Lee, S.: Steel surface defect diagnostics using deep convolutional neural network and class activation map. Appl. Sci. 9(24), 5449 (2019)
Dong, H., Song, K., He, Y., Xu, J., Yan, Y., Meng, Q.: PGA-Net: pyramid feature fusion and global context attention network for automated surface defect detection. IEEE Trans. Industr. Inf. 16(12), 7448–7458 (2019)
Severstal: Severstal: Steel Defect Detection (2019)
Lin, T.-Y., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision (ICCV), vol. 2017, pp. 2999–3007 (2017)
Kingma, D. P., Ba, J.L.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, pp. 1–13 (2015)
Yeung, M., Sala, E., Schönlieb, C.B., Rundo, L.: Unified focal loss: generalising dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Comput. Med. Imaging Graph. 95, 102026 (2022)
Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P., Dokania, P.: Calibrating deep neural networks using focal loss. Adv. Neural. Inf. Process. Syst. 33, 15288–15299 (2020)
Yang, B., Liu, Z., Duan, G., Tan, J.: Mask2Defect: a prior knowledge-based data augmentation method for metal surface defect inspection. IEEE Trans. Industr. Inf. 18(10), 6743–6755 (2021)
Urbonas, A., Raudonis, V., Maskeliūnas, R., Damaševičius, R.: Automated identification of wood veneer surface defects using a faster region-based convolutional neural network with data augmentation and transfer learning. Appl. Sci. 9(22), 4898 (2019)
Božič, J., Tabernik, D., Skočaj, D.: Mixed supervision for surface-defect detection: from weakly to fully supervised learning. Comput. Ind. 129, 103459 (2021)
Damacharla, P., Rao, A., Ringenberg, J., Javaid, A. Y.: TLU-Net: a deep learning approach for automatic steel surface defect detection. In: 2021 International Conference on Applied Artificial Intelligence (ICAPAI), Halden, Norway (2021)
Funding
Sakarya University of Applied Sciences BAP and TUBITAK 1505 Program supports this study with Project numbers 078-2022 and 5220125.
Author information
Authors and Affiliations
Contributions
The author contributed to the study's conception and design. Experimental analyses were performed by AFK. The first draft of the manuscript was written by AFK and all authors commented on previous versions of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kamanli, A.F. A novel multi-scale cross-patch attention with dilated convolution (MCPAD-UNET) for metallic surface defect detection. SIViP 18, 485–494 (2024). https://doi.org/10.1007/s11760-023-02745-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02745-2