Multi-modal background-aware for defect semantic segmentation with limited data

Shan, Dexing; Zhang, Yunzhou; Liu, Shitong

doi:10.1007/s10845-024-02373-8

Multi-modal background-aware for defect semantic segmentation with limited data

Published: 18 May 2024

(2024)
Cite this article

Journal of Intelligent Manufacturing Aims and scope Submit manuscript

112 Accesses
Explore all metrics

Abstract

Visual defect detection is widely used in intelligent manufacturing to achieve intelligent detection of product quality. Two main challenges remain in industrial applications. One is the scarcity of defect samples and the other is the weak texture variation of industrial defects. The above problems lead to the application of RGB image-based industrial defect segmentation. To this end, we propose a multi-modal background-aware network (MMBA-Net) for few-shot defect (2D+3D) segmentation with limited data, which can segment texture and structural defects in unseen and seen domains (objects). To synthesize the perception capabilities of different imaging conditions, MMBA-Net exploits the point cloud to provide spatial information for the RGB images. Furthermore, we found that background regions are perceptually consistent within an industrial image, which can be leveraged to discriminate between foreground and background regions. To implement this idea, we model correlation learning between multi-modal query samples and multi-modal normal (defect-free) samples as an optimal transport problem, establishing robust multi-modal background correlations between query and normal samples across different modalities. Experiments were conducted on real-world industrial products and food datasets, demonstrating that the proposed method can perform effective base learning and meta-learning on a small number of defective samples (approximately 15–25 defective training samples) to achieve effective segmentation of defects in the seen and unseen domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human–machine knowledge hybrid augmentation method for surface defect detection based few-data learning

Article 01 March 2024

Permute-MAML: exploring industrial surface defect detection algorithms for few-shot learning

Article Open access 13 September 2023

Few-shot defect detection using feature enhancement and image generation for manufacturing quality inspection

Article 12 December 2023

References

Bao, Y., Song, K., Liu, J., Wang, Y., Yan, Y., Yu, H., & Li, X. (2021). Triplet-graph reasoning network for few-shot metal generic surface defect segmentation. IEEE Transactions on Instrumentation and Measurement, 70, 1–11. https://doi.org/10.1109/TIM.2021.3083561
Article Google Scholar
Bergmann, P., Jin, X., Sattlegger, D., & Steger, C. (2021). The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization.
Cao, J., Yang, G., & Yang, X. (2020). A pixel-level segmentation convolutional neural network based on deep feature fusion for surface defect detection. IEEE Transactions on Instrumentation and Measurement, 70, 1–12.
Google Scholar
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., & Joulin, A. (2020). Unsupervised learning of visual features by contrasting cluster assignments. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20. Curran Associates Inc.
Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. Preprint at http://arxiv.org/abs/1706.05587
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder–decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (pp. 801–818).
Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. In Proceedings of the 26th International Conference on Neural Information Processing Systems—Volume 2, NIPS’13 (pp. 2292–2300). Curran Associates Inc.
Dimitriou, N., Leontaris, L., Vafeiadis, T., Ioannidis, D., Wotherspoon, T., Tinker, G., & Tzovaras, D. (2020). Fault diagnosis in microelectronics attachment via deep learning analysis of 3-d laser scans. IEEE Transactions on Industrial Electronics, 67(7), 5748–5757. https://doi.org/10.1109/TIE.2019.2931220
Article Google Scholar
Dong, H., Song, K., He, Y., Xu, J., Yan, Y., & Meng, Q. (2019). Pga-net: Pyramid feature fusion and global context attention network for automated surface defect detection. IEEE Transactions on Industrial Informatics, 16(12), 7448–7458.
Article Google Scholar
Dong, N., & Xing, E. P. (2018). Few-shot semantic segmentation with prototype learning. BMVC, 3, 4.
Google Scholar
Dong, X., Taylor, C. J., & Cootes, T. F. (2022). Defect classification and detection using a multitask deep one-class cnn. IEEE Transactions on Automation Science and Engineering, 19(3), 1719–1730. https://doi.org/10.1109/TASE.2021.3109353
Article Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
Hong, S., Cho, S., Nam, J., Lin, S., & Kim, S. (2022a). Cost aggregation with 4d convolutional swin transformer for few-shot segmentation. In European Conference on Computer Vision (pp. 108–126).
Hong, S., Cho, S., Nam, J., Lin, S., & Kim, S. W. (2022b). Cost aggregation with 4d convolutional swin transformer for few-shot segmentation. Preprint at http://arXiv.org/abs/2207.10866
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning (pp. 448–456). PMLR.
Kang, D., & Cho, M. (2022). Integrative few-shot learning for classification and segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
Li, G., Jampani, V., Sevilla-Lara, L., Sun, D., Kim, J., & Kim, J. (2021). Adaptive prototype learning and allocation for few-shot segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8334–8343).
Li, Y., Zhao, W., & Pan, J. (2017). Deformable patterned fabric defect detection with fisher criterion-based deep learning. IEEE Transactions on Automation Science and Engineering, 14(2), 1256–1264. https://doi.org/10.1109/TASE.2016.2520955
Article Google Scholar
Lin, H., Li, B., Wang, X., Shu, Y., & Niu, S. (2019). Automated defect inspection of led chip using deep convolutional neural network. Journal of Intelligent Manufacturing, 30, 2525–2534.
Article Google Scholar
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 936–944). https://doi.org/10.1109/CVPR.2017.106
Liu, K., Lyu, S., & Lu, Y. (2022). Few-shot segmentation for prohibited items inspection with patch-based self-supervised learning and prototype reverse validation. IEEE Transactions on Multimedia, 1, 1–1. https://doi.org/10.1109/TMM.2022.3176546
Article Google Scholar
Liu, W., Liu, Z., Wang, H., & Han, Z. (2020). An automated defect detection approach for catenary rod-insulator textured surfaces using unsupervised learning. IEEE Transactions on Instrumentation and Measurement, 69(10), 8411–8423.
Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3431–3440).
Ma, S., Song, K., Niu, M., Tian, H., & Yan, Y. (2022). Cross-scale fusion and domain adversarial network for generalizable rail surface defect segmentation on unseen datasets. Journal of Intelligent Manufacturing, 35, 1–20.
Google Scholar
Min, J., Kang, D., & Cho, M. (2021). Hypercorrelation squeeze for few-shot segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Ni, X., Ma, Z., Liu, J., Shi, B., & Liu, H. (2021). Attention network for rail surface defect detection via consistency of intersection-over-union (iou)-guided center-point estimation. IEEE Transactions on Industrial Informatics, 18(3), 1694–1705.
Article Google Scholar
Niu, M., Song, K., Huang, L., Wang, Q., Yan, Y., & Meng, Q. (2021). Unsupervised saliency detection of rail surface defects using stereoscopic images. IEEE Transactions on Industrial Informatics, 17(3), 2271–2281. https://doi.org/10.1109/TII.2020.3004397
Article Google Scholar
Oh, Y., Kim, B., & Ham, B. (2021). Background-aware pooling and noise-aware loss for weakly-supervised semantic segmentation. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 6909–6918). https://doi.org/10.1109/CVPR46437.2021.00684
Rocco, I., Cimpoi, M., Arandjelović, R., Torii, A., Pajdla, T., & Sivic, J. (2018). Neighbourhood consensus networks. In NeurIPS.
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 234–241). Springer
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Article Google Scholar
Shaban, A., Bansal, S., Liu, Z., Essa, I., & Boots, B. (2017). One-shot learning for semantic segmentation. Preprint at http://arxiv.org/abs/1709.03410
Shan, D., Zhang, Y., Coleman, S. A., Kerr, D., Liu, S., & Hu, Z. (2022). Unseen-material few-shot defect segmentation with optimal bilateral feature transport network. IEEE Transactions on Industrial Informatics, 1, 1–11. https://doi.org/10.1109/TII.2022.3216900
Article Google Scholar
Song, G., Song, K., & Yan, Y. (2020). Edrnet: Encoder–decoder residual network for salient object detection of strip steel surface defects. IEEE Transactions on Instrumentation and Measurement, 69(12), 9709–9719.
Article Google Scholar
Tabernik, D., Šela, S., Skvarč, J., & Skočaj, D. (2020). Segmentation-based deep-learning approach for surface-defect detection. Journal of Intelligent Manufacturing, 31(3), 759–776.
Article Google Scholar
Tian, Z., Zhao, H., Shu, M., Yang, Z., Li, R., & Jia, J. (2022). Prior guided feature enrichment network for few-shot segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(2), 1050–1065. https://doi.org/10.1109/TPAMI.2020.3013717
Article Google Scholar
Wang, J., Song, K., Zhang, D., Niu, M., & Yan, Y. (2022). Collaborative learning attention network based on rgb image and depth image for surface defect inspection of no-service rail. IEEE/ASME Transactions on Mechatronics, 1, 1–11. https://doi.org/10.1109/TMECH.2022.3167412
Wang, K., Liew, J. H., Zou, Y., Zhou, D., & Feng, J. (2019). Panet: Few-shot image semantic segmentation with prototype alignment. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9197–9206).
Wu, Y., & He, K. (2018). Group normalization. In The European Conference on Computer Vision (ECCV).
Xie, Q., Li, D., Xu, J., Yu, Z., & Wang, J. (2019). Automatic detection and classification of sewer defects via hierarchical deep learning. IEEE Transactions on Automation Science and Engineering, 16(4), 1836–1847. https://doi.org/10.1109/TASE.2019.2900170
Yang, B., Liu, C., Li, B., Jiao, J., & Ye, Q. (2020). Prototype mixture models for few-shot semantic segmentation. In Proceedings of the European Conference on Computer Vision (pp. 763–778). Springer
Yang, H., Zhou, Q., Song, K., & Yin, Z. (2020). An anomaly feature-editing-based adversarial network for texture defect visual inspection. IEEE Transactions on Industrial Informatics, 17(3), 2220–2230.
Article Google Scholar
Yu, R., Guo, B., & Yang, K. (2022). Selective prototype network for few-shot metal surface defect segmentation. IEEE Transactions on Instrumentation and Measurement, 71, 1–10. https://doi.org/10.1109/TIM.2022.3196447
Zhang, X., Wei, Y., Yang, Y., & Huang, T. S. (2020). Sg-one: Similarity guidance network for one-shot semantic segmentation. IEEE Transactions on Cybernetics, 50(9), 3855–3865.
Article Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2881–2890).

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant No. 61973066), Major Science and Technology Projects of Liaoning Province (Grant No. 2021JH1/10400049) and Foundation of Key Laboratory of Equipment Reliability (Grant No. WD2C20205500306).

Author information

Authors and Affiliations

College of Information Science and Engineering, Northeastern University, Shenyang, China
Dexing Shan, Yunzhou Zhang & Shitong Liu

Authors

Dexing Shan
View author publications
You can also search for this author in PubMed Google Scholar
Yunzhou Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shitong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunzhou Zhang.

Ethics declarations

Conflict of interest

The authors declare that they do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shan, D., Zhang, Y. & Liu, S. Multi-modal background-aware for defect semantic segmentation with limited data. J Intell Manuf (2024). https://doi.org/10.1007/s10845-024-02373-8

Download citation

Received: 07 February 2023
Accepted: 14 March 2024
Published: 18 May 2024
DOI: https://doi.org/10.1007/s10845-024-02373-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-modal background-aware for defect semantic segmentation with limited data

Abstract

Access this article

Similar content being viewed by others

Human–machine knowledge hybrid augmentation method for surface defect detection based few-data learning

Permute-MAML: exploring industrial surface defect detection algorithms for few-shot learning

Few-shot defect detection using feature enhancement and image generation for manufacturing quality inspection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-modal background-aware for defect semantic segmentation with limited data

Abstract

Access this article

Similar content being viewed by others

Human–machine knowledge hybrid augmentation method for surface defect detection based few-data learning

Permute-MAML: exploring industrial surface defect detection algorithms for few-shot learning

Few-shot defect detection using feature enhancement and image generation for manufacturing quality inspection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation