Skip to main content
Log in

Hierarchical multi-scale network for cross-scale visual defect detection

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

Nowadays, an increasing number of researchers apply deep-learning-based object detection methods to implement visual defect detection in industrial manufacturing. However, large-scale variation in visual defect detection impedes the improvement of detection accuracy to be further explored. Therefore, we propose a hierarchical multi-scale block (HMS-Block), equipped with hierarchical representation and multi-scale embedding, to afford scale-abundant features to facilitate multi-scale defect detection. Specially, the hierarchical representation is implemented by a cascade learning stage to extract features from local to global at the channel level. Based on this representation, a cross-branch shortcut is concisely embedded to relieve the large-scale variation problem. Ultimately, the hierarchical multi-scale network (HMSNet) is published elegantly via stacking a certain amount of HMS-Blocks. The proposed methods facilitate the defect detection at all scales and outperform the ResNet50 baseline by a large margin with minor time overhead and less parameter required, indicating that the proposed HMS-Block has a high practical utility in the field of industrial applications. Moreover, the proposed HMSNet can also be applied to other detection-based tasks and greatly surpasses existing methods. Concretely, the proposed HMSNets achieve 42.4/42.7 mAP on NEU and COCO datasets, surpassing the recent backbones (i.e.,  HRNetV2) by 2.6/1.2 mAP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Bao, Y., Song, K., Liu, J., Wang, Y., Yan, Y., Yu, H., & Li, X. (2021). Triplet-graph reasoning network for few-shot metal generic surface defect segmentation. IEEE Transactions on Instrumentation and Measurement, 70, 1–11.

    Google Scholar 

  • Cai, Z., & Vasconcelos, N. (2018). Cascade r-cnn: Delving into high quality object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6154–6162.

  • Cao, Y., Xu, J., Lin, S., Wei, F., & Hu, H. (2019). GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond, in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), IEEE Computer Society, pp. 1971–1980. IEEE Computer Society.

  • Çelik, A., Küçükmanisa, A., Sümer, A., Çelebi, A. T., & Urhan, O. (2022). A real-time defective pixel detection system for lcds using deep learning based object detectors. Journal of Intelligent Manufacturing, 33, 985–994.

    Article  Google Scholar 

  • Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., & Xu, J. et al. (2019). Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 1–13.

  • Cheng, K.C.-C., Chen, L.L.-Y., Li, J.-W., Li, K.S.-M., Tsai, N.C.-Y., Wang, S.-J., Huang, A.Y.-A., Chou, L., Lee, C.-S., Chen, J. E., et al. (2021). Machine learning-based detection method for wafer test induced defects. IEEE Transactions on Semiconductor Manufacturing, 34(2), 161–167.

    Article  Google Scholar 

  • Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database, in 2009 IEEE conference on computer vision and pattern recognition, Ieee, pp. 248–255. IEEE.

  • Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.

    Article  Google Scholar 

  • Gao, S.-H., Cheng, M.-M., Zhao, K., Zhang, X.-Y., Yang, M.-H., & Torr, P. (2019). Res2net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(2), 652–662.

    Article  Google Scholar 

  • Gao, Y., Lin, J., Xie, J., & Ning, Z. (2020). A real-time defect detection method for digital signal processing of industrial inspection applications. IEEE Transactions on Industrial Informatics, 17(5), 3450–3459.

    Article  Google Scholar 

  • Guo, C., Fan, B., Zhang, Q., Xiang, S., & Pan, C. (2020). Augfpn: Improving multi-scale feature learning for object detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12595–12604.

  • Hao, R., Lu, B., Cheng, Y., Li, X., & Huang, B. (2021). A steel surface defect inspection approach towards smart industrial monitoring. Journal of Intelligent Manufacturing, 32, 1833–1843.

    Article  Google Scholar 

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.

  • He, Y., Song, K., Meng, Q., & Yan, Y. (2019). An end-to-end steel surface defect detection approach via fusing multiple hierarchical features. IEEE Transactions on Instrumentation and Measurement, 69(4), 1493–1504.

    Article  ADS  Google Scholar 

  • Hsu, C.-Y., & Liu, W.-C. (2021). Multiple time-series convolutional neural network for fault detection and diagnosis and empirical study in semiconductor manufacturing. Journal of Intelligent Manufacturing, 32, 823–836.

    Article  Google Scholar 

  • Hu, J., Shen, L., Albanie, S., Sun, G., & Wu, E. (2019). Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8), 2011–2023.

    Article  PubMed  Google Scholar 

  • Jain, S., Seth, G., Paruthi, A., Soni, U., & Kumar, G. (2022). Synthetic data augmentation for surface defect detection and classification using deep learning. Journal of Intelligent Manufacturing, 33, 1007–1020.

    Article  Google Scholar 

  • Kim, Y., Cho, D., & Lee, J.-H. (2021). Wafer defect pattern classification with detecting out-of-distribution. Microelectronics Reliability, 122, 114157.

    Article  CAS  Google Scholar 

  • Kong, T., Yao, A., Chen, Y., & Sun, F. (2016). Hypernet: Towards accurate region proposal generation and joint object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 845–853.

  • Li, D., Li, Y., Xie, Q., Wu, Y., Yu, Z., & Wang, J. (2021). Tiny defect detection in high-resolution aero-engine blade images via a coarse-to-fine framework. IEEE Transactions on Instrumentation and Measurement, 70, 1–12.

    Google Scholar 

  • Li, F., & Xi, Q. (2021). Defectnet: Toward fast and effective defect detection. IEEE Transactions on Instrumentation and Measurement, 70, 1–9.

    Google Scholar 

  • Li, Y., Chen, Y., Wang, N., & Zhang, Z. (2019). Scale-aware trident networks for object detection, in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6054–6063.

  • Lin, T-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C.L. (2014). Microsoft coco: Common objects in context, in European conference on computer vision, Springer, 2014, pp. 740–755. Springer.

  • Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017a) Feature pyramid networks for object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017a, pp. 2117–2125.

  • Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017b). Focal loss for dense object detection, in Proceedings of the IEEE international conference on computer vision, pp. 2980–2988.

  • Ling, Z., Zhang, A., Ma, D., Shi, Y., & Wen, H. (2022). Deep Siamese semantic segmentation network for pcb welding defect detection. IEEE Transactions on Instrumentation and Measurement, 71, 1–11.

    Article  Google Scholar 

  • Liu, J.-J., Hou, Q., Cheng, M.-M., Wang, C., & Feng, J. (2020a). Improving convolutional networks with self-calibrated convolutions, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10096–10105.

  • Liu, R., Sun, Z., Wang, A., Yang, K., Wang, Y., & Sun, Q. (2020). Real-time defect detection network for polarizer based on deep learning. Journal of Intelligent Manufacturing, 31, 1813–1823.

    Article  Google Scholar 

  • Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path aggregation network for instance segmentation, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768.

  • Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A.C. (2016). Ssd: Single shot multibox detector, in European conference on computer vision, Springer, pp. 21–37. Springer.

  • Liu, Z., Yang, B., Duan, G., & Tan, J. (2020). Visual defect inspection of metal part surface via deformable convolution and concatenate feature pyramid neural networks. IEEE Transactions on Instrumentation and Measurement, 69(12), 9681–9694.

    Article  ADS  Google Scholar 

  • Liu, Z., Tang, R., Duan, G., & Tan, J. (2021). Truingdet: Towards high-quality visual automatic defect inspection for mental surface. Optics and Lasers in Engineering, 138, 106423.

    Article  Google Scholar 

  • Liu, Z., Song, Y., Tang, R., Duan, G., & Tan, J. (2022). Few-shot defect recognition of metal surfaces via attention-embedding and self-supervised learning. Journal of Intelligent Manufacturing, 1–15.

  • Meng, S., Pan, R., Gao, W., Zhou, J., Wang, J., & He, W. (2021). A multi-task and multi-scale convolutional neural network for automatic recognition of woven fabric pattern. Journal of Intelligent Manufacturing, 32, 1147–1161.

    Article  Google Scholar 

  • Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., & Lin, D. (2019) Libra r-cnn: Towards balanced learning for object detection, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 821–830.

  • Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., & Dollár, P. (2020). Designing network design spaces, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10428–10436.

  • Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39(06), 1137–1149.

    Article  Google Scholar 

  • Schlosser, T., Friedrich, M., Beuth, F., & Kowerko, D. (2022). Improving automated visual fault inspection for semiconductor manufacturing using a hybrid multistage system of deep neural networks. Journal of Intelligent Manufacturing, 33(4), 1099–1123.

    Article  Google Scholar 

  • Singh, B., & Davis, L.S. (2018) An analysis of scale invariance in object detection snip, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3578–3587.

  • Singh, B., Najibi, M., & Davis, L. S. (2018). Sniper: Efficient multi-scale training. Advances in Neural Information Processing Systems, 31, 1–10.

    Google Scholar 

  • Song, Y., Liu, Z., Wang, J., Tang, R., Duan, G., & Tan, J. (2021). Multiscale adversarial and weighted gradient domain adaptive network for data scarcity surface defect detection. IEEE Transactions on Instrumentation and Measurement, 70, 1–10.

    Google Scholar 

  • Stern, M. L., & Schellenberger, M. (2021). Fully convolutional networks for chip-wise defect detection employing photoluminescence images: Efficient quality control in led manufacturing. Journal of Intelligent Manufacturing, 32, 113–126.

    Article  Google Scholar 

  • Sun, K., Xiao, B., Liu, D., & Wang, J. (2019). Deep high-resolution representation learning for human pose estimation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703.

  • Szegedy, C., Liu, W., Jia,Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9.

  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826.

  • Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning, in Proceedings of the AAAI conference on artificial intelligence, vol. 31, pp. 1–8.

  • Tang, R., Liu, Z., Li, Y., Song, Y., Liu,H., Wang, Q., Shao, J., Duan, G., & Tan, J. (2023). Task-balanced distillation for object detection. Pattern Recognition, 109320.

  • Tian, Z., Shen, C., Chen, H., & He, T. (2019). Fcos: Fully convolutional one-stage object detection, in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9627–9636.

  • Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10), 3349–3364.

    Article  Google Scholar 

  • Wang, Y., Liu, M., Zheng, P., Yang, H., & Zou, J. (2020). A smart surface inspection system using faster r-cnn in cloud-edge computing environment. Advanced Engineering Informatics, 43, 101037.

    Article  Google Scholar 

  • Woo, S., Park, J., Lee, J.-Y., & Kweon, I.S. (2018). Cbam: Convolutional block attention module, in Proceedings of the European conference on computer vision (ECCV), pp. 3–19.

  • Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017). Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1492–1500.

  • Yang, B., Liu, Z., Duan, G., & Tan, J. (2021). Mask2defect: A prior knowledge-based data augmentation method for metal surface defect inspection. IEEE Transactions on Industrial Informatics, 18(10), 6743–6755.

    Article  Google Scholar 

  • Zhang, S., Chi, C., Yao, Y., Lei, Z., & Li, S.Z. (2020). Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9759–9768.

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 52075480, in part by the Key Research and Development Program of Zhejiang Province under Grant 2022C01064, and in part by the High-level Talent Special Support Plan of Zhejiang Province under Grant 2020R52004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guifang Duan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, R., Liu, Z., Song, Y. et al. Hierarchical multi-scale network for cross-scale visual defect detection. J Intell Manuf 35, 1141–1157 (2024). https://doi.org/10.1007/s10845-023-02097-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-023-02097-1

Keywords

Navigation