Skip to main content
Log in

Pseudo-labeling and clustering-based active learning for imbalanced classification of wafer bin map defects

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Wafer bin map (WBM) defect patterns play a crucial role in identifying the root cause of manufacturing defects in the semiconductor industry. Although various deep learning-based approaches have been proposed for automated defect pattern classification, they often demand a large amount of labeled data for effective training. However, manual labeling is a costly and time-consuming process that requires specialized expertise. To address this challenge, this work introduces a novel active learning framework aimed at reducing the labeling cost by strategically selecting which WBMs should be labeled. An approach is proposed to mitigate issues related to class imbalance; the WBM patterns from identified classes are clustered, and a set of samples from each cluster is carefully selected, with a particular emphasis on classes with limited labeled data. This intelligent selection process effectively reduces human labeling efforts and mitigates problems associated with class-imbalanced training. The effectiveness of the proposed approach is demonstrated through significant improvements compared to other active learning methods. Remarkably, the state-of-the-art F1 score of 91.6% on the large-scale public WBM dataset, WM-811K, is achieved using only 4.3K labeled WBM images, while existing approaches require over 100K labeled images to achieve similar results. This outcome showcases the efficiency and practicality of the proposed approach. The code is available at (https://github.com/M-Siyamalan/PLAL).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

A publicly available dataset was used.

Notes

  1. https://www.kaggle.com/datasets/qingyi/wm811k-wafer-map.

References

  1. Adly, F., Alhussein, O., Yoo, P.D., Al-Hammadi, Y., Taha, K., Muhaidat, S., Jeong, Y.S., Lee, U., Ismail, M.: Simplified subspaced regression network for identification of defect patterns in semiconductor wafer maps. IEEE Trans. Industr. Inf. 11(6), 1267–1276 (2015)

    Article  Google Scholar 

  2. Adly, F., Yoo, P.D., Muhaidat, S., Al-Hammadi, Y., Lee, U., Ismail, M.: Randomized general regression network for identification of defect patterns in semiconductor wafer maps. IEEE Trans. Semicond. Manuf. 28(2), 145–152 (2015)

    Article  Google Scholar 

  3. Arthur, D., Vassilvitskii, S.: K-means++ the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, pp. 1027–1035 (2007)

  4. Chen, S., Liu, M., Hou, X., Zhu, Z., Huang, Z., Wang, T.: Wafer map defect pattern detection method based on improved attention mechanism. Expert Syst. Appl. 230, 120544 (2023)

    Article  Google Scholar 

  5. Chien, C.F., Hsu, S.C., Chen, Y.J.: A system for online detection and classification of wafer bin map defect patterns for manufacturing intelligence. Int. J. Prod. Res. 51(8), 2324–2338 (2013)

    Article  Google Scholar 

  6. Cho, J.W., Kim, D.J., Jung, Y., Kweon, I.S.: Mcdal: maximum classifier discrepancy for active learning (2022)

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)

  8. Houlsby, N., Huszár, F., Ghahramani, Z., Lengyel, M.: Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745 (2011)

  9. Hsu, C.Y., Chien, J.C.: Ensemble convolutional neural networks with weighted majority for wafer bin map pattern classification. J. Intell. Manuf. 33(3), 831–844 (2020)

  10. Hwa, R.: Sample selection for statistical parsing. Comput. Linguist. 30(3), 253–276 (2004)

    Article  MathSciNet  Google Scholar 

  11. Kahng, H., Kim, S.B.: Self-supervised representation learning for wafer bin map defect pattern classification. IEEE Trans. Semicond. Manuf. 34(1), 74–86 (2021)

    Article  Google Scholar 

  12. Kahng, H., Kim, S.B.: Self-supervised representation learning for wafer bin map defect pattern classification. IEEE Trans. Semicond. Manuf. 34(1), 74–86 (2021)

    Article  Google Scholar 

  13. Loshchilov, I., Hutter, F.: Sgdr: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)

  14. Manivannan, S.: An ensemble-based deep semi-supervised learning for the classification of wafer bin maps defect patterns. Comput. Ind. Eng. 172, 108614 (2022)

    Article  Google Scholar 

  15. Misra, S., Kim, D., Kim, J., Shin, W., Kim, C.: A voting-based ensemble feature network for semiconductor wafer defect classification. Sci. Rep. 12(1), 16254 (2022)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Nakazawa, T., Kulkarni, D.V.: Wafer map defect pattern classification and image retrieval using convolutional neural network. IEEE Trans. Semicond. Manuf. 31(2), 309–314 (2018)

    Article  Google Scholar 

  17. Piao, M., Jin, C.H., Lee, J.Y., Byun, J.Y.: Decision tree ensemble-based wafer map failure pattern recognition based on radon transform-based features. IEEE Trans. Semicond. Manuf. 31(2), 250–257 (2018)

    Article  Google Scholar 

  18. Scheffer, T., Decomain, C., Wrobel, S.: Active hidden markov models for information extraction. In: International symposium on intelligent data analysis, pp. 309–318. Springer (2001)

  19. Sener, O., Savarese, S.: Active learning for convolutional neural networks: a core-set approach. arXiv preprint arXiv:1708.00489 (2017)

  20. Settles, B., Craven, M.: An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the 2008 conference on empirical methods in natural language processing, pp. 1070–1079 (2008)

  21. Shim, J., Kang, S., Cho, S.: Active learning of convolutional neural network for cost-effective wafer map pattern classification. IEEE Trans. Semicond. Manuf. 33(2), 258–266 (2020)

    Article  Google Scholar 

  22. Shin, E., Yoo, C.D.: Efficient convolutional neural networks for semiconductor wafer bin map classification. Sensors 23(4), 1926 (2023)

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  23. Shin, E., Yoo, C.D.: Efficient convolutional neural networks for semiconductor wafer bin map classification. Sensors 23(4), 1926 (2023)

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  24. Shin, W., Kahng, H., Kim, S.B.: Mixup-based classification of mixed-type defect patterns in wafer bin maps. Comput. Ind. Eng. 167, 107996 (2022)

    Article  Google Scholar 

  25. Tong, S., Koller, D.: Support vector machine active learning with application sto text classification. In: Proceedings of the seventeenth international conference on machine learning, pp. 999–1006 (2000)

  26. Tsai, T.H., Lee, Y.C.: A light-weight neural network for wafer map classification based on data augmentation. IEEE Trans. Semicond. Manuf. 33(4), 663–672 (2020)

  27. Wang, K., Zhang, D., Li, Y., Zhang, R., Lin, L.: Cost-effective active learning for deep image classification. IEEE Trans. Circuits Syst. Video Technol. 27(12), 2591–2600 (2016)

  28. Wu, M.J., Jang, J.S.R., Chen, J.L.: Wafer map failure pattern recognition and similarity ranking for large-scale data sets. IEEE Trans. Semicond. Manuf. 28(1), 1–12 (2015)

    Article  CAS  Google Scholar 

  29. Yu, J., Zheng, X., Liu, J.: Stacked convolutional sparse denoising auto-encoder for identification of defect patterns in semiconductor wafer map. Comput. Ind. 109, 121–133 (2019)

    Article  Google Scholar 

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

All the work has been done by Siyamalan Manivannan.

Corresponding author

Correspondence to Siyamalan Manivannan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Manivannan, S. Pseudo-labeling and clustering-based active learning for imbalanced classification of wafer bin map defects. SIViP 18, 2391–2401 (2024). https://doi.org/10.1007/s11760-023-02915-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02915-2

Keywords

Navigation