Abstract
In real-world production data, missing values often occur randomly or systematically with various missing patterns. Missing values need to be handled properly to build effective prediction models. This paper presents a novel method based on graph representation and graph neural networks for improving prediction in missing value conditions. To utilize the entire information of a training dataset without direct manipulation, all instances of the dataset are represented as graphs of varying sizes, in which nodes and edges represent the observed input variables and their pairwise relationships. Prediction models learn from the graph representations. These models can make predictions of unknown labels for new instances that have arbitrary missing patterns. The superiority of the proposed method was investigated on seven different product failure prediction tasks from a home appliance manufacturer. The proposed method outperformed all other methods in six of the seven tasks.
Similar content being viewed by others
References
Battaglia PW, Hamrick JB, Bapst V, Sanchez-Gonzalez A, Zambaldi V, Malinowski M, Tacchetti A, Raposo D, Santoro A, Faulkner R, et al (2018) Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:180601261
Camino RD, Hammerschmidt CA, State R (2019) Improving missing data imputation with deep generative models. arXiv preprint arXiv:190210666
Chien CF, Wang WC, Cheng JC (2007) Data mining for yield enhancement in semiconductor manufacturing and an empirical study. Expert Syst Appl 33(1):192–198
Chien CF, Liu CW, Chuang SC (2017) Analysing semiconductor manufacturing big data for root cause detection of excursion for yield enhancement. Int J Prod Res 55(17):5095–5107
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of conference on empirical methods in natural language processing, pp 1724–1734
Choudhary AK, Harding JA, Tiwari MK (2009) Data mining in manufacturing: a review based on the kind of knowledge. J Intel Manuf 20(5):501–521
Choudhury SJ, Pal NR (2019) Imputation of missing data with neural networks for classification. Knowl Based Syst 182:104838
García-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review. Neural Comput Appl 19(2):263–282
Ghorai S, Mukherjee A, Gangadaran M, Dutta PK (2013) Automatic defect detection on hot-rolled flat steel products. IEEE Trans Instrum Meas 62(3):612–621
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: Proceedings of international conference on machine learning, pp 1263–1272
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
He W, Li Z, Chen CLP (2017) A survey of human-centered intelligent robots: issues and challenges. IEEE/CAA J Autom Sin 4(4):602–609
Hong TP, Wu CW (2011) Mining rules from an incomplete dataset with a high missing rate. Expert Syst Appl 38(4):3931–3936
Jia X, Di Y, Feng J, Yang Q, Dai H, Lee J (2018) Adaptive virtual metrology for semiconductor chemical mechanical planarization process using gmdh-type polynomial neural networks. J Process Control 62:44–54
Kang S (2020) Joint modeling of classification and regression for improving faulty wafer detection in semiconductor manufacturing. J Intel Manuf 31:319–326
Kang S, Kim E, Shim J, Cho S, Chang W, Kim J (2017) Mining the relationship between production and customer service data for failure analysis of industrial products. Comput Ind Eng 106:137–146
Kang S, Kim E, Shim J, Chang W, Cho S (2018) Product failure prediction with missing data. Int J Prod Res 56(14):4849–4859. https://doi.org/10.1080/00207543.2017.1407883
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of international conference on learning representations
Köksal G, Batmaz I, Testik MC (2011) A review of data mining applications for quality improvement in manufacturing industry. Expert Syst Appl 38(10):13448–13467
Kumar S, Chow TWS, Pecht M (2010) Approach to fault identification for electronic products using mahalanobis distance. IEEE Trans Instrum Meas 59(8):2055–2064
Kwak DS, Kim KJ (2012) A data mining approach considering missing values for the optimization of semiconductor-manufacturing processes. Expert Syst Appl 39(3):2590–2596
Lakshminarayan K, Harp SA, Samad T (1999) Imputation of missing data in industrial databases. Appl Intel 11(3):259–275
Li Y, Tarlow D, Brockschmidt M, Zemel R (2016) Gated graph sequence neural networks. In: Proceedings of international conference on learning representations
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New Jersey
Myneni MB, Srividya Y, Dandamudi A (2017) Correlated cluster-based imputation for treatment of missing values. In: Proceedings of international conference on computational intelligence and informatics, pp 171–178
Qin SJ (2014) Process data analytics in the era of big data. AICHE J 60(9):3092–3100
Rodriguez JD, Perez A, Lozano JA (2010) Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans Pattern Anal Mach Intel 32(3):569–575
Shin CK, Park SC (2000) A machine learning approach to yield management in semiconductor manufacturing. Int J Prod Res 38(17):4261–4271
Siddiqui YA, Saif AWA, Cheded L, Elshafei M, Rahim A (2015) Integration of multivariate statistical process control and engineering process control: a novel framework. Int J Adv Manuf Technol 78(1–4):259–268
Silva LO, Zárate LE (2014) A brief review of the main approaches for treatment of missing data. Intel Data Anal 18(6):1177–1198
Sivathanu AK, Subramanian S (2018) Extended kalman filter for fouling detection in thermal power plant reheater. Control Eng Pract 73:91–99. https://doi.org/10.1016/j.conengprac.2018.01.005
Tao F, Qi Q, Liu A, Kusiak A (2018) Data-driven smart manufacturing. J Manuf Syst 48:157–169. https://doi.org/10.1016/j.jmsy.2018.01.006
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2018) Graph attention networks. In: Proceedings of international conference on learning representations
Wang Z, Liu L, Zhang H (2017) Neural network-based model-free adaptive fault-tolerant control for discrete-time nonlinear systems with sensor fault. IEEE Trans Syst Man Cybern Syst 47(8):2351–2362
Wuest T, Weimer D, Irgens C, Thoben KD (2016) Machine learning in manufacturing: advantages, challenges, and applications. Prod Manuf Res 4(1):23–45
Yang CC (2008) Improving the definition and quantification of quality costs. Total Qual Manag 19(3):175–191
Yang HC, Tieng H, Cheng FT (2016) Automatic virtual metrology for wheel machining automation. Int J Prod Res 54(21):6367–6377
Yin S, Ding SX, Xie X, Luo H (2014) A review on basic data-driven approaches for industrial process monitoring. IEEE Trans Ind Electr 61(11):6418–6428
Yoon J, Jordon J, Schaar M (2018) GAIN: Missing data imputation using generative adversarial nets. In: Proceedings of international conference on machine learning, pp 5675–5684
Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2018) Graph neural networks: a review of methods and applications. arXiv preprint arXiv:181208434
Zhu X, Zhang S, Jin Z, Zhang Z, Xu Z (2011) Missing value estimation for mixed-attribute data sets. IEEE Trans Knowl Data Eng 23(1):110–121
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT; Ministry of Science and ICT) (Nos. NRF-2019R1A4A1024732 and NRF-2020R1C1C1003232).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares no conflict of interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kang, S. Product failure prediction with missing data using graph neural networks. Neural Comput & Applic 33, 7225–7234 (2021). https://doi.org/10.1007/s00521-020-05486-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05486-2