Skip to main content
Log in

Detection and diagnosis of process fault using unsupervised learning methods and unlabeled data

  • Published:
International Journal of Advances in Engineering Sciences and Applied Mathematics Aims and scope Submit manuscript

Abstract

Supervised learning methods, commonly used for process monitoring, require labeled historical datasets for normal condition as well for each faulty condition, which demands significant effort in data mining. This article proposes a methodology combining principal component analysis (PCA) with the k-means clustering algorithm to automate detection and diagnosis of fault from unlabeled data. The k-means algorithm is used for fault detection and diagnosis by exploiting PCA for data mining. PCA is able to precisely detect and diagnose fault from large set of unlabeled historical data. The proposed method improves the online diagnosis by using clustering algorithm in the monitoring stage. Based on the Euclidean distance between each dataset and cluster centroid of the training data, the k-means clustering algorithm is able to decide if the process is at a normal state or belongs to a particular faulty state. To illustrate the effectiveness of the methodology, the proposed method is applied to two industrial processes: (i) a separator unit from an offshore gas processing platform and (ii) a distillation column of a crude refining unit. The results show that the proposed method is able to avoid the data labeling exercise and is effective in detecting and diagnosing fault in large-scale industrial processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Bakdi, A., Kouadri, A.: A new adaptive PCA based thresholding scheme for fault detection in complex systems. Chemom. Intell. Lab. Syst. 162, 83–93 (2017)

    Article  Google Scholar 

  2. Yin, S., Ding, S.X., Haghani, A., Hao, H., Zhang, P.: A comparison study of basic data-driven fault diagnosis and process-monitoring methods on the benchmark Tennessee Eastman process. J. Process Control 22(9), 1567–1581 (2012)

    Article  Google Scholar 

  3. Huang, J., Yan, X.: Dynamic process fault detection and diagnosis based on dynamic principal component analysis, dynamic independent component analysis and Bayesian inference. Chemom. Intell. Lab. Syst. 148, 115–127 (2015)

    Article  Google Scholar 

  4. Chiang, L.H., Russell, E.L., Braatz, R.D.: Fault diagnosis in chemical processes using fisher discriminant analysis, discriminant partial least squares, and principal component analysis. Chemom. Intell. Lab. Syst. 50(2), 243–252 (2000)

    Article  Google Scholar 

  5. Pearson, K.: LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 2(11), 559–572 (1901)

    Article  MATH  Google Scholar 

  6. Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24(6), 417 (1933)

    Article  MATH  Google Scholar 

  7. Kourti, T., MacGregor, J.F.: Multivariate SPC methods for process and product monitoring. J. Qual. Technol. 28(4), 409–428 (1996)

    Article  Google Scholar 

  8. Li, G., Hu, Y.: Improved sensor fault detection, diagnosis and estimation for screw chillers using density-based clustering and principal component analysis. Energy Buildings 173, 502–515 (2018)

    Article  Google Scholar 

  9. Sebzalli, Y.M., Wang, X.Z.: Knowledge discovery from process operational data using PCA and fuzzy clustering. Eng. Appl. Artif. Intell. 14(5), 607–616 (2001)

    Article  Google Scholar 

  10. Srinivasan, R., Wang, C., Ho, W.K., Lim, K.W.: Dynamic principal component analysis based methodology for clustering process states in agile chemical plants. Ind. Eng. Chem. Res. 43(9), 2123–2139 (2004)

    Article  Google Scholar 

  11. Imtiaz, S.A., Shah, S.L., Patwardhan, R., Palizban, H.A., Ruppenstein, J.: Detection, diagnosis and root cause analysis of sheet-break in a pulp and paper mill with economic impact analysis. Can. J. Chem. Eng. 85(4), 512–525 (2007)

    Article  Google Scholar 

  12. Lam, J.C., Wan, K.K., Cheung, K.L.: An analysis of climatic influences on chiller plant electricity consumption. Appl. Energy 86(6), 933–940 (2009)

    Article  Google Scholar 

  13. Lam, J.C., Wan, K.K., Cheung, K.L., Yang, L.: Principal component analysis of electricity use in office buildings. Energy Buildings 40(5), 828–836 (2008)

    Article  Google Scholar 

  14. Du, Z., Chen, L., Jin, X.: Data-driven based reliability evaluation for measurements of sensors in a vapor compression system. Energy 122, 237–248 (2017)

    Article  Google Scholar 

  15. Zanoli, S. M., Astolfi, G., & Barboni, L. (2010, October). FDI of process faults based on PCA and cluster analysis. In 2010 Conference on Control and Fault-Tolerant Systems (SysTol). IEEE, Nice, pp. 197–202.

  16. He, Q.P., Wang, J.: Large-scale semiconductor process fault detection using a fast pattern recognition-based method. IEEE Trans. Semicond. Manuf. 23(2), 194–200 (2010)

    Article  Google Scholar 

  17. Guo, X., Yuan, J., Li, Y.: Feature space k nearest neighbor based batch process monitoring. Acta Autom. Sin. 40(1), 135–142 (2014)

    Google Scholar 

  18. de Andrade, M.A.H., de Carvalho, M.M.A., da Silva, R.F., de Souza, G.F.: A framework to automate fault detection and diagnosis based on moving window principal component analysis and Bayesian network. Reliab. Eng. Syst. Saf. 1(215), 107837 (2021)

    Google Scholar 

  19. Rahoma, A., Imtiaz, S., Ahmed, S.: A new criterion for selection of non-zero loadings for sparse principal component analysis (SPCA). Can. J. Chem. Eng. 99, S356–S368 (2021)

    Article  Google Scholar 

  20. Rahoma, A., Imtiaz, S., Ahmed, S.: Sparse principal component analysis using bootstrap method. Chem. Eng. Sci. 246, 116890 (2021)

    Article  Google Scholar 

  21. Abid, A., Khan, M.T., Iqbal, J.: A review on fault detection and diagnosis techniques: basics and beyond. Artif. Intell. Rev. 54(5), 3639–3664 (2021)

    Article  Google Scholar 

  22. Arunthavanathan, R., Khan, F., Ahmed, S., Imtiaz, S.: An analysis of process fault diagnosis methods from safety perspectives. Comput. Chem. Eng. 145, 107197 (2021)

    Article  Google Scholar 

  23. Brito, L.C., Susto, G.A., Brito, J.N., Duarte, M.A.: An explainable artificial intelligence approach for unsupervised fault detection and diagnosis in rotating machinery. Mech. Syst. Signal Process. 15(163), 108105 (2022)

    Article  Google Scholar 

  24. Arunthavanathan, R., Khan, F., Ahmed, S., Imtiaz, S.: A deep learning model for process fault prognosis. Process Saf. Environ. Prot. 154, 467–479 (2021)

    Article  Google Scholar 

  25. Arunthavanathan, R., Khan, F., Ahmed, S., Imtiaz, S.: Autonomous fault diagnosis and root cause analysis for the processing system using one-class SVM and NN permutation algorithm. Ind. Eng. Chem. Res. 61(3), 1408–1422 (2022)

    Article  Google Scholar 

  26. Amin, M.T., Khan, F., Ahmed, S., Imtiaz, S.: Risk-based fault detection and diagnosis for nonlinear and non-Gaussian process systems using R-vine copula. Process Saf. Environ. Prot. 150, 123–136 (2021)

    Article  Google Scholar 

  27. Bakshi, B.R.: Multiscale PCA with application to multivariate statistical process monitoring. AIChE J. 44(7), 1596–1610 (1998)

    Article  Google Scholar 

  28. Li, M., Ju, Y.: The analysis of the operating performance of a chiller system based on hierarchal cluster method. Energy Buildings 138, 695–703 (2017)

    Article  Google Scholar 

  29. Zhang, H., Chen, H., Guo, Y., Wang, J., Li, G., Shen, L.: Sensor fault detection and diagnosis for a water source heat pump air-conditioning system based on PCA and preprocessed by combined clustering. Appl. Therm. Eng. 160, 114098 (2019)

    Article  Google Scholar 

  30. Shi, C., Wei, B., Wei, S., et al.: A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm. J. Wirel. Com Netw. 2021, 31 (2021). https://doi.org/10.1186/s13638-021-01910-w

    Article  Google Scholar 

  31. Singh, H., Kaur, K.: New method for finding initial cluster centroids in k-means algorithm. Int. J. Comput. Appl. 74(6), 27–30 (2013)

    Google Scholar 

  32. Wang, F., Franco-Penya, H.H., Kelleher, J.D., Pugh, J., Ross, R.: An analysis of the application of simplified silhouette to the evaluation of k-means clustering validity. In: Perner, P. (ed.) Machine Learning and Data Mining in Pattern Recognition MLDM 2017 Lecture Notes in Computer Science, vol. 10358. Springer, Cham (2017)

    Google Scholar 

  33. Peña, J.M., Lozano, J.A., Larrañaga, P.: An empirical comparison of four initialization methods for the K-Means algorithm. Pattern Recogn. Lett. 20(10), 1027–1040 (1999)

    Article  Google Scholar 

  34. Khaled, M.S., Imtiaz, S., Ahmed, S., Zendehboudi, S.: Dynamic simulation of offshore gas processing plant for normal and abnormal operations. Chem. Eng. Sci. 230, 116159 (2021)

    Article  Google Scholar 

  35. Voldsund, M., Ertesvåg, I.S., He, W., Kjelstrup, S.: Exergy analysis of the oil and gas processing on a North Sea oil platform a real production day. Energy 55, 716–727 (2013)

    Article  Google Scholar 

Download references

Funding

The authors would like to thank the Libyan government for funding this research through student scholarships.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salim Ahmed.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahoma, A., Imtiaz, S., Ahmed, S. et al. Detection and diagnosis of process fault using unsupervised learning methods and unlabeled data. Int J Adv Eng Sci Appl Math 15, 24–36 (2023). https://doi.org/10.1007/s12572-023-00327-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12572-023-00327-6

Keywords

Navigation