Abstract
The imbalance of fault modes prevails in industrial equipment monitoring. Many methods were presented for imbalanced fault diagnosis only by resampling labeled fault dataset, which limited the diagnostic performance due to information loss from unlabeled fault dataset. To perfectly exploit the information from unlabeled and labeled datasets, this study proposed a semi-supervised ensemble learning method termed as SSTI for imbalanced fault diagnosis. First, the sample information was evaluated based on Mahalanobis distance, and a novel sample information-based synthetic minority oversampling technique (SI-SMOTE) was presented for balancing the labeled dataset. Second, the tri-training architecture-based imbalanced co-training technique (Tri-ImCT) was developed to exploit the information contained in the unlabeled dataset. In the Tri-ImCT, rebalancing the training subsets and variable weighted voting were utilized to improve the performance of proposed method for imbalanced fault diagnosis. To verify the performance of proposed method, several experiments were carried out on several imbalanced datasets derived from two bearing datasets and one subway wheel dataset. We utilized three indicators of G-mean, average precision, and average F-score for evaluating the performance of classifiers. Experimental results show that the performance of proposed method exceeds that of other methods, which is very close to the upper bound of fully-supervised performance. It substantially indicates that this study provides a very promising methodology for imbalanced fault diagnosis.
Similar content being viewed by others
References
Abdi, L., & Hashemi, S. (2016). To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Transactions on Knowledge and Data Engineering, 28(1), 238–251.
Abedin, M. Z., Guotai, C., & Moula, F. E. (2019). Weighted SMOTE-ensemble algorithms: Evidence from chinese imbalance credit approval instances. In 2nd International Conference on Data Intelligence and Security (pp 208–211).
Al Majzoub, H., & Elgedawy, I. (2020). AB-SMOTE: An affinitive borderline SMOTE approach for imbalanced data binary classification. International Journal of Machine Learning and Computing, 10(1), 39–45.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
Chen, R., Zhu, J., Hu, X., Wu, H., Xu, X., & Han, X. (2021a). Fault diagnosis method of rolling bearing based on multiple classifier ensemble of the weighted and balanced distribution adaptation under limited sample imbalance. ISA Transactions, 114, 434–443.
Chen, X., Wang, Z., Zhang, Z., Jia, L., & Qin, Y. (2018). A semi-supervised approach to bearing fault diagnosis under variable conditions towards imbalanced unlabeled data. Sensors, 18(7), 2097.
Chen, X., Zhang, B., & Gao, D. (2021b). Bearing fault diagnosis base on multi-scale CNN and LSTM model. Journal of Intelligent Manufacturing, 32(4), 971–987.
Estabrooks, A., Jo, T., & Japkowicz, N. (2004). A multiple resampling method for learning from imbalanced data sets. Computational Intelligence, 20(1), 18–36.
Fan, S., Zhang, X., & Song, Z. (2021). Imbalanced sample selection with deep reinforcement learning for fault diagnosis. IEEE Transactions on Industrial Informatics, 18(4), 2518–2527.
Fan, Y., Cui, X., Han, H., & Lu, H. (2019). Chiller fault diagnosis with field sensors using the technology of imbalanced data. Applied Thermal Engineering, 159, 113933.
Gousseau, W., Antoni, J., Girardin, F., & Griffaton, J. (2016). Analysis of the rolling element bearing data set of the center for intelligent maintenance systems of the University of Cincinnati. In 13th international conference on condition monitoring and machinery failure prevention technologies (pp. 1–16)
Guannan, L., Huanxin, C., Yunpeng, H., Jiangyu, W., Yabin, G., Jiangyan, L., et al. (2018). An improved decision tree-based fault diagnosis method for practical variable refrigerant flow system using virtual sensor-based fault indicators. Applied Thermal Engineering, 129, 1292–1303.
Han, H., Zhang, Z., Cui, X., & Meng, Q. (2020). Ensemble learning with member optimization for fault diagnosis of a building energy system. Energy and Buildings, 226, 110351.
Han, S., & Jeong, J. (2020). An weighted CNN ensemble model with small amount of data for bearing fault diagnosis. Procedia Computer Science, 175, 88–95.
Han, H., Wang, W. Y., & Mao, B. H. (2005). Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing (pp 878–887).
He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (pp 1322–1328).
He, Q., Pang, Y., Jiang, G., & Xie, P. (2020). A spatio-temporal multiscale neural network approach for wind turbine fault diagnosis with imbalanced SCADA data. IEEE Transactions on Industrial Informatics, 17(10), 6875–6884.
Jian, C., Gao, J., & Ao, Y. (2016). A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing, 193, 115–122.
Jian, C., Yang, K., & Ao, Y. (2021). Industrial fault diagnosis based on active learning and semi-supervised learning using small training set. Engineering Applications of Artificial Intelligence, 104, 104365.
Jianan, W., Haisong, H., Liguo, Y., Yao, H., Qingsong, F., & Dong, H. (2020). New imbalanced fault diagnosis framework based on Cluster-MWMOTE and MFO-optimized LS-SVM using limited and complex bearing data. Engineering Applications of Artificial Intelligence, 96, 103966.
Kohavi, R., & Wolpert, D. H. (1996). Bias plus variance decomposition for zero-one loss functions. In Proceedings of Thirteenth International Conference on Machine Learning (pp 275–283).
Last, F., Douzas, G., & Bacao, F. (2017). Oversampling for imbalanced learning based on K-Means and SMOTE. arXiv preprint arXiv:1711.00837.
Li, J., & Lin, M. (2021). Ensemble learning with diversified base models for fault diagnosis in nuclear power plants. Annals of Nuclear Energy, 158, 108265.
Li, S., Wang, Z., Zhou, G., & Lee, S. Y. M. (2011). Semi-supervised learning for imbalanced sentiment classification. In 22nd International Joint Conference on Artificial Intelligence (pp 1826–1831).
Liu, L., Wang, A., Sha, M., Sun, X., & Li, Y. (2011). Optional SVM for fault diagnosis of blast furnace with imbalanced data. ISIJ International, 51(9), 1474–1479.
Luo, J., Huang, J., & Li, H. (2021). A case study of conditional deep convolutional generative adversarial networks in machine fault diagnosis. Journal of Intelligent Manufacturing, 32(2), 407–425.
Nguyen, H. M., Cooper, E. W., & Kamei, K. (2011). Borderline over-sampling for imbalanced data classification. International Journal of Knowledge Engineering and Soft Data Paradigms, 3(1), 4–21.
Prusty, M. R., Jayanthi, T., & Velusamy, K. (2017). Weighted-SMOTE: A modification to SMOTE for event classification in sodium cooled fast reactors. Progress in Nuclear Energy, 100, 355–364.
Qifa, X., Shixiang, L., Weiyin, J., & Cuixia, J. (2020). Imbalanced fault diagnosis of rotating machinery via multi-domain feature extraction and cost-sensitive learning. Journal of Intelligent Manufacturing, 31(6), 1467–1481.
Qiu, H., Lee, J., Lin, J., & Yu, G. (2006). Wavelet filter-based weak signature detection method and its application on rolling element bearing prognostics. Journal of Sound & Vibration, 289(4–5), 1066–1090.
Santos, P., Maudes, J., & Bustillo, A. (2018). Identifying maximum imbalance in datasets for fault diagnosis of gearboxes. Journal of Intelligent Manufacturing, 29(2), 333–351.
Shan, Z., Xiuying, W., Xiangjun, D., Sen, Z., Zuyin, X., & Feng, D. (2021). Kernelized mahalanobis distance for fuzzy clustering. IEEE Transactions on Fuzzy Systems, 29(10), 3103–3117.
Shi, Q., & Zhang, H. (2020). Fault diagnosis of an autonomous vehicle with an improved SVM algorithm subject to unbalanced datasets. IEEE Transactions on Industrial Electronics, 68(7), 6248–6256.
Wade, A., Smith, R. B., & & Randall,. (2015). Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study. Mechanical Systems and Signal Processing, 64, 100–131.
Wang, J.-B., Zou, C.-A., Fu, G.-H., & Risi, M. (2021). AWSMOTE: An SVM-based adaptive weighted SMOTE for class-imbalance learning. Scientific Programming, 2021, 9947621.
Wei, C., Sohn, K., Mellina, C., Yuille, A., & Yang, F. (2021). Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10857–10866).
Xiang, L., Wei, Z., Qian, D., & Jian-Qiao, S. (2020). Intelligent rotating machinery fault diagnosis based on deep learning using data augmentation. Journal of Intelligent Manufacturing, 31(2), 433–452.
Yang, X., Kuang, Q., Zhang, W., & Zhang, G. (2018). AMDO: An over-sampling technique for multi-class imbalanced problems. IEEE Transactions on Knowledge and Data Engineering, 30(9), 1672–1685.
Yao, L., & Lin, T. B. (2021). Evolutionary mahalanobis distance-based oversampling for multi-class imbalanced data classification. Sensors, 21, 6616.
Yuyan, Z., Xinyu, L., Liang, G., Lihui, W., & Long, W. (2018). Imbalanced data fault diagnosis of rotating machinery using synthetic oversampling and feature learning. Journal of Manufacturing Systems, 48, 34–50.
Zhang, H., Wang, R., Pan, R., & Pan, H. (2020). Imbalanced fault diagnosis of rolling bearing using enhanced generative adversarial networks. IEEE Access, 8, 185950–185963.
Zhou, Z.-H., & Li, M. (2005). Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 17(11), 1529–1541.
Acknowledgements
This project was supported by the Guangdong Provincial Key Laboratory of Cyber-Physical Systems (No. 2016B030301008) and Key Project of Youth Fund of Guangdong University of Technology (No. 17QNZD001).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jian, C., Ao, Y. Imbalanced fault diagnosis based on semi-supervised ensemble learning. J Intell Manuf 34, 3143–3158 (2023). https://doi.org/10.1007/s10845-022-01985-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10845-022-01985-2