Skip to main content
Log in

Imbalanced fault diagnosis based on semi-supervised ensemble learning

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

The imbalance of fault modes prevails in industrial equipment monitoring. Many methods were presented for imbalanced fault diagnosis only by resampling labeled fault dataset, which limited the diagnostic performance due to information loss from unlabeled fault dataset. To perfectly exploit the information from unlabeled and labeled datasets, this study proposed a semi-supervised ensemble learning method termed as SSTI for imbalanced fault diagnosis. First, the sample information was evaluated based on Mahalanobis distance, and a novel sample information-based synthetic minority oversampling technique (SI-SMOTE) was presented for balancing the labeled dataset. Second, the tri-training architecture-based imbalanced co-training technique (Tri-ImCT) was developed to exploit the information contained in the unlabeled dataset. In the Tri-ImCT, rebalancing the training subsets and variable weighted voting were utilized to improve the performance of proposed method for imbalanced fault diagnosis. To verify the performance of proposed method, several experiments were carried out on several imbalanced datasets derived from two bearing datasets and one subway wheel dataset. We utilized three indicators of G-mean, average precision, and average F-score for evaluating the performance of classifiers. Experimental results show that the performance of proposed method exceeds that of other methods, which is very close to the upper bound of fully-supervised performance. It substantially indicates that this study provides a very promising methodology for imbalanced fault diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  • Abdi, L., & Hashemi, S. (2016). To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Transactions on Knowledge and Data Engineering, 28(1), 238–251.

    Article  Google Scholar 

  • Abedin, M. Z., Guotai, C., & Moula, F. E. (2019). Weighted SMOTE-ensemble algorithms: Evidence from chinese imbalance credit approval instances. In 2nd International Conference on Data Intelligence and Security (pp 208–211).

  • Al Majzoub, H., & Elgedawy, I. (2020). AB-SMOTE: An affinitive borderline SMOTE approach for imbalanced data binary classification. International Journal of Machine Learning and Computing, 10(1), 39–45.

    Article  Google Scholar 

  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

    Article  Google Scholar 

  • Chen, R., Zhu, J., Hu, X., Wu, H., Xu, X., & Han, X. (2021a). Fault diagnosis method of rolling bearing based on multiple classifier ensemble of the weighted and balanced distribution adaptation under limited sample imbalance. ISA Transactions, 114, 434–443.

    Article  Google Scholar 

  • Chen, X., Wang, Z., Zhang, Z., Jia, L., & Qin, Y. (2018). A semi-supervised approach to bearing fault diagnosis under variable conditions towards imbalanced unlabeled data. Sensors, 18(7), 2097.

    Article  Google Scholar 

  • Chen, X., Zhang, B., & Gao, D. (2021b). Bearing fault diagnosis base on multi-scale CNN and LSTM model. Journal of Intelligent Manufacturing, 32(4), 971–987.

    Article  Google Scholar 

  • Estabrooks, A., Jo, T., & Japkowicz, N. (2004). A multiple resampling method for learning from imbalanced data sets. Computational Intelligence, 20(1), 18–36.

    Article  Google Scholar 

  • Fan, S., Zhang, X., & Song, Z. (2021). Imbalanced sample selection with deep reinforcement learning for fault diagnosis. IEEE Transactions on Industrial Informatics, 18(4), 2518–2527.

    Article  Google Scholar 

  • Fan, Y., Cui, X., Han, H., & Lu, H. (2019). Chiller fault diagnosis with field sensors using the technology of imbalanced data. Applied Thermal Engineering, 159, 113933.

    Article  Google Scholar 

  • Gousseau, W., Antoni, J., Girardin, F., & Griffaton, J. (2016). Analysis of the rolling element bearing data set of the center for intelligent maintenance systems of the University of Cincinnati. In 13th international conference on condition monitoring and machinery failure prevention technologies (pp. 1–16)

  • Guannan, L., Huanxin, C., Yunpeng, H., Jiangyu, W., Yabin, G., Jiangyan, L., et al. (2018). An improved decision tree-based fault diagnosis method for practical variable refrigerant flow system using virtual sensor-based fault indicators. Applied Thermal Engineering, 129, 1292–1303.

    Article  Google Scholar 

  • Han, H., Zhang, Z., Cui, X., & Meng, Q. (2020). Ensemble learning with member optimization for fault diagnosis of a building energy system. Energy and Buildings, 226, 110351.

    Article  Google Scholar 

  • Han, S., & Jeong, J. (2020). An weighted CNN ensemble model with small amount of data for bearing fault diagnosis. Procedia Computer Science, 175, 88–95.

    Article  Google Scholar 

  • Han, H., Wang, W. Y., & Mao, B. H. (2005). Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing (pp 878–887).

  • He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (pp 1322–1328).

  • He, Q., Pang, Y., Jiang, G., & Xie, P. (2020). A spatio-temporal multiscale neural network approach for wind turbine fault diagnosis with imbalanced SCADA data. IEEE Transactions on Industrial Informatics, 17(10), 6875–6884.

    Article  Google Scholar 

  • Jian, C., Gao, J., & Ao, Y. (2016). A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing, 193, 115–122.

    Article  Google Scholar 

  • Jian, C., Yang, K., & Ao, Y. (2021). Industrial fault diagnosis based on active learning and semi-supervised learning using small training set. Engineering Applications of Artificial Intelligence, 104, 104365.

    Article  Google Scholar 

  • Jianan, W., Haisong, H., Liguo, Y., Yao, H., Qingsong, F., & Dong, H. (2020). New imbalanced fault diagnosis framework based on Cluster-MWMOTE and MFO-optimized LS-SVM using limited and complex bearing data. Engineering Applications of Artificial Intelligence, 96, 103966.

    Article  Google Scholar 

  • Kohavi, R., & Wolpert, D. H. (1996). Bias plus variance decomposition for zero-one loss functions. In Proceedings of Thirteenth International Conference on Machine Learning (pp 275–283).

  • Last, F., Douzas, G., & Bacao, F. (2017). Oversampling for imbalanced learning based on K-Means and SMOTE. arXiv preprint arXiv:1711.00837.

  • Li, J., & Lin, M. (2021). Ensemble learning with diversified base models for fault diagnosis in nuclear power plants. Annals of Nuclear Energy, 158, 108265.

    Article  Google Scholar 

  • Li, S., Wang, Z., Zhou, G., & Lee, S. Y. M. (2011). Semi-supervised learning for imbalanced sentiment classification. In 22nd International Joint Conference on Artificial Intelligence (pp 1826–1831).

  • Liu, L., Wang, A., Sha, M., Sun, X., & Li, Y. (2011). Optional SVM for fault diagnosis of blast furnace with imbalanced data. ISIJ International, 51(9), 1474–1479.

    Article  Google Scholar 

  • Luo, J., Huang, J., & Li, H. (2021). A case study of conditional deep convolutional generative adversarial networks in machine fault diagnosis. Journal of Intelligent Manufacturing, 32(2), 407–425.

    Article  Google Scholar 

  • Nguyen, H. M., Cooper, E. W., & Kamei, K. (2011). Borderline over-sampling for imbalanced data classification. International Journal of Knowledge Engineering and Soft Data Paradigms, 3(1), 4–21.

    Article  Google Scholar 

  • Prusty, M. R., Jayanthi, T., & Velusamy, K. (2017). Weighted-SMOTE: A modification to SMOTE for event classification in sodium cooled fast reactors. Progress in Nuclear Energy, 100, 355–364.

    Article  Google Scholar 

  • Qifa, X., Shixiang, L., Weiyin, J., & Cuixia, J. (2020). Imbalanced fault diagnosis of rotating machinery via multi-domain feature extraction and cost-sensitive learning. Journal of Intelligent Manufacturing, 31(6), 1467–1481.

    Article  Google Scholar 

  • Qiu, H., Lee, J., Lin, J., & Yu, G. (2006). Wavelet filter-based weak signature detection method and its application on rolling element bearing prognostics. Journal of Sound & Vibration, 289(4–5), 1066–1090.

    Article  Google Scholar 

  • Santos, P., Maudes, J., & Bustillo, A. (2018). Identifying maximum imbalance in datasets for fault diagnosis of gearboxes. Journal of Intelligent Manufacturing, 29(2), 333–351.

    Article  Google Scholar 

  • Shan, Z., Xiuying, W., Xiangjun, D., Sen, Z., Zuyin, X., & Feng, D. (2021). Kernelized mahalanobis distance for fuzzy clustering. IEEE Transactions on Fuzzy Systems, 29(10), 3103–3117.

    Article  Google Scholar 

  • Shi, Q., & Zhang, H. (2020). Fault diagnosis of an autonomous vehicle with an improved SVM algorithm subject to unbalanced datasets. IEEE Transactions on Industrial Electronics, 68(7), 6248–6256.

    Article  Google Scholar 

  • Wade, A., Smith, R. B., & & Randall,. (2015). Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study. Mechanical Systems and Signal Processing, 64, 100–131.

    Google Scholar 

  • Wang, J.-B., Zou, C.-A., Fu, G.-H., & Risi, M. (2021). AWSMOTE: An SVM-based adaptive weighted SMOTE for class-imbalance learning. Scientific Programming, 2021, 9947621.

    Google Scholar 

  • Wei, C., Sohn, K., Mellina, C., Yuille, A., & Yang, F. (2021). Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 10857–10866).

  • Xiang, L., Wei, Z., Qian, D., & Jian-Qiao, S. (2020). Intelligent rotating machinery fault diagnosis based on deep learning using data augmentation. Journal of Intelligent Manufacturing, 31(2), 433–452.

    Article  Google Scholar 

  • Yang, X., Kuang, Q., Zhang, W., & Zhang, G. (2018). AMDO: An over-sampling technique for multi-class imbalanced problems. IEEE Transactions on Knowledge and Data Engineering, 30(9), 1672–1685.

    Article  Google Scholar 

  • Yao, L., & Lin, T. B. (2021). Evolutionary mahalanobis distance-based oversampling for multi-class imbalanced data classification. Sensors, 21, 6616.

    Article  Google Scholar 

  • Yuyan, Z., Xinyu, L., Liang, G., Lihui, W., & Long, W. (2018). Imbalanced data fault diagnosis of rotating machinery using synthetic oversampling and feature learning. Journal of Manufacturing Systems, 48, 34–50.

    Article  Google Scholar 

  • Zhang, H., Wang, R., Pan, R., & Pan, H. (2020). Imbalanced fault diagnosis of rolling bearing using enhanced generative adversarial networks. IEEE Access, 8, 185950–185963.

    Article  Google Scholar 

  • Zhou, Z.-H., & Li, M. (2005). Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 17(11), 1529–1541.

    Article  Google Scholar 

Download references

Acknowledgements

This project was supported by the Guangdong Provincial Key Laboratory of Cyber-Physical Systems (No. 2016B030301008) and Key Project of Youth Fund of Guangdong University of Technology (No. 17QNZD001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuanxia Jian.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jian, C., Ao, Y. Imbalanced fault diagnosis based on semi-supervised ensemble learning. J Intell Manuf 34, 3143–3158 (2023). https://doi.org/10.1007/s10845-022-01985-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-022-01985-2

Keywords

Navigation