Journal of Intelligent Manufacturing

, Volume 29, Issue 2, pp 333–351 | Cite as

Identifying maximum imbalance in datasets for fault diagnosis of gearboxes

  • Pedro Santos
  • Jesús Maudes
  • Andres BustilloEmail author


Research into fault diagnosis in rotating machinery with a wide range of variable loads and speeds, such as the gearboxes of wind turbines, is of great industrial interest. Although appropriate sensors have been identified, an intelligent system that classifies machine states remains an open issue, due to a paucity of datasets with sufficient fault cases. Many of the proposed solutions have been tested on balanced datasets, containing roughly equal percentages of wind-turbine failure instances and instances of correct performance. In practice, however, it is not possible to obtain balanced datasets under real operating conditions. Our objective is to identify the most suitable classification technique that will depend least of all on the level of imbalance in the dataset. We start by analysing different metrics for the comparison of classification techniques on imbalanced datasets. Our results pointed to the Unweighted Macro Average of the F-measure, which we consider the most suitable metric for this diagnosis. Then, an extensive set of classification techniques was tested on datasets with varying levels of imbalance. Our conclusion is that a Rotation Forest ensemble of C4.4 decision trees, modifying the training phase of the classifier with a cost-sensitive approach, is the most suitable prediction model for this industrial task. It maintained its good performance even when the minority classes rate was as low as 6.5 %, while the majority of the other classifiers were more sensitive to the level of database imbalance and failed standard performance objectives, when the minority classes rate was lower than 10.5 %.


Fault diagnosis Multi-class imbalance Wind turbines Ensembles Metrics Gearbox 



This research project has received funding from the Spanish government through Projects CENIT-2008-1028, TIN2011-24046 and IPT-2011-1265-020000 of the Ministerio de Economía y Competitividad [Ministry of Economy and Competitiveness]. Special thanks to Roberto Arnanz, Dr. Luisa F. Villa and Dr. Aníbal Reñones of the CARTIF FOUNDATION for providing the original dataset and for performing all the experimental tests and to Dr. Juan J. Rodríguez from the University of Burgos for his kind-spirited and useful advice.


  1. Anand, R., Mehrotra, K., Mohan, C. K., & Ranka, S. (1995). Efficient classification for multiclass problems using modular neural networks. Neural Networks, IEEE Transactions on, 6(1), 117–124.CrossRefGoogle Scholar
  2. Bagheri, M. A., Montazer, G. A., & Escalera, S., (2012). Error correcting output codes for multiclass classification: application to two image vision problems. In 2012 16th CSI international symposium on artificial intelligence and signal processing (AISP) (pp. 508–513). IEEE.Google Scholar
  3. Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A., & Nielsen, H. (2000). Assessing the accuracy of prediction algorithms for classification: An overview. Bioinformatics, 16(5), 412–424.CrossRefGoogle Scholar
  4. Barszcz, T., & Randall, R. B. (2009). Application of spectral kurtosis for detection of a tooth crack in the planetary gear of a wind turbine. Mechanical Systems and Signal Processing, 23(4), 1352–1365. [Online]. Available
  5. Bartelmus, W., & Zimroz, R. (2009). Vibration condition monitoring of planetary gearbox under varying external load. Mechanical Systems and Signal Processing, 23(1), 246–257, special Issue: Non-linear Structural Dynamics. [Online]. Available
  6. Bradley, A. P. (1997). The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7), 1145–1159.CrossRefGoogle Scholar
  7. Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. In Wadsworth International Group.Google Scholar
  8. Breiman, L. (1996). Heuristics of instability and stabilization in model selection. The Annals of Statistics, 24(6), 2350–2383.CrossRefGoogle Scholar
  9. Bustillo, A., & Correa, M. (2012). Using artificial intelligence to predict surface roughness in deep drilling of steel components. Journal of Intelligent Manufacturing, 23(5), 1893–1902.CrossRefGoogle Scholar
  10. Cao, Y. H., Cao, Y., Wu, G. Q., Li, Q. M., & Shi, Y. J. (2014). The analysis of monitoring system of wind turbine. Applied Mechanics and Materials, 487, 595–600.CrossRefGoogle Scholar
  11. Caselitz, P., Giebhardt, J., & Mevenkamp, M. (1994). On-line fault detection and prediction in wind energy converters. In Proceedings of the EWEC (Vol. 94, pp. 623–627).Google Scholar
  12. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2011). Smote: Synthetic minority over-sampling technique. arXiv preprint arXiv:1106.1813.
  13. Chawla, N. V., Lazarevic, A., Hall, L. O., & Bowyer, K. W. (2003). Smoteboost: Improving prediction of the minority class in boosting. In N. Lavrač, D. Gamberger, L. Todorovski, & H. Blockeel (Eds.), Knowledge discovery in databases: PKDD 2003 (pp 107–119). Springer.Google Scholar
  14. Chen, J., & Hao, G. (2012). Research on the fault diagnosis of wind turbine gearbox based on bayesian networks. Practical Applications of Intelligent Systems, 124, 217–223.CrossRefGoogle Scholar
  15. Davies, A. (1998). Handbook of condition monitoring: Techniques and methodology. Chapman & Hall. [Online]. Available
  16. Dietterich, T. G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. arXiv:cs/9501101.
  17. Essawy, M. (1998). Fault diagnosis of helicopter gearboxes using neuro-fuzzy techniques. In 52nd meeting of the MFPT society, pp. 293–302.Google Scholar
  18. Estabrooks, A., Jo, T., & Japkowicz, N. (2004). A multiple resampling method for learning from imbalanced data sets. Computational Intelligence, 20(1), 18–36.CrossRefGoogle Scholar
  19. Ferri, C., Hernández-Orallo, J., & Salido, M. A. (2003). Volume under the roc surface for multi-class problems. In Machine learning: ECML 2003 (pp. 108–120). Springer.Google Scholar
  20. Filev, D. & Yager, R. R. (1994). Learning owa operator weights from data. In Fuzzy systems, 1994. IEEE World congress on computational intelligence. Proceedings of the third IEEE conference on, (pp. 468–473). IEEE.Google Scholar
  21. Filev, D., & Yager, R. R. (1998). On the issue of obtaining owa operator weights. Fuzzy sets and systems, 94(2), 157–169.CrossRefGoogle Scholar
  22. Freund, Y., & Schapire, R. E. et al. (1996). Experiments with a new boosting algorithm. In ICML (Vol. 96, pp. 148–156).Google Scholar
  23. Fürnkranz, J. (2002). Round robin classification. The Journal of Machine Learning Research, 2, 721–747.Google Scholar
  24. Fyfe, K., & Munck, E. (1997). Analysis of computed order tracking. Mechanical Systems and Signal Processing, 11(2), 187–205.CrossRefGoogle Scholar
  25. Gajate, A., Haber, R., del Toro, R., Vega, P., & Bustillo, A. (2012). Tool wear monitoring using neuro-fuzzy techniques: A comparative study in a turning process. Journal of Intelligent Manufacturing, 23(3), 869–882.CrossRefGoogle Scholar
  26. Galar, M., Fernández, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 42(4), 463–484.CrossRefGoogle Scholar
  27. García, S., & Herrera, F. (2009). Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy. Evolutionary Computation, 17(3), 275–306.CrossRefGoogle Scholar
  28. Garg, A., & Tai, K. (2014). An ensemble approach of machine learning in evaluation of mechanical property of the rapid prototyping fabricated prototype. In Applied Mechanics and Materials (Vol. 575, pp. 493–496). Trans Tech Publ.Google Scholar
  29. Hameed, Z., Hong, Y., Cho, Y., Ahn, S., & Song, C. (2009). Condition monitoring and fault detection of wind turbines and related algorithms: A review. Renewable and Sustainable energy reviews, 13(1), 1–39.CrossRefGoogle Scholar
  30. Harris, T. (1993). A kohonen som based, machine health monitoring system which enables diagnosis of faults not seen in the training set. In Neural networks, 1993. IJCNN’93-Nagoya. Proceedings of 1993 international joint conference on, (Vol. 1, pp. 947–950) IEEE.Google Scholar
  31. Hastie, T., & Tibshirani, R. (1998). Classification by pairwise coupling. The Annals of Statistics, 26(2), 451–471.CrossRefGoogle Scholar
  32. He, H., & Garcia, E. A. (2009). Learning from imbalanced data. Knowledge and Data Engineering, IEEE Transactions on, 21(9), 1263–1284.CrossRefGoogle Scholar
  33. Hoens, T.R., Qian, Q., Chawla, N. V., & Zhou, Z.-H. (2012). Building decision trees for the multi-class imbalance problem. In P.-N. Tan, S. Chawla, C. K. Ho, & J. Bailey (Eds.), Advances in knowledge discovery and data mining (pp. 122–134). Springer.Google Scholar
  34. Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366.CrossRefGoogle Scholar
  35. Jeffries, W., Chambers, J., & Infield, D. (1998). Experience with bicoherence of electrical power for condition monitoring of wind turbine blades. In IEE proceedings—vision, image and signal processing (Vol. 145, no. 3, pp. 141–148). IET.Google Scholar
  36. John, G. H., & Langley, P. (1995). Estimating continuous distributions in bayesian classifiers. In Proceedings of the eleventh conference on uncertainty in artificial intelligence (pp. 338–345). Morgan Kaufmann Publishers Inc.Google Scholar
  37. Joselin Herbert, G., Iniyan, S., Sreevalsan, E., & Rajapandian, S. (2007). A review of wind energy technologies. Renewable and Sustainable Energy Reviews, 11(6), 1117–1145.CrossRefGoogle Scholar
  38. Joshi, A. J., Chandran, S., Jayaraman, V. K., & Kulkarni, B. D. (2010). Hybrid support vector machine for imbalanced data in multiclass arrhythmia classification. International Journal of Functional Informatics and Personalised Medicine, 3(1), 29–47.CrossRefGoogle Scholar
  39. Jurman, G., & Furlanello, C. (2010). A unifying view for performance measures in multi-class prediction. arXiv preprint arXiv:1008.2908.
  40. Kohavi, R., et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI, 14(2), 1137–1145.Google Scholar
  41. Krawczyk, B., & Schaefer, G. (2013). An improved ensemble approach for imbalanced classification problems. In 2013 IEEE 8th international symposium on applied computational intelligence and informatics (SACI) (pp. 423–426). IEEE.Google Scholar
  42. Landgrebe, T. C., & Duin, R. P. (2008). Efficient multiclass roc approximation by decomposition via confusion matrix perturbation analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 30(5), 810–822.CrossRefGoogle Scholar
  43. Lekou, D., Mouzakis, F., Anastasopoulo, A., & Kourosis, D. (2009). Fused acoustic emission and vibration techniques for health monitoring of wind turbine gearboxes and bearings. In European wind energy conference and exhibition, (EWEC 2009), Marseille, France (pp. 78–82). European Wind Energy Association.Google Scholar
  44. Lertampaiporn, S., Thammarongtham, C., Nukoolkit, C., Kaewkamnerdpong, B., & Ruengjitchatchawalya, M. (2013). Heterogeneous ensemble approach with discriminative features and modified-smotebagging for pre-mirna classification. Nucleic acids research, 41(1), e21–e21.CrossRefGoogle Scholar
  45. Li, H., Lian, X., Guo, C., & Zhao, P. (2013). Investigation on early fault classification for rolling element bearing based on the optimal frequency band determination. Journal of Intelligent Manufacturing, 26(1), 1–10.Google Scholar
  46. Liao, T. W. (2008). Classification of weld flaws with imbalanced class data. Expert Systems with Applications, 35(3), 1041–1052.CrossRefGoogle Scholar
  47. Liu, X.-Y., & Zhou, Z.-H. (2006). The influence of class imbalance on cost-sensitive learning: An empirical study. In Data mining, ICDM’06. Sixth international conference on (pp. 970–974). IEEE.Google Scholar
  48. Lu, Y., Tang, J., & Luo, H. (2012). Wind turbine gearbox fault detection using multiple sensors with features level data fusion. Journal of Engineering for Gas Turbines and Power, 134(4), 042501.CrossRefGoogle Scholar
  49. Modi, S., Lin, Y., Cheng, L., Yang, G., Liu, L., & Zhang, W. (2011). A socially inspired framework for human state inference using expert opinion integration. Mechatronics, IEEE/ASME Transactions on, 16(5), 874–878.CrossRefGoogle Scholar
  50. Montazer, G. A., & Escalera, S., et al. (2012). Error correcting output codes for multiclass classification: Application to two image vision problems. In 2012 16th CSI international symposium on artificial intelligence and signal processing (AISP) (pp. 508–513). IEEE.Google Scholar
  51. Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52(3), 239–281.CrossRefGoogle Scholar
  52. Nie, M., & Wang, L. (2013). Review of condition monitoring and fault diagnosis technologies for wind turbine gearbox. Procedia CIRP, 11, 287–290.CrossRefGoogle Scholar
  53. Pazzani, M. J., Merz, C. J., Murphy, P. M., Ali, K., Hume, T., & Brunk, C. (1994). Reducing misclassification costs. In ICML (Vol. 94, pp. 217–225).Google Scholar
  54. Provost, F., & Domingos, P. (2003). Tree induction for probability-based ranking. Machine Learning, 52(3), 199–215.CrossRefGoogle Scholar
  55. Quinlan, J. R. (1993). C4.5: Programs for machine learning. Morgan Kaufmann.Google Scholar
  56. Rennie, J. D. (2001). Improving multi-class text classification with naive bayes. Ph.D. dissertation, Massachusetts Institute of Technology.Google Scholar
  57. Rodríguez, J., Kuncheva, L., & Alonso, C. (2006). Rotation forest: A new classifier ensemble method. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 28(10), 1619–1630.CrossRefGoogle Scholar
  58. Salahshoor, K., Kordestani, M., & Khoshro, M. (2010). Fault detection and diagnosis of an industrial steam turbine using fusion of svm (support vector machine) and anfis (adaptive neuro-fuzzy inference system) classifiers. Energy, 35(12), 5472–5482.CrossRefGoogle Scholar
  59. Samuel, P. D., & Pines, D. J. (2005). A review of vibration-based techniques for helicopter transmission diagnostics. Journal of Sound and Vibration, 282(1–2), 475–508. [Online]. Available
  60. Sánchez, L., & Couso, I. (2012). Singular spectral analysis of ill-known signals and its application to predictive maintenance of windmills with scada records. Soft Computing, 16(5), 755–768.CrossRefGoogle Scholar
  61. Santos, P., Villa, L., Reñones, A., Bustillo, A., & Maudes, J. (2012). Wind turbines fault diagnosis using ensemble classifiers. Advances in Data Mining. Applications and Theoretical Aspects, 7377, 67–76.Google Scholar
  62. Seiffert, C., Khoshgoftaar, T. M., Van Hulse, J., & Napolitano, A. (2010). Rusboost: A hybrid approach to alleviating class imbalance. Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 40(1), 185–197.CrossRefGoogle Scholar
  63. Simani, S., & Fantuzzi, C. (2006). Dynamic system identification and model-based fault diagnosis of an industrial gas turbine prototype. Mechatronics, 16(6), 341–363.CrossRefGoogle Scholar
  64. Simonoff, J. S. (1995). Smoothing categorical data. Journal of Statistical Planning and Inference, 47(1), 41–69.CrossRefGoogle Scholar
  65. Soua, S., Van Lieshout, P., Perera, A., Gan, T.-H., & Bridge, B. (2013). Determination of the combined vibrational and acoustic emission signature of a wind turbine gearbox and generator shaft in service as a pre-requisite for effective condition monitoring. Renewable Energy, 51, 175–181.CrossRefGoogle Scholar
  66. Stander, C., & Heyns, P. (2006) Transmission path phase compensation for gear monitoring under fluctuating load conditions. Mechanical Systems and Signal Processing, 20(7), 1511–1522. [Online]. Available
  67. Tan, A. C., Gilbert, D., & Deville, Y. (2003). Multi-class protein fold classification using a new ensemble machine learning approach. Genome Informatics, 14, 206–217.Google Scholar
  68. Teixidor, D., Grzenda, M., Bustillo, A., & Ciurana, J. (2013). Modeling pulsed laser micromachining of micro geometries using machine-learning techniques. Journal of Intelligent Manufacturing, 1–14. doi: 10.1007/s10845-013-0835-x.
  69. Vijayakumar, S., & Schaal, S. (2006). Approximate nearest neighbor regression in very high dimensions. In G. Shakhnarovich, T. Darrell, & P. Indyk (Eds.), Nearest-neighbor methods in learning and vision: Theory and practice (pp. 103–142). Cambridge, MA: MIT Press.Google Scholar
  70. Villa, L. F., Reñones, A., Perán, J. R., & de Miguel, L. J. (2011). Angular resampling for vibration analysis in wind turbines under non-linear speed fluctuation. Mechanical Systems and Signal Processing, 25(6), 2157–2168. [Online]. Available
  71. Villa, L. F., Reñones, A., Perán, J. R., & de Miguel, L. J. (2012). Statistical fault diagnosis based on vibration analysis for gear test-bench under non-stationary conditions of speed and load. Mechanical Systems and Signal Processing, 29, 436–446.Google Scholar
  72. Wang, S., & Yao, X. (2009). Diversity analysis on imbalanced data sets by using ensemble models. In Computational intelligence and data mining, 2009. IEEE symposium on CIDM’09 (pp. 324–331). IEEE.Google Scholar
  73. Wang, J., Gao, R. X., & Yan, R. (2014). Integration of EEMD and ICA for wind turbine gearbox diagnosis. Wind Energy, 17(5), 757–773.Google Scholar
  74. Wang, Y.-C., Wang, X.-B., Yang, Z.-X., & Deng, N.-Y. (2010). Prediction of enzyme subfamily class via pseudo amino acid composition by incorporating the conjoint triad feature. Protein and Peptide Letters, 17(11), 1441–1449.CrossRefGoogle Scholar
  75. Wang, S., & Yao, X. (2012). Multiclass imbalance problems: Analysis and potential solutions. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 42(4), 1119–1130.CrossRefGoogle Scholar
  76. Witten, I., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques, 2nd ed. Morgan Kaufmann,
  77. Yager, R. R. (2004). Owa aggregation over a continuous interval argument with applications to decision making. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 34(5), 1952–1963.CrossRefGoogle Scholar
  78. Zhan, Y., Makis, V., & Jardine, A. K. (2006). Adaptive state detection of gearboxes under varying load conditions based on parametric modelling. Mechanical Systems and Signal Processing, 20(1), 188–221. [Online]. Available
  79. Zhou, Z.-H., & Liu, X.-Y. (2010). On multi-class cost-sensitive learning. Computational Intelligence, 26(3), 232–257.CrossRefGoogle Scholar
  80. Ziani, R., Felkaoui, A., & Zegadi, R. (2014). Bearing fault diagnosis using multiclass support vector machines with binary particle swarm optimization and regularized fisher’s criterion. Journal of Intelligent Manufacturing, 1–13. doi: 10.1007/s10845-014-0987-3.

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.University of BurgosBurgosSpain

Personalised recommendations