1 State of the art

Rotating systems are considered the beating heart of energy production and transmission. The proper-time maintenance and fault diagnosing of these machines are mandatory since they carry heavy and expensive attachments. Investigation on the troubleshooting techniques of rotary machines has a long-time history. Many researchers have been analyzing the vibration signals of such devices to find signatures of failures. Certainly, unbalancing is among the most prevalent failures in every rotary device but besides this, some other defects can occur simultaneously, i.e., the misalignment is accounted for other rampant failure in such systems; in addition, cracks can cause catastrophic failures. As a result, the preventive diagnosis of misalignment and cracks in rotor systems has been increasingly attracting engineers’ and researchers’ attention [1, 2].

Misalignment can occur in almost all rotor systems. Since a shaft of an electrical motor/generator must be connected to a driven/driving shaft through specific joints, in many cases, due to long-term operations or incorrect assembly, the two shafts may move slightly, and their central axes may no longer be aligned. On the other hand, although the occurrence of a crack in a rotating shaft is not as prevalent as misalignment, this fault can rapidly grow and bring about calamitous failures. Consequently, the study of the appropriate methods in distinguishing misalignment and cracks in rotor systems has increased during the last decade.

While a vast majority of faults can be detected during the periodic maintenance programs thanks to portable machine monitoring instruments and the presence of highly specialized technicians, the lack of enough professional workforce and human errors pushed companies to the usage of imperfections automatic identification systems.

If a shaft fails as a result of damage, a series of adverse effects may follow. First, a halt in the production of energy or its transformation can result in financial loss. Second, the other components may experience a series of related faults. For instance, if an unbalance in a rotating system is not caught in time, the cyclic load may turn into a fatigue load and result in a fatigue crack. Last but not least, a rotor system with a failing shaft has a very high destructive potential. This incident has the potential to completely destroy an industrial shed and put the lives of the workers nearby in grave danger.

In computerized fault diagnostic, classification is regarded as one of the most reliable methods. For classification purposes, there are two primary approaches. Machine learning classification can be divided into three types: supervised, unsupervised, and reinforcement learning. The user should introduce a feature vector for each class in this procedure. Another method of classification is deep learning, which consists of two or more hidden layers and can extract features automatically. Convolutional neural networks (CNNs) are a sort of deep learning classification that can classify objects or graphs into distinct classes based on their shapes, or their visual characteristics [3].

In the following, some of the important, also new investigations that have been performed on the diagnosing of these two faults, i.e., shaft crack and misalignment are addressed.

The main effect of misalignment in a rotating system is the change in the stiffness matrix and the creation of additional reaction forces in the shaft element that carries the coupling. In [4], 5], Gibbon and Sekhar studied the reaction forces and moments caused by parallel and angular misalignments in rotor systems, respectively.

In [4], Gibbon assumed that due to misalignment in a multi-rotor system, moments and constant forces are created in the coupling. The amount of these forces and moments is a function of the severity of the misalignment and is not dependent on the system rotating speed.

In [5], Sekhar and Prabhu studied the impacts of a misalignment in turbomachinery in bending mode. They mainly focused on the sensitivity analysis of the system to the position of the coupling along the shaft.

Sinha et al. investigated an estimation technique to identify unbalancing and misalignment in a rotor-bearing system. The transient response (during the run-down) was studied, and finally, the proposed method was checked through a sensitivity analysis [6].

Jalan and Mohanty studied a model-based technique for defect diagnosis in rotor-bearing devices. The residual generation method is used to engender residual vibration for a system that was out of balance and misaligned. The research was conducted through physical experiments. The practically calculated residual forces have been compared with the theoretical ones due to these two faults. Location and conditions of defects have been detected successfully; in addition, the coupling stiffness matrix for a 4-degree freedom system has been achieved [7].

Misalignment in the rotary machine can engender axial and radial vibrations. In [8], Sudhakar and Sekhar studied modeling procedures of two diverse types of coupling, i.e., flexible and gear coupling when a rotor system is misaligned. Effects of parallel and angular misalignment on the stiffness matrix and on the vibration responses have been investigated. It is stated that a misalignment can result in the 2× harmonic component on the frequency response although this fault does not affect the amplitude of 1× bending vibration.

Patel and Darpe researched harmonic components that stem from parallel and angular misalignment. Firstly, a rotor system with six degrees of freedom was modeled by means of finite element (FE) method, then a test rig was installed to verify the modeled system. In the frequency analysis, it is stated that 2× and 3× harmonics can be considered as the signatures of a misalignment in the rotor system. Moreover, Orbit plots for misaligned rotors were graphed [9].

Although scientists have been examined discrepant manners to distinguish symptoms and dynamics of a misaligned shaft more properly through recent years, the improvement in making hi-tech instruments has helped them as well. In [10], Arebi et al. installed a wireless sensor on a rotor system to observe misalignment with higher accuracy. This method is more effective since using this sensor increased the SNR (signal-to-noise ratio).

Qu et al. employed a combined method to distinguish misalignment in a rotor system. Average Rotor Centerline (ARC) and 1X-orbit were synchronously plotted to identify 1X segments arising from various faults like unbalancing or transient bending. Comparing changes in these two graphs with those caused by other failures, new symptoms for misalignment were introduced [11].

In [12], Umbrajkaar et al. utilized machine learning techniques to identify and measure the amount of misalignment in rotating machinery. In this work, the rank-based feature selection enhanced the accuracy of the classification process up to 89.7%.

Sathujoda applied wavelet transformation to reveal features of misalignment in a rotor system. Due to the presence of misalignment, some subcritical speeds, i.e., 50%, 33%, and 25% of the critical speed were revealed in the wavelet transform (WT) plots of vibration signals [13].

Kumar et al. have compiled an extensive range of previous work on the detection of misalignment in rotor machines [14].

Misalignment is a quite common fault in rotor systems; however, cracks in rotating machines can lead to catastrophic breakdowns. There are several significant issues in the case of cracked systems, including the study of its propagation, the strength of cracked structures, and the simulation of its breathing behavior in rotational parts where nonlinear phenomena appear in vibrational responses.

It has been shown that due to the presence of a crack in an element, the local stiffness of that element is reduced relative to adjacent elements. As one of the primary works about cracked rotors, in [15], Nelson and Nataraj investigated the dynamic behavior of a cracked rotor system. Also, the stiffness matrix of a cracked element was extracted in the work. The breathing behavior of the crack, i.e., its variation during the rotation was presented by an expansion of the Fourier series.

Darpe proposed a recent crack detection methodology. The traces of coupling bending-torsional vibrations and breathing behavior were exploited; a short-time torsional excitation was employed and its effect on the lateral vibration was investigated. To reveal signatures of the resonant bending vibrations, wavelet transformation was applied. Changes of the peak absolute value in WT coefficients of the lateral vibration response, also the angle that the torsional excitation was employed were evaluated [16].

In [17], Caputo et al. studied fracture resistance of an aluminum flat stiffened panel. They also introduced a method to improve the residual strength of faulted panels utilizing a stochastic design improvement manner.

Numerous earlier works have looked into the effects of multiple cracks as well as the potential effects of changing the position of a crack in different structures [18,19,20,21].

Lu et al. introduced a new Kriging surrogate FE method that is an updating model-based procedure. This process was utilized in the crack detection of rotating rotors employing super-harmonic nonlinear characteristics. To check the effectiveness of the proposed method, an experiment was done; in addition, Gaussian white noise was added to the captured signal [22].

In [23], Prabhakar studied the transient vibration signals of a slant-cracked rotary machine. The FE method was employed to simulate the defective system; a torque that was varying harmonically and an unbalance force were added to the system. To find a crack and its location, fast Fourier transform (FFT) and WT were utilized.

Zhang et al. studied the crack’s effects on the energy tracks. For a cracked rotor system, the potential energy expression was extracted, then the fundamental of energy tracks was proposed. It has been shown that the more a crack be deeper, more the energy tracks change [24].

In [25], Kushwaha and Patel categorized a wide number of previous works that have been done on the crack detection area in rotating systems.

In some previous works of literature, it has been noted that a crack and misalignment produce similar symptoms in the frequency domain, i.e., the 2× components in the FFT diagram. As a result, it seems necessary to study these two defects simultaneously in the same work to find a proper method in distinguishing them. In [26], Sinha used the higher-order spectra (HOS) to identify higher harmonics resulting from cracked rotors from misaligned ones. That was the first time bi-spectrum and tri-spectrum were applied in the fault diagnosing of rotatory devices.

Patel et al. investigated a misaligned rotor system that suffered crack too. To show the fault-specific whirl signatures, spectrum analysis was done on the axial and torsional vibration signals in the steady-state operation. According to discrepancies in backward and forward whirling, two first harmonic components, i.e., 1× and 2×, two novel whirling parameters were presented [27].

Azeem et al. investigated two prevalent faults, i.e., crack and misalignment in a Spectra Quest's Machinery Fault Simulator (MFS) system by employing order analysis. It represented that 2× harmonic components can be considered as a signature of misalignment in the system while due to a crack 2×, and 3× harmonic components appeared [28].

Employing intelligent methods in fault identification has witnessed growing development during the last years. The impressive advantage of such procedures is that defects can be identified in relatively early stages, also using these methods does not need well-educated personnel. In [29], Zhao et al. employed CNNs to classify misaligned and cracked rotors. Raw signals were introduced directly as the training input data; at the testing stage, Gaussian noise was added to the signal. In the experimental work, a compound of the probabilistic principal component analysis (PPCA), the variational mode decomposition (VMD), and the principal component analysis (PCA) were utilized in noise reduction, also feature extraction. Finally, the results of five various classification procedures, i.e., normal CNN, variational mode decomposition CNN(VMD-CNN), VMD-PPCA-SVM, VMD-PCA-CNN, and VMD-PPCA-CNN were compared to each other in different signal-to-noise ratios (SNRs).

Rezazadeh and Fallahy utilized a deep learning procedure based on WT to classify cracked rotor systems. In this work, the discrete wavelet transformation (DWT) was employed to reduce noise from experimental vibration signals, then the relative wavelet energy and wavelet entropy were used in forming the feature vector [30].

Rezazadeh et al. applied the convolutional neural networks (CNNs) to identify cracked rotating machinery concerning different crack depths. The scalogram of continuous wavelet transformation and spectrogram of short-time Fourier spectrogram of transient signals were brought forward as two separate training materials [31].

In [32], Jin et al. worked on the classification of a cracked hollow shaft according to the crack location. The amplitude-frequency responses of the defective rotary machine were introduced as the input data; in addition, CNNs and deep metric learning procedures were utilized for the classification.

Rezazadeh et al. employed CNNs based on persistence spectrum to classify cracked rotors suffering shallow cracks from a healthy rotor system. The steady-state vibration responses of rotating machinery that were modeled by utilizing FEM have been applied in acquiring the persistence spectrums [33].

In [34], Rodrigues et al. compared several methods in the classification of defective rotating machinery. For this purpose, five types of faults (crack, misalignment, hydrodynamic instability, unbalance, and rotor–stator rub) were modeled numerically. The spectral image of vibration orbits throughout the start-up was introduced as the input data in CCNs and the feature vectors for the other classification manners were calculated by processing the same spectral images.

Looking at previous research into the detection of cracked and misaligned rotors, it can be understood that, while some intelligent methods have been used in this area, there are two major issues: First, the tried processes have shown to have low accuracy, and second, the methods are insensitive to shallow cracks.

In the present paper, classification processes of cracked, unbalanced, and misaligned rotating systems are compared. The steps in the research are as follows. To begin, a rotor-bearing-disk system with an imbalance is modeled using the FE approach and the Timoshenko beam theory. A transverse crack in the shaft, as well as a parallel misalignment, is simulated in the rotary machine in the following. The systems are then operated for a variety of initial and physical conditions, and the responses are numerically captured in steady-state operation. At the feature extraction stage, four different methods are used: statistical, frequency domain, time–frequency domain, and uncertainty. Features are extracted for the three classes, i.e., unbalanced, cracked, and misaligned, and feature vectors are constructed. The T test is performed on the extracted feature vectors as a semi-final step, and the improper features are eliminated. Finally, the three classes are classified using the SVM, Naive Bayes, decision tree, and KNN algorithms.

On the other hand, the input data for CNNs are created by graphing the persistence spectrum of raw signals in the three classes. As a classifier, CNNs with AlexNet architecture are utilized. Confusion matrices are used to demonstrate the accuracy of various methods. The procedures used in this article are visualized in Fig. 1.

Fig. 1
figure 1

The article working process

2 Materials and methods

The FE method is used to model the system in this paper. The system includes the disk, coupling, bearing supports, rotor shafts, and electrical motor. Figure 2 presents a graphic representation of the modeled system. It should be noted that the driving motor is not shown in the figure, and the analysis has been performed from bearing number 1. In the finite element formulation, two-node elements are employed for each shaft, as shown in Fig. 3, where each nodal point has four degrees of freedom (DoF): rotations around Y and Z axes; translations along the same axes.

Fig. 2
figure 2

Schematic of the rotor-bearing-disk system with coupling, and the finite element model of the system

Fig. 3
figure 3

The coordinate system and the shaft beam element

The equation of motion for a system with damping can be expressed in the following general form:

$$M\left\{\ddot{q}\right\}+C\left\{\dot{q}\right\}+K\left\{q\right\}=f\left(t\right)$$
(1)

where \(M\), \(C\), and \(K\) are the mass, damping, and stiffness matrices, respectively; \(f(t)\), and \(\{q\}\) are the force and the coordinate vectors, respectively. The results from [35] are used to create these matrices. These matrices are 4*4 sized for each node. Calculating the characteristics matrices for the shaft elements (here eight elements), disk, bearings, and coupling, and assembling these matrices in a global matrix using a connectivity table, a 36*36 (due to nine nodes)-sized matrix has been achieved for each of these characteristic matrices.

For elements consisting of journal bearings, added damping, as well as extra stiffness, should be accounted [35]. Furthermore, the stiffness matrix of the element that carries the coupling (non-frictional flexible) is different from its left and right elements. In the following, the stiffness matrix of the element carrying coupling is stated, and this matrix should be inserted in the entire system’s global stiffness matrix [7].

$$\left[{\text{k}}_{\text{coupling}}\right]\text{=}\frac{\text{EI}}{{\text{l}}^{3}} \, \left[\begin{array}{cccccccc}{12}& {0}& {0}& {\text{6l}}& -{12}& {0}& {0}& {\text{6l}}\\ {0}& {12}& -{\text{6l}}& {0}& {0}& -{12}& -{\text{6l}}& {0}\\ {0}& -{\text{6l}}& {4}{\text{l}}^{2}& {0}& {0}& {\text{6l}}& -{2}{\text{l}}^{2}& {0}\\ {\text{6l}}& {0}& {0}& {4}{\text{l}}^{2}& -{\text{6l}}& {0}& {0}& {2}{\text{l}}^{2}\\ -{12}& {0}& {0}& -{\text{6l}}& {12}& {0}& {0}& -{\text{6l}}\\ {0}& -{12}& {\text{6l}}& {0}& {0}& {12}& {\text{6l}}& {0}\\ {0}& -{\text{6l}}& -{2}{\text{l}}^{2}& {0}& {0}& {\text{6l}}& {4}{\text{l}}^{2}& {0}\\ {\text{6l}}& {0}& {0}& {2}{\text{l}}^{2}& -{\text{6l}}& {0}& {0}& {4}{\text{l}}^{2}\end{array}\right],$$
(2)

where \(EI\) and \(l\) are the flexural rigidity and length of each shaft element, respectively.

2.1 Fault modeling

Due to a specific failure, one or all the characteristic matrices can change. For some cases, only the force vector should be updated, while in other cases, there may be some changes in the other characteristic matrices such as stiffness matrix. These adjustments will be explained in the following sections for unbalanced, cracked, and misaligned rotors.

2.1.1 Unbalanced rotor

Unbalancing, the most common defect in all rotary machines, occurs when the mass center of a rotating subject, such as a disk, does not coincide with its geometric center. The main effect of an imbalanced rotating disk is to generate harmonic forces in the element that carries the disk. The force vector in a rotor system that suffers unbalancing is defined as follows [36]:

$$\begin{aligned} & \left\{ {f\left( t \right)} \right\} = \left[ {F_{Y} ,F_{Z} ,F_{\theta Y} ,F_{\theta Z} } \right]^{{\text{T}}} \\ & F_{Y} = m_{d} e\omega^{2} \cos \omega t \\ & F_{Z} = m_{d} e\omega^{2} \sin \omega t \\ & F_{\theta Y} = 0 \\ & F_{\theta Z} = 0 \\ \end{aligned}$$
(3)

where \(F_{Y}\), \(F_{Z}\), \(F_{\theta Y}\), and \(F_{\theta Z}\) are force and moment components in the \(Y\) and \(Z\) directions, respectively. Furthermore, \(m_{d}\), \(e\), and \(\omega\) represent the disk mass, unbalance eccentricity, shaft angular speed, respectively.

2.1.2 Cracked rotor

A crack in a rotating shaft can alter the local flexibility of the element affected [37]. As a result, the stiffness matrix of a cracked element is different from the neighbor elements. To determine the stiffness matrix of a cracked shaft element with the open crack assumption \(\left[ {K_{{\text{o}}}^{{\text{e}}} } \right]\), first, the extra flexibility matrix caused by the crack \(\left[ {C_{{\text{c}}} } \right]\) should be calculated; then, it should be added to the flexibility matrix of the healthy shaft element \(\left[ {C_{{{\text{uc}}}} } \right]\); and finally, the inverse of the resulting matrix \(\left[ {C_{{\text{o}}} } \right]\) should be multiplied in a transfer matrix \(\left( T \right)\) [35]. Three crack loading modes—namely the tensile, sliding, and tearing modes—have been taken into account for calculating the effects of a crack on the local flexibility of a cracked shaft element; calculation of these factors has a wide share in the engineering works in solid mechanics [38]. The non-dimensional coefficients in the extra flexibility matrix of the cracked element were calculated using the stress intensity factor of each of these modes [16].

$$\begin{aligned} \left[ {C_{{\text{o}}} } \right] & = \left[ {C_{{{\text{uc}}}} } \right] + \left[ {C_{{\text{c}}} } \right] \\ \left[ {K_{{\text{o}}}^{{\text{e}}} } \right]_{8*8} & = \left[ T \right]_{8*4} *\left[ {C_{o} } \right]_{4*4}^{ - 1} *\left[ T \right]_{4*8}^{{\text{T}}} \\ \end{aligned}$$
(4)

The breathing behavior of a crack should be considered in practical applications. This phenomenon can be seen in very heavy spinning shafts when a crack closes and opens in a harmonic manner due to the shaft weight. A truncated cosine function consisting of four sentences is used in this study to simulate this effect [33]. A crack is considered in the element beside the disk in this article; Fig. 4 depicts the cracked system.

Fig. 4
figure 4

Schematic of the cracked rotor system showing its location

In the current investigation, a constant torsional torque is applied at the location of node 1 (from Fig. 2), but the possible impacts of changing the position of this load have not been investigated separately. The probable effects should be on the breathing behavior of the crack; this has been considered by supposing a varying stiffness matrix during a complete rotation.

2.1.3 Misaligned rotor

In real applications, the complete alignment of two shafts connected by couplings is infrequent. Even if a perfect alignment is set up initially, it is complex to maintain it for a longer period. Foundation setup, foundation thermal expansion/contraction, unbalance, wear and tear, and temperature fluctuations caused by friction or process are all probable causes. There are three types of misalignments: parallel, angular, and combined, which is a combination of the previous two types. Moreover, there are two various sorts of coupling, i.e., flexible and rigid. The flexible couplings can provide some desired misalignment, on the other hand, the rigid one is similar to connecting two beam elements. In this type, the driven and driving shafts cannot move radially or axially; they cannot be used when shock or high amplitude vibrations due to other probable faults are expected.

Couplings can be also classified as frictionless and with friction. Although the non-friction type is used in this paper, friction coupling can protect the system against overload as well as reduce the amount of parallel misalignment. Misalignment in a rotor system leads to reaction forces and moment. Misalignment of connected parts causes reaction forces and torque on the bearing. The vibration is caused by the reaction forces that a misaligned coupling imposes on the machine, not by the misalignment itself. For a parallel misalignment, reaction forces and moments were calculated by Gibbons in [4], and these reactions should be added in the places of the two bearings. Figure 5 presents a rotor system suffering parallel misalignment.

Fig. 5
figure 5

Schematic of the misaligned rotor system showing its measure

2.2 Preparation of the data set

After calculating the system matrices, force vectors and placing them on the related global matrices, the equations of motion of the systems are solved numerically in MATLAB R2022a using the Houbolt method with an interval of 0.001 s [39].

To have an assessment of the nature of the signals to be examined, the time-domain signals of six different health states, i.e., unbalanced, with parallel misalignment of \(0.2\), \(0.35\), and 0.4 mm, as well as cracked ones with depths of \(0.2r\) and \(0.32r\), are plotted in Fig. 6. It should be noted that r is the radius of the shaft, and the system’s angular velocity is constant, \(\omega = 100\,{\text{rad/s}}\); the physical properties are listed in Table 1.

Fig. 6
figure 6

Time-domain signals for unbalanced, misaligned, and cracked rotary systems

Table 1 Physical and operational characteristics of the evaluated systems

From the above plots, it can be noted that

  1. 1.

    The amplitude of cracked systems, i.e., “C” and “E,” is the same as unbalanced (“A”) and the first misaligned system (“B”), but lower than the second (“D”) and third (“F”) misaligned rotors. The figures of cracked and unbalanced systems, on the other hand, have a similar appearance;

  2. 2.

    As the severity of parallel misalignment increases, the related amplitude experienced rises;

  3. 3.

    Misaligned systems are distinguishable from the cracked and unbalanced systems because of the second local peaks in each rotation;

  4. 4.

    The unbalanced system, "A", and cracked ones, "C" and "E" have the same phase angle.

Because the FFT will be used in the next sections for the feature extraction process, Fig. 7 presents the above time signals in the frequency domain; to have suitable visibility, X-axis has been limited.

Fig. 7
figure 7

Frequency domain signals for unbalanced, misaligned, and cracked rotary systems

As shown in the previous diagrams of Fig. 6, only the misaligned systems, “B,” “D,” and “F,” contain the second harmonic component (2×), and its amplitude increases with increasing the intensity of parallel misalignment (m). However, in unbalanced and cracked conditions, i.e., “A,” “C,” and “E,” only the first component (1×) is visible. It should be mentioned that in cracked rotors, “C” and “E,” there are higher-order harmonic components such as 2×, 5×, and 7×, although they are not visible due to the large differences between these components and the first harmonic component, 1×.

In the following sections, to classify using CNNs, persistence spectrums are used as input data. A signal persistence spectrum is a frequency display that shows the percentage of time that a particular frequency exists in a signal. This method uses coloration to reveal latent frequency characteristics in a signal, the hotter the coloration, the greater the frequency severity. In power–frequency space, this is a histogram. Figure 8 shows the persistence spectrum of the signals represented in Fig. 6.

Fig. 8
figure 8

Persistence spectrums of the three classes (unbalanced, misaligned, and cracked)

Looking at the earlier set of persistence spectrums, graphs of cracked rotor systems, i.e., “C,” and “E,” are completely different from the other two defective systems due to the longer persistence of frequencies approximately between − 160 and − 80 dB in the vertical axis and from 0.2 to 1 in the horizontal axis. As previously stated, higher-order frequency components exist in the FFT diagrams of such systems, and these components are visible in the persistence spectrums.

Furthermore, the persistence spectrums of misaligned systems, “B,” “D,” and “F” can be distinguished from the unbalanced system, “A” by the differences that exist in the area where the main peak occurs (relevant regions from 0 to 0.2 on the horizontal axis). These differences are automatically being detected and applied by the designed CNNs to classify the three classes of unbalanced, misaligned, and cracked rotor systems. As a result, in the steady state, the persistence spectrum can be introduced as reliable input data for CNNs.

Since a data set is needed for classification, the simulated system has been run for discrepant physical as well as operating conditions for the three fault circumstances, i.e., unbalanced, cracked, and misaligned systems. For each condition, 220 samples are considered. Vibration signals have been captured during the steady-state operation and for 4.5 s.

Table 2 shows the scope of change of physical and operational specifications of rotating machinery systems that have been used in the preparation of the data set.

Table 2 Physical and operational change-range of the rotor system

2.3 Classification procedure

In the present study, two methods, supervised machine learning and CNNs, are used to classify unbalanced, misaligned, and cracked rotational systems. In the former one, firstly, a feature vector should be prepared, then the network will be trained with the input features and the corresponding desired output classes. On the other hand, in the latter, the persistence spectrums of the three classes are introduced as the input data, and the deep network will extract the features from the images and will allocate these extracted features for each class. Overall, for both methods, a preprocessing stage is needed.

2.3.1 Supervised machine learning

In this method, the data used to train the network should be labeled before. This method has both benefits and drawbacks. Low computational complexity can be considered as the main advantage of this process compared to methods such as CNNs because if the number of features is high, unrelated features can be eliminated by applying optimization methods or statistical testing. Finding and calculating the proper feature, on the other hand, is a labor-intensive activity, and in many cases, the correct feature can be discovered through trial and error. As a result, in this method, introducing and selecting appropriate features play a key role in the acquired accuracy. The feature extraction and selection procedures used in this paper are briefly explained in the following section.

Feature extraction

Different features can be extracted depending on the nature of the signal to be examined. Desirable features can sometimes be extracted from raw signals, but in many cases, preprocessing stages on the signal are required. As the preprocessing and feature extraction steps, four diverse types of features are employed:

  1. 1.

    Statistical features; in this category, average, standard deviation, skewness, and kurtosis, of each vibration signal (collectively 660 signals) are elicited;

  2. 2.

    Frequency features; for this category, the amplitude and the respective frequencies of peaks are distinguished; the power values corresponding to the estimates contained in the persistence spectrums of each signal are summed;

  3. 3.

    Time-frequency features; first, each signal up to level 4 is decomposed by the Daubechies wavelet function of order 10 (db10) as the mother function, then the mean value of the approximation coefficients at level 4, cA-4, and detail coefficients at level 1, cD-1, are calculated;

  4. 4.

    Uncertainty features; for this purpose, the Shannon entropy is applied. The linear signals obey the superposition principle, while the nonlinear ones do not obey this law. As a result, the Shannon entropy measures the predictability of signals.

Table 3 reveals the names of the extracted features as well as the number assigned to them.

Table 3 Extracted features

Feature selection

After calculating the described features, the effectiveness of the features should be evaluated by a test, and the unreliable features should be eliminated from the feature vector. Selecting suitable features can increase the accuracy of the classification procedure; on the other hand, choosing a feature that does not make an adequate difference between two various classes can result in low accuracy.

In this paper, the T test is used to identify features that reduce the accuracy of the classification process. T test is an inferential statistical method that shows whether there is a significant difference between the two features or not. In the manner, first, a significance value must be introduced, then the test determines which properties are below the desired value of significance [40]. In this work, the significance value is set to 0.05 [41]. After evaluating the features extracted in the previous section by the T test, Fig. 9 reveals the effectiveness of all features. The horizontal and vertical axes represent the number of features (from Table 3) and the probability of the property occurring by chance. It should be noted that the features below the red-dashed line successfully passed this test. As a result, features 1, 8, and 9 should be removed from the feature vector.

Fig. 9
figure 9

Results of the T test on extracted features

Eliminating the three mentioned features, the dimension of the feature vector for each vibration signal decreased to seven features. The final feature matrix consists of the selected features in the related vector for each sample along with the label for each class has been brought forward to the following step.

Classification

Now that the feature vector has been calculated for each signal, four classification algorithms with various preset have been performed using the Classification Learner App in MATLAB, namely support vector machine (SVM), k-nearest neighbors (KNN), Naive Bayes, and decision trees. Further information concerning these methods exists in [42].

Among the 220 samples from each class, 15 percent have been assigned randomly to the test step; to avoid overfitting, the cross-validation scheme with fivefold has been utilized. As a result, in the validation and test phases for each class, 187 and 33 samples have been considered, separately. Table 4 represents the acquired accuracy for each of these classification methods.

Table 4 The achieved accuracy of various machine learning procedures

Comparing the different classifiers from Table 4, it can be observed that the fine KNN revealed the highest accuracy in the validation step, i.e., 98.2%, in the test stage with 98%. On the other hand, Naive Bayes and Gaussian preset represented the worst performances both in the test and validation phases, e.g., 83.8% and 75.9%, respectively. The confusion matrices for the fine KNN classifier are plotted in Fig. 10 for both validation and test steps.

Fig. 10
figure 10

Confusion matrices of the training, and testing phases for fine KNN classifier

The indices 0, 1, and 2 in Fig. 10 denote imbalanced, misaligned, and cracked classes, respectively. The preceding confusion matrix shows that the fine KNN classifier classified 10 samples incorrectly in the validation step, six samples from class 3 that have been classified wrongly in class 1; one and three samples have been categorized mistakenly in class 3, while they belonged to classes 2 and 1, respectively.

On the other side, only two of the 99 samples in the test phase have been classified in class 1 incorrectly although they originally came from classes 2 and 3.

The two classifiers with the best performances in this situation, fine KNN and SVM Cubic, took 5.04 and 6.55 s, respectively, to process the 561 samples and the three classes in the training phase containing the validation. Intel(R) Xeon(R) Gold 6248R CPU @ 3.00 GHz, 2993 MHz, 24 Core(s), 48 Logical Processor(s), and installed Physical Memory (RAM) of 192 GB are all features of the workstation that was used in this investigation. As a result, fine KNN not only has demonstrated better accuracy, but it also needed less time to calculate.

2.3.2 Classification using convolutional neural networks (CNNs)

The convolutional neural network is a deep learning process that can extract features from images, assign those characteristics to each class, and then classify images into various categories. In this study, a multilayer AlexNet has been created, with the input image as the first layer and the classification layer as the last. The architecture of the CNNs used is shown in Table 5. It has been made up of 25 layers, with 660 samples’ persistence spectrums serving as the input images. Similar to the prior technique, i.e., machine learning classifier, 85% of the data has been allocated to the training phase.

Table 5 The architecture of the employed AlexNet

Running the projected CNNs, in the training stage, 99.1% accuracy has been achieved; in addition, for the test step, the gained accuracy is 99.0%. Confusion matrices for these two stages, training, and testing are plotted in Fig. 11.

Fig. 11
figure 11

Confusion matrices of the training, and testing phases for the designed CNNs

Considering the previous graph, it can be seen that the planned AlexNet has detected only five samples incorrectly in the training phase, although only one sample has mistakenly been classified in the test phase.

With a single GPU as the hardware resource, training the CNNs took 173 s; however, running the same CNNs on a single CPU took about 531 s. The workstation has a GPU processor NVIDIA Quadro P2000.

3 Conclusions

The accuracy of two different classification approaches in the identification of imbalanced, cracked, and misaligned rotor systems has been compared in the current study.

In the first step, unbalancing, crack, and misalignment have been simulated in the finite element model of a rotor system. In the case of the unbalanced system and due to an extra mass in the disk, a harmonic force has been projected in the shaft element containing the disk. To model the effect of a transverse crack in the driven shaft, the stiffness matrix of the cracked element has been changed, also the crack’s breathing behavior has been simulated by means of a truncated cosine series. Finally, parallel misalignment has been modeled through additional forces on the bearings.

In the second phase, and to create a data set, physical and operational characteristics of the modeled systems have been changed, and the vibration signals have been calculated numerically in the steady-state operation. For each class, 220 samples have been generated.

In the feature extraction and selection steps, firstly, four different types of features, statistical, frequency domain, time–frequency domain, and uncertainty have been elicited, then by performing the T test, those features that did not create appropriate differences between various faulted systems (different classes) have been eliminated from the feature vector. Moreover, for the CNNs method, the persistence spectrums of each signal have been plotted and saved.

Two kinds of classification methods have been employed.

In a first manner and as the supervised machine learning, 85% of the extracted feature vectors, also the indices of the three classes (0, 1, and 2 for unbalanced, misaligned, and cracked systems, consequently) have been introduced as the training input and the output classes, respectively. Four types of classifiers have been utilized: decision trees, Naive Bayes, KNN, and SVM. The 15% remaining samples have been allocated to the test phase. The effectiveness of these classifiers is stated in Table 4. Among these classifiers, fine KNN revealed the best performance, 98.2% in the validation step, also 98% in the test phase.

In the second procedure, a 25-layer AlexNet has been designed, and the persistence spectrums have been brought forward as the training material. Part of the success of this method relies on introducing a well-tailored collection of images as the training data. The main cause that the persistence spectrums have been applied is that this graph can reveal the frequency components of the unbalanced, cracked, and misaligned systems with high resolution. This process has shown an overall accuracy of nearly 99%.

When compared to machine learning classifiers, utilizing CNN requires more time and computational power, but with the help of a graphics processing unit (GPU), the amount of time needed to run can be significantly reduced.

There are some restrictions on using the discussed process; initially, the suggested methods are only applicable to the steady-state operation of damaged rotor systems, and further research is required for non-stationary signals, such as start-up and shutdown signals because the nature of stationary and non-stationary signals and the characteristics of faults in these two types of signals differ. Additionally, a suitable noise reduction technique should be used prior to the feature extraction stage because a real-world signal is contaminated by various levels of noise.