1 Introduction

Reliability, safety, and service continuity are prerequisites in current electrical systems. Occasionally, these devices may come to a complete or partial shutdown due to the actuating or conversion devices’failure, thereby jeopardizing the system’s operation continuity. In such systems, a diagnostic tool capable of detecting and identifying the fault in question is indispensable to ensure the system properly functioning.

Permanent Magnet Synchronous Motors (PMSM) find their place in various applications because of their increased reliability and efficiency [1]. They are often deployed in critical systems such as automotive and aerospace.

Due to the permanent excitation of the magnets in the rotor, a short-circuit fault in the stator of the machine is maintained as long as the machine is rotating. Reliable detection with an acceptable confidence margin is therefore required to prevent the rapid fault’s propagation outside the system.

To design powerful diagnostic tools in advance while avoiding machine damage, precise modeling of the machine, taking into account the severity of the fault and its nature, is compulsory. In this context, two approaches are often considered to model the Inter-Turn Stator Winding Fault (ITSWF). The first way concerns the connection between the electrical circuit and the equivalent magnetic circuit. This model considers the slots’ real current distribution and even the magnetic circuits’ saturation [2]. However, its implementation is relatively complicated and time-consuming in defining the variables, which invalidates the faulty motor’s simulation with the diagnostic unit in question.

Differently, the circuit design can be built in different reference frames, such as the static reference frame \(\varvec{(}\varvec{\alpha }\varvec{\beta }\varvec{0)}\) or the rotating reference frame (dq0) [3, 4]. This mode also allows easy access to the motor parameters that determine the faulty winding’s severity and location. As the stator-winding fault causes a change in machine output sizes, such as stator current and torque signal, they were usually employed to extract the fault imprint. In reference [5], the current negative sequence component is taken as the fault signature signal to distinguish the sound state from the machine’s faulty state. However, it does not indicate the winding under consideration.

On the other hand, current frequency analysis, such as the Fast Fourier Transform, has been widely deployed to identify the fault’s characteristics and even locate the fault from the three-current phase. However, the FFT is significantly destabilized by the load variation and the winding motor’s asymmetry. Besides, it exhibits low-frequency resolution and poor performance for non-stationary signals [6]. Some of these weaknesses can be solved using Time-Frequency (TF) analysis algorithms such as the empirical mode decomposition or the Wigner Ville distribution [7, 8].

The first one allows reaching a balance between the suppression of cross-terms and TF resolution loss. In comparison, the second method does not require an adaptive filter to track motor frequencies, thereby making it insensitive to load variations. Both approaches impose lengthy calculations and are subject to complex interpretation.

Another way to detect the fault understudy is via the machine-learning algorithm. The latter has come up to overcome the shortcomings as mentioned earlier related to fault diagnosis methods. Against this background, a variety of learning methods can be found in the literature and include the following features: the dispersed auto-encoder [9], the deep belief network (DBN) [10], and dispersed filtering [11].

In this paper, a conventional machine-learning method, namely the Support Vector Machine (SVM), was used. Extracting the signal characteristics was carried out using an advanced Power Spectral Density (PSD) calculation technique. Two distinct defect characteristic estimators have been studied: the Welch method, which belongs to non-parametric analyses. This method determines the spectrum of the signal without being dependent on the system model. Besides, it allows the reduction of spectral losses, and it is known to have a high-frequency resolution. On the other hand, the parametric method, such as Burg’s method, focuses on auto regression, which also takes some parameters of the model into account for spectrum estimation [12].

To facilitate the maintenance of the machine after fault occurrence and then choose the appropriate fault-tolerant control strategy, it is necessary to determine the fault severity. For this purpose, the current negative sequence is used as the third predictor of the SVM classifier.

The paper has the following structure: in Sect. 2, we will briefly describe the dynamic model of the system under Inter Turn Stator Winding Fault. Section 3 exposes the proposed method and highlights the Current pre-processing and the SVM method. This part includes a comparative study between Welch and Burg’s method in performing fault feature extraction. The simulation results of the proposed fault diagnosis method are discussed in Sect. 4. we consider in Sect. 5 a general conclusions.

2 PMSM’s dynamic model in \((\varvec{\alpha }\varvec{\beta })\) reference frame with ITSF

Typically, the Inter-Turn Stator Winding Fault is caused by an insulation problem between adjacent windings in the same phase. The winding in question is divided into two parts, one sound, and the other defective. As depicted the Fig. 2, this anomaly involves a parallel resistance between the winding extremes (a2) through which the current \(i_f\) flows. The ratio between the short-circuited turns \(N_f\) and the total number of turns \({N}_s\) was designed by \(\mu \).

The following formula can define the equation of the PMSM stator voltage in terms of fault severity and the resistance associated with the defective winding in the (ABC) stationary reference frame:

$$\begin{aligned} {{\left[ v\left( t\right) \right] }^{abcf}}_s= & {} {{\left[ R_s\right] }^f\left[ i_s\right] }^{abcf}\nonumber \\&+{{\left[ L_s\right] }^f\frac{d}{dt}[i_s]}^{abcf}+{[e_s]}^{abcf} \end{aligned}$$
(1)

With:

$$\begin{aligned} {\left[ v\right] }^{abcf}= & {} {\left[ \ v_a\ \ v_b\ \ v_c\ 0\right] }^T, {\ \ \ \ \left[ i\right] }^{abcf}={\left[ \ i_a\ \ i_b\ \ i_c\ \ i_f\right] }^T, \\&{[e_s]}^{abcf}={\left[ e_a\ \ e_b\ \ e_c\ \ e_f\right] }^T \end{aligned}$$

Where: \({\ \left[ R_s\right] }^f=\left[ \begin{array}{c} R_s \\ \begin{array}{c} 0 \\ \begin{array}{c} 0 \\ -\mu R_s \end{array} \end{array} \end{array} \right. \ \ \ \ \begin{array}{cc} 0 &{} 0 \\ R_s &{} 0 \\ \begin{array}{c} 0 \\ 0 \end{array} &{} \begin{array}{c} R_s \\ 0 \end{array} \end{array} \ \ \left. \begin{array}{c} \mu R_s \\ \begin{array}{c} 0 \\ \begin{array}{c} 0 \\ \mu R_s+r_f \end{array} \end{array} \end{array} \right] \) and

\({\left[ L_s\right] }^f=\ \left[ \begin{array}{c} L_s \\ \begin{array}{c} 0 \\ \begin{array}{c} 0 \\ -\mu L_s-M_{a1a2}+M_{a2b}\ \ \ \end{array} \end{array} \end{array} \right. \ \ \ \ \begin{array}{cc} 0 &{} 0 \\ L_s &{} 0 \\ \begin{array}{c} 0 \\ 0 \end{array} &{} \begin{array}{c} L_s \\ 0 \end{array} \end{array} \ \ \left. \begin{array}{c} \mu L_s-M_{a1a2} \\ \begin{array}{c} -M_{a2b} \\ \begin{array}{c} {-M}_{a2b} \\ \mu L_s \end{array} \end{array} \end{array} \right] \)

The following parameters (\(M_{a\mathrm {1}a\mathrm {2}}\) and \(M_{a\mathrm {2}b}\mathrm {)}\) are respectively the mutual inductances between the windings (as2, as1) and the windings (as2, bs) resulting from the new stationary magnetic field, which can be identified based on the Finite Element Method (FEM) [13].

For a machine having one notch per pole and per phase, the axes of the sub-coils as1 and as2 are aligned and are not shifted concerning the coil axis. In this case, one can presume that:

$$\begin{aligned} {\ M}_{a2c}=M_{a2b}. \end{aligned}$$

Thus, it is necessary to include the electromagnetic torque-mechanical in the system’s state-space model to complete the machine model. The following formula can express this:

$$\begin{aligned} T_{em}=\frac{i_ae_a+i_be_b+i_ce_c-i_fe_f}{\Omega } \end{aligned}$$
(2)

Due to the conceptual clarity obtained with a single set of two windings on the stator, it was necessary to apply the Clarke transformation to the electrical Eq. (1). The matrix model of the PMSM in case of malfunction is shown below:

$$\begin{aligned}\left\lceil \begin{array}{c} V_{\alpha } \\ V_{\beta } \\ 0 \end{array} \right\rceil =\ \ \left[ \begin{array}{ccc} R_s &{} 0 &{} -R^{'}_{a2} \\ 0 &{} R_s &{} 0 \\ R^{'}_{a2} &{} 0 &{} R^{'}_{f} \end{array} \right] \left[ \begin{array}{c} i_{\alpha } \\ i_{\beta } \\ i_f \end{array} \right] +\left[ \begin{array}{ccc} L_s &{} 0 &{} M_{f\alpha } \\ 0 &{} L_s &{} 0 \\ M_{f\alpha } &{} 0 &{} L_{a2} \end{array} \right] \frac{d}{dt}\left[ \begin{array}{c} i_{\alpha } \\ i_{\beta } \\ i_f \end{array} \right] +\left[ \begin{array}{c} e_{\alpha } \\ e_{\beta } \\ {-e}_f \end{array} \right] \end{aligned}$$

Where:

$$\begin{aligned} R^{'}_{a2}=\sqrt{\frac{2}{3}}R_{a2}{} \textit{, } R^{'}_{f}=R_{a2}+r_f \end{aligned}$$

and

$$\begin{aligned} M_{f\alpha }=-\sqrt{\frac{2}{3}}{} \textit{ (}L_{a2}+M_{a1a2}+M_{a2b}{} \textit{).} \end{aligned}$$

Based on this reductive model in terms of equations, the electromagnetic torque can then be defined in (\(\alpha \beta \mathrm {)}\) reference frame by this equation:

$$\begin{aligned} {T}_{em}=\frac{i_{\alpha }e_{\alpha }+i_{\beta }e_{\beta }+-i_fe_f}{\Omega } \end{aligned}$$
(3)

This model is implemented within the Matlab Simulink environment to analyze the machine’s behavior in its fault’s presence. The values of mutual inductances are shown in Table 4. Therefore, the simulation is carried out under normal and faulty operating conditions. Figure 1a represents the three current phases when the machine is run at healthy mode, while Fig. 1b showed the stator currents when the fault occurs in phase A.

Fig. 1
figure 1

The output Motor size under regular and faulty operating conditions a stator currents at heathy mode, b stator currents under \(rf= 10{\Omega }\), \({\mu }= 50\%\) conditions, c the electromagnetic torque with \(rf= 1{\Omega }\) and \({\mu }= 30\%\)

Because of winding asymmetry, an unbalance appears in the three currents phase, leading to harmonics, which heightens with fault severity and for small fault resistance value. These harmonics intensify specifically in the current phase, as depicted in Fig. 1b.

From these results, it is notable for fragmenting fault sharpness in four regions. In the first one, the system stay works around nominal condition with small torque oscillation; however, in the rest regions, the fault current \(i_{f}\mathrm {\ }\)arise inducing a braking torque that hinders the electromagnetic torque as depicted in Fig. 1c. In this circumstance, the use of a fault detector becomes irreplaceable.

3 Fault diagnosis method

The fault diagnosis approach is presented with the entire system through Fig. 2. This method’s main advantage resides in the fact that it references different areas of system performance. The boundary between distinct operating regions is thus established while calculating the current negative sequence, which is expressed as follows:

$$\begin{aligned} i_-=\frac{1}{3}(i_a+a^2.i_b+a.i_c) \end{aligned}$$
(4)

With: \(a=e^{j2\pi /3}.\)

The current negative sequence and the stator current pre-processing data are inputs to the SVM, which essentially aims to classify the fault in the function of its severity. More explicitly, the SVM builds a model that assigns new samples to either class.

Typically, the classification is not efficient without pre-processing the input signal. This step makes the classifier easier to learn by converging to the most appropriate data for accurate and consistent classification results [14]. There are many recommended means for signal pre-processing, among which the PSD estimation method has been chosen. The input data size is then considerably reduced by a so-called Findpeak function, which returns a vector with the current PSD signal’s peaks. Another standard solution for data reduction includes Principal Component Analysis (PCA) [15].

After performing all of the above processes, the classifier provides a binary vector in its output with the form [x y]. The [00] corresponds to the normal state, the [01], [10], and [11] are respectively referred to as zone 1, zone two, and zone three.

Fig. 2
figure 2

Inter-Turn Stator Winding Fault diagnosis methodology

3.1 Current pre-processing

To extract the ITSWF characteristics, we calculated the current PSD of the three-stator currents. Two separate PSD estimators have been employed, namely the Welch and the autoregressive method. In this context, the Burg technique was applied to estimate the autoregressive parameters. For more details on the theory behind the Welch method and the Burg autoregressive method, it is recommended to refer to [16].

In this section, these techniques are compared in terms of their frequency resolution and ability to track the defect frequency using the simulation tool. The current was sampled at the 4 kHz frequency, and for each condition, we recorded 131,072 current samples. The result reported here refers to 0% and 50% of fault severity and 0, 1, and 10 \(\Omega \). For the Welch method, the current segments can be distributed in different window types, such as the Bartlett method. Nevertheless, the latter has shown a poor result in magnitude detection, which corresponds to the fault frequency. For this reason, the Hanning window was chosen to ensure the windowing of the segments.

Fig. 3
figure 3

The current’s PSD estimation based on WELCH at different operating conditions

The Welch method has shown that the most critical fault components have been recognizable from the result obtained. Therefore, even for a low fault severity, an acceptable amplitude that identifies the defect has appeared. Because of the winding’s asymmetry, the PSD displays a displacement of the fault frequencies, which arise significantly with fault severity.

Fig. 4
figure 4

The current’s PSD estimation based Burg at different operating conditions

For the same condition as Welch, the current’s PSD was plotted based on the Burg method. Using the Akaike Criterion, the model order is set up to 200 so that the entire fault features could correctly be estimated.

It has been proven that it is nearly unattainable to properly quantify fault frequencies when the model order is less than or greatly exceeding 200. In the first case, only the fundamental component is recognized, whereas if the model order is higher than 200, there are some other components in the frequency spectrum that are not forcefully included.

By analyzing both Figs. 3 and 4, it can be deduced that these methods can separate the three areas of system performance previously mentioned. Each region has its proper feature signature. Furthermore, it is quite understandable that nearly all the fault frequency components are more evident using the Welch except the second one. The peak at this frequency is precisely identified using Burg when the machine runs under the following condition \(\mu \mathrm {=0.5\ ,\ }\)and\(r_f\mathrm {=10}\mathrm {\Omega }\).

3.2 Fault classification detection based Support Vector Machine

Once the fault feature is extracted from the current signal, the classifier is then used to distinguish different system performance regions. This section will closely explain the selected classifier, the Support Vector (SVM) Machine algorithm.

The SVM is a linear non-probabilistic classification method. It was first initially introduced to deal with binary classes and then extended to multi-class problems. With SVM, one class’s data points are separated from those of the other class by finding the optimized hyperplane. Reasonable choice means the best fitting hyperplane, i.e., the one with the most significant margin between the two classes. The hyperplane can be expressed as follows:

$$\begin{aligned} {d\left( x,w,b\right) =w}^Tx+b=\ \sum ^l_{i=0}{w_ix_i}+b \end{aligned}$$
(5)

Where: w is the vector of weights, x is an input vector, and b represents bias.

When the training data is linearly divisible, two parallel hyperplanes that assign the two data categories are determined so that the distance between them is the most comprehensive attainable size. The following equations can express these hyperplanes:

$$\begin{aligned}&w^Tx+b=1 \quad \quad \quad for\ \ the\ class\ labeled \quad \quad 1 \\&w^Tx+b=-1\quad \quad \quad for\ \ the\ class\ labeled \quad \quad -1 \end{aligned}$$

Geometrically, the distance between these two hyperplanes is\(\frac{\mathrm {2}}{\left\| \overrightarrow{w}\right\| }\), in order to maximize the distance between the planes, it is necessary to minimize\(\left\| \overrightarrow{w}\right\| \), besides, it is recommended to check data points from falling into the margin, which can be translated as it follows:

$$\begin{aligned} y_i\left( w^Tx_i+b\right) \ \ for\ all\ one\le i\le n\ \ \mathrm {\ \ \ } \end{aligned}$$

With: \(y_i=\pm 1.\)

Mutually, these constraints are established to get the optimization problem:

Min  \({\left\| w\right\| }^2\) and \(\ \ \ \ \ \ \ \ \ y_i\left( w^Tx_i+b\right) \ge One\ for\ all\ \ \ \ 1\le i\le l\ \ \ \ \)

Classically, the current quadratic optimization problem with inequality constraints can be solved by the saddle point of the Lagrange functional [17]:

$$\begin{aligned} L(w,b,\alpha )= \frac{1}{2}w^Tw-\sum ^l_{i=1}{{\alpha }_i\left\{ y_i[\right. }w^Tx_i+b]-\left. 1\right\} , \end{aligned}$$
(6)

Where the size (\({\alpha }_i\)) are Lagrange multipliers. The solutions \({\alpha }_i\) of the dual optimization problem determine the parameters \(w_0\) and \(b_0\) of the optimal hyperplane as it follows:

$$\begin{aligned}&w_0=\ \sum ^l_{i=0}{{\alpha }_{0i}y_ix_i} \end{aligned}$$
(7)
$$\begin{aligned}&b_0=\frac{1}{N_{sv}}\ \sum ^{N_{sv}}_{s=1}{(y_s-x^T_sw_0)}=\ \frac{1}{N_{sv}}\ \sum ^{N_{sv}}_{s=1}{(\frac{1}{y_s}-x^T_sw_0)} \end{aligned}$$
(8)

\(N_{sv}\mathrm {:\ }\)Denotes the number of support vectors.

Once the parameters (\(w_0\) and\(\ b_0\)) are determined, the final decision hyperplane and the indicator function \(i_F\mathrm {\ }\) are then obtained as given below:

$$\begin{aligned} d\left( x\right)= & {} \sum ^l_{i=0}{w_{0i}x_i}+b_0 \end{aligned}$$
(9)
$$\begin{aligned} i_F=\, & {} sgn\left( d\left( x\right) \right) \end{aligned}$$
(10)

Separating more than two classes involves reducing the single problem of multi-class to a binary classification problem. The one-versus-rest and the one-versus-one are two ways to make this reduction [17].

4 Results discussion

To verify the proposed fault diagnosis method’s efficiency and, therefore, choose the most appropriate PSD technique for the feature extraction process, the classification model is built with SVM aid. The one-vs-one method relying on SVM multi-class with a linear kernel function is employed in the current work. The classifier input data set has been prepared first by estimating the current PSD based on both methods: the Welch and Burg estimator. Simultaneously, the current’s negative sequence is calculated under various scenarios; at normal and faulty operating conditions. Four labels have been assigned to each operating state, as shown in Table 1.

Table 1 The label corresponding to each faulty operating condition

Before the learning phase, the stator current signals are analyzed under the following conditions: the measurement time is 0.6 s; the sampling frequency f is 10 kHz. As a result, the number of samples is set at 6000. For model training, 182 KB is needed, which is equivalent to a preset research work that trait mechanical defects of the induction machine. A low-pass filter is used to cut off the fundamental frequency at 50Hz because some fault characteristics will have a small amplitude and could be neglected in the presence of the current’s fundamental component.

To keep a limited number of SVM training data and thus identify the amplitude and location of spectral peaks, the function Findpeaks located in the signal-processing tool have been deployed.

Using this function and setting the minimum distance between each peak at 100 Hz, the data is reduced from 4100 samples to 250 main elements.

Nearly 250 data are collected from different operating conditions (healthy and defective) under various fault severity and resistance value. One hundred fifty data are used for training and the rest for testing.

Two analyses are performed in this study. The first aims to evaluate the accuracy of the ITSWF diagnostic method, including all the data. While the second is done to compare the PSD method’s impact on classification results.

  1. 1.

    Analyze 1

According to the parallel coordinate plot, it is evident that the negative sequence predictor separates the preference classes from each other as shown in Fig. 5. These results have been achieved under static condition; which mains that the speed and the load torque are considered constant.

It is also visibly illustrated that in the interval corresponding to the peak frequency [1 kHz, 1.5 KHz] and [2.6 kHz, 3.6 KHz], the data can be entirely separated using the SVM with a linear kernel. It is then conceivable to eliminate the negative sequence to predict the severity of the fault in this interval. However, it is still necessary to rely on the peaks in low frequency to avoid the noise effect caused by external conditions.

Fig. 5
figure 5

The parallel coordinate plot of the standard deviation between different predictors

In order to assess the learner’s effectiveness, we are calculated the accuracy of the classification based on the ratio between the number of correctly classified samples and the total number of classified samples under dynamic condition. According to the confusion matrix, the 5-fold classification error is 1.5 %, which means that all the remains records were classified correctly.

  1. 1.

    Analyze 2

In this analysis, we restricted the frequency interval at [2640 Hz, 3650 Hz] using a bandpass filter. As shown in Fig. 6, the minimum distance between each peak is set up to 50 Hz so that the number of a dataset will be significant. Moreover, to compare the AR-based Burg and the Welch performance, we eliminate the current negative sequence and the data corresponding to zone 2 in the training phase.

Fig. 6
figure 6

the classification of stator current’s PSD a AR based Burg, b Welch

Using the Welch method, the fivefold classification error is 2.5%. One of the defective samples corresponding to the “10” class is misclassified as a sample belonging to the “11” class. On the other hand, it can be observed that the samples corresponding to the normal state are entirely separated since that the positive predictive value is equal to 100% as shown in Table 2.

Table 2 Confusion Table for testing dataset based Welch method

By contrast, when using the AR model-based Burg, the misclassification rate increases to 7.3%. According to the Table 3, 14.3 % from the class ‘10’ was misclassified as ‘11’ and, 8.3% from the class ‘11’ was misclassified as ‘10’. We concluded from these results that the classifier has benefited from a careful selection of feature extraction methods from this analysis. The welch analysis demonstrates more adequate results in extracting fault features at the high frequencies range.

Table 3 Confusion Table for testing dataset based Burg estimator

Since the extraction of the fault features is done by estimating the power spectral density, the accuracy of the classification is directly related to the sampling frequency, especially with the Burg estimator. We note that a low sampling frequency makes the identification of fault frequencies impossible. In addition, a relatively high sampling frequency may result in components that do not reflect the fault feature.

By adding the current negative sequence component as an SVM predictor alongside the other predictors, we find from the simulation results that the different performance areas of the system can be easily identified and classified with an accuracy of 98.5%. This conclusion is obtained if the machine is submitted to dynamic conditions. The result obtained exceeds the classification rates obtained with two fault detectors, namely the SVM-PSO, and HCNN-softmax typical [18], which reach an accuracy rate equal to 94.23%, and 96.57%. However, the HCNN-SVM detector achieves more accurate results with a classification rate of 99.89%.

5 Conclusions

In this paper, a new method of detecting an Inter-Turn Stator Winding fault for Permanent Magnet Synchronous Motor is considered. This technique uses the Power Spectral Density estimator to extract the fault characteristic from the motor’s three phases’ stator current and then explores these characteristics to classify the system’s normal and faulty states. It has the advantage of separating the different performance regions according to the fault severity ratio of the fault being studied.

In addition to exposing the fault diagnosis methodology, this study highlighted the Autoregressive Model estimators’ difference based on Burg and the Welch method in extracting fault characteristics. Simulation results showed that the Welch method combined with linear SVM offers relatively optimizable performance compared to the Burg method for ITSWF detection.

It should be noted that the autoregressive method based on the Burg parameter estimator also allows adequate monitoring of the frequencies characterizing the fault and therefore it detects the ITSWF; however, it is dependent on the model order; thus, the wrong choice of the model order would automatically lead to erroneous results. By way of perspective, this method could be applied to diagnose various PM Synchronous Motor’s powertrains defects of the electric vehicles in which the engine occupies a primordial place.