Adaptive feature extraction and fault diagnosis for three-phase inverter based on hybrid-CNN models under variable operating conditions

The increasing reliability and availability requirements of power electronic systems have drawn great concern in many industrial applications. Aiming at the difficulty in fault characteristics extraction and fault modes classification of the three-phase full-bridge inverter (TFI) that used as the drive module of brushless DC motor (BLDCM). A hybrid convolutional neural network (HCNN) model consists of one-dimensional CNN (1D-CNN) and two-dimensional CNN (2D-CNN) is proposed in this paper, which can tap more effective spatial feature for TFI fault diagnosis. The frequency spectrum from the three-phase current signal preprocess are applied as the input for 1D-CNN and 2D-CNN to conduct feature extraction, respectively. Then, the feature layers information are combined in the fully connected layer of HCNN. Finally, the performance status of TFI could be identified by the softmax classifier with Adam optimizer. Several groups of experiments have been studied when the BLDCM under different operating conditions. The results show that the fusion features can get a higher degree of discrimination so as to the presented network model also obtains better classification accuracy, which verify the feasibility and superiority to the other networks.


Introduction
Power electronic converters (PECs) are widely used in electric vehicles (EVs), smart grids, renewable energy generation, aerospace and other fields.Considerable manufacturers have been getting a growing awareness of the protection efficiency and maintenance costs of PECs.The reliability and availability are essential to these industrial applications [11,13,25,28], as a fault not only may cause unscheduled shutdown but also even lead to a disastrous accident.Thus, prognostics and health management (PHM) received great attention and promotion in numerous fields [8,10,23,35], which is the development of condition-based maintenance (CBM) that can forecast the occurrence moment of faults and greatly improve the operation and maintenance efficiency.
As the development of new energy electric vehicles, the brushless DC motor (BLDCM) system gets more and more attention, because its reliability relates to the whole vehicles' performance and safety.The brushless DC motor (BLDCM) drive system, as a typical PEC, the faults of that mainly come from the vulnerable components such as power MOS-FET, capacitor, connector, etc.As the core part of the drive system, the power MOSFET is more prone to failure due to its frequent on-off movements and the influence of thermal and electrical overstress.If the failure of the power MOS-FET cannot be detected effectively, it will get an adverse effect on the motor drive system and result in immeasurable economic losses [4,16,20].
The fault types of power MOSFET can be divided into open circuit fault (OCF) and short circuit fault (SCF).Usually, the SCF occurs rapidly, the hardware circuit equipped with the corresponding protection measures, such as connecting a fast fuse in series in the drive circuit, which can convert the SCF into OCF.However, the OCF of the inverter is not easy to be detected, because the circuit can still operate normally over a period of time.If the OCF fault is not fixed in time, it will result in a secondary fault to the power MOSFET even the whole power electronic circuits, owing to the current through other power MOSFETs and components may increase the electrical stress greatly.At the same time, the harmonic interference will also damage the devices under the working mode with fault [34,36].Therefore, the technology of fault diagnosis for power electronic systems can guarantee the reliability and availability which make it more intelligent and safety [33].

Related work
The fault diagnosis method of the inverter is shown in Fig. 1, which is mainly divided into model-based method and datadriven method.The method based on analytical model usually needs to understand the dynamic properties and operation mechanism of the system, then establish an accurate mathematical model.According to the detection of input and output signals, the residual signal is formed, the signal is further analyzed to realize fault diagnosis.Hu et al. [17] used four observers corresponding to four types of opencircuit faults of switch tubes to detect and locate faults.No additional sensors are needed, which is a single-phase PWM rectifier switch tube open-circuit fault diagnosis method based on switch system theory.For the open circuit, fault diagnosis of voltage source inverter (VSI), Chen et al. [9] proposed an online fault diagnosis method based on current observer.By analyzing the state space model of VSI when fault occurs, a state observer is constructed to estimate the current.However, methods based on analytical models rely too much on precise mathematical models.Once the structure of the research object changes, it needs to be remodeled.
The method based on data-driven does not need to know the exact analytical model of the system, which directly analyzes and processes the measured data.It has been widely used in the field of fault diagnosis.Considering that the topology of the inverter is extremely symmetrical and the failure mode will destroy the symmetry of the three-phase current, Hang et al. [6]  With the continuous development of artificial intelligence, more and more machine learning methods are applied to the field of fault diagnosis, as shown in Table 1.CNN is widely used in speech recognition, image processing and image recognition because of its powerful feature extraction ability.CNN does not need to know the specific features of the input, which achieves the encapsulation of feature extraction through learning.The advantages of CNN can solve the problem of low fault diagnosis rate due to the lack of obvious features.Watanabe et al. [2] used a real-time monitoring scanning acoustic microscope to capture images during a  [26], but also obtain the number of images per block that will enter in the training phase [27].The optimized CNN is applied in the field of recognition and image classification.Li et al. [31] proposed an infinite convolutional neural network (FAC-CNN) for adaptive fusion of multi-source data.Multi-source data are captured by an adaptive convolution kernel matching the number of data channels, meanwhile, an irregular convolution kernel is introduced to expand the field of view of FAC-CNN.Zare et al. [32] considered changes in each measurement variable and identified subsystem failures, they used a multi-channel convolutional neural network with multiple parallel local heads.The image composed of the time domain signal obtained from the wind turbine is input into the convolutional neural network.A majority of machinelearning networks are performed in an ideal experimental environment.In the face of changing working conditions, the single machine neural network has problems such as low fault diagnosis accuracy and poor generalization ability.
To solve these problems, a convolutional neural network based on fault feature fusion is proposed, which focuses on mining the deep information of the original fault data to avoid complex feature engineering.After preprocessing, the one-dimensional structure features of the fault samples are extracted by 1D-CNN and the two-dimensional spatial features are extracted by 2D-CNN.The deeper feature information of fault data is generated by fusion in the full connection layer.It can not only avoid the one-dimensional network to get smaller receptive field, but also prevent the two-dimensional network from destroying the original data space structure.The main contributions of this paper can be summarized as follows: 1.A hybrid convolutional neural network structure is proposed, which can adaptively learn features.2. Through PCA visual analysis, the feature extraction effect of HCNN is significantly better than other networks.3. A new index is applied to evaluate the effect of feature extraction.4. HCNN can be competent for fault diagnosis under different working conditions.
The rest of this paper is organized as follows: "Theory of HCNN" introduces the theory of CNN; "Fault diagnosis model based on HCNN" introduces the process of HCNN fault diagnosis.In "Experiment study and analysis", the effectiveness of HCNN is verified by experiments.Finally, conclusions are drawn in "Conclusion".

Convolutional neural network
The traditional convolutional neural network is a typical feedforward neural network, its powerful feature extraction ability makes it widely used in image recognition, speech detection and other fields.Since Lecun et al. proposed LeNet-5 [38], as shown in Fig. 2, the basic structure of the convolutional neural network has been basically determined.The typical convolutional neural network is usually  composed of an input layer, convolutional layer, pooling layer and a fully connected layer.The function of the convolutional layer is to adaptively extract the fault characteristics of the input data while eliminating noise.The appropriate convolution kernel is selected to process the input data in turn.Different convolution kernels represent different features.Usually, the convolutional neural network increases the feature extraction ability of the model by adding different convolution kernels, then obtains multi-layer data with fault features.The convolution formula [19] is: where M j is the dimension of the input data, l is the number of layers of the network, x i l − 1 is the part of input data to be convolved, w1 ij is the weight matrix, b1 j is the bias matrix, f(•) is the activation function [15], the activation function selected by this model is the ReLU function, which mathematical expression is: The pooling layer is also called the downsampling layer.The pooling layer is commonly used as the maximum pooling function and the mean pooling function, the convolution kernel slides through the target data in turn and extracts the maximum or average value of the corresponding area.Therefore, the role of the pooling layer is to extract the main features of the data output from the convolutional layer while reducing its dimensions.This model chooses the maximum pooling function.Its expression is: where βl j is the weight matrix, bl j is the bias matrix, down (•) is the downsampling function. (1) The fully connected layer combines the fault features extracted by the convolutional layer and the pooling layer.The fully connected layer is connected to the last pooling layer, finally, obtains the prediction result by softmax classifier.The expression of the fully connected layer is: where k is the network layer number, y k is the output of the fully connected layer, w k is the weight coefficient, x k−1 is a one-dimensional feature vector, b k is the bias coefficient, f(•) is softmax activation function [14], its expression is: represents the probability value of the ith sample belonging to the jth category.

HCNN model
Traditional CNN has been widely used in image recognition, natural language processing and other fields [5,22,30].Based on its powerful feature extraction capabilities, the improved CNN is applied to TFI fault diagnosis in this paper.Since the fault signal is a one-dimensional spectrum signal, the network structure can choose 1D-CNN.However, under the same convolution kernel, 1D-CNN can obtain fewer receptive fields which has the disadvantage of insufficient data and it is prone to overfitting during the training process.The fault data signal can also be converted into a matrix form as the input of 2D-CNN.However, this method may destroy the spatial correlation in the original sequence, resulting in the loss of fault information, and cannot maximize the advantages of deep learning to automatically learn the characteristics of the original signal [35].Therefore, HCNN is proposed in this paper, its network structure is shown in Fig. 3. (4) , As can be seen from Fig. 3, the data processed by FFT are used as the input of 1D-CNN and 2D-CNN in one-dimensional and two-dimensional forms, respectively.Both networks are alternately connected by two convolutional layers and pooling layers to achieve feature extraction of the fault sample data, the fault feature vectors extracted from the two networks are fused in the full connection layer to generate the fault feature vectors of HCNN.Finally, the softmax classifier is used to complete the fault mode recognition.In the learning process of HCNN, the cross-entropy loss function is selected as the training objective loss function fused with fault features, its mathematical expression is shown in: where I(⋅) is indicator function, (x i , y i ) is the training sam- ple of the model, m is the number of training samples, n is the total number of fault categories.Calculating the partial derivative of formula ( 6), the weight parameters w and b of CNN can be updated layer by layer.
where w ′ and b ′ is the updated weight and bias, w and b is the current weight and bias.η ∈ (0, 1) is the learning rate.The goal of training is to minimize J(w).

Fault mode of TFI
The schematic diagram of the three-phase full-bridge brushless DC motor drive circuit is shown in Fig. 4, T 1 -T 6 are power MOSFETs, D 1 -D 6 are flyback diodes, V in is the DC bus voltage of 48 V.The fault data are obtained from the three-phase winding currents i a , i b , i c .The single open fault mode of six switches is discussed in this paper, including a total of seven failure modes in the normal working state.The failure mode is shown in Table 2.

Fault diagnosis process
The flowchart of fault diagnosis based on HCNN is shown in Fig. 5.The specific steps are as follows: 1. Collecting DC-AC drive circuit A, B, C three-phase current signal i a , i b , i c as fault signal.2. FFT is used to convert the measured current signal into the frequency domain signal of fault signal.The appropriate frequency domain signal is intercepted as the fault feature vector, and the fault feature vector is normalized to obtain the fault data sample, which are divided into training set and testing set.

Experimental platform
The experimental platform is shown in Fig. 6.The fan load is fixed on the BLDCM shaft with a rated speed of 800 rad/ min.In the experiment, the drive signal is controlled by software to simulate the open circuit fault of different power MOSFETs.Similarly, the fan speed is controlled by software to make the BLDCM work under different load conditions.In the experiment, the fan load is set to three situations, the speed is 550 rad/min, 650 rad/min and 750 rad/min, which are recorded as load situations C load1 , C load2 and C load3 .Under each load condition, the phase currents i a , i b , i c are sampled 100 times for each failure mode (including normal working state).The current signal sampling adopts the chip ACS712 to convert it into a voltage signal, the sampling frequency of the data collector is set to 200 kHz, and the length of each data sampling is 100 k sampling points.When the BLDCM is in FC2 failure mode, the phase current i a , i b , i c waveforms are shown in Fig. 7a.FFT is used to analyze the frequency domain, the obtained amplitude spectrum is shown in Fig. 7b.The first 432 frequency points in the current amplitude spectrum of i a , i b , i c are taken as the fault feature vectors.Therefore, the dimension of the fault feature vector is 1296, meanwhile, there are 700 fault sample data in each load situation.The samples are randomly divided into three groups.The ratio of the testing set to the training set is shown in Table 3.

Model creation
The main parameters involved in HCNN are structural parameters and training parameters.The structural parameters directly affect the adaptive feature extraction capabilities of HCNN, including the size and number of convolution kernels, etc.The structure parameters of HCNN are shown in Table 4. Training parameters affect the quality of the network model and directly affect the accuracy of fault diagnosis.Parameters usually include optimizer, optimizer learning rate, batch number, number of iterations, etc.The training parameters of HCNN are shown in Table 5.
The sample dimension of the original data is 1296, the input signal of 1D-CNN is 1 × 1296 and that of 2D-CNN     The fault diagnosis algorithms used in this experiment are all executed in the Pycharm environment, the operating environment is Windows10 (64-bit) system, the central processing unit is Intel Core i7-9750, 16G memory.The compilation environment is python3.6,tensorflow-gpu1.13.2.
Adam optimizer is selected as the network model.According to the experiment in [3], the best classification result can be achieved when the learning rate is 0.001, which is verified in Fig. 8a.The number of iterations is shown in Fig. 8b.After 400 times, there is basically no fluctuation and remains stable.To avoid minor fluctuations, the number of epochs is selected as 500 times in this article.

PCA analysis of fault feature
To analyze the effect of HCNN fault feature extraction, principal component analysis (PCA) was performed on the original data set and the data set extracted by the three networks.The first three principal components of the feature are visualized in a three-dimensional image.The test data set of the experiment uses the fault data sample with the load of C load1 as an example, the result is shown in Fig. 9.
It can be seen from Fig. 9a that the main components of the various failure modes in the original data sample are not clustered together and the scattered points of each failure mode overlap each other, which is messy and difficult to distinguish.It shows that the original fault samples cannot clearly reflect the fault characteristics of the TFI.
Figure 9b, c are PCA visualization scatter plots of 1D-CNN and 2D-CNN extracted features, respectively.It can be seen that the same types of failure modes are clustered together well, but for different failure modes, only FT1 failure modes are clearly separated from other failure modes, indicating that the two methods are only sensitive to obvious failure modes and sensitive to minor failures.There is still overlap between different failure modes, so the feature extraction effect is not ideal.
Figure 9d is the PCA visualization scatter points extracted by HCNN.It can be seen that the same failure modes of the TFI are extremely gathered together, and the feature scatter points between different failure modes are effectively separated, which shows a regular pattern of gathering.Compared with 1D-CNN and 2D-CNN, the feature extraction ability of HCNN is obviously the best.

Distinguishing evaluation of fault characteristics
To evaluate the ability of deep neural network feature extraction, it is assumed that the discrete points of each state mode on the PCA map are surrounded by a minimum characteristic sphere, its center coordinates and radius are determined by the minimum enclosing sphere random increment algorithm.Assume that P j = (a j , b j , c j ) is the center coordinate of each state mode.P ji = (a ji , b ji , c ji ) is the sample point coordinates of each state mode.r j represents the radius of each state mode and j represents the state mode.The following feature indicators are used to quantitatively analyze the features extracted by the above three methods [39]: 1.The radius of the characteristic sphere in each mode (CSR); (9) FSR j = r j j = 1, 2, … , 5 It can be seen from Fig. 10a that HCNN has the largest DCC in the seven failure modes, indicating that the distinction between different failure modes is particularly obvious.The failure mode discrimination ability of 2D-CNN is better than 1D-CNN.
It can be seen from Table 6 that the CSR of HCNN is larger than others, but it must be combined with the indicator DCC of each state mode.Therefore, CSR/DCC is introduced to measure the difference between different state modes and their discrete point distribution.The smaller the index is it shows that the better the fault extraction effect is.From Fig. 10b, it can be seen that the CSR/DCC of HCNN is basically the smallest among the seven failure modes.Only in the FT4 failure mode, the index is slightly higher than the other two networks, indicating that the other two networks are only sensitive to the FT4 failure mode, the generalization ability is weakness.
In summary, the different fault features extracted by HCNN are scattered far, and the fault extraction effect is significantly better than 1D-CNN and 2D-CNN.( 10)

The performance of fault diagnosis methods
The  In Fig. 14, the fault diagnosis accuracy of C load3 is slightly lower than that of C load1 and C load2 , indicating that the fault sample data in the case of C load3 contains a lot of noise, which will affect the classification accuracy of the network.But in the case of Mixed C load , the accuracy of SDAE is 84.29%, while the accuracy of HCNN is 98.57%, which indicates that HCNN has stronger generalization ability.

Conclusion
For the open-circuit fault diagnosis of TFI switches, the HCNN method is used in this paper, which uses 1D-CNN and 2D-CNN to extract fault features at the same time, merges fault features at the fully connected layer to generate HCNN fault features, the softmax is used to classify.By compared with 1D-CNN and 2D-CNN algorithms, the results show as follows: 1.The adaptive fault feature extraction effect of the HCNN algorithm is more obvious and deeper fault feature information can be extracted from the same fault data, which improves the accuracy of fault diagnosis results.2. Under different loads and different sample division ratios, the accuracy of fault diagnosis of HCNN remains above 95.86%.When the sample ratio is case 3, the diagnostic accuracy rate of HCNN under various load conditions is maintained above 98.57%, which is significantly better than the other networks.Reflects the strong generalization ability of HCNN.HCNN model lays a good foundation for fault tolerance and is also beneficial to PHM of power electronic system.
proposed the Mannharden distance to detect and locate open-circuit faults based on the similarity measurement algorithm of the order residual of the current.Aiming at the open-circuit fault of multiple IGBTs in the inverter, Bae et al. [7] studied a fault diagnosis technology based on Park vector-based online monitoring and identification.According to the dwell time of the vector of the threephase current, the open circuit fault of the switch is detected and normalized.Determine the branch open condition based on the average value of the normalized current and perform fault diagnosis.However, the data-driven method requires nuclear principal component analysis of a specific fault signal, which leads to excessive workload.

Fig. 1
Fig. 1 Fault diagnosis methods of inverter

Fig. 4 Table 2
Fig. 4 Schematic diagram of three-phase full-bridge brushless DC motor drive circuit

is 36 ×
36.Both 1D-CNN and 2D-CNN are composed of two convolutional layers, two pooling layers, and a fully connected layer.The convolutional layer and the pooling layer are alternately connected.The convolution layer uses the commonly used 3 × 1 and 3 × 3 convolution kernels, and the step size is set to 1.The pooling layer uses the smaller 2 × 1 and 2 × 2 convolution kernels, and the step size is set to 2. The selection of network model structure parameters in this paper not only expands the receptive field area of model input, increases the amount of information input, but

Fig. 10
Fig. 10 Feature evaluation index of three networks

Fig. 11 3 2.
Fig. 11 Fault diagnosis confusion matrix of HCNN under C load2 performance of HCNN under different load HCNN is used to test the experiment of different sample proportion under three load conditions.The confusion matrix of HCNN fault diagnosis in the case of C load2 is shown in Fig. 11, and the accuracy of other fault diagnosis is shown in Fig. 12.It can be seen from Fig. 12 that the fault diagnosis accuracy rate of HCNN under any conditions has reached over 95.86%.Under mixed load conditions, fault diagnosis accuracy of HCNN is also ideal, indicating that HCNN has a strong generalization ability.When the sample ratio of the training set and the test set changes, the change trend of the accuracy of fault diagnosis in different situations is also the same.When the data sample ratio is CASE 1, the network parameters have not been optimally trained due to fewer training samples, resulting in lower accuracy.As the number of training samples increases, when the proportion of data samples is case 3, the test set is also reduced, meanwhile, the accuracy of fault diagnosis is significantly improved.Fault diagnosis result in different state modes SDAE, 1D-CNN, 2D-CNN and HCNN are used to classify the failure modes of the failure samples under various load conditions, the result is shown in Fig. 14.It can be seen from experiment 4.4.1 that fault diagnosis has the highest diagnostic accuracy under case 3 conditions, so the experiment is carried out under case 3 conditions.The fault diagnosis result based on HCNN is shown in Fig. 13.

Fig. 12
Fig. 12 Fault diagnosis accuracy of HCNN Experiments show that HCNN can complete TFI fault diagnosis under different working conditions, and can accurately locate the fault components.HCNN can also be applied to the field of fault diagnosis where the original signal is one-dimensional signal, such as the voltage signal of transformer fault diagnosis.It is not necessary to know the characteristics of the fault in the original signal.With the powerful learning ability of HCNN, it can adaptively identify different fault modes.The fault diagnosis ability of

Fig. 14
Fig. 14 Fault diagnosis results under different loads

Table 1
Application of machine learning in fault diagnosis

Table 4
Structural parameters of HCNN

Table 5
Training parameters of

Table 6
CSR in each fault state