1 Introduction

As a crucial protection and control apparatus in the power grid, a high-voltage circuit breaker (HVCB) is used to cut or connect the circuit according to the received operating signal. HVCB faults caused by its operating mechanism can lead to severe damage, such as unscheduled time loss and even circuit destruction [1]. The reliability of HVCBs is closely linked to the stability of the electrical supply provided by the power grid. The maintenance of HVCBs is an important daily task for substations. A traditional scheduled maintenance scheme is widely applied for that purpose, in which the operating mechanism needs to be partially dissembled for fault detection. Previous operating recordings indicate that some unnecessary economic losses and new defect mechanisms might be brought about by this scheduled maintenance [2, 3]. Moreover, the relationship between mechanical parameters and mechanical system health is highly nonlinear. HVCB operating faults caused by internal operating mechanisms are very complicated to analytically formulate and detect by traditional maintenance methods [4, 5]. Moreover, according to statistical data, the operating mechanism contributes to a large portion of HVCB operating failures. Operation rejection caused by “mechanical stuck” (i.e., failing to open or close on command) is responsible for the highest proportion of major failures of HVCBs, at 34% of the overall failures [6]. Although typical on-line system monitors can improve the reliability of the mechanism system, the feasibility of these monitors is prevented by the high hardware cost and construction difficulty [7, 8]. Thus, the diagnosis of potential faults concealed inside operating mechanisms is meaningful in enhancing the stability of the electrical power supply to consumers.

With the advantages of strong self-adaptability, robustness and great fault tolerance, artificial neural networks (ANNs) can store large amounts of precise nonlinear input-output mapping relationships through sufficient training without revealing the mathematical equation [9]. Consequently, ANNs have recently become applied in fault diagnoses [10, 11]. Based on an electric current signal, investigations of fault diagnosis of HVCB by ANNS are popular recently [12,13,14], by which the card acerbity and stroke over the iron core idle in HVCB has been diagnosed successfully. Fei et al. [15] noted that due to the limitations of slow convergence velocity and easy relapse into the local optimum, the performance of classic ANNs needs to be further optimized. As an alternative solution, support vector machines (SVMs) successfully overcome the disadvantages of ANNs. Moreover, combined with other intelligent algorithms, SVMs can achieve fault diagnoses with high accuracy with limited fault samples [16]. For instance, an optimized genetic algorithm-based SVM (GA-SVM) model was proposed [17, 18]. Through this approach, operating mechanism faults, such as base screw looseness and buffer spring failure, were successfully diagnosed. Compared with the normal SVM model, the optimized model achieved higher diagnostic accuracy. An optimized maintenance scheme based on least-squares SVM was also proposed to predict the distribution of defects using statistical defect data collected from large quantities of HVCBs [19]. The vibration signal is highly related to the modal parameter of the mechanical structure that directly contains more information about the system health status of the operating mechanism than electrical signal [20, 21]. Besides, based on variable feature extraction and analysis method [22, 23], vibration signal has been successfully be utilized in fault diagnosis of rotating machinery like gearbox [24] and combustion engines [25]. These valuable studies provide some guidance for fault diagnosis of switching mechanism like high voltage circuit breakers. One of the key points of fault diagnosis is the eigenvalue extraction of the fault signal. Wavelet packet decomposition (WPD) can achieve multi-scale information refinement by scaling and moving its operation function. Therefore, the fault eigenvalue in vibration can be effectively extracted through WPD to form an input vector for the fault diagnosis model [26, 27].

The abovementioned studies have successfully diagnosed some faults caused by the buffer spring, electromagnet and base screw of HVCBs. For an electromagnet, its iron core might be stuck without effective lubrication, which eventually causes the action rejection fault of a HVCB. Faults caused by buffer spring deterioration and base screw looseness can cause loud noises and abnormal vibrations, which dramatically damage the operational accuracy and reduce the service life of a high-voltage circuit breaker. All of these works are valuable in improving the reliability of HVCBs and the overall electric power system. However, as for the faults caused by some internal operating mechanisms components (e.g., cam, spring and revolute joints), relative fault diagnosis works are still absent due to the shortage of signal samples.

The purpose of this study is to propose a particle swarm optimization-support vector machine (PSO-SVM) fault diagnosis model to diagnose these operating mechanism faults. The Gaussian radial basis function is selected as the kernel function of the SVM. The particle swarm optimization (PSO) algorithm is used to determine the kernel parameter and penalty parameter because they are closely related to the accuracy of the fault diagnosis model. Vibration signals generated during the operation experiment are considered for fault diagnosis. A total of 150 groups of vibration signals under different operation conditions are collected from HVCB experiments. These vibration signals are analysed via WPD. Then, the energy of each frequency band is extracted as the fault eigenvalue to form the input vector for the training and testing of the proposed fault diagnosis model. In addition, comparative fault diagnosis experiments are performed with the normal SVM model and a particle swarm optimization-back-propagation neural network (PSO-BP) model. The comparative study results suggest that the PSO-SVM in the paper has better efficiency and higher diagnosis accuracy than the other models; thus, the PSO-SVM model can overcome the limitations of traditional maintenance and eventually improve the reliability of HVCBs.

2 Signal Eigenvalue Extraction

2.1 Description of the Operation Experiment

Compared with hydraulic and pneumatic operating mechanisms, spring operating mechanisms are more compact and lower cost, and they do not leak oil or gas. Thus, spring operating mechanisms have been widely applied in HVCBs at voltage levels between 10 and 35 kV. ZN12 is a common new kind of indoor three-phase HVCB equipped with a spring operating mechanism. According to a previous operating report, a large number of operating mechanism faults appear during the service period. Figure 1 shows the main structure and two typical operating mechanism faults, over-travel and mechanical stuck, in ZN12 and the vibration signal collection system in this paper.

Figure 1
figure 1

HVCB structure and operation experiment

The over-travel fault of the mechanical system is quite common in the service period of an HVCB. Once an over-travel fault occurs, the HVCB cannot be resettled for the opening operation to cut the fault circuit in the power grid. In addition, mechanical stuck faults generally exist in the revolute joints of an articulated mechanism. Under normal conditions, there is reasonable clearance and good lubrication in revolute joints that ensures the mobility of the mechanism system. However, in the long-term service process, the lubrication might deteriorate. In addition, the clearance size of revolute joints could also change along with the inevitable wear effects during operation. All these factors can lead to mechanical stuck faults. Operating mechanism faults in spring operating mechanisms are highly nonlinear and hard to detect by scheduled maintenance schemes.

Fault signal samples are indispensable for model training to establish an intelligent diagnosis model. However, traditional big data of HVCBs from the power grid are characterized by a low density of valuable signal data, and most of the recordings are normal signal samples. It is difficult to determine a sufficient number of fault signals to train and establish a fault diagnosis model. The fault signals adopted in previous studies of HVCB fault diagnosis are insufficient and are scarcely involved the faults caused by the internal operating mechanisms. Thus, an indoor operation experiment is first conducted in this study. By deteriorating the lubrication and components of the mechanism, normal conditions and four types of operating mechanism fault conditions (cam offset, spring creep, over-travel and mechanical stuck) are constituted. The operation of an HVCB belongs to a transient shock process (spring energy releases instantaneously, and then the cam impacts the roller to drive the whole operating mechanism). For accurate vibration measurements, accelerometer performance should be selected based on the maximum vibration burst that can be expected for the mechanism.

A set of vibration collection system from Brüel&Kjær is adopted in this paper. The system mainly includes a CCLD/IEPE accelerometer type 8339 (a measuring range of ± 20000g (g = 9.8 m/s2), a sensitivity of 0.25 mV/g, an upper-frequency limit of 20 kHz, and a non-linearity of 0.3%), and a vibration recorder type 3053 (a 25.6 kHz sampling frequency for a time period of 100 ms per operation process). In this paper, the vibration signals of a high-voltage circuit breaker type ZN12 are collected under five different operation conditions (normal condition, cam offset, spring creep, over-travel, and mechanical stuck). Thirty groups of vibration signals are collected per operation condition; hence, a total of 150 recordings are collected.

2.2 Wavelet Packet Decomposition

Compared with the traditional fast Fourier transform, WPD belongs to a self-adaptive and multi-scale signal processing method whose sampling window size varies with respect to the signal frequency. Therefore, both good time resolution at the high-frequency band and good frequency resolution at the low-frequency band can be achieved by WPD [27]. After decomposing the vibration signal through WPD, the detailed signal in different frequency bands can be obtained for further Eigenvalue extraction. The basic theory of WPD is outlined as follows [28]:

$$ \left\{ {\begin{array}{*{20}c} {u_{{2^{n} ,k}} (t) = \sqrt 2 \sum\limits_{k} {h(k)u(t)} ,} \\ {u_{{2^{n} - 1,k}} (t) = \sqrt 2 \sum\limits_{k} {g(k)u(t)} ,} \\ \end{array} } \right. $$
(1)

where k is the layer of decomposition; h(k) and g(k) are high-pass filters and low-pass filters, respectively; u(t) is the original signal; and \( u_{{2^{n} }} (t) \) and \( u_{{2^{n} - 1}} (t) \) are the signals that passed the high and low filters, respectively. The vibration signal is decomposed into two different frequency subsections after each decomposition layer. Then, the subsection is passed to the next layer and eventually forms a decomposition of a binary tree, as shown in Figure 2.

Figure 2
figure 2

Three layer wavelet packet decomposition

The decomposition layer is vital for the signal resolution of the WPD method. Fewer decomposition layers result in a dull reaction, especially with fewer distributed signal fluctuations. Meanwhile, an oversized decomposition layer can exponentially increase the computation and cause very large calculation errors. Comprehensively, the WPD method with three decomposition layers is used in this study. Figure 3 shows one of the original vibration signals collected from the operation experiment and its three-layer WPD result.

Figure 3
figure 3

Vibration signal and relative detail signal

2.3 Vibration Energy Distribution

Figure 3 shows that the original vibration signal was decomposed into eight frequency bands. There are already some differences in each detail signal. In general, the energy distribution of a vibration signal would be altered in different frequency bands in the occurrence of a mechanical fault. Therefore, the energy of each frequency band is selected as the signal eigenvalue for fault diagnosis. The energy of each band is calculated and normalized as follows [17]:

$$ \left\{ {\begin{array}{*{20}l} {E_{i} = \int_{{t_{0} }}^{{t_{i} }} {\left| {A(t)} \right|^{2} {\text{d}}t} ,} \hfill \\ {e_{i} { = }\frac{{E_{i} }}{{\sum\limits_{1}^{8} {E_{i} } }},} \hfill \\ \end{array} } \right. $$
(2)

where i is the signal of each band; t0 and ti represent the signal beginning and ending time, respectively; and A(t) denotes the signal energy extracted by the Hilbert technique [29].

As an instance, one vibration signal for each operation condition of HVCB is extracted for analysis. First, the original vibration signal is decomposed through the 3-layer WPD method to obtain its eight-band detailed signal. Then, the WPD energy of different frequency bands is calculated and normalized with Eq. (2). Figure 4 shows the WPD energy distribution of the five-group vibration signals in this paper. Figure 4 shows that the energy distribution of different frequency bands changes with the operation condition. However, it is still too difficult to diagnose faults directly. Thus, a novel PSO-SVM is proposed for fault diagnosis, which uses these WPD energies as the eigenvalue of the operating mechanism condition.

Figure 4
figure 4

Energy distribution of different condition

3 Principle of the PSO-SVM Method

3.1 Theory of SVMs

Based on statistical theory, a novel learning algorithm named SVM was proposed by Vapnik and co-authors in the late 1990s [30]. The basic idea of SVMs is that a non-linear issue is linearly separable in higher dimensional feature space through some non-linear mapping function. An example of the classification of two types of data in feature space is presented in Figure 4, in which the blue circle and white circles represent negative and positive data, respectively. During the classification, SVM tries to search a hyper-plane between the two different types of data in the feature space. For higher classification accuracy, the distance between the hyper-plane and the nearest data point of two types is maximized. The nearest data points are sorted as support vectors to define the margin, which is marked by the dotted circles in Figure 5.

Figure 5
figure 5

Classification of two tapes using SVM

The hyper-plane is settled in the middle of the two margins. The support vectors established from the original data samples contain sufficient information to define the hyper-plane. The other data points can be discarded after the support vectors are confirmed. Therefore, high classification accuracy can be achieved by an SVM with few original data. According to previous descriptions [15,16,17,18], the hyper-plane in high dimension feature space follows the form in Eq. (3):

$$ f(x) = \omega \cdot \varphi (x) + b, $$
(3)

where \( \varphi (x) \) is the so-called mapping function. The given data are transformed into the high dimensional feature space through this non-linear mapping function. In Eq. (3), ω represents the weight vector, b denotes the bias term, and ω and b are applied to define the separating hyper-plane between the two kinds of data. A data set \( T = \{ x_{i} ,y_{i} \}_{i = 1}^{m} \) is introduced, where xi represents the data samples for SVM training, \( y_{i} \in \{ - 1,1\} \) denotes the identification of different types, and m is the number of data samples. The separating hyper-plane should be restricted under the following constraints:

$$ y_{i} (\omega_{i} \cdot \varphi (x_{i} ) + b) \ge 1 - \zeta_{i} ,\;\;\;i = 1,2,3, \ldots ,m. $$
(4)

A positive slack coefficient \( \zeta_{i} \) is necessary to measure the distance between the margin and the vector xi that lie on the wrong side. Moreover, the error penalty c implements a trade-off between the confidence interval and the empirical risk. Finally, a corresponding constraint optimization can be converted to a minimization issue:

$$ \left\{ {\begin{array}{*{20}c} {\hbox{min} :\frac{1}{2}||\omega ||^{2} + c\sum\limits_{i}^{m} {\zeta_{i} } ,\zeta_{i} \ge 0,} \\ {{\text{s}} . {\text{t}} . ,\;y_{i} (\omega^{\text{T}} \varphi (x_{i} ) + b) \ge 1 - \zeta_{i} .} \\ \end{array} } \right. $$
(5)

Eq. (5) is a typical convex quadratic programming problem. According to the Kuhn-Tucker condition, the calculation can be simplified to an equivalent Lagrangian dual question by introducing the Lagrangian multiplier ai:

$$ \hbox{min} :L(w,b,a) = \frac{1}{2}\left\| w \right\|^{2} { + }\sum\limits_{i = 1}^{m} {a_{i} - } \sum\limits_{i = 1}^{m} {a_{i} y_{i} (w \cdot \varphi (x) + b)} . $$
(6)

For the solution of the dual problem, the partial derivative of L is obtained concerning w and b. Since the value of the partial derivative is zero at the minimum points, the following equation can be obtained:

$$ \left\{ {\begin{array}{*{20}c} {w = \sum\limits_{i = 1}^{m} {a_{i} y_{i} \varphi (x_{i} )} ,} \\ {\sum\limits_{i = 1}^{m} {a_{i} y_{i} = 0} .} \\ \end{array} } \right. $$
(7)

By combining Eqs. (6) and (7), we can get the final optimization problem:

$$ \left\{ {\begin{array}{*{20}l} {\begin{array}{*{20}l} {\hbox{min} :Q(a) = \sum\limits_{i = 1}^{m} {a_{i} - \frac{1}{2}\sum\limits_{i = 1}^{m} {\sum\limits_{j = 1}^{m} {a_{i} a_{j} y_{i} y_{j} \varphi (x_{i} ) \cdot \varphi (x_{j} )} } } ,} \hfill \\ {{\text{s}} . {\text{t}} . ,\;\sum\limits_{i = 1}^{l} {a_{i} y_{i} = 0,a_{i} \ge 0} ,} \hfill \\ \end{array} } \hfill \\ {K(x_{i} ,x_{j} ) = \varphi (x_{i} ) \cdot \varphi (x_{j} ),} \hfill \\ \end{array} } \right. $$
(8)

where K (xi, xj) represents the introduced kernel function in the input space. Make the xi support vectors, and the classifier is defined as follows:

$$ f(x) = {\text{sign}}(\sum\limits_{i,j = 1}^{m} {a_{i} y_{i} K(x_{i} ,x)} + b). $$
(9)

According to a previous study [15], SVM established by the Gaussian radial basis function shows excellent nonlinear classification ability. Thus, the Gaussian radial basis kernel function [17] is adopted to compute in the original input space through Eq. (10):

$$ K(x_{i} ,x) = \exp ( - \frac{{(x - x_{i} )^{2} }}{{2\delta^{2} }}). $$
(10)

3.2 Theory of PSO

The kernel parameter δ and error penalty c are significant for the classification accuracy of the SVM model. Thus, they are predetermined by the PSO algorithm in the study. The basic idea of the PSO algorithm is that the optimization process can be treated as searching for optimal parameters in a feature space through iterative calculations. Each particle contains a fitness value, determined by the fitness function, and each particle has a velocity vector to decide its flying direction and distance. During each iteration, the particles update themselves through two best positions: one is found by the single particle called Pbest, and the other is found in the whole particle group named gbest. Each particle updates its velocity and position after an iteration step through Eq. (11) and Eq. (12) [9]:

$$ \begin{aligned} v_{{i,d}}^{k} & = w*v_{{i,d}}^{{k - 1}} + c_{1} *r\,and()*(P_{{best{\kern 1pt} i,d}}^{{k - 1}} - x_{{i,d}}^{{k - 1}} ) \\ & \quad + c_{2} *rand()*(g_{{best\;d}}^{{k - 1}} - x_{{i,d}}^{{k - 1}} ), \\ \end{aligned} $$
(11)
$$ x_{i,d}^{k} = x_{i,d}^{k - 1} + v_{i,d}^{k} , $$
(12)

where k is the number of iterations; i represents the serial number of the particle; d denotes the dimension number of the feature space (since the current position of the particle corresponds to the kernel parameter δ and error penalty c in the SVM, the two-dimensional feature space is set up); \( v_{i,d}^{k} \) represents the velocity of particle i in the d dimensional feature space after k iterations; \( x_{i,d}^{k} \)denotes the position of particle i; \( Pbest_{d}^{k - 1} \) is the best position of particle i; \( {\text{g}}best_{d}^{k - 1} \) represents the best position picked from the whole particle swarm; c1 and c2 are the group cognitive coefficients, and generally c1 = c2 = 2; v and x are the velocity and position of each particle, respectively; and ω is the weighting coefficient. After k iterations, \( Pbest_{d}^{k - 1} \) and \( {\text{g}}best_{d}^{k - 1} \) will be updated through Eq. (13) and Eq. (14) [9]:

$$ P_{best\;i,d}^{k} = \left\{ {\begin{array}{*{20}l} {P_{best\;i,d}^{k - 1} ,\quad (f(P_{best\;i,d}^{k - 1} ) < f(P_{best\;i,d}^{k} )),} \\ {x_{i}^{k} ,\quad \quad \quad {\text{otherwise,}}} \\ \end{array} } \right. $$
(13)
$$ gbest_{d}^{k - 1} = {\text{argmin}}\{ f(Pbest_{i,d}^{k - 1} )\,|\,i = 1,2, \cdots ,n\} , $$
(14)

where f represents the fitness function used to determine the accuracy of the classifier. The fitness function is given as Eq. (15):

$$ f={\frac{1}{m}} \sum\limits_{i=1}^{m} {{(y(i)-y_{i} )}^{2}}, $$
(15)

where m stands for the number of data samples, y(i) is the simulation output of the ith data identification by SVM, and yi denotes the real identification of data.

3.3 PSO-SVM Fault Diagnosis Model

Since an SVM is essentially a binary classifier, a one-layer SVM model can only classify two kinds of operation conditions of HVCBs. For the classification of a normal condition mechanism fault and four types of operating mechanism faults, a four-layer SVM classifier shown in Figure 6 is constituted as the fault diagnosis model.

Figure 6
figure 6

Four-layer SVM classifier

The obtained vibration signals are divided into two sets: 100 groups are training samples, and the others are testing samples. The optimal parameter combination (c, δ) of each SVM is predetermined by the PSO algorithm with the training samples. The effectiveness of the multi-layer SVM classifier is confirmed by the testing samples. The establishment of the PSO-SVM fault diagnosis model is described in Figure 7.

Figure 7
figure 7

Flow chart of the PSO-SVM fault diagnosis model

The main steps in Figure 7 can be outlined as follows:

  • Step 1: Initialize the particle swarm: Initialize the particle swarm parameters, including the particle number, learning factors, weighting coefficient, particle position and particle velocity.

  • Step 2: Encode the SVM parameters: Encode the kernel parameter δ and error penalty c of SVM as the position of the particles. Herein, two-dimensional feature space is considered.

  • Step 3: Train the SVM model: Train the SVM model with the 100 groups of training samples. The parameter pairs (c, δ) change along with the flying of particles.

  • Step 4: Assess the fitness value: Calculate and assess the fitness value of the particle with Eq. (15). The fitness value is used to evaluate the validity of the fault diagnosis model with an electric current parameter combination δ and c. A smaller fitness value indicates higher fault diagnosis accuracy.

  • Step 5: Judge the stop condition: Terminate the iteration process and obtain the optimal parameters of the SVM if the desired accuracy is reached. Otherwise, proceed to the iterative calculation.

  • Steps 6 and 7: Update the parameters: Update the parameter particle best Pbest with Eq. (13) and the global best gbest with Eq. (14). Update the particle velocity with Eq. (11) and the particle position with Eq. (12).

  • Step 8: Validate SVM model: Validate the SVM model with the obtained optimal parameter combinations δ and c with the 50 groups of testing samples.

The optimal combinations of the penalty parameter and kernel parameter (c, δ) of each layer SVM classifier are listed in Table 1.

Table 1 Optimal parameters of each SVM classifier

4 Fault Diagnosis Results and Analysis

4.1 Fault Diagnosis by PSO-SVM

The final fault diagnosis result of the proposed PSO-SVM with the optimal parameters in Table 2 according to the 50 groups of testing samples is illustrated in Figure 8.

Table 2 Comparative diagnosis result
Figure 8
figure 8

Fault diagnosis result of PSO-SVM

The horizontal axis represents the number of testing samples. The vertical axis denotes the operation condition name. Normal conditions and four types of fault conditions called cam offset, spring creep, over-travel and mechanical stuck are included. The symbol “o” is the real operation condition, and the symbol “*” represents the diagnosed operation condition. Overlapping of the two symbols under the same horizontal axis value represents the correct diagnosis. Similarly, separation of the two symbols indicates an incorrect diagnosis. The results show that 48 groups of testing samples are diagnosed correctly by the PSO-SVM. In particular, four kinds of operation conditions, normal condition, cam offset, over-travel and mechanical stuck, are diagnosed with 100% accuracy. The incorrectly identified cases can be explained from the operation process of the type ZN12 HVCB. In the initial stage of operation, the spring releases instantaneously, and then the cam impacts the roller to drive the whole operating mechanism. The energy conversion from the elastic potential energy in the spring to mechanical energy in the operating mechanism is completed in this transient shock process. The cam and spring act as the drive unit for the operating mechanism. Spring creep can cause an initial lack of elastic potential energy, thereby decreasing the mechanical energy of the operating mechanism. Similarly, cam offset could lower the energy utilization efficiency and decrease the mechanical energy converted from elastic potential energy. Both phenomena can cause delayed operation of the operating mechanism and stimulate more similar vibration responses. This finding also means that these phenomena are more difficult to diagnose than other operating mechanism faults. Therefore, other diagnosis models and signal processing methods might be investigated to diagnose these faults in the future.

4.2 Comparative Fault Diagnosis Experiment

To show the fault diagnosis performance of the PSO-SVM, fault diagnosis models derived from a four-layer normal SVM with a random parameter combination (c = 0.25, g = 0.44) are established. In addition, the PSO-BP neural network with an 8-dimensional input layer, 10-dimensional hidden layer, and 7-dimensional output layer is also set up with the same training samples. To prove the effectiveness of the SVM, the PSO-BP fault diagnosis model has the same particle swarm parameters as those of the PSO-SVM model, including the particle number (40), learning factors (c1= c2= 2), and the maximum number of iterations (500). Comparative fault diagnosis results by the same testing samples mentioned above are listed in Table 2.

From Table 2, it is clear that the PSO-SVM has higher total accuracy and shorter training time than the PSO-BP neural network. Due to the added process of particle swarm optimization, the training time of PSO-SVM is longer than that of normal SVM. However, the total accuracy has improved dramatically from 80% to 98%, which outweighs the training time increment. This finding indicates that the penalty parameter c and kernel parameter δ are significant for the diagnostic accuracy and that the optimal combination (c, δ) can be determined by the PSO algorithm to improve the fault diagnosis accuracy. Switching from the PSO-BP to the PSO-SVM improves the total diagnosis accuracy from 88% to 96% and decreases the model training time from 4093 s to 107 s. Additionally, from the standpoint of diagnostic accuracy, the PSO-SVM model performs better than the other two models for every fault type. Four kinds of faults are diagnosed with 100% accuracy by the PSO-SVM. For the case of the PSO-BP model, only the mechanical stuck condition can be diagnosed with 100% accuracy. No fault diagnosis result can reach 100% accuracy by the normal SVM. The training time increase in PSO-BP is because its model structure is more complicated than that of PSO-SVM. Taking variables from the topological structure of the PSO-BP in this paper as an instance, the number of network weight variables is 8 × 10 + 10 × 7= 150, and the number of network threshold variables is 10 + 7= 17. Therefore, the total number of network variables is 150 + 17= 167. Unlike the SVM, the diagnostic accuracy of the BP model highly depends on the sample number. Twenty groups of fault signal samples per HVCB operating mechanism fault are still not sufficient to train the BP model, which further leads to lower diagnostic accuracy. These findings indicate that the PSO-SVM is a powerful tool for fault diagnosis problem with a small number of samples.

5 Conclusions

A fault diagnosis model based on WPD energy and the PSO-SVM is proposed to diagnose HVCB operating mechanism faults using vibration signals from operation experiments in the study, the main content can be summarized as below:

  1. 1.

    The WPD energy in each frequency band of the vibration signal from operation experiment is extracted and taken as the fault eigenvalue to form the input vector for the fault diagnosis model.

  2. 2.

    A four-layer SVM classifier is employed to perform multi-class HVCB operating mechanism fault diagnosis. The PSO algorithm is applied to search for the optimal kernel parameter δ and penalty parameter c for the SVM.

  3. 3.

    Comparative fault diagnosis experiments among normal SVM, PSO-SVM and PSO-BP indicated that the PSO-SVM has better diagnostic performance for HVCB operating mechanism faults.

  4. 4.

    The fault diagnosis model based on PSO-SVM can overcome the limitations of traditional scheduled maintenance, and has best comprehensive diagnosis performance. It will, in turn, enhance the reliability of the power grid.