1 Introduction

To enhance the consumption of large-scale new energy and the energy supply of remote load centres, an ultra-high voltage direct current (UHVDC) transmission has been rapidly developed due to its energy allocation advantage in large regions [1]. However, once serious faults or large disturbances such as DC blocking and commutation failure occur at the weak receiving end power grid, the balance of active power and reactive power is greatly impacted, resulting in insufficient dynamic reactive power compensation of the power system, further causing system voltage depression and unstable power grid operation [2, 3].

Regarding the problem of deficient dynamic reactive power compensation at the UHVDC receiving end system, the fast-reactive power dynamic response and strong overload abilities of the synchronous condenser (SC) can be used to reduce the probability of system voltage drop and commutation failure and improve the anti-disturbance ability of the power grid [4,5,6]. Establishing of the reactive power output model of the SC in the UHVDC converter station has theoretical meaning and applicable value for the research on its reactive power regulation ability.

Presently, SC modelling includes mathematical analysis [7] and experimental data fitting methods [8]. When the mathematical analysis method is applied to compute the model, if the model accuracy is higher, the number of equivalent windings required is greater so that the order of the differential equation of the model increases, and the number of calculations greatly increases. The test data fitting method starts with the test data and fits the curve between the various test data values. Due to the limitation of the field test conditions, the data of all working conditions cannot be given, and there is some difference between the practical operation conditions of the power grid and test conditions, so there is a low modelling accuracy problem.

It is assumed that the leakage reactance of each winding end of the stator and rotor of the SC is constant, regardless of the hysteresis effect of the core. Based on Maxwell’s equations, the time-step finite element equation of the SC is derived, and the time-step finite element model of the SC is obtained [9].

The SC parameter identification is the basis of constructing its mathematical model. Usually, a field test of the SC is carried out to obtain the relevant parameters. The authors in Ref. [10] used the online static time domain response test method for parameter identification, but the SC is a large-capacity device, and it is difficult to test the parameters online.

The above modelling methods have not considered SC modelling and analysis of the SC in the UHVDC converter station. To further study the reactive power output modelling method of the SC in the UHVDC converter station, a simulation model of the SC in the UHVDC converter station is established by PSCAD/EMTDC simulation software. The SC model is an electromagnetic transient model [11, 12].

Machine learning is applied to the power estimation of integrated circuits [13, 14]. Because convolutional neural network (CNN) can effectively extract data features and reduce interference information [15], CNN is applied to SC reactive power modelling. The CNN research and development directions are mainly improving the network structure and increasing the network depth. CNN models have been designed with various network structures, but a deeper and wider network structure needs to be designed for high accuracy. This leads to problems such as network degradation and overfitting [16].

Long short-term memory (LSTM) has the capacity to mine data messages, and effectively alleviate the vanishing gradient problem [17], but when the size of training data set is not large enough, it will reduce the generalization ability of LSTM [18]. The CNN-LSTM model combines the advantages of both. It can not only extract the data characteristics by CNN but also use LSTM to learn data feature information [19, 20]. BiLSTM can propagate forwards and backwards, overcoming the defect of insufficient data information mining by a one-way propagation LSTM neural network. The BiLSTM network learns and maximizes long-term dependencies in sequences to compensate for CNN shortcomings [21].

The CNN-BiLSTM is a combination of CNN and bidirectional long short-term memory (BiLSTM), and the CNN model and BiLSTM model are good at processing grid spatial data and time series data, respectively. BiLSTM is composed of two LSTM networks in reverse parallel, which can simultaneously mine time series information from the past and the future. By combining CNN and BiLSTM in series, a CNN-BiLSTM deep learning model that is good at mining spatiotemporal features can be obtained [22].

Skip connections are a classical structure commonly used in residual networks. Its main function is to alleviate the issue of gradient disappearance and gradient explosion problems. Skip connections are used in other neural networks, such as CNN, to fuse different data features. The neural network output features are fused by skip connections [23].

In addition, selecting hyperparameters directly affect the model prediction accuracy, but it takes a lot of time to constantly manually trial and error to search for the optimal hyperparameters values, so an optimization algorithm is needed to optimize its hyperparameters [24]. At present, some scholars use different optimization algorithms to search for the optimum value of hyperparameters. For example, the pelican optimization algorithm (POA) is applied to seek the optimal value of three hyperparameters, such as the number of maximum iterations, the initial study rate, and the number of neurons in the hidden layer, that affect the LSTM prediction accuracy of LSTM [25]. The hyperparameter weights and bias values of the LSTM training network are optimized by a genetic algorithm (GA) [26]. The optimum initial parameters of the current model are found by the web search algorithm [27]. The Bayesian optimization algorithm (BOA) is applied to optimize the initial hyperparameters of the deep learning network to enhance the accuracy of the time series predicted model. The BOA has a favourable performance in adjusting the hyper-parameters of the deep learning network [28,29,30].

This paper adopts interlaced superposition CNN-BiLSTM to build the reactive power output model of the SC in the UHVDC converter station. Compared with CNN-BiLSTM, the difference is that convolution units with two different structures are used to construct interleaved superimposed CNN. Particularly the branch channels of two different convolution units are connected by a convolution layer and a skip connection respectively.

The reactive power adjustment simulation results of the SC in the UHVDC converter station based on PSCAD/EMTDC are regarded as training and testing sampled data. The BOA is applied to optimize the hyperparameters of the interlaced superposition CNN-BiLSTM model. According to the evaluation properties index, the reactive power output model and the best performance index value of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM model are obtained. The predicted results indicate that the reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM model after hyperparameter optimization improves the model accuracy and model fitting performance.

2 Architecture of Interlaced Superposition CNN-BiLSTM Model

2.1 Interlaced Superposition CNN Model

The interleaved superposition CNN model is constructed by interleaved superposition of convolution units with two different structures, which can not only increase the network depth, but also transfer the gradient by using fast channels to avoid the disappearing gradient. The main branch layer of the convolution unit is composed of two two-dimensional convolution layers, two batch normalization layers, two activation layers, an addition layer, and a maximum pooling layer. As shown in Fig. 1, the branch channel of convolution units 1, 3, and 5 is connected to the addition layer through a two-dimensional convolution layer 3 and a batch normalization layer 3. Figure 2 shows the branch layer of convolution units 2, 4, and 6 is directly connected to the addition layer.

Fig. 1
figure 1

Structure diagram of convolution units 1, 3, and 5

Fig. 2
figure 2

Structure diagram of convolution units 2, 4, and 6

To accelerate the training rate of the interleaved superposition CNN model and reduce the sensitivity of model initialization, a batch normalization layer is attached behind each convolution layer. The batch normalization layer normalizes each input passage in small batches. Not adding a dropout layer is considered.

The activation layer needs to select the appropriate activation function. The gradient of the exponential linear unit (ELU) is nonzero for all negative values, so the issue of gradient disappearance and gradient explosion can be avoided. The rectified linear unit (ReLU), the leaky rectified linear unit (LeakyReLU), and ELU are applied to the interlaced superposition CNN-BiLSTM model. According to the comparative analysis of model performance, the precision of the interlaced superposition CNN-BiLSTM model is higher when the activation function is ELU; therefore, the activation function of the established model in this paper is selected as ELU.

The addition layer adds the branch layer to the main branch layer by the fast channel. Finally, the size of the maximum pooling layer 1 is 2 × 2, and its stride is 2. The layer has translation invariance, accelerates the calculation speed and prevents overfitting.

2.2 BiLSTM Model

LSTM can learn what information should be memorized and forgotten through the training process, but it cannot encode information from back to front. BiLSTM not only retains the advantages of LSTM with long-term dependencies but also considers combining context information for prediction.

The expression of LSTM is:

Amnestic gate

$$ A_{t} = \sigma_{{\text{g}}} (W_{{\text{a}}} x_{t} + R_{{\text{a}}} S_{t - 1} + b_{{\text{a}}} ) $$
(1)

Import gate

$$ I_{{\text{t}}} = \sigma_{{\text{g}}} (W_{{\text{i}}} x_{{\text{t}}} + R_{{\text{i}}} S_{{\text{t - 1}}} + b_{{\text{i}}} ) $$
(2)

Export gate

$$ O_{{\text{t}}} = \sigma_{{\text{g}}} (W_{{\text{e}}} x_{{\text{t}}} + R_{{\text{e}}} S_{{\text{t - 1}}} + b_{{\text{e}}} ) $$
(3)

Candidate unit

$$ g_{{\text{t}}} = \sigma_{{\text{c}}} (W_{{\text{g}}} x_{{\text{t}}} + R_{{\text{g}}} S_{{\text{t - 1}}} + b_{{\text{g}}} ) $$
(4)

The state of cell

$$ C_{t} = C_{t - 1} A_{t} + I_{t} g_{t} $$
(5)

where Wa, Wi, We, and Wg are respectively amnestic gate, import gate, export gate, and candidate unit of import weights, Ra, Ri, Re, and Rg are respectively amnestic gate, import gate, export gate, and candidate unit of cyclic weights, ba, bi, be, and bg are respectively amnestic gate, import gate, export gate, and candidate unit of bias, σg is gate activating function, that is sigmoid function, σc is state activating function, that is tanh function, xt is input value at time (t), St − 1 is the status value of hidden layer at time (t − 1).

BiLSTM is composed of forward LSTM and backward LSTM, and x1, x2, …, and xn are input to the forward LSTM and the backward LSTM for characteristic extraction in positive and reverse order, respectively, as shown in Fig. 3. The two output feature vectors are spliced to form the final feature expression, so that the feature data obtained at time (t) have both past and future information.

Fig. 3
figure 3

BiLSTM structure diagram

2.3 Structure Diagram of the Interlaced Superposition CNN-BiLSTM Model

As shown in Fig. 4, the interlaced superposition CNN model is constructed by interlaced superposition of convolution units 1, 3, and 5 and convolution units 2, 4, and 6, and then the interlaced superposition CNN model is combined with the BiLSTM model to form the interlaced superposition CNN-BiLSTM model; its structural parameters are shown in Table 1. Each convolution unit contains a two-dimensional convolution layer, a batch normalization layer, an ELU layer, and a max pooling layer, which passes the gradient through a fast channel.

Fig. 4
figure 4

Structure diagram of the interlaced superposition CNN-BiLSTM model

Table 1 Architecture parameters of the interlaced superposition CNN-BiLSTM model

The interlaced superposition CNN-BiLSTM model includes 67 layers. The main branches are linked in sequence, and the input sampled data are loaded into the interlaced superposition CNN-BiLSTM model by utilizing the sequence input layer. The convolution operation is independently applied to the feature extraction of the sample data by using the sequence folding layer. Through convolution unit 1, convolution unit 2, …, and convolution unit 6, it can automatically extract the valid characterization of input sampled data for forming the characteristic vector.

Convolution unit 1: The filter size of convolution layer 1 and convolution layer 2 in convolution unit 1 is 2 × 1, the depth is 64, and the strides are respectively 2 and 1, the filter size of convolution layer 3 in convolution unit 1 is 1 × 1, the depth is 64, and the stride is 2.

Convolution unit 2: the filter size of convolution layer 1 and convolution layer 2 in the convolution unit 2 is 2 × 1, the depth is 64, and the stride is 1.

Convolution unit 3: the filter size of convolution layer 1 and convolution layer 2 in convolution unit 3 is 2 × 1, the depth is 128, and the sliding steps are respectively 2 and 1. The filter size of convolution layer 3 in convolution unit 3 is 1 × 1, the depth is 128, and the stride is 2.

Convolution unit 4: the filter size of convolution layer 1 and convolution layer 2 in the convolution unit 4 is 2 × 1, the depth is 128, and the stride is 1.

Convolution unit 5: the filter size of convolution layer 1 and convolution layer 2 in convolution unit 5 is 2 × 1, the depth is 256, and the strides are respectively 2 and 1, the filter size of convolution layer 3 in convolution unit 5 is 1 × 1, the depth is 256, and the stride is 2.

Convolution unit 6: the filter size of convolution layer 1 and convolution layer 2 in the convolution unit 6 is 2 × 1, the depth is 256, and the strides are respectively 2 and 1.

The size of the maximum pooling layer 1 of convolution units 1, 2, 3, 4, 5, and 6 is 2 × 2, and its stride is 2. The zero-padding method is used as the filling method, and the convolution kernel performs convolution operation along the time axis.

The convolution unit 6 is followed by a convolution layer, a batch normalization layer, and an ELU layer. The filter size of the convolution layer is 2 × 1, the depth is 256, and the stride is 1.

The activation layer is followed by a maximum pooling layer and an average pooling layer, which are connected to the addition layer. The maximum pooling layer size is 3 × 3, the stride is 1, the average pooling layer size is 3 × 3, and the stride is 1. The addition layer is connected to the sequence unfolding layer, which recovers the sequence structural body deleted by the sequence folding layer.

The sequence folding layer is directly connected to the sequence unfolding layer through a fast channel, namely, a skip connection. The BiLSTM layer can learn from the vector sequence by adding a flattening layer behind the sequence unfolding layer. The BiLSTM layer is followed by the fully connected layer with a value of 1. The final layer is the regression output layer.

3 Construction Method for the Interlaced Superposition CNN-BiLSTM Model

The PSCAD/EMTDC simulation software is applied to build the simulation model of the UHVDC system with the SC. The excitation voltage and excitation current of the SC are used as inputs for training and testing sampled data, and the reactive power of the SC is used as the outputs for training and testing the sampled data. The reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM is established by combining the interlaced superposition CNN model and BiLSTM model. The overall modelling flow scheme is shown in Fig. 5.

Fig. 5
figure 5

Flow chart of the SC in the UHVDC converter station reactive power output modelling based on interlaced superposition CNN-BiLSTM

First, based on PSCAD/EMTDC, the simulation model of the SC in the UHVDC system is established, and the simulation results of the SC in the UHVDC converter station reactive power regulation are regarded as the training and testing sampled data. The excitation voltage and excitation current of the SC are taken as input sampled data, and the SC reactive power output is taken as output sampled data. Then, the training samples and test samples are pre-processed; that is, the sample data value is mapped between [− 1, 1], that is, normalized processing.

The normalized sample data are loaded into the interlaced superposition CNN model to extract data features. The information processed by the interlaced superposition CNN model is imported into the BiLSTM model, and the data features are learned in the BiLSTM model. The RMSE of the interlaced superposition CNN-BiLSTM model is regarded as the target function, and the BOA is applied to optimize the hyperparameters of the interlaced superposition CNN-BiLSTM model. The interlaced superposition CNN-BiLSTM model is trained and tested. If it fails to reach the maximum number of iterations of the BOA, continue to optimize its hyper-parameters.

Then, the performance index of the interlaced superposition CNN-BiLSTM model is appraised to determine whether the performance index is reached. After comparative analysis, the test results with the best performance index are selected to obtain the reactive power output model of the SC in the UHVDC converter station based on the interlaced superposition CNN-BiLSTM and its optimal performance index value.

3.1 Selection of Input and Output Signals

To maintain the voltage stability of the UHVDC system, the excitation voltage and excitation current of the SC are changed to make it operate in three working conditions, late-phase operation, no-load operation, and leading-phase operation, to obtain the reactive power output of the SC in the UHVDC converter station under the three working conditions. The reactive power output of the SC in the UHVDC converter station is a vital indicator to weight its operating performance. The reactive power regulation of the SC in a UHVDC converter station is an important means of maintaining the busbar voltage stabilization in UHVDC systems.

Therefore, in the reactive power output model based on interlaced superposition CNN-BiLSTM, the excitation voltage and excitation current of the SC are selected as the input sampled data, and the reactive power output of the SC in the UHVDC converter station is selected as the output sampled data of the SC in the UHVDC converter station.

3.2 Pre-processing of Sampled Data

Owing to the differences in the dimension of quantity and order of magnitude for the sampled data selected in this paper, it is necessary to preprocess the training and testing sample data selected from the input and output sample data. The selected training and testing sampled data are normalized to the range of [Gmin, Gmax] = [− 1, 1]. The formula is as follows:

$$ G = \frac{{(G_{\max } - G_{\min } )(R - R_{\min } )}}{{R_{\max } - R_{\min } }} + G_{\min } $$
(6)

where, R is the selected sample data, Rmax and Rmin are the maximum numbers of selected sample data, G is the normalized value for the selected sample data, Gmax is the upper limit of the normalized interval [Gmin, Gmax], and Gmin is the lower limit of the normalized interval [Gmin, Gmax].

3.3 Hyperparameter Selection and Optimization

When compared with the output effect of the root mean square propagation (RMSProp) algorithm and the stochastic gradient descent (SGD) algorithm, the output effect of the adaptive momentum (AdaM) optimization algorithm is better, so AdaM is used to update the network weights.

The gradient threshold is set to enable stable training with higher learning rates and outliers. If the gradient is beyond the threshold when renewing the gradient, it will be restricted to this scope to arrest gradient explosion. In this paper, the gradient threshold is 1.

The number of samples per batch is set. If the number of samples per batch is too small, it may be difficult to converge. If the number of samples per batch is too large, it easily falls into a topical-optimized solution. The adopted approach is to multiply the 128-bit dividing line by 2 upwards and 0.5 downwards. The predicted results show that it is better to multiply downwards by 0.5 downwards, and the appropriate number of samples per batch is selected according to the predicted results.

The BOA is applied to search for the optimum values of the hyperparameters of the interlaced superposition CNN-BiLSTM model: the initial learning rate, the number of neurons in the hidden layer, and the L2 regularization coefficient. The BOA establishes a surrogate function based on the past assessment results of the target function to search for the minimum objective function value. The optimization process of each iteration updates the Gaussian process model, utilizes the model to search a new group of hyperparameters, and uses the new hyper-parameters to calculate the target function value. After the final iteration is completed, the optimal value of hyper parameter optimization is selected from the hyperparameters at the minimum value of the target function at the end of each iteration.

3.4 Evaluate Model Performance Indicators

The reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM uses the performance indicators of the evaluation model are as follows:

Mean square error (MSE):

$$ {\text{MSE}} = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {T_{i} - \hat{T}_{i} } \right)}^{2} $$
(7)

Root mean square error (RMSE):

$$ {\text{RMSE}} = \sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {T_{i} - \hat{T}_{i} } \right)}^{2} } $$
(8)

Normalized root mean square error (NRMSE):

$$ {\text{NRMSE}} = {{\sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {T_{i} - \hat{T}_{i} } \right)}^{2} } } \mathord{\left/ {\vphantom {{\sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {T_{i} - \hat{T}_{i} } \right)}^{2} } } {\overline{T}}}} \right. \kern-0pt} {\overline{T}}} $$
(9)

The smaller the MSE, RMSE, NRMSE, the better the performance of the model.

Mean absolute error (MAE):

$$ {\text{MAE}} = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left| {T_{i} - \overline{T}_{i} } \right|} $$
(10)

The smaller the MAE, the lower the error of the model.

Mean absolute percentage error (MAPE):

$$ {\text{MAPE}} = \frac{1}{N}\sum\limits_{i = 1}^{N} {\frac{{\left| {T_{i} - \hat{T}_{i} } \right|}}{{T_{i} }}} \times 100\% $$
(11)

The smaller the MAPE, the higher the accuracy and the better the fitting performance of the model.

Determination coefficient (R2):

$$ R^{2} = 1 - \frac{{\sum\nolimits_{i = 1}^{N} {\left( {T_{i} - \hat{T}_{i} } \right)^{2} } }}{{\sum\nolimits_{i = 1}^{N} {\left( {T_{i} - \overline{T}} \right)^{2} } }} $$
(12)

Corrected determination coefficient (R2_adjusted):

$$ R^{2} \_{\text{adjusted}} = 1 - \frac{{(1 - R^{2} )(N - 1)}}{N - p - 1} $$
(13)

The R2, R2_adjusted are close to 1, indicating that the model fitting effect is better. where, N is the number of testing sampled data, \(\hat{T}_{i}\) and \(T_{i}\) are respectively the predicted value and the sampled value of the ith testing sampled data, \(\overline{T}\) is the medium value of testing sampled data, p is the number of input sampled data characteristics.

4 Results and Analysis

Based on a project of a domestic ± 800 kV/10,000 MW UHVDC system with TTS-300-2 type 300 MVA double water-cooled non-salient pole SC, the ± 800 kV/10,000 MW UHVDC system simulation model with the SC in PSCAD/EMTDC simulation software was built, as shown in Fig. 6. The main parameters of the ± 800 kV/10000 MW UHVDC system are shown in Table 2, and the main performance parameters of the SC are shown in Table 3.

Fig. 6
figure 6

Simulation model of the ± 800 kV/10,000 MW UHVDC system with the SC

Table 2 Main parameters of the ± 800 kV/10,000 MW UHVDC system
Table 3 Main parameters of the SC

In Fig. 6, the excitation voltage Uf of the SC is changed by the excitation system of the SC to obtain the reactive power output of the SC in the UHVDC converter station under three different operating conditions: late-phase operation, no-load operation, and leading-phase operation. The excitation voltage, excitation current, and reactive power of the SC under these three different operating conditions are used as training and testing sampled data, as shown in Table 4. The excitation voltage and excitation current of the SC are used as input samples, and the reactive power of the SC is used as an output sample. The sampled data with sample serial numbers of 2, 6, 11, 14, 20, 26, 30, 36, and 42 in Table 4 are taken as testing sampled data, as well as the rest data are taken as training sampled data.

Table 4 Training samples and testing samples

First, the excitation voltage and excitation current of the SC in the ± 800 kV/10000 MW UHVDC converter station under three different operating conditions of late-phase operation, no-load operation, and leading-phase operation are regarded as input samples, and its reactive power is regarded as the output sample. The input and output sampled data are preprocessed, and the preprocessed input and output sampled data are loaded into the interlaced superposition CNN-BiLSTM model.

The gradient threshold is set at 1, the study rate decline period is set at 0.5, the learning rate decline factor is set at 225, and the batch size is set at 64.

The maximum number of iterations of the BOA is set at 10, and the initial learning rate optimization range [5 × 10–3, 1]. The optimization range of the number of hidden units is [50, 150], and the optimization range of L2 regularization coefficient is [1 × 10–10, 1 × 10–2]. The results of hyper parameter optimization are that the initial study rate is 0.0056, the number of hidden units is 50, and the L2 regularization coefficient is 1.0437 × 10–10.

The BOA is used to optimize the reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM. The range and results of optimization are shown in Table 5.

Table 5 Optimization results and search range of hyperparameters

The hyperparameter optimization process of the SC in the UHVDC converter station reactive power output model based on the interlaced superposition CNN-BiLSTM is shown in Fig. 7.

Fig. 7
figure 7

Hyperparameter searching process

In Fig. 7, the number of iterations starts from the third time, and the current calculated minimum objective function value and the current estimated minimum objective function value decrease rapidly and greatly. The training samples are applied to train the interlaced superposition CNN-BiLSTM model to obtain the RMSE of reactive power output, the RMSE of reactive power output is taken as the objective function. The results show that the BOA is applied to optimize the hyperparameters, which enhances convergence speed and accuracy.

The network parameters are updated by using the best results of hyperparameter optimization, and the generalization ability of the trained network pair is verified by testing samples. The MSE, RMSE, NRMSE, MAE, MAPE, R2, and R2_adjusted evaluation model performance standards are used to evaluate the testing results. The best SC in the UHVDC converter station reactive power output model based on the interlaced superposition CNN-BiLSTM is obtained through a comparative study.

4.1 Verification of the Generalization Ability of the SC Reactive Power Output Mode Based on Interlaced Superposition CNN-BiLSTM

To verify the generalization capability of the reactive power output model of the SC in the UHVDC converter station based on the interlaced superposition CNN-BiLSTM, the absolute error is verified by, as shown in Table 6.

Table 6 The generalization ability verification of the interlaced superposition CNN-BiLSTM model

Table 6 shows that the reactive power output absolute error among the sampled data value and predicted value of the CNN-BiLSTM model based on interleaved superposition ranges from 0.0049 to 0.1801 MVar. The absolute error value is low, and the whole fluctuation margin is small. The results indicate that the reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM has higher precision and favourable generalized capability.

As shown in Figs. 8 and 9, the output regression fitting diagram and fitting effect diagram of the reactive power output model of the SC in the UHVDC converter station based on the interlaced superposition CNN-BiLSTM.

Fig. 8
figure 8

Regression fitting diagram of the reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM

Fig. 9
figure 9

The fitting effect graph of the interlaced superposition CNN-BiLSTM model

In Figs. 8 and 9, the SC in the UHVDC converter station reactive power output error among the sampled data value and the predicted value is low; therefore, reactive power output model of the SC in the UHVDC converter station based on the interlaced superposition CNN-BiLSTM has satisfactory output regression fitting properties.

4.2 Performance Comparison of the SC Reactive Power Output Mode Based on the Interlaced Superposition CNN-BiLSTM with and Without Hyperparameter Optimization

The performance of the SC reactive power output model based on the interlaced superposition CNN-BiLSTM without and with hyper parameter optimization is compared and analysed (see Table 7). The predicted value error comparison of the SC reactive power output model based on the interlaced superposition CNN-BiLSTM with and without hyperparameter optimization is shown in Fig. 7.

Table 7 Performance comparison of the SC reactive power output model based on interlaced superposition CNN-BiLSTM with and without hyperparameter optimization

Table 7 shows that the MSE, RMSE, NRMSE, MAE, and MAPE of the SC in the UHVDC converter station reactive power output model based on the interlaced superposition CNN-BiLSTM with hyper-parameter optimization are smaller than those without hyperparameter optimization.

R2 and R2_adjusted of the SC in the UHVDC converter station reactive power output model based on the interlaced superposition CNN-BiLSTM is closer to 1 than those without hyperparameter optimization.

In Fig. 10, the SC reactive power output predicted value error range with hyperparameter optimization is ± 0.2 MVar, which is less than the SC reactive power output predicted value error range without hyperparameter optimization of ± 1.5 MVar, and the whole fluctuation margin is small. The prediction results indicate that the reactive power output model of the SC in the UHVDC converter station based on the interlaced superposition CNN-BiLSTM with hyperparameter optimization achieves better output regression fitting properties and high precision.

Fig. 10
figure 10

Predicted value error of the SC reactive power output model based on the interlaced superposition CNN-BiLSTM with and without hyperparameter optimization

4.3 Performance Comparison of the SC Reactive Power Output of Model 1, Model 2, …, Model 6 and the Interlaced Superposition CNN-BiLSTM Model

To further verify that the network structure of the reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM can improve the generalization performance of the model, according to the MSE, RMSE, NRMSE, MAE, MAPE, R2, and R2_adjusted evaluation of the performance standards of the model, the performance of the reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM is compared with Model 1, Model 2, …, and Model 6, as shown in Table 8. The reactive power output error comparison of the SC in the UHVDC converter station between Model 1, Model 2, …, Model 6, and the interlaced superposition CNN-BiLSTM model is shown in Fig. 11.

Table 8 Performance comparison of the SC reactive power output of Model 1, Model 2, …, Model 6, and the interlaced superposition CNN-BiLSTM model
Fig. 11
figure 11

Predicted value error of the SC reactive power output of Model 1, Model 2, …, Model 6, and the interlaced superposition CNN-BiLSTM model

Model 1 is CNN model composed of six convolution units with the same structure. The structure of the six convolution units of Model 1 adopts the structure of the convolution units 1, 3, and 5 of the interlaced superposition CNN-BiLSTM model.

Model 2 is CNN model composed of six convolution units with the same structure. The structure of the six convolution units of Model 2 adopts the structure of the convolution units 2, 4, and 6 of the interlaced superposition CNN-BiLSTM model.

Model 3 is composed of Model 1 and the BiLSTM model. Model 4 is composed of Model 2 and the BiLSTM model.

Model 5 is the interlaced superposition CNN model. Model 6 is the interlaced superposition CNN model and the LSTM model.

The structural parameters of Model 1, Model 2, …, Model 6, are the same as those of the interlaced superposition CNN-BiLSTM model. The hyperparameters of Model 1, Model 2, Model 3, and the interlaced superposition CNN-BiLSTM model are optimized by the BOA.

Table 8 shows the MSE, RMSE, NRMSE, MAE, and MAPE of the SC in the UHVDC converter station reactive power output model based on the interlaced superposition CNN-BiLSTM are smaller than those of Model 1, Model 2, …, and Model 6. R2 and R2_adjusted of the SC in the UHVDC converter station reactive power output model based on the interlaced superposition CNN-BiLSTM is closer to 1 than those of Model 1, Model 2, …, and Model 6.

By comparing Model 1 and Model 2, the MSE, RMSE, NRMSE, MAE, and MAPE of Model 5 are smaller than those of Model 1 and Model 2. The results show Model 5 is constructed by interlaced superposition of convolution units with two different structures, which effectively enhancing the model accuracy and generic capability.

By comparing Model 1 and Model 2, the MSE, RMSE, NRMSE, MAE, and MAPE of Model 3 and Model 4 are smaller than those of Model 1 and Model 2. The results show the CNN-BiLSTM model is further improved model accuracy and generalization ability by compared with the CNN model.

Relative to the combination of the interlaced superposition CNN-LSTM model in Model 6, the interlaced superposition CNN-BiLSTM model is composed of the interlaced superposition CNN model and BiLSTM model to enhance its accuracy and generalization ability.

In Fig. 11, the SC reactive power output predicted value error range of the interlaced superposition CNN-BiLSTM model is between ± 0.2 MVar, the SC reactive power output predicted value error range of Model 1 and Model 2 is between − 0.1 and 0.7 MVar, the SC reactive power output predicted value error range of Model 3 and Model 4 between − 0.5 and 0.6 MVar, the SC reactive power output predicted value error range of Model 5 is between − 0.2 and 0.3 MVar, and the SC reactive power output predicted value error range of Model 6 is between − 0.2 and 0.4 MVar.

When compared with Model 1, Model 2, …, and Model 6, the interlaced superposition CNN-BiLSTM model not only has a low reactive power output error of the SC but also has a small overall fluctuation range, which effectively enhances the accuracy and generic capability of the model.

5 Conclusion

In this paper, the reactive power output model of the SC in the UHVDC converter station based on interlaced superposition CNN-BiLSTM is proposed.

A novelty of the interlaced superposition CNN-BiLSTM is that the interlaced superposition CNN composed of convolution units with two different structures is built to avoid disappearing gradient and overfitting. Particularly the branch channels of convolution units with two different structures are connected by a convolution layer and a skip connection respectively. The other novelty of the interlaced superposition CNN-BiLSTM is that the combination of the interlaced superposition CNN and BiLSTM, which further improves the model accuracy and generic capability.

Through hyperparameter optimization by the BOA, the RMSE of the reactive power output model of the SC in the UHVDC converter station based on the interlaced superposition CNN-BiLSTM are reduced by 57.91%. By comparing the performance with six different structural models, the reactive power output model of a SC in an UHVDC converter station based on interlaced superposition CNN-BiLSTM performs well with a low RMSE of 0.126750 and a high R2 of 0.999999.

Because of the application of a SC in an UHVDC converter station is a new developing technology, so far, it is difficult to obtain the various actual operating condition data of a SC in an UHVDC converter station. Therefore, the sample data is obtained by establishing a simulation model of the ± 800 kV/10,000 MW UHVDC system with the SC in PSCAD. In the future work, with the increasing operation of UHVDC projects, the various actual operating condition data of a SC in an UHVDC converter station would be obtained through long-term accumulation.