1 Introduction

State estimation (SE) is a fundamental module in energy management systems (EMSs) and its key task is to provide estimates of state variables which are as accurate as possible. For distribution networks, a robust and efficient distribution state estimator assists in integrated operation with distributed energy resources, assures power quality levels, and improves the reliability of a power system [1]. A number of distribution system state estimation (DSSE) methodologies have been proposed based on different state variables, treatments for load data and bad data, and measurements. Weighted least square (WLS) estimators form the basis of the most popular methods. To suppress the influence of bad data, some robust estimators, such as least median of square (LMS) estimators [2], weighted least absolute value (WLAV) estimators [3, 4], and M-estimators [5], have been proposed. To mitigate the effect of leverage points and improve the applicability of WLAV, a WLAV estimation with optimal transformations (WLAV-OT) that systematically solves the problem of computing rotation angles and scaling factors was proposed [6]. Most robust estimators consider the measurement residual as a whole and minimize the value of a penalty function of residuals. However, the residuals usually consist of two parts: observation noise and abnormally large measurement errors caused by bad data, which obey different distributions. The errors are normally sparse. Therefore, based on the theory of compress sensing (CS), some sparse recovery models [7, 8] considering this point have been proposed to detect bad data. An L1-relaxation (L1-R) model constraining the sparse vector by the L1-norm instead of the L0-norm, which is used to denote the number of non-zero values in a sparse vector, was proposed [8]. Relaxing the L0-norm problem to the L1-norm problem is a common practice in CS [9]. The effectiveness of the L1-R model has been proved both mathematically and practically. However, it has been reported that the relaxation process often leads to sub-optimal solutions [10]. To handle this issue mathematically, a multi-stage convex relaxation (Capped-L1) method was proposed [10]. We first introduced this method for SE in a transmission power system in our previous work [11], and found that Capped-L1 method has an advantage in precision but not in computational speed because iterations are required and because the optimization problem contains both the L1-norm and the L2-norm, which are relatively nonlinear compared to WLS and WLAV. Actually, it is easier to solve a pure quadratic optimization problem than a mixed problem of square values and absolute values [12]. In this paper, efforts are made to transform the original Capped-L1 model to a quadratic optimization model. The efficiency of the revised Capped-L1 (R-Capped-L1) model is improved significantly.

Recently, to improve the monitoring of distribution system operating conditions, some utilities have started installing phasor measurement units (PMUs) at the distribution level. PMUs provide measurements of voltage phasor and current phasor with high frequency. Combining measurements from both remote terminal units (RTUs) and PMUs can promote the ability to perform state forecasting for the distribution network [13]. The inclusion of branch current and voltage measurements helps to improve the precision but also introduces additional components to the solution procedure [14]. A novel branch current-based SE has been proposed [15], with a two-stage solution. In [16], the active and reactive power measurements were transformed to linear complex current measurements based on estimated phase angle and voltage. References [17] and [18] formulated the multi-source measurements SE problem by extending the state variables. Reference [19] combines the estimates independently obtained from supervisory control and data acquisition (SCADA)-based and PMU-based estimators based on multisensor data fusion theory. They all showed good performance in handling branch current measurements. However, owing to the repeating factorization of the Jacobian matrix, these methods suffer from a heavy computational burden. To resolve this, a fast decoupled power flow (FDPF) method for distribution networks by choosing a complex base voltage and adjusting the ratio of R/X was also proposed [20, 21]. Its efficiency has been proved in a large number of case studies. The computational speed of DSSE is expected to be faster when introducing this method.

In this paper, we introduce a robust Capped-L1 model into DSSE. To reduce the computational burden, we transform the original model to a quadratic form and apply a novel fast decoupled state estimation (FDSE) method to formulate the three-phase SE problem with hybrid measurements. The branch current measurements are formulated as the branch active power and reactive power losses, allowing them to be incorporated into the FDSE model. Thus, the contributions of this paper can be summarized as follows:

  1. 1)

    The Capped-L1 model is first introduced for DSSE, which has a powerful capacity to compress bad data.

  2. 2)

    A novel three-phase fast decoupled state estimation model with hybrid measurements for DSSE is adopted.

  3. 3)

    A transformation strategy for Capped-L1 to R-Capped-L1 is applied and the computational efficiency is significantly improved.

2 Proposed state estimation model

This section first reviews the formulation of the sparse L1-R model for state estimation that has been proposed in previous work [8]. Based on this formulation, the proposed Capped-L1 model and the revised model are introduced.

2.1 Sparse L1-R model

When bad data are presented in the measurements, the relationship between the measurements and the state variables can be represented as [7]:

$$ {\varvec{z}} = {\varvec{h}} ({\varvec{y}} ) + {\varvec{o}}+ {\varvec{e}}{\kern 1pt} $$
(1)

where z is the raw measurement vector; h(y) is a set of measurement functions; o is the vector of errors corresponding to bad data; e is the noise vector.

Generally, the elements of e follow random Gaussian distributions and are independent in most cases; the error vector o is a sparse vector with few non-zero values.

Mathematically, the L0-norm of a vector represents the number of its non-zero values. Therefore, the error vector can be constrained with the L0-norm and the SE model can be formulated as an optimization problem:

$$ \left\{ \begin{aligned} &\hbox{min} \;J({\varvec{y}} ,{\varvec{o}}) = \left\| {{\varvec{z}} - {\varvec{h}} ({\varvec{y}} )- {\varvec{o}}} \right\|_{2} \\ &{\text{s}} . {\text{t}} .\;\;\left\| {\varvec{o}} \right\|_{0} \le \varepsilon \hfill \\ \end{aligned} \right. $$
(2)

where \( \varepsilon \) is a positive small number related to the proportion of bad data.

Given a reasonable \( \varepsilon \), we can obtain relatively accurate estimates for state variables by solving (2). However, there are two difficulties in solving this problem: ① the value of parameter \( \varepsilon \) is difficult to specify because the proportion of bad data is not normally known; ② L0-norm minimization has been proved to be an NP problem that cannot be solved efficiently [9].

For ①, according to the duality theory, the solution to (2) corresponds to the solution of the problem (2) for some Lagrange dual variable \( \lambda \ge 0 \):

$$ \hbox{min} \quad J({\varvec{y}} ,{\varvec{o}}) = \left\| {{\varvec{z}} - {\varvec{h}} ({\varvec{y}} )- {\varvec{o}}} \right\|_{2} { + }\lambda \,\left\| {\varvec{o}} \right\|_{0} $$
(3)

For ②, according to the theory of CS [9], L1-norm minimization helps obtain sparse solutions. Thereby, the L0-norm in (3) can be relaxed to an L1-norm problem.

For this step, the original non-convex sparse minimization problem has been evolved into a convex optimization problem given by (4), which can be solved efficiently:

$$ \hbox{min} \quad J({\varvec{y}} ,{\varvec{o}}) = \left\| {{\varvec{z}} - {\varvec{h}} ({\varvec{y}} )- {\varvec{o}}} \right\|_{2} { + }\lambda \,\left\| {\varvec{o}} \right\|_{1} $$
(4)

The SE model above is exactly the L1-R model previously proposed in [8], which is a typical sparse recovery model.

2.2 Capped-L1 model and transformation strategy

Convex relaxation such as L1-R indeed solves the L0-norm minimization problem efficiently under some conditions. However, it often leads to a sub-optimal solution in reality. To obtain better solutions than the L1-R model, a new model Capped-L1 was proposed [10] and we introduce this model into the DSSE problem to handle the issues with a sparse error vector. The Capped-L1 model can be formulated as:

$$ \hbox{min} \quad J({\varvec{y}}^{ (l)} ,{\varvec{o}}^{ (l)} ) = \left\| {{\varvec{z}} - {\varvec{h}} ({\varvec{y}}^{ (l)} )- {\varvec{o}}^{ (l)} } \right\|_{2} + \lambda \sum\limits_{i} {c_{i}^{(l)} \left| {o_{i}^{ (l)} } \right|} $$
(5)

where \( c_{i}^{(l + 1)} = I(\left| {o_{i}^{ (l)} } \right| \le \alpha^{(l)} ) \) is a relaxation parameter, \( I( \cdot ) \) is a step function with a value of 0 or 1, l is the number of iterations, \( \alpha^{(l)} \) is a threshold value changing with the decision variable \( o_{i}^{(l)} \). For this problem, if \( \left| {o_{i}^{(l)} } \right| \le \alpha^{(l)} \), the value of \( c_{i}^{(l + 1)} \) will be 1; otherwise, the value will be 0.

In the procedure of solving the problem, the iterations are needed as shown in Fig. 1.

Fig. 1
figure 1

Iterative procedure of Capped-L1

In the iteration procedure, \( f(\varvec{o}^{(l)} ) \) is a function of \( \varvec{o}^{(l)} \)whose value meets \( \varvec{o}_{\hbox{min} }^{(l)} \le f(\varvec{o}^{(l)} ) \le \varvec{o}_{\hbox{max} }^{(l)} \). Generally, the function can be chosen as the average value or the median of vector \( \varvec{o}^{(l)} \). The threshold \( \alpha^{(l)} \) changes in each iteration. \( \varepsilon \) is a small threshold value to determine whether or not the iteration has been convergent.

Observing the iteration procedure, it can be seen that each iteration exactly solves a convex optimization problem. Once the new sparse error vector \( \varvec{o}_{{}}^{(l)} \) is given from the last iteration, the method studies the new sparse vector \( \varvec{o}_{{}}^{(l)} \) to adjust the relaxation parameter \( \varvec{c}_{{}}^{(l + 1)} \) and solve a new optimization problem to obtain better solutions. This process continues until the two error vectors of adjacent iteration steps become relatively close.

However, the Capped-L1 model (5) is an extremely nonlinear model with a fairly heavy computational burden. To increase the calculation efficiency, the following transformation strategy can be adopted:

$$ \left\{ \begin{aligned} \hbox{min} \quad J({\varvec{y}}^{(l)} ,{\varvec{o}}^{(l)} ) = \left\| {{\varvec{z}} - {\varvec{h}} ({\varvec{y}}^{(l)} )- {\varvec{o}}^{(l)} } \right\|_{2} + \lambda \sum\limits_{i} {c_{i}^{(l)} \left( {a_{i}^{(l)} + b_{i}^{(l)} } \right)} \hfill \\ {\text{s}} . {\text{t}} .\;\;a_{i}^{(l)} \ge 0 \hfill \\ \;\;\;\;\;b_{i}^{(l)} \ge 0 \hfill \\ \;\;\;\;\;a_{i}^{(l)} - b_{i}^{(l)} - o_{i}^{(l)} = 0 \hfill \\ \end{aligned} \right. $$
(6)

where \( a_{i}^{(l)} \) and \( b_{i}^{(l)} \) represent auxiliary variables sharing the same dimension with \( o_{i}^{(l)} \).

This strategy transforming the optimization problem (5) to (6) is actually relaxing the absolute value, and the relaxation method has been proved to be valid mathematically [22]. Solutions to (6) are exactly the same as those to (5); thereby, the precisions of the two Capped-L1 models are identical. However, through the relaxation process, the computational speed of the Capped-L1 model will be significantly improved because each iteration of (6) is actually a quadratic optimization problem, which is relatively easily solved. The iteration procedure for solving (6) is the same as that of (5), but each iteration saves a lot of time thanks to the linearization.

By adopting this R-Capped-L1 model, a robust SE model with a lighter computational burden can be developed.

3 Fast decoupled model for three-phase distribution networks

The fast decoupled method has been applied in transmission networks for a long time and its efficiency has been proved by substantial practice. However, resulting from the large R/X in distribution networks, the “decoupled” idea fails for almost all feeders. Recently, to handle this problem, a fast decoupled algorithm via complex per unit (pu) normalization for distribution networks has been proposed [20]. To be more readable, a briefly review of the complex pu normalization algorithm is given in Appendix A. In this section, we will introduce a novel three-phase fast decoupled model to estimate state variables for unbalanced three-phase distribution networks. To take the branch current measurements into consideration, the branch active power and reactive power losses are employed. Using the fast decoupled method, the calculation efficiency of the R-Capped-L1 sparse recovery model will be further improved, enabling good handling of the branch current measurements.

In the DSSE problem, the measurement vector has to be extended as (7) and the state variable vector can be represented as (18):

$$ {\varvec{z}} = \left[ {U_{i}^{\text{abc}} ,\theta_{i}^{\text{abc}} ,P_{ij}^{\text{abc}} ,P_{ji}^{\text{abc}} ,Q_{ij}^{\text{abc}} ,Q_{ji}^{\text{abc}} ,P_{i}^{\text{abc}} ,Q_{i}^{\text{abc}} ,I_{ij}^{\text{abc}} } \right] $$
(7)
$$ {\varvec{y}}{ = }\left[ {U_{i}^{\text{abc}} ,\theta_{i}^{\text{abc}} ,P_{ij}^{\text{abc}} ,P_{ji}^{\text{abc}} ,Q_{ij}^{\text{abc}} ,Q_{ji}^{\text{abc}} } \right] $$
(8)

where U is the bus voltage; θ is the phase angle; P is the active power of branches or injections; Q is the reactive power of branches or injections; I is the magnitude of branch current; the superscript abc denotes the three phases of the variables; the superscript ij denotes that the variable flows from bus i to bus j; the superscript i denotes bus i.

The measurement function h(y) relating z and y includes:

  1. 1)

    Three-phase real and reactive power measurements of the branch:

$$ \left[ \begin{aligned} (P_{ij}^{\text{a}} )^{m} \hfill \\ (P_{ij}^{\text{b}} )^{m} \hfill \\ (P_{ij}^{\text{c}} )^{m} \hfill \\ \end{aligned} \right] = \left[ \begin{aligned} P_{ij}^{\text{a}} \hfill \\ P_{ij}^{\text{b}} \hfill \\ P_{ij}^{\text{c}} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} o_{{P_{ij}^{\text{a}} }} \hfill \\ o_{{P_{ij}^{\text{b}} }} \hfill \\ o_{{P_{ij}^{\text{c}} }} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} e_{{P_{ij}^{\text{a}} }} \hfill \\ e_{{P_{ij}^{\text{b}} }} \hfill \\ e_{{P_{ij}^{\text{c}} }} \hfill \\ \end{aligned} \right] $$
(9)
$$ \left[ \begin{aligned} (Q_{ij}^{\text{a}} )^{m} \hfill \\ (Q_{ij}^{\text{b}} )^{m} \hfill \\ (Q_{ij}^{\text{c}} )^{m} \hfill \\ \end{aligned} \right] = \left[ \begin{aligned} Q_{ij}^{\text{a}} \hfill \\ Q_{ij}^{\text{b}} \hfill \\ Q_{ij}^{\text{c}} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} o_{{Q_{ij}^{\text{a}} }} \hfill \\ o_{{Q_{ij}^{\text{b}} }} \hfill \\ o_{{Q_{ij}^{\text{c}} }} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} e_{{Q_{ij}^{\text{a}} }} \hfill \\ e_{{Q_{ij}^{\text{b}} }} \hfill \\ e_{{Q_{ij}^{\text{c}} }} \hfill \\ \end{aligned} \right] $$
(10)

where the superscript m denotes the measurement quantities.

  1. 2)

    Three-phase injection power measurements:

$$ \left[ \begin{aligned} (P_{i}^{\text{a}} )^{m} \hfill \\ (P_{i}^{\text{b}} )^{m} \hfill \\ (P_{i}^{\text{c}} )^{m} \hfill \\ \end{aligned} \right] = = \sum\limits_{j \in i} {\left[ \begin{aligned} P_{ij}^{\text{a}} \hfill \\ P_{ij}^{\text{b}} \hfill \\ P_{ij}^{\text{c}} \hfill \\ \end{aligned} \right]} + \left[ \begin{aligned} o_{{P_{i}^{\text{a}} }} \hfill \\ o_{{P_{i}^{\text{b}} }} \hfill \\ o_{{P_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} e_{{P_{i}^{\text{a}} }} \hfill \\ e_{{P_{i}^{\text{b}} }} \hfill \\ e_{{P_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] $$
(11)
$$ \left[ \begin{aligned} (Q_{i}^{\text{a}} )^{m} \hfill \\ (Q_{i}^{\text{b}} )^{m} \hfill \\ (Q_{i}^{\text{c}} )^{m} \hfill \\ \end{aligned} \right] = \sum\limits_{j \in i} {\left[ \begin{aligned} Q_{ij}^{\text{a}} \hfill \\ Q_{ij}^{\text{b}} \hfill \\ Q_{ij}^{\text{c}} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} o_{{Q_{i}^{\text{a}} }} \hfill \\ o_{{Q_{i}^{\text{b}} }} \hfill \\ o_{{Q_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} e_{{Q_{i}^{\text{a}} }} \hfill \\ e_{{Q_{i}^{\text{b}} }} \hfill \\ e_{{Q_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right]} $$
(12)

where \( j \in i \) indicates that j is connected to i.

  1. 3)

    Three-phase branch current measurements:

$$ \begin{aligned} \sum\limits_{{\phi \in \{ {\text{a,b,c}}\} }} {I_{ij}^{\varphi } } I_{ij}^{\phi } (\cos (\theta_{ij}^{\phi } - \theta_{ij}^{\varphi } )r_{ij}^{\phi \varphi } + \sin (\theta_{ij}^{\phi } - \theta_{ij}^{\varphi } )x_{ij}^{\phi \varphi } ) \hfill \\ \quad \quad \quad = P_{ij}^{\varphi } + P_{ji}^{\varphi } + o_{{P_{ji,loss}^{\varphi } }} + e_{{P_{ji,loss}^{\varphi } }} = P_{ji,loss}^{\varphi } \hfill \\ \end{aligned} $$
(13)
$$ \begin{aligned} \sum\limits_{{\phi \in \{ {\text{a,b,c}}\} }} {I_{ij}^{\varphi } } I_{ij}^{\phi } (\cos (\theta_{ij}^{\phi } - \theta_{ij}^{\varphi } )x_{ij}^{\phi \varphi } - \sin (\theta_{ij}^{\phi } - \theta_{ij}^{\varphi } )r_{ij}^{\phi \varphi } ) \hfill \\ \quad \quad \quad = Q_{ij}^{\varphi } + Q_{ji}^{\varphi } + o_{{Q_{ij,loss}^{\varphi } }} + e_{{Q_{ij,loss}^{\varphi } }} = Q_{ij,loss}^{\varphi } \quad \quad \hfill \\ \end{aligned} $$
(14)

where \( \phi \) and \( \varphi \) represent the phases; \( r_{ij}^{\phi \varphi } \) is the mutual resistance of branch ij between phase \( \varphi \) and phase \( \phi \); \( x_{ij}^{\phi \varphi } \) is the mutual reactance of branch ij between phase \( \varphi \) and phase \( \phi \).

  1. 4)

    Three-phase bus voltage and phase angle measurements:

$$ \left[ \begin{aligned} (U_{i}^{\text{a}} )^{m} \hfill \\ (U_{i}^{\text{b}} )^{m} \hfill \\ (U_{i}^{\text{c}} )^{m} \hfill \\ \end{aligned} \right] = \left[ \begin{aligned} U_{i}^{\text{a}} \hfill \\ U_{i}^{\text{b}} \hfill \\ U_{i}^{\text{c}} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} o_{{U_{i}^{\text{a}} }} \hfill \\ o_{{U_{i}^{\text{b}} }} \hfill \\ o_{{U_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} e_{{U_{i}^{\text{a}} }} \hfill \\ e_{{U_{i}^{\text{b}} }} \hfill \\ e_{{U_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] $$
(15)
$$ \left[ \begin{aligned} (\theta_{i}^{\text{a}} )^{m} \hfill \\ (\theta_{i}^{\text{b}} )^{m} \hfill \\ (\theta_{i}^{\text{c}} )^{m} \hfill \\ \end{aligned} \right] = \left[ \begin{aligned} \theta_{i}^{\text{a}} \hfill \\ \theta_{i}^{\text{b}} \hfill \\ \theta_{i}^{\text{c}} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} o_{{\theta_{i}^{\text{a}} }} \hfill \\ o_{{\theta_{i}^{\text{b}} }} \hfill \\ o_{{\theta_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] + \left[ \begin{aligned} e_{{\theta_{i}^{\text{a}} }} \hfill \\ e_{{\theta_{i}^{\text{b}} }} \hfill \\ e_{{\theta_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] $$
(16)
  1. 5)

    Pseudo-measurements formulating network constraints:

$$ \begin{aligned} \frac{{P_{ij}^{\varphi } }}{{U_{i}^{\varphi } }} = \sum\limits_{{\phi \in \{ {\text{a,b,c}}\} }} {U_{j}^{\phi } } (\cos (\theta_{i}^{\varphi } - \theta_{j}^{\phi } )g_{ij}^{\varphi \phi } + \sin (\theta_{i}^{\varphi } - \theta_{j}^{\phi } )b_{ij}^{\varphi \phi } ) \hfill \\ \quad \qquad - \sum\limits_{{\phi \in \{ {\text{a,b,c}}\} }} {U_{i}^{\phi } } (\cos (\theta_{i}^{\varphi } - \theta_{i}^{\phi } )g_{ij}^{\varphi \phi } + \sin (\theta_{i}^{\varphi } - \theta_{i}^{\phi } )b_{ij}^{\varphi \phi } ) \hfill \\ \end{aligned} $$
(17)
$$ \begin{aligned} \frac{{Q_{ij}^{\varphi } }}{{U_{i}^{\varphi } }} = \sum\limits_{{\phi \in \{ {\text{a,b,c}}\} }} {U_{j}^{\phi } } (\sin(\theta_{i}^{\varphi } - \theta_{j}^{\phi } )g_{ij}^{\varphi \phi } - \cos (\theta_{i}^{\varphi } - \theta_{j}^{\phi } )b_{ij}^{\varphi \phi } ) \hfill \\ \qquad \quad - \sum\limits_{{\phi \in \{ {\text{a,b,c}}\} }} {U_{i}^{\phi } } (\sin(\theta_{i}^{\varphi } - \theta_{i}^{\phi } )g_{ij}^{\varphi \phi } - \cos (\theta_{i}^{\varphi } - \theta_{i}^{\phi } )b_{ij}^{\varphi \phi } ) \hfill \\ \end{aligned} $$
(18)

where \( g_{ij}^{\varphi \phi } \) is the mutual conductance of branch ij between phase \( \varphi \) and phase \( \phi \); \( b_{ij}^{\varphi \phi } \) is the mutual susceptance of branch ij between phase \( \varphi \) and phase \( \phi \).

The fast decoupled method can improve the efficiency of this DSSE problem. Fast decoupled state estimation depends on the PQ decoupled formulation of the measurement equations. Clearly, the real and reactive power measurement (9) and (10), the injection power measurement (11) and (12), the branch current measurement (13) and (14), and the bus voltage and phase angle measurement (15) and (16) can all be expressed in a PQ decoupled formulation.

The network constraints (17) and (18) involve both voltage U and phase angle \( \theta \). As reviewed above, in a distribution network, the normalized R/X can be adjusted by choosing an appropriate complex base value and thus original R/X value is not required to be small to adapt the proposed formulation. Once normalized R/X is small, U has little impact on the active power, and \( \theta \) has little impact on the reactive power and \( U_{j}^{\phi } \approx 1 \). Hence, the first-order differential of the pseudo-measurement equations can be formulated to have PQ decoupled properties as (19) and (20).

$$ \frac{{\Delta P_{ij}^{\varphi } }}{{U_{i}^{\varphi } }} = \sum\limits_{{\phi \in \{ {\text{a,b,c}}\} }} {(b_{ij}^{\varphi \phi } { \cos }(\theta_{i}^{\varphi } - \theta_{j}^{\phi } ) - g_{ij}^{\varphi \phi } \sin (\theta_{i}^{\varphi } - \theta_{j}^{\phi } ))(\Delta \theta_{i}^{\phi } - \Delta \theta_{j}^{\phi } )} $$
(19)
$$ \frac{{\Delta Q_{ij}^{\varphi } }}{{U_{i}^{\varphi } }} = \sum\limits_{{\phi \in \{ {\text{a,b,c}}\} }} {(g_{ij}^{\varphi \phi } \sin (\theta_{i}^{\varphi } - \theta_{j}^{\phi } ) - b_{ij}^{\varphi \phi } { \cos }(\theta_{i}^{\varphi } - \theta_{j}^{\phi } ))(\Delta U_{i}^{\phi } - \Delta U_{j}^{\phi } )} $$
(20)

In addition, in a three-phase distribution system, some other approximations listed in (21) can be made.

$$ \left\{ \begin{aligned} &\theta^{\text{a}} - \theta^{\text{b}} \approx 120^{\text{o}} \quad \hfill \\ &\theta^{\text{a}} - \theta^{\text{c}} \approx - 120^{\text{o}} \hfill \\ &\theta^{\text{b}} - \theta^{\text{c}} \approx - 240^{\text{o}} \hfill \\ \end{aligned} \right. $$
(21)

As a result, (19) turns to be (22) and (20) is transformed to be (23):

$$ \left[ \begin{aligned} \frac{{\Delta P_{ij}^{\text{a}} }}{{U_{i}^{\text{a}} }} \hfill \\ \frac{{\Delta P_{ij}^{\text{b}} }}{{U_{i}^{\text{b}} }} \hfill \\ \frac{{\Delta P_{ij}^{\text{c}} }}{{U_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] = [\varvec{A}\quad - \varvec{A}]\left[ \begin{aligned} \Delta \theta_{i}^{\text{a}} \hfill \\ \Delta \theta_{i}^{\text{b}} \hfill \\ \Delta \theta_{i}^{\text{c}} \hfill \\ \Delta \theta_{j}^{\text{a}} \hfill \\ \Delta \theta_{j}^{\text{b}} \hfill \\ \Delta \theta_{j}^{\text{c}} \hfill \\ \end{aligned} \right] $$
(22)
$$ \left[ \begin{aligned} \frac{{\Delta Q_{ij}^{\text{a}} }}{{U_{i}^{\text{a}} }} \hfill \\ \frac{{\Delta Q_{ij}^{\text{b}} }}{{U_{i}^{\text{b}} }} \hfill \\ \frac{{\Delta Q_{ij}^{\text{c}} }}{{U_{i}^{\text{c}} }} \hfill \\ \end{aligned} \right] = [ - \varvec{A}\quad \varvec{A}]\left[ \begin{aligned} \Delta U_{i}^{\text{a}} \hfill \\ \Delta U_{i}^{\text{b}} \hfill \\ \Delta U_{i}^{\text{c}} \hfill \\ \Delta U_{j}^{\text{a}} \hfill \\ \Delta U_{j}^{\text{b}} \hfill \\ \Delta U_{j}^{\text{c}} \hfill \\ \end{aligned} \right] $$
(23)

where A in (22) is a constant matrix of size \( 3 \times 3 \):

$$ \varvec{A} = \left[ \begin{aligned} \quad \quad \quad b_{ij}^{\text{aa}} \quad \quad \quad - \frac{1}{2}b_{ij}^{\text{ab}} - \frac{\sqrt 3 }{2}g_{ij}^{\text{ab}} \quad - \frac{1}{2}b_{ij}^{\text{ac}} + \frac{\sqrt 3 }{2}g_{ij}^{\text{ac}} \hfill \\ - \frac{1}{2}b_{ij}^{\text{ab}} + \frac{\sqrt 3 }{2}g_{ij}^{\text{ab}} \quad \quad \quad \quad b_{ij}^{\text{bb}} \quad \quad \quad - \frac{1}{2}b_{ij}^{\text{bc}} - \frac{\sqrt 3 }{2}g_{ij}^{\text{bc}} \quad \hfill \\ - \frac{1}{2}b_{ij}^{\text{ac}} - \frac{\sqrt 3 }{2}g_{ij}^{\text{ac}} \quad - \frac{1}{2}b_{ij}^{\text{bc}} + \frac{\sqrt 3 }{2}g_{ij}^{\text{bc}} \quad \quad \quad \quad b_{ij}^{\text{cc}} \quad \quad \hfill \\ \end{aligned} \right] $$
(24)

Finally, all of the elements in the Jacobian matrix of h(y) are constant. It is important to note that \( P_{ij}^{\text{abc}} ,P_{ji}^{\text{abc}} ,Q_{ij}^{\text{abc}} ,Q_{ji}^{\text{abc}} ,P_{i}^{\text{abc}} ,Q_{i}^{\text{abc}} \), and \( I_{ij}^{\text{abc}} \) are irrelevant to the bus voltage U and phase angle \( \theta \) in our model and the Jacobian matrix can be divided into three blocks arranged in diagonal form. Therefore, the calculation procedure of the proposed FDSE is mainly composed of the following three steps.

Step 1: Estimates of branch active and reactive powers are first calculated as:

$$ \left[ \begin{aligned} (P_{ij}^{\text{abc}} )^{m} \hfill \\ (P_{ji}^{\text{abc}} )^{m} \hfill \\ (Q_{ij}^{\text{abc}} )^{m} \hfill \\ (Q_{ji}^{\text{abc}} )^{m} \hfill \\ (P_{i}^{\text{abc}} )^{m} \hfill \\ (Q_{i}^{\text{abc}} )^{m} \hfill \\ (P_{ij\_loss}^{\text{abc}} )^{m} \hfill \\ (Q_{ij\_loss}^{\text{abc}} )^{m} \hfill \\ \end{aligned} \right] = \varvec{C}\left[ \begin{aligned} P_{ij}^{\text{abc}} \hfill \\ P_{ji}^{\text{abc}} \hfill \\ Q_{ij}^{\text{abc}} \hfill \\ Q_{ji}^{\text{abc}} \hfill \\ \end{aligned} \right] $$
(25)

where C is the constant measurement Jacobian matrix for branch active and reactive power measurements and current measurements.

Then, the estimates of U and \( \theta \) can be obtained by conducting the fast decoupled iteration process.

Step 2: The phase angles are first corrected by:

$$ \left[ \begin{aligned} \Delta (\theta_{i}^{\text{abc}} )^{m} \hfill \\ \frac{{\Delta P_{ij}^{\text{abc}} }}{{U_{i}^{\text{abc}} }} \hfill \\ \end{aligned} \right] = \varvec{B}_{1} \Delta \theta_{i}^{\text{abc}} $$
(26)

Step 3: The voltages are corrected by:

$$ \left[ \begin{aligned} \Delta (U_{i}^{\text{abc}} )^{m} \hfill \\ \frac{{\Delta Q_{ij}^{\text{abc}} }}{{U_{i}^{\text{abc}} }} \hfill \\ \end{aligned} \right] = \varvec{B}_{2} \Delta U_{i}^{\text{abc}} $$
(27)

where B1 and B2 are the constant measurement Jacobian matrixes for pseudo-active power measurements.

This procedure continues to convergence. The detail of the iteration procedure is similar to that reported previously [1].

In step 1, the constant Jacobian matrix C relates the active and reactive power state variables to active and reactive power measurements. The elements of C are determined by (9)–(14). C is relatively independent and the relationship between measurements and state variables is linear. Thus, there is no need to calculate iteratively. \( P_{ij}^{\text{abc}} ,P_{ji}^{\text{abc}} ,Q_{ij}^{\text{abc}} ,Q_{ji}^{\text{abc}} ,P_{i}^{\text{abc}} ,Q_{i}^{\text{abc}} \) and \( I_{ij}^{\text{abc}} \) can be estimated directly in one calculation.

In Step 2, the constant Jacobian matrix \( \varvec{B}_{1} \) relates the phase angle state variables to phase angle measurements and pseudo-active power measurements. \( \varvec{B}_{1} \) is formulated by the Jacobian matrix \( [\varvec{A}, - \varvec{A}] \) as shown in (22) representing the relationship between the incremental of pseudo-active power measurements and incremental of phase angle state variables, and unit matrixes representing the relationship between the incremental of phase angle measurements and the incremental of phase angle state variables.

In Step 3, the constant Jacobian matrix is \( \varvec{B}_{2} \) relates the voltage state variables to voltage measurements and pseudo-reactive power measurements. \( \varvec{B}_{2} \) is formulated by the Jacobian matrix \( [ - \varvec{A},\varvec{A}] \) as shown in (23) representing the relationship between the incremental of pseudo-reactive power measurements and incremental of phase angle state variables, and unit matrixes representing the relationship between the incremental of voltage measurements and incremental of voltage state variables.

4 Numerical tests

To verify the effectiveness of the R-Capped-L1 SE model with fast decoupled solutions in a three-phase distribution network, the method was programmed in MATLAB and tested on three distribution networks: an IEEE 33-bus distribution network, an IEEE 123-bus distribution network, and a 615-bus distribution network spliced by 5 IEEE 123-bus systems.

The standard deviation of the errors corresponding to bad data o is 50 times that of the noises e, and the standard deviation of noises e is given by:

$$ \sigma = 0.001 \times \left| {z_{mean} } \right| $$
(28)

where \( z_{mean} \) is the mean of the per-unit measurements.

We measure the estimate errors by resistance of the branch \( R = \left\| {\hat{y} - y_{true} } \right\|_{2} \), where \( \hat{y} \) is the estimated value and ytrue is the true value, and assign \( \alpha^{(l)} \) in Fig. 1 to be (29), which is relatively reasonable for the simulation method.

$$ \alpha^{(l)} = f(\varvec{o}^{(l)} ) = mean(\varvec{o}^{(l)} ) $$
(29)

To maintain a situation close to the practical scenario, 30% of nodes and branches were selected randomly for installation by PMU, and voltages, phase angles, and currents on these nodes and branches are part of the measurement vector.

To perform sufficient analysis on the R-Capped-L1 model and the fast decoupled method for the three-phase distribution network, we conducted case studies from four aspects:

  1. 1)

    Impact of the dual parameter \( \lambda \) on the accuracy of the R-Capped-L1 model.

  2. 2)

    Efficiency comparison of FDSE with Newton base SE in [16] and R-Capped-L1 with Capped-L1.

  3. 3)

    Performance comparison with the traditional WLS model, robust WLAV model, and sparse recovery model L1-R.

  4. 4)

    Impact of imbalance in the three-phase loads.

In these cases, the WLAV model for comparison is a traditional robust model that performs superiorly in suppressing bad data. We compared our sparse recovery Capped-L1 model with this widely accepted robust model to verify the efficiency and precision of our model. The formulation of the WLAV model is as follows:

$$ \hbox{min} \quad J(\varvec{y}) = \sum\limits_{i} {\left| {w_{i} (z_{i} - h(y_{i} ))} \right|} $$
(30)

where wi is the weight associated with ith measurement.

For convenience, we simply considered the unweighted situation and ignored the differences in precision between measurements. Thus, the WLAV model can be simply described as :

$$ \hbox{min} \quad J(\varvec{x}) = \left\| {\varvec{y} - \varvec{h}(\varvec{x})} \right\|_{1} $$
(31)

Similarly, the weights in the WLS model were ignored and the unweighted model was adopted.

Comparisons between these two unweighted and sparse recovery models will be discussed next.

4.1 Impact of dual parameter on accuracy of R-Capped-L1 model

Similar to other Lagrangian optimization problems, \( \lambda \) is a crucial parameter that largely influences the precision of the solutions. To maximize the precision of the sparse recovery SE models and detect the impact of the dual parameter, we scanned \( \lambda \) across a range for the L1-R and R-Capped-L1 models (identical to Capped-L1). The optimal value of \( \lambda \) for different test cases is provided by case study, which is important for the application of sparse recovery models. Table 1 shows the optimal value of \( \lambda \) for the three test feeders in cases of bad data proportions (BD) of 0.06 and 0.1.

Table 1 Optimal value of λ for tested cases

Figures 2, 3 and 4 show the estimation errors against \( \lambda \) for the R-Capped-L1 model, the L1-R model, and the robust WLAV model. As a result of the independence of WLAV from \( \lambda \), the WLAV trendline is horizontal. However, both the R-Capped-L1 and the L1-R models are sensitive to dual parameter \( \lambda \) and the trendlines have extreme values. At extreme points, the SE precision of the R-Capped-L1 model is highest, and the WLAV model is the most inaccurate. Thereby, in the following case studies, the optimal values of \( \lambda \) were applied to test the performance of the sparse recovery SE models.

Fig. 2
figure 2

Estimation error versus λ in IEEE 33-bus system

Fig. 3
figure 3

Estimation error versus λ in IEEE 123-bus system

Fig. 4
figure 4

Estimation error versus λ in 615-bus system

4.2 Efficiency comparison

To reduce the calculation time of the Capped-L1 model, the fast decoupled method for three-phase DSSE was adopted and a transformation strategy was applied. To verify the efficiency of the FDSE method on the R-Capped-L1 model, the test results comparing the computational speed of the four SE methods are shown in Tables 2, 3 and 4. Bad data proportions were 0.06 for all simulations. The calculation times were measured by CPU time. All results in the tables are the average of 20 trials in which the locations of bad data, the errors, and the noises were randomized.

Table 2 Performance of two SE models using NB or FD methods in IEEE 33-bus system
Table 3 Performance of two SE models using NB or FD methods in IEEE123 bus system
Table 4 Performance of two SE models using NB or FD methods in 615-bus system

From the results, we can conclude that the proposed FDSE is more efficient. On one hand, although the Jacobian matrix of the FDSE was enlarged by introducing additional state variables \( (P_{ij}^{\text{abc}} ,P_{ji}^{\text{abc}} ,Q_{ij}^{\text{abc}} ,Q_{ji}^{\text{abc}} ) \), this matrix is very sparse and the function between these variables and measurements is linear with no need to iterate in the calculation process. On the other hand, despite the fewer iterations required for NBSE, the Jacobian matrix has to be reformulated at every iteration, whereas it only needs to be formulated initially for FDSE. Thus, the FDSE method maintains its efficiency.

Additionally, the validity of the transformation strategy in (6) is proved. The process transforming the original Capped-L1 to a formulation without absolute value helps decrease the computational burden.

It is important to note that the calculation errors for the four methods are very close because all the methods met the same constraints and were solved by the same sparse recovery model (the solution for Capped-L1 is the same as that for R-Capped-L1).

4.3 Performance comparison with WLS, WLAV and L1-R models

In this section, test results comparing the performance of the R-Capped-L1 model (with fast decoupled solutions) with the traditional WLS model, the WLAV model, and the L1-R model are listed. The dual parameter \( \lambda \) was chosen to be optimal as shown in Table 1 and all results are the average for 20 trials. Tables 5, 6 and 7 show the estimation errors, iterations, and CPU times of each model for the three distribution systems. Table 8 lists a specific example that compares the recovered sparse errors of bad data via the proposed R-Capped-L1 model, the traditional WLS model, the WLAV model and the L1-R model with the pre-set bad data values. Please note that since the WLS model and the WLAV model cannot distinguish between noises and errors, for convenience of comparison, the values listed in the table are the sums of them and since the numbers of sparse errors for the tested systems are very large even if bad data proportion is 0.06, only the recovered errors and noises of bus voltage and phase angle measurements with larger amplitude are listed in the table.

Table 5 Performance of four SE models in IEEE 33-bus system
Table 6 performance of four SE models for IEEE 123-bus system
Table 7 Performance of four SE models in 615-bus system
Table 8 Sum of recovered sparse errors and noises in IEEE 33-bus system compared with different pre-set values

It can be discerned from the results that the R-Capped-L1 model has the highest precision and efficiency of all three cases, but it also has the largest number of iterations because the process of solving an R-Capped-L1 model is a multi-stage issue, with each stage exactly solving a convex optimization problem (each stage is similar to the L1-R model). Although the number of iterations for solving the R-Capped-L1 model is larger, its computational speed is still the fastest because of the application of the FDSE method and the relaxation step in (6). Furthermore, as the BD increases from 0.06 to 0.1, all SE models except for WLS show good performance in compressing bad data. The R-Capped-L1 model was more robust than the other models. Please note that the errors here are different because different models are adopted.

4.4 Impact of imbalance in three-phase loads

Finally, we consider the impact of imbalance in the three-phase loads. On the basis of the original IEEE 33-bus system, the ratio among the load of each phase was multiplied by a coefficient to explore the effect of three-phase imbalance on the iteration times of the FDSE method. Table 9 lists the number of iterations and the CPU time of FDSE and NBSE for the R-Capped-L1 models. The bad data rate was 0.1.

Table 9 Iterations and CPU times for R-Capped-L1 model of three-phase imbalances in IEEE 33-bus system

As shown in Table 9, the iteration of both FDSE and NBSE for R-Capped-L1 are almost constant as the load ratio changes. The imbalance in the three-phase distribution system has little impact on the iteration process. Actually, a similar conclusion has already been discussed elsewhere [20] and it was concluded that the iterative times for FDPF depend on R/X and the power factor to some degree, but not the imbalance in the loads.

5 Conclusion

In this paper, Capped-L1 was first introduced into DSSE and a revision, denoted R-Capped-L1, was proposed to improve the computational efficiency. In addition, a novel three-phase FDSE model for distribution networks was adopted, which could also accelerate the solution procedure for R-Capped-L1. Numerical tests were conducted to analyze the performance of the proposed R-Capped-L1 model and it was shown to have advantages in both computational efficiency and compressing bad data.

Furthermore, the transformation strategy shown in (6) can be applied to the L1-R model to accelerate the calculation process, and the R-Capped-L1 model with fast decoupled solutions can also be used in solving the SE problem for transmission networks. The optimal placement of PMUs to enhance accuracy of state estimation is another interesting topic but probably beyond the scope within this single paper. Also, spatial decomposition into multi-area SE merits further investigation. These issues will be regarded as future works.