Introduction: Resilience in Power System Coordination & Control

Recently, the academic and industrial literature has coalesced around an enhanced vision of the electric power grid that is intelligent, responsive, dynamic, adaptive and flexible [3, 4, 51, 59, 87, 98]. One particularly emphasized “smart-grid” property is that of resilience where healthy regions of the grid continue to operate while disrupted and perturbed regions bring themselves back to normal operation. This is a cyber-physical grand challenge [4850]. “Future power systems” require a fundamental evolution in the physical structure of today’s power grid with a corresponding change in the grid’s many layers of control and optimization algorithms [31]. Naturally, these must be considered holistically to achieve the end goal of system resilience.

The earliest work on resilient or self-healing power grids [1, 2] was envisioned for the entire power grid. Although this view remains applicable [3, 4, 51], much of the current resilient power systems literature has focused on the emerging concept of microgrids as a leading technology [11, 15, 19, 39, 44, 97, 100]. These microgrids are defined as electric power systems that: have distributed renewable and thermal energy generation as well as conventional and dispatchable loads. They also have the ability to operate while connected or disconnected from the main power grid [12, 40, 41, 65]. The high penetration of renewable energy resources introduces new dynamics into these microgrids at all timescales [5, 17]. Furthermore, the introduction of dispatchable energy resources on the demand side suggests an explosion in the number of active devices which require control and coordination [4, 31, 62, 75, 94].

Naturally, a wide array of microgrid literature has emerged to address their control and optimization. Traditional power grid operation and control is a hierarchical structure with three layers [37, 96] that spans multiple power grid timescales. These include a primary and a secondary control and a tertiary dispatch. Many microgrid control and optimization developments have drawn from this traditional hierarchical approach with customizations to account for the unique features found in microgrids [12, 40, 41, 65]. Generally speaking, each of these layers have typically been addressed individually despite their interdependence. More recent work instead advocates a holistic enterprise control approach [31, 69, 70] where all three layers are simultaneously synthesized, analyzed and simulated.

The microgrid control and optimization developments mentioned above have generally been centralized in nature and thus they have limited resilience with respect to being able to connect and disconnect while certain microgrids are perturbed or disrupted. Furthermore, it is important to consider how multiple microgrids will interact with each other as “peer” regions [56]. Similarly, recent work has advocated resilient control systems [79, 80] built upon open, distributed, and interoperable architectures [48, 49, 89] of the power grid as an integrated cyber-physical system. Multi-agent systems have often been proposed as a key-enabling technology for such a resilient control [14, 58, 77, 92]. The most recent work in this regard is consonant with an enterprise control approach and suggests a hierarchy of agents that address power system management, coordination, and real-time execution control [79, 80].

And while multi-agent systems are often proposed as a key enabling technology to achieve resilient future power systems, their application in the power system domain has often been for other purposes. Several recent reviews show that multi-agent systems research in the power systems domain is well-established [18, 36, 66, 67, 83, 92]. While their original application was often for power system market simulation [85], they have also been used in the context of power system stability control [71]. And yet, the prevailing intention behind these developments is the decentralization of a particular decision-making/control algorithm rather than the development of resilience as a system property. While the former is necessary for the latter, it is far from sufficient.

Contribution

The contribution of this paper builds upon an earlier version [26] of this work and is two-fold. First, it seeks to identify a set of multi-agent system design principles for resilient coordination and control of power systems. In this regard, the paper builds upon the existing literature on autonomous and multi-agent systems and focuses specifically on the design principles that can bring about greater resilience in power systems. Second, the paper assesses the adherence of existing MAS implementations in the power systems domain to these design principles. This serves to clarify where future research efforts can best be directed.

Paper Outline

To these ends, the paper is organized as follows. “MAS design principles for resilience in power systems ” section presents as background an Axiomatic Design [88] model which was used in the development of resilience measures [2729] for large flexible engineering systems (LFESs). “Adherence of existing MAS implementations to design principles” section then uses the model to distill a set of MAS design principles that facilitate greater power system resilience. “Conclusions & future work” section then assesses the adherence of some recent MAS implementations with respect to these design principles. The paper is brought to a conclusion in “Conclusions & Future Work” section.

Background: Axiomatic Design Model for Resilient Power Systems

The MAS design principles for resilient coordination and control of power systems rests upon an Axiomatic Design Model for LFESs which has recently been used to develop a set of quantitative resilience measures [2729].

Definition 1

LFES [88]: an engineering system with many functional requirements (i.e., system processes) that not only evolve over time, but also can be fulfilled by one or more design parameters (i.e., system resources).

Note that the scope of LFES spans multiple engineering application domains including production, power, water, and transportation systems [21, 23, 24, 2730, 63, 91]. Furthermore, the choice of Axiomatic Design Theory rests in the realization that traditional graph theoretic methods only include an explicit description of system form and neglect system function [2729]—thus hindering the study of resilience where both system function and form change. While the full description of resilience measures is not feasible here, the underlying Axiomatic Design model [2730] for LFES is included in order to provide a common foundation upon which the remainder of the discussion is based. The interested reader is referred to previous works [21, 2325, 2730, 63, 91] for further discussion and illustrative examples of how this Axiomatic Design model has been applied to resilience and reconfigurability measurement. The second half of this section serves to ground the Axiomatic Design model, as a concise description of system structure, to traditional power systems models as descriptions of power system behavior.

Introduction to Axiomatic Design for Large Flexible Engineering Systems

At its foundation, Axiomatic Design for Large Flexible Engineering Systems is built upon a mapping of systems processes (P) to system resources (R) [21, 23, 24, 2730, 88, 91]. Here, it is understood that both the processes and resources are cyber-physical. The system processes include a physical activity (e.g. power generation, transmission and consumption) with its associated cyber-activities that consist of the enterprise coordination and control. Similarly, as is common in multi-agent system research [8, 57], the resources include physical entities (e.g. power plants, lines, and loads) with their associated informatic entities (i.e., agents).

The mapping between system processes (P) to system resources (R) arises from the Independence Axiom [88] which requires that any given process not require more than one resource for its completion. That said, in LFESs, any process can potentially be completed by any resource and any resource can potentially complete any process (on its own). The associated mapping is described in terms of a design equation

$$\begin{aligned} P=J_S\odot R \end{aligned}$$
(1)

where \(J_S\) is a binary matrix called a LFES “knowledge base”, and \(\odot \) is “matrix boolean multiplication”.

Definition 2

LFES Knowledge Base [21, 23, 24, 2730, 91]: A binary matrix \(J_S\) of size \(\sigma (P)\times \sigma (R)\) whose element \(J_S(w,v)\in \{0,1\}\) is equal to one when action \(e_{wv}\in E_S\) exists (where \(\sigma ()\) gives the size of a set).

Here, the term “action” is drawn from SysML [33] where it is used within activity diagrams. Consequently, the system knowledge base itself forms a bipartite graph which maps the set of system processes to their resources. Each individual mapping represents the existence of a system capability. The system processes and resources may be defined at any level of abstraction and axiomatic design encourages functional and physical decomposition with successive stages of engineering design.

Essential to the development of the model is the specialization of these system processes and resources. The resources \(R=M \cup B \cup H\) may be classified into transforming resources \(M=\{m_1\ldots m_{\sigma (M)}\}\), independent buffers \(B=\{b_1\ldots b_{\sigma (B)}\}\), and transporting resources \(H=\{h_1\ldots h_{\sigma (H)}\}\) [21, 23, 24, 2730]. The set of buffers \(B_S=M \cup B\) is also introduced for later simplicity. These resource R may also be logically aggregated into a set of aggregated resources \(\bar{R}\) by means of an aggregation matrix and operator \(\circledast \) [21, 30].

$$\begin{aligned} \bar{R}=A\circledast R \end{aligned}$$
(2)

The high level system processes are formally classified into three varieties: transformation, transportation and holding processes.

Definition 3

Transformation Process [21, 23, 24, 2730]: A resource-independent, technology-independent process \(p_{\mu j} \in P_\mu =\{p_{\mu 1}\ldots p_{\mu \sigma (P_\mu )}\}\) that transforms an artifact from one form into another.

Definition 4

Holding Process [21, 23, 24, 2730]:A transportation independent process \(p_{\varphi g} \in P_\varphi \) that holds artifacts during the transportation from one buffer to another.

Definition 5

Transportation Process [21, 23, 24, 2730]: A resource-independent process \(p_{\eta u}\in P_\eta =\{p_{\eta 1}\ldots p_{\eta \sigma (P_\eta )}\}\) that transports artifacts from one buffer \(b_{sy_1}\) to \(b_{sy_2}\). There are \(\sigma ^2(B_S)\) such processes of which \(\sigma (B_S)\) are “null” processes where no motion occurs. Furthermore, the convention of indices

$$\begin{aligned} u=\sigma (B_S)(y_1-1) + y_2 \end{aligned}$$
(3)

is adopted.

It important to note for later discussion that the convention stated in Eq. 3 implies a directed bipartite graph between the set of independent buffers and the transportation processes whose incidence in \(M_{H^-}\) and incidence out \(M_{H^+}\) matrices are given by:

$$\begin{aligned} M_{H^+}= & {} \sum _{y1=1}^{\sigma (B)}e_{y1}^{\sigma (B)}[e_{y1}^{\sigma (B)}\otimes \mathbf {1}^{\sigma (B)}]^T\end{aligned}$$
(4)
$$\begin{aligned} M_{H^+}= & {} \sum _{y2=1}^{\sigma (B)}e_{y2}^{\sigma (B)}[ \mathbf {1}^{\sigma (B)}\otimes e_{y1}^{\sigma (B)}]^T \end{aligned}$$
(5)

where \(\mathbf {1}^n\) is a column ones vector of predefined length n, \(e_{i}^{n}\) is the ith elementary basis vector, and \(\otimes \) is the kronecker product. Consequently, a generalized transportation process incidence matrix \(M_H\) becomes:

$$\begin{aligned} M_H=M_{H^+}-M_{H^-} \end{aligned}$$
(6)

The LFES knowledge base, \(J_S\), can be reconstructed straightforwardly from smaller knowledge bases that individually address transformation and transportation processes. \(P_\mu =J_M\odot M\), and \(P_\eta =J_H\odot R\). \(J_S\) then becomes [30]

$$\begin{aligned} J_S= \left[ \frac{J_M|\mathbf {0}}{J_{H}}\right] \end{aligned}$$
(7)

Axiomatic Design for LFES distinguishes between the existence and the availability of system capabilities. This is managed by a scleronomic (i.e., sequence-independent) constraints matrix.

Definition 6

LFES Scleronomic Constraints Matrix [21, 23, 24, 2730]: A binary matrix \(K_S\) of size \(\sigma (P)\times \sigma (R)\) whose element \(K_S(w,v)\in \{0,1\}\) is equal to one when a constraint eliminates action \(e_{wv}\) from the action set.

Consequently, a measure of sequence-independent structural degrees of freedom (DOF) is introduced to measure the number of available system capabilities.

Definition 7

LFES Sequence-Independent Structural Degrees of Freedom [21, 23, 24, 2730]: The set of independent actions \({\fancyscript{E}}_S\) that completely defines the available processes in a LFES. Their number is given by:

$$\begin{aligned} \textit{DOF}_S=\sigma ({\fancyscript{E}}_S)= & {} \sum _w^{\sigma (P)}\sum _v^{\sigma (R)}\left[ J_S\ominus K_S \right] (w,v)\end{aligned}$$
(8)
$$\begin{aligned}= & {} \langle J_S,\bar{K}_S \rangle _F=tr(J_S^T\bar{K}_S) \end{aligned}$$
(9)

As has been shown in previous work [24, 2729], it is often useful to vectorize \(J_S\) and \(K_S\). The shorthand \(()^V\) is used to replace vec(). Furthermore, a projection operator may be introduced to project the vectorized knowledge base onto a one’s vector to eliminate sparsity. \(\mathbb {P}(J_S\ominus K_S)^V=\mathbf {1}^{\sigma ({\fancyscript{E}}_S)}\). While solutions for \(\mathbb {P}\) are not unique, this work chooses:

$$\begin{aligned} \mathbb {P}=\bigg [e_{\psi _1}^{\sigma ({\fancyscript{E}}_S)}, \ldots , e_{\psi _{\sigma ({\fancyscript{E}}_S)}}^{\sigma ({\fancyscript{E}}_S)}\bigg ] \end{aligned}$$
(10)

where \(e_{\psi _i}^{\sigma ({\fancyscript{E}}_S)}\) is the \(\psi _i^{th}\) elementary row vector corresponding to the first up to the last structural degree of freedom.

The resilience measures for LFESs (mentioned at the beginning of this section) recognized that system capabilities needed to be addressed as sequences rather than individually. For this reason, it introduced a rheonomic (i.e., sequence-dependent) knowledge base and constraints matrix.

Definition 8

LFES Rheonomic knowledge base [24, 2729]: A square binary matrix \(J_\rho \) of size \(\sigma (P)\sigma (R)\times \sigma (P)\sigma (R)\) whose element \(J_\rho (\psi _1,\psi _2)\in \{0,1\}\) is equal to one when string \(z_{\psi 1 \psi 2}=e_{w_1v_1}e_{w_2v_2} \in Z\) exists. It may be calculated directly as

$$\begin{aligned} J_\rho =\left[ J_S\ominus {K}_S\right] ^V\left[ J_S\ominus {K}_S\right] ^{VT} \end{aligned}$$
(11)

Definition 9

LFES Rheonomic Constraints Matrix \(K_{\rho }\) [24, 2729]: a square binary constraints matrix of size \(\sigma (P)\sigma (R)\times \sigma (P)\sigma (R)\) whose elements \(K_\rho (\psi _1,\psi _2)\in \{0,1\}\) are equal to one when string \(z_{\psi 1 \psi 2}=e_{w_1v_1}e_{w_2v_2} \in Z\). is eliminated and where \(\psi =\sigma (P)(v-1)+w\).

Previous work has calculated \(K_\rho \) and has shown that it must be non-zero so as to account, at a minimum, for basic rules of continuity. The destination/location of one structural degree of freedom must occur at the origin/location of the subsequent one [21, 23, 24, 2730, 91]. Consequently, a new measure for sequence-dependent capabilities of the LFES can be defined.

Definition 10

LFES Sequence-Dependent Structural Degrees of Freedom [21, 23, 24, 2730]: The set of independent pairs of actions \(z_{\psi _1\psi _2}=e_{w_1v_1}e_{w_2v_2} \in Z\) of length 2 that completely describe the system language. The number is given by:

$$\begin{aligned} \textit{DOF}_{\rho }=\sigma ({\fancyscript{Z}})= & {} \sum _{\psi _1}^{\sigma (E_S)}\sum _{\psi _2}^{\sigma (E_S)}[J_\rho \ominus K_\rho ](\psi _1,\psi _2)\end{aligned}$$
(12)
$$\begin{aligned}= & {} \sum _{\psi _1}^{\sigma (E_S)}\sum _{\psi _2}^{\sigma (E_S)}[A_\rho ](\psi _1,\psi _2) \end{aligned}$$
(13)
Table 1 Processes & resources in a power grid as a LFES [2729]

Note that from a resilience measurement perspective, where graph theory is commonly applied, \(A_\rho \) is an adjacency matrix with nodes as each individual structural degree of freedom [2729]. However, unlike traditional applications of graph theory, the axiomatic design model described is a complete and yet concise description of system structure.

Definition 11

System Structure [74](page26): the parts of a system and the relationships amongst them. It is described in terms of

  • A list of all components (i.e., resources) that comprise it.

  • What portion of the total system behavior (i.e., processes) is carried out by each component (i.e., resources).

  • How the components (i.e., resources) are interconnected.

Therefore, structural changes in a system that occur as a result of a disruption or resilient recovery operation can be expressed in terms of the axiomatic design model [2730].

$$\begin{aligned}&A_\rho \rightarrow A'_\rho \end{aligned}$$
(14)
$$\begin{aligned}&(J_S,K_S,K_\rho ) \rightarrow (J_S',K_S',K_\rho ') \end{aligned}$$
(15)

Linking Axiomatic Design to Traditional Power Systems Models

The Axiomatic Design model presented in the previous subsection applies to both the physical as well as the cyber structure of a power system. As has been discussed extensively in the literature, life cycle properties such as reconfigurability and resilience depend primarily on a complete description of system structure rather than system behavior [25, 2729]. Therefore, the discussion presented in the previous subsection is sufficient to address the cyber-layer and distill the MAS design principles for resilience in “MAS Design Principles for Resilience in Power Systems” section. However, in order to tailor the discussion specifically for the power systems domain, the behavior of the physical layer of the power system is also discussed.

Fig. 1
figure 1

Two-bus power system with generation, storage and load

As mentioned previously, the Axiomatic Design model, unlike traditional graph theory, provides a complete description of system structure. Traditional graph theory, with its nodes and edges, is commonly applied in the power systems field. Nodes represent buses and edges represent lines. In Axiomatic Design, however, system processes and resources must both be defined.

Example 1

Table 1 provides examples of transformation and transportation processes as well the three types of system resources in the power system domain. Holding processes are often introduced to differentiate between two transportation processes between an origin and a destination. In power grids, they can be used to differentiate transmission lines of different voltage level and are neglected for the remainder of the paper. Instead, the common power systems assumption of per unit normalization is applied.

This generic description of system processes and resources takes on greater meaning in the context of an instantiated power system.

Example 2

Consider the two-bus power system operating at a single voltage of 33kV shown in Fig. 1. M\(=\){Gen1, Load1}. B\(=\){Battery, Bus1, Bus2}. H\(=\){GenLine, LoadLine, BessLine, BusLine1-2}. Note that it is important to include the lead lines to the generator, load and battery as would be done in a transient stability analysis [37]. P\(_\mu =\){Inject Power, Withdraw Power}. Transportation processes are defined between all possible pairs of buffers \(B_S\). The transformation and transportation knowledge bases are then formed. \(J_M=[1,0;0,1]\). The number of transformation degrees of freedom \(\sigma ({\fancyscript{E}}_M)=2\).

(16)

\(J_H^T\) is given horizontal lines to distinguish between the three types of resources MB, and H and may be rewritten as \(J_H=[J_{MH}\) \(J_{BH}\) \(J_{HH}\)]. The vertical lines in Eq. 16 distinguish between processes with different origins. The number of storage degrees of freedom \(\sigma ({\fancyscript{E}}_{BH})=3\). In total, the buffers account for \(\sigma ({\fancyscript{E}}_{BS})=5\) degrees of freedom. Finally, the number of (non-null) transportation degrees of freedom \(\sigma ({\fancyscript{E}}_H)=4*2=8\). A careful look at the two knowledge bases shows that all transforming resources (i.e., generators & loads) and independent buffers (i.e., storages & substations) are capable of realizing exactly one process (i.e., inject, withdraw, or store power). In the meantime, the transporting resources can do exactly two; transportation to and from a given pair of buffers.

In order to further ground the background discussion, the link between the Axiomatic Design structural model and traditional power systems behavioral models is established. To that effect, each structural degree of freedom \(\psi \) must be described by a “device model” consisting of dynamic state variables \(\mathbf x _\psi \), algebraic state variables \(\mathbf w _\psi \), internal parameters \(\kappa _\psi \), differential equations \(f_\psi \) and algebraic equations \(g_\psi \) [68]. The specific details for a given device model depend on the chosen type of technical analysis. Consider the cases of AC power flow analysis and transient stability analysis.

Example 3

Power Flow Analysis Power flow analysis is relevant to the study of resilience in power systems because of its repeated use in N-1 contingency analysis [96]. The derivation of the power flow analysis equationS from the Axiomatic Design model is done in five steps:

  1. 1.

    Construct a device model for each degree of freedom

  2. 2.

    Construct a transportation degree of freedom admittance matrix

  3. 3.

    Construct a transportation degree of freedom incidence matrix

  4. 4.

    Construct a bus admittance matrix

  5. 5.

    Construct the power flow analysis equations from Kirchoff’s Current Law.

First, three different types of device models are required. For structural degrees of freedom that inject & withdraw power \({\fancyscript{E}}_M\).

$$\begin{aligned}&x_\psi =\emptyset \nonumber \\&w_\psi =\{P_{E\psi }, Q_\psi , v_\psi , \theta _\psi \}\nonumber \\&\kappa _\psi = \emptyset \nonumber \\&f_\psi = \emptyset \nonumber \\&g_\psi = \emptyset \end{aligned}$$
(17)

where \(\{P_{E\psi }, Q_\psi , v_\psi , \theta _\psi \}\) represent the active power injection, the reactive power injection, voltage magnitude, and voltage angle respectively (measured across the structural degree of freedom). For structural degrees of freedom that store power \({\fancyscript{E}}_{BS}\),

$$\begin{aligned}&x_\psi = S \nonumber \\&w_\psi =\{P_{E\psi }, Q_\psi , v_\psi , \theta _\psi \}\nonumber \\&\kappa _\psi = \{ \underline{S}_\psi , \overline{S}_\psi , \alpha \} \nonumber \\&f_\psi = S_\psi [k+1]=S_\psi [k] + (1-\alpha _\psi )P_\psi (t_k-t_{k-1}) \nonumber \\&g_\psi = \emptyset \end{aligned}$$
(18)

where \(\underline{S}_\psi , \overline{S}_\psi \) are the storage minimum and maximum capacities respectively, and \(\alpha _\psi \) is a percentage loss factor. For structural degrees of freedom that transport power \({\fancyscript{E}}_{BH}\),

$$\begin{aligned}&x_\psi =S \nonumber \\&w_\psi =\{P_{E\psi }, Q_\psi , v_\psi , \theta _\psi \} \nonumber \\&\kappa _\psi = \{ y_\psi \} \nonumber \\&f_\psi = \emptyset \nonumber \\&g_\psi = P_{E\psi }+ jQ_\psi =(v_\psi \angle \theta _\psi )y_\psi ^*(v_\psi \angle \theta _\psi )^* \end{aligned}$$
(19)

Second, a transportation degree of freedom admittance matrix is constructed with all of the admittances of the structural degrees of freedom that transport power.

$$\begin{aligned} {\fancyscript{Y}}=diag(y_{\psi _1}, \ldots , y_{\psi _{\sigma (2H)}}) \end{aligned}$$
(20)

\({\fancyscript{Y}}\) is similar to the traditional concept of a line admittance matrix \({\fancyscript{Y}}_H\) in power systems engineering [53]. However, while \({\fancyscript{Y}}_H\) is of size \(\sigma (H)\times \sigma (H), {\fancyscript{Y}}\) is of size \(\sigma (2H)\times \sigma (2H)\) noting that each line \(h \in H\) actually has two transportation degrees of freedom; one for each direction between a given pair of buses. Thus, Axiomatic Design for LFES mathematically supports directed graphs or lines which exhibit different admittances depending on the direction of the flowing current. In traditional power flow analysis, each line’s two degrees of freedom is assumed to have the same admittance.

Third, a transportation degree of freedom incidence matrix \(M_{\fancyscript{E}}\) is constructed from Eqs. 4 and 5.

$$\begin{aligned} M_{\fancyscript{E}}=M_{{\fancyscript{E}}^+}-M_{{\fancyscript{E}}^-} \end{aligned}$$
(21)

where

$$\begin{aligned}&M_{{\fancyscript{E}}^-}=\sum _{y1=1}^{\sigma (B_S)}e_{y1}^{\sigma (B_S)}\left[ \mathbb {P}\left( e_{y1}^{\sigma (B_S)}\otimes 1\!\!1^{\sigma (B_S)} \otimes 1\!\!1^{\sigma (H)T}\right) \right] ^{VT} \end{aligned}$$
(22)
$$\begin{aligned}&M_{{\fancyscript{E}}^+}=\sum _{y2=1}^{\sigma (B_S)}e_{y2}^{\sigma (B_S)}\left[ \mathbb {P}\left( 1\!\!1^{\sigma (B_S)}\otimes e_{y2}^{\sigma (B)}\otimes 1\!\!1^{\sigma (H)T} \right) \right] ^{VT} \end{aligned}$$
(23)

Note, that the projection operator \(\mathbb {P}\) contains the transportation degree of freedom information from \(J_H\) and \(K_H\).

Fourth, the bus admittance matrix \(\mathbf {Y}\) is calculated [53].

$$\begin{aligned} \mathbf {Y}=M*{\fancyscript{Y}}*M^T \end{aligned}$$
(24)

As expected, it’s size is \(\sigma (B_S)\times \sigma (B_S)\) or equivalently \(\sigma ({\fancyscript{E}}_{BS})\times \sigma ({\fancyscript{E}}_{BS})\). The latter expression is useful so as to create vectors for active power injection \(\mathbf {P}_E=[P_{E\psi _{1}}\ldots P_{E\psi _{\sigma (B_S)}}]\), reactive power injection \(\mathbf {Q}=[Q_{\psi _{1}}\ldots Q_{\psi _{\sigma (B_S)}}]\), and complex voltage \(\mathbf {V}=[v_{\psi _{1}}\angle \theta _{\psi _{1}} \ldots v_{\psi _{\sigma (B_S)}}\angle \theta _{\psi _{\sigma (B_S)}}]\).

As a final step, the power flow equations follow straightforwardly from Kirchoff’s Current Law [53, 68].

$$\begin{aligned} \mathbf {P}_E+j\mathbf {Q}=diag(\mathbf {V})\mathbf {Y}^*\mathbf {V}^* \end{aligned}$$
(25)

This example shows that the relatively abstract representation of system structure provided by the Axiomatic Design model is entirely consistent with a traditional power flow analysis model.

Example 4

Transient Stability Model Transient stability analysis is relevant to the study of resilience in power systems because it used to study grid stability in the event of resource (i.e., generator, line or load) failure. The derivation of this model from the Axiomatic Design model follows the same steps as in Example 3, but also adds a set of differential equations \(f_\psi \) and their associated parameters.

Consider the case where the structural degrees of freedom associated with inject power take on the device model of a simple damped synchronous generator [37].

$$\begin{aligned}&x_\psi = \{\theta _\psi , \dot{\theta }_\psi \} \nonumber \\&w_\psi =\{P_{E\psi }, Q_\psi , v_\psi \}\nonumber \\&\kappa _\psi = \{{\fancyscript{H}}_\psi , {D}_\psi , P_{M\psi }\} \nonumber \\&f_\psi = \dot{\theta }_\psi = \frac{\omega _0}{2{\fancyscript{H}}}\left( P_{M\psi }-P_{E\psi } -D_\psi \dot{\theta }_\psi \right) \nonumber \\&g_\psi = \emptyset \end{aligned}$$
(26)

where \(\theta _\psi \) now becomes a dynamic state variable, \(\dot{\theta }_\psi \) is the generator’s shaft speed, \({\fancyscript{H}}_\psi \) is its inertia, \({D}_\psi \) is its damping constant, \(P_{M\psi }\) is its mechanical power setpoint and \(\omega _0\) is the grid’s nominal frequency. The remaining device models are assumed to be static and are left unchanged.

From there, the remainder of the transient stability model is derived as is commonly established in the literature [37]. The active & reactive power injections are converted into shunt admittances and the kron reduction formula is applied to Eq. 25 so that it becomes

$$\begin{aligned} {\mathbf {P}}_{Ered}+j {\mathbf {Q}}_{red}=diag({\mathbf {V}}_{red}){\mathbf {Y}}_{red}^{*}{\mathbf {V}}_{red}^{*} \end{aligned}$$
(27)

where \({\mathbf {P}}_{Ered}, {\mathbf {Q}}_{red}, {\mathbf {V}}_{red}\) and \({\mathbf {Y}}_{red}\) are all resized to the number of structural degrees of freedom associated with injecting power (by synchronous generator). This allows the algebraic Eq. 27 to couple the dynamics of the synchronous generators \(f_\psi \) via \(P_{E\psi }\) and \(\theta _\psi \). The extension of the Axiomatic Design model to a transient stability power system model shows how the power system structure can be incrementally detailed as the associated analysis requires.

MAS Design Principles for Resilience in Power Systems

In this section, a set of multi-agent system design principles for resilience in power systems are distilled from the Axiomatic Design for LFES. The discussion in the introduction showed that resilient coordination and control of future power systems must ultimately recognize that the structure of the physical power grid will be in a regular state of change allowing generators, loads, lines, and even whole microgrids to connect and disconnect as is necessary in an interoperable fashion. Consequently, the dynamics of the physical power grid and its associated enterprise control will also change. The background section described an Axiomatic Design Model for LFES which has been recently used to develop a set of quantitative resilience measures. It was later linked to traditional models of the physical power system like power flow analysis and transient stability. This same Axiomatic Design model is now applied to the MAS cyber-layer with the understanding that any multi-agent system that is implemented as a control system to achieve that resilience must manage both changes in system structure as well as dynamics. On this basis, this work proposes two sets of multi-agent system design principles (1) for a change of system structure (2) for a change of system dynamics. These principles are primarily intended to pertain to the multi-agent system architecture rather than the corresponding coordination and control algorithms. To support each design principle, a counter-example rationale is provided where the consequences of breaking the principle are described.

Design Principles for a Change of System Structure

With the Axiomatic Design Model for LFES, a number of design principles are distilled to account for changes in system structure.

Principle 1

Application of Independence Axiom: The agent architecture must be explicitly described in terms of the power system’s structural degrees of freedom.

Counter Example 1

Because the flow of power can be described as sequences of individual structural degrees of freedom, it is logical to describe the agents in terms of these same structure degrees of freedom. Consider if an arbitrary structural degree of freedom \(\psi \) were not included in the agent architecture. In such a case, it would not be aware of the associated physical power grid activity nor be able to control it individually. In such a way, structural degrees of freedom are the quantitative equivalent of agent semantic ontologies [35].

Principle 2

Existence of Physical Agents: As a decision-making/control system, the multi-agent system must maintain a 1-to-1 relationship with the structural degrees of freedom that exist in the power system.

Counter Example 2

Reconsider Example 2 such that the agent architecture only includes the five structural degrees of freedom associated with energy management (i.e., inject, withdraw and store power) are included in the agent architecture. In such a case, it would be difficult to devise a multi-agent system in which the corresponding resources were aware of the resources to which they were physically connected. In the event that the power grid divided into separate areas, they could potentially be managing energy without knowing to which area they belong. Nevertheless, many multi-agent system developments found in the literature do not fulfill Principle 2 because they are focusing on the decentralization of an existing decision-making/control algorithm. If such a decision-making control algorithm does not involve all the structural degrees of freedom then the associated multi-agent system will likely only be a subset of the multi-agent system required for resilient operation. For example, an agent-based approach to solving the unit commitment or economic dispatch problem [86] would not require a description of the power grid topology and its associated structural degrees of freedom.

Principle 3

Functional Heterogeneity: The structural degrees of freedom within the agent architecture must respect the heterogeneity of capabilities found within the physical power system be they stochastic or deterministic processes or their various types: transformation (i.e., generation, and consumption) or transportation (i.e., transmission & distribution).

Counter Example 3

Reconsider the case of the battery in Example 2. If the associated physical agent were no different than any other agent, then it would not be aware of its distinguishing device model features; namely the minimum and maximum storage capacity. Similarly, if the physical agent associated with the generator believed it to be a thermal unit when indeed it was a wind turbine, then it might seek to be dispatched in an energy-management negotiation when in fact its generated power is an exogenous input. Therefore, the differences between these system processes must be reflected in the LFES knowledge base and its associated structural degrees of freedom.

Principle 4

Physical Aggregation: The agent architecture must reflect the physical aggregation of the objects that they represent.

Counter Example 4

The agents must also have a level of aggregation that mimics that of the physical entities that they represent. Reconsider Example 2 as a two-area transmission system. In such a case, the load serves as an abstraction of the net-load drawn by a full distribution system consisting of many power system resources. In Axiomatic Design, such an aggregated resource would be described by Eq. 2. If the agent architecture did not represent the transmission system load as an aggregation of distribution system resources, then the fine-grain decision-making of the distribution system could not be included in the agent-architecture without replacing the transmission load with a complete model of the distribution system resources. Note that while the presence of aggregation in the MAS architecture does require information exchange it does not require hierarchical decision-making.

Principle 5

Availability: The agent architecture must explicitly model the potential for sequence independent constraints that impede the availability of any given structural degree of freedom.

Counter Example 5

Next, the agent architecture must distinguish between the existence and availability of its capabilities. This principle is essential for resilient operation where any given resource can be taken on or offline. Consider the failure of an arbitrary structural degrees of freedom \(\psi \) in Example 2 modeled as \(K_S^V(\psi )=1\). If the agent architecture did not model this constraint, it would not be aware of the failure. Consequently, it would not be able to take a resilient recovery operation.

Principle 6

Interaction: The agent architecture must contain agent interactions along the minimal set of physical sequence-dependent constraints (i.e., nearest neighbor interactions).

Counter Example 6

The existence of sequence-dependent constraints in the physical power grid suggests for the need for the same amongst the agents. Reconsider Example 2, if the generator’s agent did not interact with the “GenLine” agent, it would not know of their relative proximity. In such a case, the generator could continue to inject power even if the “GenLine” agent were to fail.

Principle 7

Maximum Reconfiguration Potential: Aside from the minimal set of physical sequence-dependent constraints, the agent architecture should avoid introducing any further agent interactions (which may impose further constraints).

Counter Example 7

Adding agent interactions beyond the ones on the physical power grid is likely to introduce additional, perhaps unnecessary, constraints. Reconsider Example 2 such that the generator’s agent communicates with another arbitrary agent whose physical resource is not physically attached. In the event that this arbitrary agent were to fail, then the generator’s agent may also malfunction despite being physically independent.

Principle 8

Scope of Physical Agents: Agents’ scope and boundaries should be aligned with their corresponding physical resources and their associated structural degrees of freedom.

Counter Example 8

The concept of physical agency is well established and directly supports resilience. Reconsider Example 2 where a hypothetical centralized agent is introduced that manages the four structural degrees of freedom associated with Bus 1, Bus 2, and BusLine1-2. In the event that “BusLine1-2” fails, the physical power grid can continue to operate as two autonomous power system areas. Meanwhile, this centralized agent pertains to both areas; albeit unnecessarily. The computing hardware supporting this agent may have failed with “BusLine1-2” leading to the failure of 4 DOFs and not just 2. If it is situated on either of the two buses, it would still need to communicate with both despite their independence. Consequently, another failure would fail both power system areas despite their autonomy. Principle 8 ensures that when a reconfiguration process occurs (i.e., addition, modification or removal of a structural degree of freedom), it does so simultaneously on the physical resource as well as on the corresponding agent. Previous reconfigurability measurement work has shown that in many cases misaligned informatic entities such as centralized controllers lead to greater coupling of structural degrees of freedom [22, 25]; thus hindering ease of reconfiguration. Recent work in power system state estimation has recognized the challenge of gathering geographically dispersed measurements from a variable power grid topology; thus motivating recent developments in distributed state estimation [38].

Principle 9

Encapsulation: Power system information should be placed in the agent corresponding to the physical entity that it describes.

Counter Example 9

Principle 9 recognize that information is more often used locally rather than remotely and thus encourages greater encapsulation and modularity. Reconsider Example 2 such that the generator’s agent is the only agent to know the admittance of the GenLine. In such case, the GenLine would have to query the Gen1 agent every time it needed to calculate its power flow. In such a case, the proper function of both agents would depend on each other more than necessary.

Principle 10

Interoperability: Agent-to-Agent interacts should be described by well-known interoperability standards.

Counter Example 10

Prinicple 10 encourages the use of multi-agent system standards such as FIPA [78] and IEC61499 [93]. Consider two arbitrary communicating agents, without an interoperability standard the communication syntax of one could not be understood by the other.

Design Principles for a Change of System Dynamics

In addition to the design principles for a change of system structure, it is necessary to identify the same for a change of system dynamics taking into consideration the full set of power grid enterprise control activities.

Principle 11

Scope of Physical System Model & Decision Making: The physical system model must describe the physical system behavior at all time scales for which resilient decision-making/control is required. These time scales are described by characteristic frequencies for continuous dynamics and characteristic times for discrete (pseudo-steady-state) processes.

Principle 11 recognizes that the multi-agent system is part of a larger cyber-physical system. Therefore, it will either have a virtual model of the physical system or it will connect to such a model during the engineering design and testing. In either case, such a model must be rich enough to include all of the physical phenomena relevant to resilient operation. For example, the unit commitment problem must account for startup/shutdown times and load/generator ramp rates [37]. Meanwhile, dynamic reconfiguration of multiple microgrids implies a full transient-stability model of the power grid [37].

Principle 12

Temporal Scope of Execution Agent/Real-time Controller: The characteristic frequencies in the physical system model must be controlled by at least one execution agent/real-time controller capable of making decisions 5x faster than the fastest characteristic frequency.

Principle 12 also implies two types of agents; those responsible for executing real-time dynamics and those responsible for pseudo-state coordination. This is consistent with recent works on resilient control systems [79, 80]. To avoid mathematical convolution, the Nyquist sampling theorem requires that real-time execution agents/controllers operate at a significantly faster than the dynamics that they control [73]. In theory, the sampling rate must be 2x faster, however, in industrial practice this number is increased to 5-10x. This principle can impose a strict real-time requirement. In the case of the transient stability model presented in Example 4, such characteristic frequencies can be on the order of 100ms [37].

Principle 13

Temporal Scope of Coordination Agent: A coordination agent may not take decisions any faster than 5x slower than the slowest characteristic frequency in the physical system model.

Principle 13 is also based upon the avoidance of mathematical convolution. Consider the linearization of the transient stability model presented in Example 4 around an equilibrium point \((\mathbf {x}_0,\mathbf {w}_0)\). The dynamic state equations would then follow a state space model.

$$\begin{aligned} {\varDelta }\dot{{\mathbf {x}}}=\mathbf {A}{\varDelta }\mathbf {x}+\mathbf {B}{\varDelta }\mathbf {u} \end{aligned}$$
(28)

The unforced time domain solution is given by [34]

$$\begin{aligned} \mathbf {x}(t)=e^{\mathbf {A}(t-\tau )}\mathbf {x}(\tau ) \end{aligned}$$
(29)

where the eigenvalues of \(\mathbf {A}, \lambda _1,\ldots \lambda _n\) are ordered from smallest to largest represent the system poles. The exponential decay \(e^{{\text {Re}}(\lambda _1)t}\) reaches 99 % of its horizontal asymptote after \(5/\lambda _1\) [72]. Therefore, Principle 13 ensures that the coordination agents only take decisions once the underlying physical model has reached steady-state. Furthermore, dynamic instability can arise if Principle 13 is violated.

Principle 14

Equivalence of Agent Hierarchy & Time Scale Separation: If the physical system model has two or more characteristic frequencies or times that are (mathematically proven or practically assumed to be) independent then the associated agent may be divided into an equal number of hierarchical agents each responsible for decision-making/control for the associated characteristic frequency or time.

Principle 14 recognizes that different power system phenomena either are, or can be assumed to be, effectively decoupled in time and the agent hierarchy can be designed accordingly. For example, unit commitment and economic dispatch problems are usually time scale separated [37]. Additionally, small-signal stability dynamics are often categorized as intra-area and inter-area dynamics [54].

This section has used the axiomatic design model presented in the previous section to distill fourteen multi-agent system design principles for resilient coordination and control of power systems. The first ten design principles were necessary to address changes in system structure and correspond to various aspects of the Axiomatic Design model for LFESs model described in “Background: Axiomatic Design Model for Resilient Power Systems” section. The next four design principles were necessary to address changes in system behavior at the various timescales found within power systems. While these fourteen principles are necessary, they are not sufficient for many reasons. First, the design principles described here are based upon static rather than dynamic measures of resilience. Although many authors have identified the need for such dynamic measures, the literature has yet to produce them [7, 9, 10, 32, 45, 64, 76, 90, 95]. Therefore, it is likely that the design principles described here will be expanded as the system resilience literature develops further. Second, the resilient control of power grids remains very much an open area of research. Formal results on the synchronization of power systems [6], control over networked communication systems [42, 43, 46, 84], and consensus of multi-agent systems still require dedicated effort [46]. In this context, the design principles presented here are best interpreted as those pertaining to the multi-agent system architecture rather than the corresponding coordination and control algorithms that make up the multi-agent system behavior.

Adherence of Existing MAS Implementations to Design Principles

With these multi-agent system design principles identified, the discussion can turn to evaluating the existing multi-agent system power grid literature. The application of multi-agent systems in the power systems domain is well-established [18, 36, 66, 67, 83, 92]. Originally, multi-agent systems were intended as a tool for the design and simulation of power system market operation [85]. However, in recent years, MAS implementations are increasingly intended for real-time coordination and control. Therefore, this evaluation focuses on the latter category and specifically includes works that meet the following criteria: (1) were published after 2010 and (2) included a control system composed of multiple agents (3) demonstrated closed-loop control of a simulation model or physical hardware. This lead to the inclusion of Refs. [13, 16, 20, 47, 52, 55, 60, 61, 81, 82, 99]. Figure 2 shows the results of the assessment where green, yellow and red correspond to full, partial and non-adherence to the MAS design principles. Although the assessment is conducted at a fairly high level, this is entirely consistent with Axiomatic Design which states that high level design decisions can not be fixed by detailed design decisions made thereafter [88]. The main themes and conclusions of Figure 2 are summarized below.

Fig. 2
figure 2

Adherence of existing MAS implementations to design principles

The results of the assessment suggest that MAS development for power grids has been primarily intended as the decentralization of a particular decision-making/control algorithm rather than the development of resilience as a system property. The most common of these decisions may be broadly categorized as either energy management or fault location, isolation, and supply restoration (FLISR). The former often neglected the power grid topology, while the latter often neglected some type of energy resource. Furthermore, most of the the works did not strictly adhere to the principles of physical agency. These observations naturally meant that the availability of all physical resources was often partial. Only the work of Rivera et al. [81, 82]Footnote 1 fully adhered to Principles 1, 2, 3, 5 and 8. The literature as a whole was found to be weak with respect to physical aggregation (Principle 4). Either aggregation was not addressed, or it lead to centralized-decision-making algorithms. In the latter case, this consequently leads to additional agent-to-agent interactions and compromised encapsulation (Principles 7 and 9). The literature as a whole was also found to be weak with respect to physical nearest-neighbour interactions (Principle 6). A MAS implementation that does not fully describe the system’s structural degrees of freedom will naturally neglect the interactions between them. That said, one nearly universal strength of the literature was its utilization of interoperability standards such as FIPA-compliant agents, IEC61499, and IEC61850 (Principle 10).

The multi-agent system implementations considered in the assessment were generally well suited to changes in power system dynamics at the various time-scales of enterprise control. While all considered works included either a physical grid simulation model or physical hardware, some did not describe the specifics of the implementation leading to questions of their suitability (Principle 11). Almost all works addressed coordination decisions as a pseudo-steady-state process (Principle 13) while others addressed power grid dynamics with real-time execution agents/controllers (Principle 12). For those implementations that considered both time scales, an agent hierarchy composed of at least two layers consequently emerged (Principle 14).

Conclusions & Future Work

This paper has identified a set of multi-agent system design principles for the resilient coordination and control of future power systems. To that effect, it drew upon an axiomatic design for LFESs model that has been used in the development of resilience measures. The newly identified MAS design principles were then used to evaluate the adherence of some recent MAS power grid implementations. The results of the assessment suggest that MAS development for power grids has been primarily intended as the decentralization of a particular decision-making/control algorithm rather than the development of resilience as a system property. While the former is necessary for the latter, it is far from sufficient.

Future extensions of this work can proceed int two directions. First, the set of design principles themselves can be extended so that they support both dynamic as well as static resilience. While four principles have been included here to address changes in system dynamics, it is likely that more principles will emerge from promising areas such as synchronization of power systems [6], control over networked communication systems [42, 43, 46, 46, 84], and consensus of multi-agent systems [46]. Second, the design principles can be applied to achieve greater resilience in MAS implementations applied in the power grid domain.