1 Introduction

In the last few decades, the world has become increasingly connected. This has brought a significant interest to complex networks, smart-grids, distributed systems, transportation networks, biological networks, and networked multi-agent systems, see, e.g., [2, 10, 28]. Widely studied topics in networked systems have been the problems of consensus and synchronization, see [19, 20, 27, 30]. Other important subjects in the theory of networked systems are flocking, formation control, sensor placement, and controllability of networks, see, e.g., [8, 9, 11, 12, 24, 29, 34].

Analysis and controller design for large-scale complex networks can become very expensive from a computational point of view, especially for problems where the complexity of the network scales as a power of the number of nodes it contains. In order to tackle this problem, there is a need for methods and procedures to approximate the original networks by smaller, less complex ones.

Direct application of established model reduction techniques, such as balanced truncation, Hankel-norm approximation, and Krylov subspace methods, see, e.g., [1, 3], to the dynamical models of networked systems generally leads to a collapse of the network structure, as well as the loss of important properties such as consensus or synchrony.

Model reduction techniques specifically for networked multi-agent systems with first-order agents have been proposed in [6, 15, 16, 22]. Extensions to second-order agents have been considered in [7, 14] and to more general higher-order agents in [4, 17, 23, 25]. Some of these methods are based on clustering nodes in the network. With clustering, the idea is to partition the set of nodes in the network graph into disjoint sets called clusters, and to associate with each cluster a single, new, node in the reduced network, thus reducing the number of nodes and connections and the complexity of the network topology. For a review on clustering in data mining see, e.g., [18].

In [26], model reduction by clustering was put in the context of model order reduction by Petrov–Galerkin projection. The results in [26] provide explicit expressions for the \(\mathcal {H}_2\) model reduction error if a leader–follower network with single integrator agent dynamics is clustered using an almost equitable partition of the graph. In the present paper, our aim is to generalize and extend the results in [26] to networks where the agent dynamics is given by an arbitrary multivariable input–state–output system. We also aim at finding explicit formulas and a priori upper bounds for the model reduction error measured in the \(\mathcal {H}_\infty \)-norm. Finally, we will consider the problem of clustering a network according to arbitrary, not necessarily almost equitable, graph partitions. The main contributions of this paper are the following:

  1. 1.

    We derive an a priori upper bound for the \(\mathcal {H}_2\) model reduction error for the case that the agents are represented by an arbitrary input–state–output system.

  2. 2.

    We extend the results in [26] for single integrator dynamics by giving an explicit expression for the \(\mathcal {H}_\infty \) model reduction error in terms of properties of the given graph partition.

  3. 3.

    We establish an a priori upper bound for the \(\mathcal {H}_\infty \) model reduction error for the case that the agents are represented by an arbitrary but symmetric input–state–output system.

  4. 4.

    We establish some preliminary results on the model reduction error in case of clustering using an arbitrary, possibly non almost equitable, partition.

The outline of this paper is as follows. In Sect. 2, we introduce some notation and discuss some elementary facts about computing the \(\mathcal {H}_2\)- and \(\mathcal {H}_\infty \)-norm of stable transfer functions needed later on in this paper. In Sect. 3, we formulate our problem of model reduction of leader–follower multi-agent networks. Section 4 reviews some theory on graph partitions and model reduction by clustering and relates this method to Petrov–Galerkin projection of the original network. Also preservation of synchronization is discussed here. In Sect. 5, we provide a priori error bounds on the \(\mathcal {H}_2\) model reduction error for networks with arbitrary agent dynamics, clustered using almost equitable partitions. In Sect. 6, we complement these results by providing upper bounds on the \(\mathcal {H}_\infty \) model reduction error. In Sect. 7, the problem of clustering networks according to general partitions is considered and the first steps toward a priori error bounds on both the \(\mathcal {H}_2\) and \(\mathcal {H}_\infty \) model reduction errors are made. Numerical examples for which we compare the actual errors with the a priori bounds established in this paper are presented in Sect. 8. Finally, Sect. 9 provides some conclusions. To enhance readability, some of the more technical proofs in this paper have been put to “Appendix.”

2 Preliminaries

In this section we briefly introduce some notation and discuss some basic facts on finite-dimensional linear systems. The trace of a square matrix A is denoted by \({{\mathrm{tr}}}(A)\). The largest singular value of a matrix A is denoted by . For given real numbers \(\alpha _1, \alpha _2, \ldots , \alpha _k\), we denote by \({{\mathrm{diag}}}(\alpha _1, \alpha _2, \ldots , \alpha _k)\) the \(k \times k\) diagonal matrix with the \(\alpha _i\)’s on the diagonal. For square matrices \(A_1, A_2, \ldots , A_k\), we use \({{\mathrm{diag}}}(A_1, A_2, \ldots , A_k)\) to denote the block diagonal matrix with the \(A_i\)’s as diagonal blocks. For a given matrix A, let \(A^+\) denote its Moore–Penrose pseudoinverse.

Consider the input–state–output system

$$\begin{aligned} \begin{aligned} {\dot{x}}&= A x + B u, \\ y&= C x, \end{aligned} \end{aligned}$$
(1)

with \(x \in \mathbb {R}^n\), \(u \in \mathbb {R}^m\), \(y \in \mathbb {R}^p\), and transfer function \(S(s) = C {(s I - A)}^{-1} B\). If S has all its poles in the open left half complex plane, then its \(\mathcal {H}_2\)-norm is defined by

If A is Hurwitz, then the \(\mathcal {H}_2\)-norm can be computed as

where X is the unique positive semi-definite solution of the Lyapunov equation

$$\begin{aligned} A^T X + X A + C^T C = 0. \end{aligned}$$
(2)

For the purposes of this paper, we also need to deal with the situation when A is not Hurwitz. Let \({\mathcal {X}}_+(A)\) denote the unstable subspace of A, i.e., the direct sum of the generalized eigenspaces of A corresponding to its eigenvalues in the closed right half plane. We state the following proposition:

Proposition 1

Assume that \(\mathcal {X}_+(A) \subset \ker {C}\). Then, the Lyapunov equation (2) has at least one positive semi-definite solution. Among all positive semi-definite solutions, there is exactly one solution, say X, with the property \(\mathcal {X}_+(A) \subset \ker {X}\). For this particular solution X, we have .

A proof of this result can be found in “Appendix A”.

If S has all its poles in the open left half plane, then its \(\mathcal {H}_\infty \)-norm is defined by

We will now deal with computing the \(\mathcal {H}_{\infty }\)-norm. The result is a generalization of Lemma 4 in [16]. For a proof, we refer to “Appendix B.”

Lemma 1

Consider the system (1). Assume that its transfer function S has all its poles in the open left half plane. If there exists \(X \in \mathbb {R}^{p \times {}p}\) such that \(X = X^T\) and \(C A = X C\), then .

Continuing our effort to compute the \(\mathcal {H}_{\infty }\)-norm, we now formulate a lemma that will be instrumental in evaluating a transfer function at the origin. Recall that for a given matrix A, its Moore–Penrose inverse is denoted by \(A^+\).

Lemma 2

Consider the system (1). If A is symmetric and \(\ker {A} \subset \ker {C}\), then 0 is not a pole of the transfer function S and we have \(S(0) = -C A^+ B\).

This result is proven in “Appendix C.”

To conclude this section, we briefly review the model reduction technique known as Petrov–Galerkin projection (see also [1]).

Definition 1

Consider the system (1). Let \(W, V \in \mathbb {R}^{n \times {}r}\), with \(r < n\), such that \(W^T V = I\). The matrix \(V W^T\) is then a projector, called a Petrov–Galerkin projector. The reduced order system

$$\begin{aligned} {\dot{\hat{x}}}&= W^T A V \hat{x}+ W^T B u, \\ \hat{y}&= C V \hat{x}, \end{aligned}$$

with \(\hat{x}\in \mathbb {R}^r\) is called the Petrov–Galerkin projection of the original system (1).

3 Problem formulation

We consider networks of diffusively coupled linear subsystems. These subsystems, called agents, have identical dynamics; however, a selected subset of the agents, called the leaders, also receives an input from outside the network. The remaining agents are called followers. The network consists of N agents, indexed by i, so \(i \in \mathcal {V}:= \{1, 2, \ldots , N\}\). The subset \(\mathcal {V}_{\mathrm {L}}\subset \mathcal {V}\) is the index set of the leaders, more explicitly \(\mathcal {V}_{\mathrm {L}}= \{v_1, v_2, \ldots , v_m\}\). The followers are indexed by \(\mathcal {V}_{\mathrm {F}}: = \mathcal {V}{\setminus } \mathcal {V}_{\mathrm {L}}\). More specifically, the leaders are represented by the finite-dimensional linear system

$$\begin{aligned} {\dot{x}}_i = A x_i + B \sum _{j = 1}^N a_{ij} (x_j - x_i) + E u_\ell , \quad i \in \mathcal {V}_{\mathrm {L}},\ i = v_\ell , \end{aligned}$$

whereas the followers have dynamics

$$\begin{aligned} {\dot{x}}_i = A x_i + B \sum _{j = 1}^N a_{ij} (x_j - x_i), \quad i \in \mathcal {V}_{\mathrm {F}}. \end{aligned}$$

The weights \(a_{ij} \ge 0\) represent the coupling strengths of the diffusive coupling between the agents. In this paper, we assume that \(a_{ij} = a_{ji}\) for all \(i, j \in \mathcal {V}\). Also, \(a_{ii} = 0\) for all \(i \in \mathcal {V}\). Furthermore, \(x_i \in \mathbb {R}^n\) is the state of agent i, and \(u_\ell \in \mathbb {R}^r\) is the external input to the leader \(v_\ell \). Finally, \(A \in \mathbb {R}^{n \times n}\), \(B\in \mathbb {R}^{n \times n}\), and \(E \in \mathbb {R}^{n \times r}\) are real matrices. It is customary to represent the interaction between the agents by the graph \(\mathcal {G}\) with node set \(\mathcal {V}= \{1, 2, \ldots , N\}\) and adjacency matrix \(\mathcal {A}= (a_{ij})\). In the setup of this paper, this graph is undirected, reflecting the assumption that \(\mathcal {A}\) is symmetric. The Laplacian matrix \(L \in \mathbb {R}^{N \times N}\) of the graph \(\mathcal {G}\) is defined as

$$\begin{aligned} L_{ij} = {\left\{ \begin{array}{ll} d_i &{} \text {if } i = j, \\ -a_{ij} &{} \text {if } i \ne j, \end{array}\right. } \end{aligned}$$

with \(d_i = \sum _{j=1}^N a_{ij}\).

Recall that the set of leader nodes is \(\mathcal {V}_{\mathrm {L}}= \{v_1, v_2, \ldots , v_m\}\), and define the matrix \(M \in \mathbb {R}^{N \times m}\) as

$$\begin{aligned} M_{i \ell } = {\left\{ \begin{array}{ll} 1 &{} \text {if } i = v_\ell , \\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Denote \(x = {{\mathrm{col}}}(x_1, x_2, \ldots , x_N)\) and \(u = {{\mathrm{col}}}(u_1, u_2, \ldots , u_m)\). The total network is then represented by

$$\begin{aligned} {\dot{x}} = (I_N \otimes A - L \otimes B) x + (M \otimes E) u. \end{aligned}$$
(3)

The goal of this paper is to find a reduced order networked system, whose dynamics is a good approximation of the networked system (3). Following [26], the idea to obtain such an approximation is to cluster groups of agents in the network, and to treat each of the resulting clusters as a node in a new, reduced order, network. The reduced order network will again be a leader–follower network, and by the clustering procedure, essential interconnection features of the network will be preserved. We will also require that the synchronization properties of the network are preserved after reduction. We assume that the original network is synchronized, meaning that if the external inputs satisfy \(u_\ell = 0\) for \(\ell = 1, 2, \ldots , m\), then for all \(i, j \in \mathcal {V}\), we have

$$\begin{aligned} x_i(t) - x_j(t) \rightarrow 0 \end{aligned}$$

as \(t \rightarrow \infty \). We impose that the reduction procedure preserves this property. In this paper, a standing assumption will be that the graph \(\mathcal {G}\) of the original network is connected. This is equivalent to the condition that 0 is a simple eigenvalue of the Laplacian L, see [21, Theorem 2.8]. In this case, the network reaches synchronization if and only if \((L \otimes I_n) x(t) \rightarrow 0\) as \(t \rightarrow \infty \).

In order to be able to compare the original network (3) with its reduced order approximation and to make statements about the approximation error, we need a notion of distance between the networks. One way to obtain such notion is to introduce an output associated with the network (3). By doing this, both the original network and its approximation become input–output systems, and we can compare them by looking at the difference of their transfer functions. Being a measure for the disagreement between the states of the agents in (3), we choose \(y = (L \otimes I_n) x\) as the output of the original network. Indeed, this output y can be considered a measure of the disagreement in the network, in the sense that y(t) is small if and only if the network is close to being synchronized. Thus, with the original system (3) we now identify the input–state–output system:

$$\begin{aligned} \begin{aligned} {\dot{x}}&= (I_N \otimes A - L \otimes B) x + (M \otimes E) u, \\ y&= (L \otimes I_n) x. \end{aligned} \end{aligned}$$
(4)

The state space dimension of (4) is equal to nN, its number of inputs equals to mr, and the number of outputs is nN.

In this paper, we will use clustering to obtain a reduced order network, i.e., a network with a reduced number of agents, as an approximation of the original network (4).

4 Graph partitions and reduction by clustering

We consider networks whose interaction topologies are represented by weighted graphs \(\mathcal {G}\) with node set \(\mathcal {V}\). The graph of the original network (3) is undirected; however, our reduction procedure will lead to networks on directed graphs. As before, the adjacency matrix of the graph \(\mathcal {G}\) is the matrix \(\mathcal {A}= (a_{ij})\), where \(a_{ij} \ge 0\) is the weight of the arc from node j to node i. As noted before, the graph is undirected if and only if \(\mathcal {A}\) is symmetric.

A nonempty subset \(C \subset \mathcal {V}\) is called a cell or cluster of \(\mathcal {V}\). A partition of a graph is defined as follows.

Definition 2

Let \(\mathcal {G}\) be an undirected graph. A partition \(\pi = \{C_1, C_2, \ldots , C_k\}\) of \(\mathcal {V}\) is a collection of cells such that \(\mathcal {V}= \bigcup _{i = 1}^k C_i\) and \(C_i \cap C_j = \emptyset \) whenever \(i \ne j\). When we say that \(\pi \) is a partition of \(\mathcal {G}\), we mean that \(\pi \) is a partition of the vertex set \(\mathcal {V}\) of \(\mathcal {G}\). Nodes i and j are called cellmates in \(\pi \) if they belong to the same cell of \(\pi \). The characteristic vector of a cell \(C \subset \mathcal {V}\) is the N-dimensional column vector p(C) defined as

$$\begin{aligned} p_i(C) = {\left\{ \begin{array}{ll} 1 &{} \text {if } i \in C, \\ 0 &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

where \(p_i(C)\) is the ith entry of p(C). The characteristic matrix of the partition \(\pi = \{C_1, C_2, \ldots , C_k\}\) is defined as the \(N \times k\) matrix

$$\begin{aligned} P(\pi ) = \begin{pmatrix} p(C_1)&\quad p(C_2)&\quad \cdots&\quad p(C_k) \end{pmatrix}. \end{aligned}$$

For a given partition \(\pi = \{C_1, C_2, \ldots , C_k\}\), consider the cells \(C_p\) and \(C_q\) with \(p \ne q\). For any given node \(j \in C_q\), we define its degree with respect to \(C_p\) as the sum of the weights of all arcs from j to \(i \in C_p\), i.e., the number

$$\begin{aligned} d_{pq}(j) := \sum _{i \in C_p} a_{ij}. \end{aligned}$$

Next, we will construct a reduced order approximation of (4) by clustering the agents in the network using a partition of \(\mathcal {G}\). Let \(\pi \) be a partition of \(\mathcal {G}\), and let \(P := P(\pi )\) be its characteristic matrix. Extending the main idea in [26], we take as reduced order system the Petrov–Galerkin projection of the original system (4), with the following choice for the matrices V and W:

The dynamics of the resulting reduced order model is then given by

$$\begin{aligned} \begin{aligned} {\dot{\hat{x}}}&= (I_k \otimes A - \hat{L}\otimes B) \hat{x}+ (\hat{M}\otimes E) u, \\ \hat{y}&= (L P \otimes I_n) \hat{x}, \end{aligned} \end{aligned}$$
(5)

where

It can be seen by inspection that the matrix \(\hat{L}\) is the Laplacian of a weighted directed graph with node set \(\{1, 2, \ldots , k\}\), with k equal to the number of clusters in the partition \(\pi \), and adjacency matrix \({\hat{\mathcal {A}}} = ({\hat{a}}_{pq})\), with

$$\begin{aligned} {\hat{a}}_{pq} = \frac{1}{{|}C_p{|}} \sum _{j \in C_q} d_{pq}(j), \end{aligned}$$

where \(d_{pq}(j)\) is the degree of \(j \in C_q\) with respect to \(C_p\), and \({|}C_p{|}\) the cardinality of \(C_p\). In other words: in the reduced graph, the edge from node q to node p is obtained by summing over all \(j \in C_q\) the weights of all edges to \(i \in C_p\) and dividing this sum by the cardinality of \(C_p\). The row sums of \(\hat{L}\) are indeed equal to zero since \(\hat{L}\mathbb {1}_k = 0\). The matrix \(\hat{M}\in \mathbb {R}^{k \times m}\) satisfies

$$\begin{aligned} \hat{M}_{pj} = {\left\{ \begin{array}{ll} \frac{1}{{|}C_p{|}} &{} \text {if } v_j \in C_p, \\ 0 &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

where \(v_1, v_2, \ldots , v_m\) are the leader nodes, \(p = 1, 2, \ldots , k\), and \(j = 1, 2, \ldots , m\).

Clearly, the state space dimension of the reduced order network (5) is equal to nk, whereas the dimensions mr and nN of the input and output have remained unchanged. Thus, we can investigate the error between the original and reduced order network by looking at the difference of their transfer functions. In the sequel, we will investigate both the \(\mathcal {H}_2\)-norm as well as the \(\mathcal {H}_{\infty }\)-norm of this difference.

Before doing this, we will now first study the question whether our reduction procedure preserves synchronization. It is important to note that since, by assumption, the original undirected graph is connected, it has a directed spanning tree. It is easily verified that this property is preserved by our clustering procedure. Then, since the property of having a directed spanning tree is equivalent with 0 being a simple eigenvalue of the Laplacian (see [21, Proposition 3.8]), the reduced order Laplacian \(\hat{L}\) has again 0 as a simple eigenvalue.

Now assume that the original network (4) is synchronized. It is well known, see, e.g., [33], that this is equivalent with the condition that for each nonzero eigenvalue \(\lambda \) of the Laplacian L the matrix \(A - \lambda B\) is Hurwitz. Thus, synchronization is preserved if and only if for each nonzero eigenvalue \(\hat{\lambda }\) of the reduced order Laplacian \(\hat{L}\) the matrix \(A - \hat{\lambda }B\) is Hurwitz.

Unfortunately, in general \(A - \lambda B\) Hurwitz for all nonzero \(\lambda \in \sigma (L)\) does not imply that \(A - \hat{\lambda }B\) Hurwitz for all nonzero \(\lambda \in \sigma (\hat{L})\). An exception is the “single integrator” case \(A = 0\) and \(B = 1\), where this condition is trivially satisfied, so in this special case synchronization is preserved. Also if we restrict ourselves to a special type of graph partitions, namely almost equitable partitions, then synchronization turns out to be preserved. We will review this type of partition now.

Again, let \(\mathcal {G}\) be a weighted, undirected graph, and let \(\pi = \{C_1, C_2, \ldots , C_k\}\) be a partition of \(\mathcal {G}\). Given two clusters \(C_p\) and \(C_q\) with \(p \ne q\), and a given node \(j \in C_q\), recall that \(d_{pq}(j)\) denotes its degree with respect to \(C_p\). We call the partition \(\pi \) an almost equitable partition (in short: an AEP) if for each pq with \(p \ne q\), the degree \(d_{pq}(j)\) is independent of \(j \in C_q\), i.e., \(d_{pq}(j_1) = d_{pq}(j_2)\) for all \(j_1, j_2 \in C_q\). We refer to Fig. 1 for an example of a graph with an AEP.

Fig. 1
figure 1

A graph from [26] for which the partition \(\{\{1, 2, 3, 4\}, \{5, 6\}, \{7\}, \{8\}, \{9, 10\}\}\) is almost equitable

It is a well-known fact (see [5]) that \(\pi \) is an AEP if and only if the image of its characteristic matrix is invariant under the Laplacian.

Lemma 3

Consider the weighed undirected graph \(\mathcal {G}\) with Laplacian matrix L. Let \(\pi \) be a partition of \(\mathcal {G}\) with characteristic matrix \(P := P(\pi )\). Then, \(\pi \) is an AEP if and only if \(L {{\mathrm{im}}}P \subset {{\mathrm{im}}}P\).

As an immediate consequence, the reduced Laplacian \(\hat{L}\) resulting from an AEP satisfies \(L P = P \hat{L}\). Indeed, since \({{\mathrm{im}}}P\) is L-invariant we have \(L P = P X\) for some matrix X. Obviously, we must then have . From this, it follows that \(\sigma (\hat{L}) \subset \sigma (L)\). It then readily follows that synchronization is preserved if we cluster according to an AEP:

Theorem 1

Assume that the network (4) is synchronized. Let \(\pi \) be an AEP. Then, the reduced order network (5) obtained by clustering according to \(\pi \) is synchronized.

To the best of our knowledge, there is no known polynomial-time algorithm for finding nontrivial AEPs of a given graph, where by “trivial AEPs” we mean the coarsest and the finest partitions (\(\{\mathcal {V}\}\) and \(\{\{i\} : i \in \mathcal {V}\}\)). There is a polynomial-time algorithm for finding the coarsest AEP which is finer than a given partition (see [35]), but there is no guarantee that it will find a nontrivial AEP. Furthermore, it is not clear whether a given graph has any nontrivial AEPs at all. On the other hand, a graph can have many AEPs, e.g., every partition of a complete unweighted graph is an AEP. Because of this, in Sect. 7 we consider extensions of our results in Sects. 5 and 6, which are based on AEPs, to arbitrary partitions.

5 \(\varvec{\mathcal {H}_2}\)-error bounds

In this section, we will formulate the first main theorem of this paper. The theorem gives an a priori upper bound for the \({\mathcal {H}}_2\)-norm of the approximation error in the case that we cluster according to an AEP. After formulating the theorem, in the remainder of this section we will establish a proof. The proof will use a sequence of separate lemmas, whose proofs can be found in “Appendix.”

Before stating the theorem, we will now first discuss some important ingredients. Let S and \(\hat{S}\) denote the transfer functions of the original (4) and reduced order network (5), respectively. We will measure the approximation error by the \(\mathcal {H}_2\)-norm of these transfer functions. An important role will be played by the \(N - 1\) auxiliary input–state–output systems

$$\begin{aligned} \begin{aligned} {\dot{x}}&= (A - \lambda B) x + E d, \\ z&= \lambda x, \end{aligned} \end{aligned}$$
(6)

where \(\lambda \) ranges over the \(N - 1\) nonzero eigenvalues of the Laplacian L. Let \(S_{\lambda }(s) = \lambda {(sI - A + \lambda B)}^{-1} E\) be the transfer matrices of these systems. We assume that the original network (4) is synchronized, so that all of the \(A - \lambda B\) are Hurwitz. Let denote the \(\mathcal {H}_2\)-norm of \(S_{\lambda }\). Recall that the set of leader nodes is \(\mathcal {V}_{\mathrm {L}}= \{v_1, v_2, \ldots , v_m\}\). Node \(v_i\) will be called leader i. This leader is an element of cluster \(C_{k_i}\) for some \(k_i \in \{1, 2, \ldots , k\}\). We now have the following theorem:

Theorem 2

Assume that the network (4) is synchronized. Let \(\pi \) be an AEP of the graph \(\mathcal {G}\). The absolute approximation error when clustering \(\mathcal {G}\) according to \(\pi \) then satisfies

where \(C_{k_i}\) is the set of cellmates of leader i, and

Furthermore, the relative approximation error satisfies

where

Remark 1

We see that, with fixed number of agents and fixed number of leaders, the approximation error is equal to 0 if in each cluster that contains a leader, the leader is the only node in that cluster. In general, the upper bound increases if the numbers of cellmates of the leaders increase. The upper bound also depends multiplicatively on the maximal \(\mathcal {H}_2\)-norm of the auxiliary systems (6) over all Laplacian eigenvalues in the complement of the spectrum of the reduced Laplacian \(\hat{L}\). The relative error in addition depends on the minimal \(\mathcal {H}_2\)-norm of the auxiliary systems (6) over all nonzero eigenvalues of the Laplacian L.

Remark 2

For the special case that the agents are single integrators (so \(n = 1\), \(A = 0\), \(B = 1\), and \(E = 1\)) it is easily seen that \(S_{\max , \mathcal {H}_2} = \frac{1}{2} \max \{ \lambda \mid \lambda \in \sigma (L) {\setminus } \sigma (\hat{L}) \}\) and \(S_{\min , \mathcal {H}_2} = \frac{1}{2} \min \{ \lambda \mid \lambda \in \sigma (L),\ \lambda \ne 0\}\). Thus, in the single integrator case the corresponding a priori upper bounds explicitly involve the Laplacian eigenvalues. As already noted in Sect. 1, the single integrator case was also studied in [26] for the slightly different setup that the output equation in the original network (4) is taken as \(y = (W^\frac{1}{2}R^T \otimes I_n) x\) instead of \(y = (L \otimes I_n) x\). Here, R is the incidence matrix of the graph and W the diagonal matrix with the edge weights on the diagonal (in other words, \(L = R W R^T\)). It was shown in [26] that in that case the absolute and relative approximation errors even admit the explicit formulas

and

In the remainder of this section, we will establish a proof of Theorem 2. Being rather technical, most of the proofs will the deferred to “Appendix.” As a first step, we establish the following lemma (see also [26], where only the single integrator case was treated):

Lemma 4

Let \(\pi \) be an AEP of the graph \(\mathcal {G}\). The approximation error when clustering \(\mathcal {G}\) according to \(\pi \) then satisfies

Proof

See “Appendix D.” \(\square \)

Recall that, since \(\pi \) is an AEP, we have \(\sigma (\hat{L}) \subset \sigma (L)\). Label the eigenvalues of L as \(0, \lambda _2, \lambda _3, \ldots , \lambda _N\) in such a way that \(0, \lambda _2, \lambda _3, \ldots , \lambda _k\) are the eigenvalues of \(\hat{L}\). Also, without loss of generality, we assume that \(\pi \) is regularly formed, i.e., all ones in each of the columns of \(P(\pi )\) are consecutive. One can always relabel the agents in the graph in such a way that this is achieved. For simplicity, we again denote \(P(\pi )\) by P. Consider now the symmetric matrix

(7)

Note that the eigenvalues of \(\bar{L}\) and \(\hat{L}\) coincide. Let \(\hat{U}\) be an orthogonal matrix that diagonalizes \(\bar{L}\). We then have

$$\begin{aligned} \hat{U}^T \bar{L}\hat{U}= {{\mathrm{diag}}}(0, \lambda _2, \ldots , \lambda _k) =: \hat{\varLambda }. \end{aligned}$$
(8)

Next, take . The columns of \(U_1\) form an orthonormal set:

Furthermore, we have that

$$\begin{aligned} U_1^T L U_1 = \hat{U}^T \bar{L}\hat{U}= \hat{\varLambda }. \end{aligned}$$

Now choose \(U_2\) such that \(U = \begin{pmatrix} U_1&\quad U_2 \end{pmatrix}\) is an orthogonal matrix and

$$\begin{aligned} \varLambda := U^T L U = \begin{pmatrix} \hat{\varLambda }&{}\quad 0 \\ 0 &{}\quad \bar{\varLambda }\end{pmatrix}, \end{aligned}$$
(9)

where \(\bar{\varLambda }= {{\mathrm{diag}}}(\lambda _{k + 1}, \ldots , \lambda _N)\). It is easily verified that the first column of \(U_1\), and thus the first column of U, is given by \(\frac{1}{\sqrt{N}} \mathbb {1}_N\), where \(\mathbb {1}_N\) is the N-vector of 1’s, a fact that we will use in the remainder of this paper.

Using the above, we will now first establish explicit formulas for the \(\mathcal {H}_2\)-norms of S and \(\hat{S}\) separately. The following lemma gives a formula for the \(\mathcal {H}_2\)-norm of the original transfer function S:

Lemma 5

Let U be as in (9). For \(i = 2, \ldots , N\), let \(X_i\) be the observability Gramian of the auxiliary system \((A - \lambda _i B, E, \lambda _i I)\) in (6), i.e., the unique solution of the Lyapunov equation . Then, the \(\mathcal {H}_2\)-norm of S is given by:

(10)

Proof

See “Appendix E.” \(\square \)

We proceed with finding a formula for the \(\mathcal {H}_2\)-norm for the reduced system. This will be dealt with in the following lemma:

Lemma 6

Let \( \hat{U}\) be as in (8) above. For \(i = 2, \ldots , k\), let \(X_i\) be the observability Gramian of the auxiliary system \((A - \lambda _i B, E, \lambda _i I)\) in (6), i.e., the unique solution of the Lyapunov equation . Then, the \(\mathcal {H}_2\)-norm of \(\hat{S}\) is given by:

(11)

Proof

See “Appendix F.” \(\square \)

We will now combine the previous lemmas and give a proof of Theorem 2.

Proof of Theorem 2

Using Lemma 4, and formulas (10) and (11), we compute

(12)

where the second equality follows from the fact that

Next, observe that (12) can be rewritten as

where \(S_{\lambda _j}\) for \(j = k + 1, \ldots , N\) is the transfer function of the auxiliary system (6). An upper bound for this expression is given by

where . Furthermore, we have

Since, by assumption, the partition \(\pi \) is regularly formed, is a block diagonal matrix of the form

It is easily verified that each \(P_i\) is a \({|}C_i{|} \times {|}C_i{|}\) matrix whose elements are all equal to \(\frac{1}{{|}C_i{|}}\). The matrix \(M M^T\) is a diagonal matrix whose diagonal entries are either 0 or 1. We then have that the ith column of is either equal to the ith column of if agent i is a leader, or zero otherwise. It then follows that the diagonal elements of are either zero or \(\frac{1}{{|}C_{k_i}{|}}\) if i is part of the leader set, where \(C_{k_i}\) is the cell containing agent i. Hence, we have

and consequently,

In conclusion, we have

which completes the proof of the first part of the theorem.

We now prove the statement about the relative error. For this, we will establish a lower bound for . By (10), we have

(13)

The first column of U spans the eigenspace corresponding to the eigenvalue 0 of L and hence must be equal to \(u_1 = \frac{1}{\sqrt{N}} \mathbb {1}_N\). Let \(\bar{U}\) be such that \(U = \begin{pmatrix} u_1&\quad \bar{U}\end{pmatrix}\). It is then easily verified using (13) that

Finally, since

we obtain that . This then yields the upper bound for the relative error as claimed. \(\square \)

Remark 3

Note that by our labeling of the eigenvalues of L, in the formulation of Theorem 2, we have that \(\sigma (L) {\setminus } \sigma (\hat{L})\) is equal to \(\{\lambda _{k + 1}, \ldots , \lambda _N\}\) used in the proof. We stress that this should not be confused with the notation often used in the literature, where the \(\lambda _i\)s are labeled in increasing order.

6 \(\varvec{\mathcal {H}_\infty }\)-error bounds

Whereas in the previous section we studied a priori upper bounds for the approximation error in terms of the \(\mathcal {H}_2\)-norm, the present section aims at expressing the approximation error in terms of the \(\mathcal {H}_\infty \)-norm. This section consists of two subsections. In the first subsection, we consider the special case that the agent dynamics is a single integrator system. Here, we obtain an explicit formula for the \(\mathcal {H}_\infty \)-norm of the error. In the second subsection, we find an upper bound for the \(\mathcal {H}_\infty \)-error for symmetric systems.

6.1 The single integrator case

Here, we consider the special case that the agent dynamics is a single integrator system. In this case, we have \(A = 0\), \(B = 1\), and \(E = 1\) and the original system (4) reduces to

$$\begin{aligned} \begin{aligned} {\dot{x}}&= -L x + M u, \\ y&= L x. \end{aligned} \end{aligned}$$
(14)

The state space dimension of (14) is then simply N, the number of agents. For a given partition \(\pi = \{C_1, C_2, \ldots , C_k\}\), the reduced system (5) is now given by

$$\begin{aligned} \begin{aligned} {\dot{\hat{x}}}&= -\hat{L}\hat{x}+ \hat{M}u, \\ \hat{y}&= L P \hat{x}, \end{aligned} \end{aligned}$$

where \(P = P(\pi )\) is again the characteristic matrix of \(\pi \) and \(\hat{x}\in \mathbb {R}^k\). The transfer functions S and \(\hat{S}\), of the original and reduced system, respectively, are given by

The first main result of this section is the following explicit formula for the \(\mathcal {H}_\infty \)-model reduction error. It complements the formula for the \(\mathcal {H}_2\)-error obtained in [26] (see also Remark 2):

Theorem 3

Let \(\pi \) be an AEP of the graph \(\mathcal {G}\). If the network with single integrator agent dynamics is clustered according to \(\pi \), then the \(\mathcal {H}_\infty \)-error is given by

where, for some \(k_i \in \{1, 2, \ldots , k\}\), \(C_{k_i}\) is the set of cellmates of leader i. Furthermore, , hence the relative and absolute \(\mathcal {H}_\infty \)-errors coincide.

Remark 4

We see that the \(\mathcal {H}_\infty \)-error lies in the interval [0, 1]. The error is maximal (\(= 1\)) if and only if two or more leader nodes occupy one and the same cell. The error is minimal (\(= 0\)) if and only if each leader node occupies a different cell, and is the only node in this cell. In general, the error increases if the number of cellmates of the leaders increases.

Proof of Theorem 3

To simplify notation, denote \(\varDelta (s) = S(s) - \hat{S}(s)\). Note that both S and \(\hat{S}\) have all poles in the open left half plane. We now first show that, since \(\pi \) is an AEP, we have

(15)

First note that , where the symmetric matrix \(\bar{L}\) is given by (7). Thus, a state space representation for the error system is given by

(16)

Next, we show that (15) holds by applying Lemma 1 to system (16). Indeed, with \(X = -L\), we have

and from Lemma 1 it then immediately follows that . To compute , we apply Lemma 2 to system (16). First, it is easily verified that

By applying Lemma 2 we then obtain

(17)

Recall that \(\hat{U}\) in (8) is an orthogonal matrix that diagonalizes \(\bar{L}\) and that . Then, \(\bar{L}^+ = \hat{U}\hat{\varLambda }^+ \hat{U}^T\). Thus, we have

Next, we compute

$$\begin{aligned} \begin{aligned} L L^+&= U \varLambda U^T U \varLambda ^+ U^T \\&= U \varLambda \varLambda ^+ U^T \\&= I_N - \frac{1}{N} \mathbb {1}_N \mathbb {1}_N^T, \end{aligned} \end{aligned}$$
(18)

where the last equality follows from the fact that the first column of U is \(\frac{1}{\sqrt{N}} \mathbb {1}_N\). Now observe that

(19)

Combining (18) and (19) with (17), we obtain

From (15) then, we have that the \(\mathcal {H}_\infty \)-error is given by

(20)

All that is left is to compute the minimal eigenvalue of . Again, let \(\{v_1, v_2, \ldots , v_m\}\) be the set of leaders and note that M satisfies

$$\begin{aligned} M = \begin{pmatrix} e_{v_1}&\quad e_{v_2}&\quad \cdots&\quad e_{v_m} \end{pmatrix}. \end{aligned}$$

Again, without loss of generality, assume that \(\pi \) is regularly formed. Then, the matrix is block diagonal where each diagonal block \(P_i\) is a \({|}C_i{|} \times {|}C_i{|}\) matrix whose entries are all \(\frac{1}{{|}C_i{|}}\). Let \(k_i \in \{1, 2, \ldots , k\}\) be such that \(v_i \in C_{k_i}\). If all the leaders are in different cells, then

and so

(21)

Now suppose that two leaders \(v_i\) and \(v_j\) are cellmates. Then, we have

which together with implies

(22)

From (20), (21), and (22), we find the absolute \(\mathcal {H}_\infty \)-error. To find the relative \(\mathcal {H}_\infty \)-error, we compute by applying Lemmas 1 and 2 to the original system (14). Combined with (18), this results in the \(\mathcal {H}_\infty \)-norm of the original system:

This completes the proof. \(\square \)

6.2 The general case with symmetric agent dynamics

In this subsection, we return to the general case that the agent dynamics is given by an arbitrary multivariable input–state–output system. Thus, the original and reduced networks are again given by (4) and (5), respectively. As in the proof of Theorem 3, we will rely heavily on Lemma 2 to compute the \(\mathcal {H}_\infty \)-error. Since Lemma 2 relies on a symmetry argument, we will need to assume that the matrices A and B are both symmetric, which will be a standing assumption in the remainder of this section.

We will now establish an a priori upper bound for the \(\mathcal {H}_\infty \)-norm of the approximation error in the case that we cluster according to an AEP. Again, an important role is played by the \(N - 1\) auxiliary systems (6) with \(\lambda \) ranging over the nonzero eigenvalues of the Laplacian L. Again, let \(S_{\lambda }(s) = \lambda {(sI - A + \lambda B)}^{-1}E\) be their transfer functions. We assume that the original network (4) is synchronized, so that all of the \(A - \lambda B\) are Hurwitz. We again use S, \(\hat{S}\), and \(\varDelta \) to denote the relevant transfer functions.

The following is the second main theorem of this section:

Theorem 4

Assume the network (4) is synchronized and that A and B are symmetric matrices. Let \(\pi \) be an AEP of the graph \(\mathcal {G}\). The \(\mathcal {H}_\infty \)-error when clustering \(\mathcal {G}\) according to \(\pi \) then satisfies

and

where

(23)

and

(24)

with \(S_\lambda \) the transfer functions of the auxiliary systems (6).

Remark 5

The absolute \(\mathcal {H}_\infty \)-error thus lies in the interval \([0, S_{\max , \mathcal {H}_\infty }]\) with \(S_{\max , \mathcal {H}_\infty }\) the maximum over the \(\mathcal {H}_\infty \)-norms of the transfer functions \(S_{\lambda }\) with \(\lambda \in \sigma (L) {\setminus } \sigma (\hat{L})\). The error is minimal (\(= 0\)) if each leader node occupies a different cell, and is the only node in this cell. In general, the upper bound increases if the number of cellmates of the leaders increases.

Proof of Theorem 4

First note that the transfer function \(\hat{S}\) of the reduced network (5) is equal to

(25)

with the symmetric matrix \(\bar{L}\) given by (7). Analogous to the proof of Theorem 3, we first apply Lemma 1 to the error system

with transfer function \(\varDelta \). Take \(X = I_N \otimes A - L \otimes B\). We then have

From Lemma 1, we thus obtain that

In the proof of Lemma 4, it was shown that

$$\begin{aligned} \hat{S}{(-s)}^T \varDelta (s) = \hat{S}{(-s)}^T (S(s) - \hat{S}(s)) = 0. \end{aligned}$$

Since all transfer functions involved are stable, in particular this holds for \(s = 0\). We then have that \(\hat{S}{(0)}^T (S(0) - \hat{S}(0)) = 0\), i.e., \(\hat{S}{(0)}^T S(0) = \hat{S}{(0)}^T \hat{S}(0)\). By transposing, we also have \({S(0)}^T \hat{S}(0) = \hat{S}{(0)}^T \hat{S}(0)\). Therefore,

By applying Lemma 2 to system (4), we obtain

(26)

where \(S_{\lambda }\) is again given by (6). Recall that and . Now apply Lemma 2 to the transfer function (25) of the system (5):

Combining the two expressions above, it immediately follows that

By taking \(S_{\max , \mathcal {H}_\infty }\) as defined by (23) it then follows that

Continuing as in the proof of Theorem 3, we find an upper bound for the \(\mathcal {H}_\infty \)-error:

To compute an upper bound for the relative \(\mathcal {H}_\infty \)-error, we bound the \(\mathcal {H}_\infty \)-norm of system (4) from below. Again, let \(\bar{U}\) be such that \(U = \begin{pmatrix} u_1&\bar{U}\end{pmatrix}\) and let \(S_{\min , \mathcal {H}_\infty }\) be as defined by (24). From (26) it now follows that

Again using Lemma 2, we find a lower bound to the \(\mathcal {H}_\infty \)-norm of S:

which concludes the proof of the theorem. \(\square \)

7 Toward a priori error bounds for general graph partitions

Up to now, we have only dealt with establishing error bounds for network reduction by clustering using almost equitable partitions of the network graph. Of course, we would also like to obtain error bounds for arbitrary, possibly non almost equitable, partitions. In this section, we present some ideas to address this more general problem. We will first study the single integrator case. Subsequently, we will look at the general case.

7.1 The single integrator case

Consider the multi-agent network

$$\begin{aligned} \begin{aligned} {\dot{x}}&= -L x + M u, \\ y&= L x. \end{aligned} \end{aligned}$$
(27)

As before, assume that the underlying graph \(\mathcal {G}\) is connected. The network is then synchronized. Let \(\pi = \{C_1, C_2, \ldots , C_k\}\) be a graph partition, not necessarily an AEP, and let \(P = P(\pi ) \in {\mathbb {R}}^{N \times k}\) be its characteristic matrix. As before, the reduced order network is taken to be the Petrov–Galerkin projection of (27) and is represented by

$$\begin{aligned} \begin{aligned} {\dot{\hat{x}}}&= -\hat{L}\hat{x}+ \hat{M}u, \\ \hat{y}&= L P \hat{x}, \end{aligned} \end{aligned}$$
(28)

Again, let S and \(\hat{S}\) be the transfer functions of (27) and (28), respectively. We will address the problem of obtaining a priori upper bounds for and . We will pursue the following idea: as a first step we will approximate the original Laplacian matrix L (of the original network graph \(\mathcal {G}\)) by a new Laplacian matrix, denoted by \(L_\mathrm {AEP}\) (corresponding to a “nearby” graph \(\mathcal {G}_\mathrm {AEP}\)) such that the given partition \(\pi \) is an AEP for this new graph \(\mathcal {G}_\mathrm {AEP}\). This new graph \(\mathcal {G}_\mathrm {AEP}\) defines a new multi-agent system with transfer function \(S_\mathrm {AEP}(s) = L_\mathrm {AEP}{(s I_N + L_\mathrm {AEP})}^{-1} M\). The reduced order network of \(S_\mathrm {AEP}\) (using the AEP \(\pi \)) has transfer function . Then, using the triangle inequality, both for \(p = 2\) and \(p = \infty \), we have

(29)

The idea is to obtain a priori upper bounds for all three terms in (29). We first propose an approximating Laplacian matrix \(L_\mathrm {AEP}\), and subsequently study the problems of establishing upper bounds for the three terms in (29) separately.

For a given matrix M, let denote its Frobenius norm. In the following, denote . Note that \(\mathcal {P}\) is the orthogonal projector onto \({{\mathrm{im}}}{P}\). As approximation for L, we compute the unique solution to the convex optimization problem

(30)

In other words, we want to compute a positive semi-definite matrix \(L_\mathrm {AEP}\) with row sums equal to zero, and with the property that \({{\mathrm{im}}}P\) is invariant under \(L_\mathrm {AEP}\) (equivalently, the given partition \(\pi \) is an AEP for the new graph). We will show that such an \(L_\mathrm {AEP}\) may correspond to an undirected graph with negative weights. However, it is constrained to be positive semi-definite, so the results of Sects. 45, and 6 in this paper will remain valid.

Theorem 5

The matrix \(L_\mathrm {AEP}:= \mathcal {P}L \mathcal {P}+ (I_N - \mathcal {P}) L (I_N - \mathcal {P})\) is the unique solution to the convex optimization problem (30). If L corresponds to a connected graph, then, in fact, \(\ker {L_\mathrm {AEP}} = {{\mathrm{im}}}{\mathbb {1}_N}\).

Proof

Clearly, \(L_\mathrm {AEP}\) is symmetric and positive semi-definite since L is. Also, \((I_N - \mathcal {P}) L_\mathrm {AEP}P = 0\) since \((I_N - \mathcal {P}) P = 0\). It is also obvious that \(L_\mathrm {AEP}\mathbb {1}_N = 0\) since \(\mathcal {P}\mathbb {1}_N = \mathbb {1}_N\). We now show that \(L_\mathrm {AEP}\) uniquely minimizes the distance to L. Let X satisfy the constraints and define \(\varDelta = L_\mathrm {AEP}- X\). Then, we have

It can be verified that \(L - L_\mathrm {AEP}= (I_N - \mathcal {P}) L \mathcal {P}+ \mathcal {P}L (I_N - \mathcal {P})\). Thus,

Now, since both X and \(L_\mathrm {AEP}\) satisfy the first constraint, we have \((I_N - \mathcal {P}) \varDelta \mathcal {P}= 0\). Using this we have

Also,

Thus, we obtain

from which it follows that is minimal if and only if \(\varDelta = 0\), equivalently, \(X = L_\mathrm {AEP}\).

To prove the second statement, let \(x \in \ker {L_\mathrm {AEP}}\), so \(x^T L_\mathrm {AEP}x = 0\). Then, both \(x^T \mathcal {P}L \mathcal {P}x = 0\) and \(x^T (I_N - \mathcal {P}) L (I_N - \mathcal {P}) x = 0\). This clearly implies \(L \mathcal {P}x = 0\) and \(L (I_N - \mathcal {P}) x = 0\). Since L corresponds to a connected graph, we must have \(\mathcal {P}x \in {{\mathrm{im}}}\mathbb {1}_N\) and \((I_N - \mathcal {P}) x \in {{\mathrm{im}}}\mathbb {1}_N\). We conclude that \(x \in {{\mathrm{im}}}\mathbb {1}_N\), as desired. \(\square \)

As announced above, \(L_\mathrm {AEP}\) may have positive off-diagonal elements, corresponding to a graph with some of its edge weights being negative. For example, for

$$\begin{aligned} L&= \begin{pmatrix} 1 &{}\quad -1 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ -1 &{}\quad 2 &{}\quad -1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad -1 &{}\quad 2 &{}\quad -1 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad -1 &{}\quad 2 &{}\quad -1 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad -1 &{}\quad 1 \end{pmatrix}, \quad P = \begin{pmatrix} 1 &{}\quad 0 \\ 1 &{}\quad 0 \\ 1 &{}\quad 0 \\ 0 &{}\quad 1 \\ 0 &{}\quad 1 \end{pmatrix}, \end{aligned}$$

we have

$$\begin{aligned} L_\mathrm {AEP}&= \begin{pmatrix} \frac{11}{9} &{}\quad -\frac{7}{9} &{}\quad -\frac{1}{9} &{}\quad 0 &{}\quad -\frac{1}{3} \\ -\frac{7}{9} &{}\quad \frac{20}{9} &{}\quad -\frac{10}{9} &{}\quad 0 &{}\quad -\frac{1}{3} \\ -\frac{1}{9} &{}\quad -\frac{10}{9} &{}\quad \frac{14}{9} &{}\quad -\frac{1}{2}&{}\quad \frac{1}{6} \\ 0 &{}\quad 0 &{}\quad -\frac{1}{2}&{}\quad \frac{3}{2} &{}\quad -1 \\ -\frac{1}{3} &{}\quad -\frac{1}{3} &{}\quad \frac{1}{6} &{}\quad -1 &{}\quad \frac{3}{2} \end{pmatrix}, \end{aligned}$$

so the edge between nodes 3 and 5 has a negative weight. Figure 2 shows the graphs corresponding to L and \(L_\mathrm {AEP}\). Although \(L_\mathrm {AEP}\) is not necessarily a Laplacian matrix with only nonpositive off-diagonal elements, it has all the properties we associate with a Laplacian matrix. Specifically, it can be checked that all results in this paper remain valid, since they only depend on the symmetric positive semi-definiteness of the Laplacian matrix.

Fig. 2
figure 2

A path graph on 5 vertices and its closest graph such that the partition \(\{\{1, 2, 3\}, \{4, 5\}\}\) is almost equitable

Using the approximating Laplacian \(L_\mathrm {AEP}= \mathcal {P}L \mathcal {P}+ (I_N - \mathcal {P}) L (I_N - \mathcal {P})\) as above, we will now deal with establishing upper bounds for the three terms in (29). We start off with the middle term in (29).

According to Remark 2, for \(p = 2\) this term has an upper bound depending on the maximal \(\lambda \in \sigma (L_\mathrm {AEP}) {\setminus } \sigma (\hat{L}_\mathrm {AEP})\), and on the number of cellmates of the leaders with respect to the partitioning \(\pi \). For \(p = \infty \), in Theorem 3 this term was expressed in terms of the maximal number of cellmates with respect to the partitioning \(\pi \) (noting that it is equal to 1 in case two or more leaders share the same cell).

Next, we will take a look at the first and third term in (29), i.e., and . Let us denote \(\varDelta L = L - L_\mathrm {AEP}\). We find

Thus, both for \(p = 2\) and \(p = \infty \), we have

(31)

It is also easily seen that and . Therefore,

Since, finally, \({(L P - P \hat{L})}^T (L P - P \hat{L}) = P^T {(\varDelta L)}^2 P\), for \(p = 2\) and \(p = \infty \), we obtain

(32)

Thus, both in (31) and (32) the upper bound involves the difference \(\varDelta L = L - L_\mathrm {AEP}\) between the original Laplacian and its optimal approximation in the set of Laplacian matrices for which the given partition \(\pi \) is an AEP. In a sense, the difference \(\varDelta L\) measures how far \(\pi \) is away from being an AEP for the original graph \(\mathcal {G}\). Obviously, \(\varDelta L = 0\) if and only if \(\pi \) is an AEP for \(\mathcal {G}\). In that case only the middle term in (29) is present.

7.2 The general case

In this final subsection, we will put forward some ideas to deal with the case that the agent dynamics is a general linear input–state–output system and the given graph partition \(\pi \), with characteristic matrix P, is not almost equitable. In this case, the original network is given by (4) and the reduced network by (5). Their transfer functions are S and \(\hat{S}\), respectively. Let \(L_\mathrm {AEP}\) and \(\hat{L}_\mathrm {AEP}\) as in the previous subsection and let

$$\begin{aligned} S_\mathrm {AEP}(s) = (L_\mathrm {AEP}\otimes I_n) {(s I - I_N \otimes A + L_\mathrm {AEP}\otimes B)}^{-1} (M \otimes E) \end{aligned}$$

and

As before, we assume that (4) is synchronized, so S is stable. However, since the partition \(\pi \) is no longer assumed to be an AEP, the reduced transfer function \(\hat{S}\) need not be stable anymore. Also, \(S_\mathrm {AEP}\) and \(\hat{S}_\mathrm {AEP}\) need not be stable. We will now first study under what conditions these are stable. First note that \(\hat{S}\) is stable if and only if \(A - {\hat{\lambda }}B\) is Hurwitz for all nonzero eigenvalues \({\hat{\lambda }}\) of \(\hat{L}\). Moreover, \(S_\mathrm {AEP}\) and \(\hat{S}_\mathrm {AEP}\) are stable if and only if \(A - \lambda B\) is Hurwitz for all nonzero eigenvalues \(\lambda \) of \(L_\mathrm {AEP}\). In the following, let \(\lambda _\mathrm {min}(L)\) and \(\lambda _\mathrm {max}(L)\) denote the smallest nonzero and largest eigenvalue of L, respectively. We have the following lemma about the location of the nonzero eigenvalues of \(\hat{L}\) and \(L_\mathrm {AEP}\):

Lemma 7

All nonzero eigenvalues of \(\hat{L}\) and of \(L_\mathrm {AEP}\) lie in the closed interval \([\lambda _\mathrm {min}(L), \lambda _\mathrm {max}(L)]\).

Proof

The claim about the eigenvalues of \(\hat{L}\) follows from the interlacing property (see, e.g., [13]). Next, note that \(\mathcal {P}= Q_1 Q_1^T\), with \(Q_1 = P {(P^T P)}^{-\frac{1}{2}}\). Since the columns of \(Q_1\) are orthonormal, there exists a matrix \(Q_2 \in \mathbb {R}^{N \times (N - r)}\) such that \(\begin{pmatrix} Q_1&\quad Q_2 \end{pmatrix}\) is an orthogonal matrix. Then, we have \(I_N - \mathcal {P}= Q_2 Q_2^T\) and we find

$$\begin{aligned} L_\mathrm {AEP}&= \mathcal {P}L \mathcal {P}+ (I_N - \mathcal {P}) L (I_N - \mathcal {P}) \\&= Q_1 Q_1^T L Q_1 Q_1^T + Q_2 Q_2^T L Q_2 Q_2^T \\&= \begin{pmatrix} Q_1&\quad Q_2 \end{pmatrix} \begin{pmatrix} Q_1^T L Q_1 &{}\quad 0 \\ 0 &{}\quad Q_2^T L Q_2 \end{pmatrix} \begin{pmatrix} Q_1^T \\ Q_2^T \end{pmatrix}. \end{aligned}$$

It follows that \(\sigma (L_\mathrm {AEP}) = \sigma (Q_1^T L Q_1) \mathop {\cup } \sigma (Q_2^T L Q_2)\). By the interlacing property, both the eigenvalues of \(Q_1^T L Q_1\) and \(Q_2^T L Q_2\) are interlaced with the eigenvalues of L, so in particular we have that all eigenvalues \(\lambda \) of \(L_\mathrm {AEP}\) satisfy \(\lambda \le \lambda _\mathrm {max}(L)\). In order to prove the lower bound, note that \(Q_1^T L Q_1\) is similar to \(\hat{L}\), for which we know that its nonzero eigenvalues are between the nonzero eigenvalues of L. As for the eigenvalues of \(Q_2^T L Q_2\), note that \(\mathbb {1}^T Q_2 = 0\) and \({||}Q_2 x{||}_2 = {||}x{||}_2\) for all x. Thus, we find

$$\begin{aligned} \min _{{||}x{||}_2 = 1} x^T Q_2^T L Q_2 x \ge \min _{\begin{array}{c} \mathbb {1}^T y = 0 \\ {||}y{||}_2 = 1 \end{array}} y^T L y. \end{aligned}$$

Therefore, the smallest eigenvalue of \(Q_2^T L Q_2\) is larger than the smallest positive eigenvalue of L. We conclude that indeed \(\lambda \ge \lambda _\mathrm {min}(L)\) for all nonzero eigenvalues \(\lambda \) of \(L_\mathrm {AEP}\). \(\square \)

Using this lemma, we see that a sufficient condition for \(\hat{S}\), \(S_\mathrm {AEP}\), and \(\hat{S}_\mathrm {AEP}\) to be stable is that for each \(\lambda \in [\lambda _\mathrm {min}(L),\lambda _\mathrm {max}(L)]\), the strict Lyapunov inequality

$$\begin{aligned} (A - \lambda B) X + X {(A - \lambda B)}^T < 0 \end{aligned}$$

has a positive definite solution X. This sufficient condition can be checked by verifying solvability of a single linear matrix inequality, whose size does not depend on the number of agents, see [31]. After having checked this, it would then remain to establish upper bounds for the first and third term in (29). This can be done in an analogous way as in the previous subsection. Specifically, it can be shown that for \(p = 2\) and \(p = \infty \) we have

and

8 Numerical examples

To illustrate the error bounds we have established in this paper, consider the graph with 10 nodes taken from [26], as shown in Fig. 1. Its Laplacian matrix is

with spectrum (rounded to three significant digits)

$$\begin{aligned} \sigma (L) \approx \{0,\ 1,\ 1.08,\ 4.14,\ 5,\ 6.7,\ 8.36,\ 16.1,\ 28.2,\ 33.5\}. \end{aligned}$$

First, we illustrate the \(\mathcal {H}_2\) and \(\mathcal {H}_\infty \) error bounds from Theorems 2 and 4. We take \(\pi = \{\{1, 2, 3, 4\}, \{5, 6\}, \{7\}, \{8\}, \{9, 10\}\}\) and

Note that, indeed, \(\pi \) is an AEP. Also, in order to satisfy the assumptions of Theorem 4, we have taken A and B symmetric. Note that \(A - \lambda B\) is Hurwitz for all nonzero eigenvalues \(\lambda \) of the Laplacian matrix L. Therefore, the multi-agent system is synchronized. It remains to choose the set of leaders \(\mathcal {V}_{\mathrm {L}}\). For demonstration, we compute the \(\mathcal {H}_2\) and \(\mathcal {H}_\infty \) upper bounds and the true errors for all possible choices of \(\mathcal {V}_{\mathrm {L}}\). Since the sets of leaders are nonempty subsets of \(\mathcal {V}\), it follows that there are \(2^{10} - 1 = 1023\) possible sets of leaders. Figure 3 shows all the ratios of upper bounds and corresponding true errors, where we define \(\frac{0}{0} := 1\). We see that in this example, all true errors and upper bounds are within one order of magnitude, and that in most cases the ratio is below 2.

Fig. 3
figure 3

Ratios of \(\mathcal {H}_2\) (left) and \(\mathcal {H}_\infty \) (right) upper bounds and corresponding true errors, for a fixed almost equitable partition and all possible sets of leaders. In both figures, the sets of leaders are sorted such that the ratio is increasing (in particular, the ordering of the sets of leaders is not the same)

Fig. 4
figure 4

True \(\mathcal {H}_2\) (left) and \(\mathcal {H}_\infty \) (right) errors and upper bounds, for a fixed set of leaders and all partitions with five cells. In each figure, partitions were sorted such that the true errors are increasing

Fig. 5
figure 5

First 1000 true errors and upper bounds from Fig. 4

Next, we compare the true errors with the triangle inequality-based error bounds from (29) for a fixed set of leaders and all possible partitions consisting of five cells. For the set of leaders, we take \(\mathcal {V}_{\mathrm {L}}= \{6, 7\}\), as was also used in [26]. With this choice of leaders, the systems norms are and (rounded to three significant digits). Figure 4 shows true errors and upper bounds for all partitions of \(\mathcal {V}\) with five cells (there are 42, 525 such partitions). We observe that the upper bounds vary significantly as the true error increases, but the ratio is still less than one order of magnitude. Additionally, we notice that partitions giving small \(\mathcal {H}_2\) errors give smaller upper bounds, as seen more clearly in the left subfigure of Fig. 5. Furthermore, we observe a jump after the 966th partition. In fact, the 966 partitions giving the smallest \(\mathcal {H}_2\) error are all those partitions where the leaders are the only members in their cell. For the \(\mathcal {H}_\infty \) error this is not the case, i.e., there are partitions with leaders sharing a cell with more agents that give a smaller \(\mathcal {H}_\infty \) error then a partition with leaders not sharing a cell. On the other hand, partitions with the smallest \(\mathcal {H}_2\) or \(\mathcal {H}_\infty \) upper bound are close to the optimal true error.

In the following, we also compute the errors for all partitions with five cells. Figure 6 shows the relative approximation errors . We see that only a few (six, to be precise) partitions give a relative error less than 0.1. Irrespective of this, a small triangle inequality-based error bound (29) seems to indicate good partitions.

Fig. 6
figure 6

Relative error of L by \(L_\mathrm {AEP}\) in Frobenius norm for all partitions with five cells. The partitions are ordered such that the errors are increasing

Finally, we compare the bound (29) with those from Ishizaki et al. [15,16,17]. There are also error bounds developed in [4, 6], but they depend on the proposed model reduction methods and cannot be evaluated for an arbitrary partition. The \(\mathcal {H}_2\) and \(\mathcal {H}_\infty \) error bounds from Ishizaki et al. are based on the decomposition (see equation (31) in [16], (20) in [14], or (17) in [17])

$$\begin{aligned} S(s) - \hat{S}(s) = \varXi (s) Q Q^T X(s), \end{aligned}$$

where

, and Q is such that \(\begin{pmatrix} P&\quad Q \end{pmatrix}\) is orthogonal. The error bounds are then

for \(p = 2\) and \(p = \infty \). Figure 7 shows the comparison between these bounds, the triangle inequality-based bound (29), and the true errors. In this example, our bounds are, for most partitions, lower than those from Ishizaki et al. Yet, they do share some qualitative properties: both vary significantly as the true error increases and those partitions with the small bounds are close to the optimal.

Fig. 7
figure 7

Comparison with error bounds from Ishizaki et al. [15,16,17]. The first column shows the \(\mathcal {H}_2\) errors and bounds, the second column the \(\mathcal {H}_\infty \) errors and bounds. The first row contains values for all partitions with five cells, the second row only the first 1000 best ones

9 Conclusions

In this paper, we have extended results on model reduction of leader–follower networks with single integrator agent dynamics from [26] to leader–follower networks with arbitrary linear multivariable agent dynamics. We have also extended these results to the case that the approximation error is measured in the \(\mathcal {H}_\infty \)-norm. The proposed model reduction technique reduces the complexity of the network topology by clustering the agents. We have shown that clustering amounts to applying a specific Petrov–Galerkin projection associated with the graph partition. The resulting reduced order model can be interpreted as a networked multi-agent system with a weighted, directed network graph. If the original network is clustered using an almost equitable graph partition, then its consensus properties are preserved. We have provided a priori upper bounds on the \(\mathcal {H}_2\) and \(\mathcal {H}_\infty \) model reduction errors in this case. These error bounds depend on an auxiliary system related to the agent dynamics, the eigenvalues of the Laplacian matrices of the original and the reduced network, and on the number of cellmates of the leaders in the network. Finally, we have provided some insight into the general case of clustering according to arbitrary, not necessarily almost equitable, partitions. Here, direct computation of a priori upper bounds on the error is not as straightforward as in the case of almost equitable partitions. We have shown that in this more general case, one can bound the model reduction errors by first optimally approximating the original network by a new network for which the chosen partition is almost equitable, and then bounding the \(\mathcal {H}_2\) and \(\mathcal {H}_\infty \) errors using the triangle inequality.