Decomposition of arrow type positive semidefinite matrices with application to topology optimization

Decomposition of large matrix inequalities for matrices with chordal sparsity graph has been recently used by Kojima et al. (Math Program 129(1):33–68, 2011) to reduce problem size of large scale semidefinite optimization (SDO) problems and thus increase efficiency of standard SDO software. A by-product of such a decomposition is the introduction of new dense small-size matrix variables. We will show that for arrow type matrices satisfying suitable assumptions, the additional matrix variables have rank one and can thus be replaced by vector variables of the same dimensions. This leads to significant improvement in efficiency of standard SDO software. We will apply this idea to the problem of topology optimization formulated as a large scale linear semidefinite optimization problem. Numerical examples will demonstrate tremendous speed-up in the solution of the decomposed problems, as compared to the original large scale problem. In our numerical example the decomposed problems exhibit linear growth in complexity, compared to the more than cubic growth in the original problem formulation. We will also give a connection of our approach to the standard theory of domain decomposition and show that the additional vector variables are outcomes of the corresponding discrete Steklov–Poincaré operators.


Introduction
General purpose algorithms and software for semidefinite optimization (SDO) are dominated by interior point and barrier type methods. Any such software exhibits two bottlenecks regarding computational complexity, and thus CPU time, and memory requirements. The first one is the evaluation of the system matrix (Schur complement matrix or Hessian of augmented Lagrangian) in every step of the underlying Newton method. The second one is then the solution of a linear system with this matrix. For problems with large matrix inequalities, it is often the first bottleneck that dominates the CPU time and that prevents the user from solving large scale problems.
To circumvent this obstacle, the technique of decomposition of a large matrix inequality into several smaller ones proved to be efficient, at least for certain classes of problems. Decomposition of positive semidefinite matrices with a certain sparsity pattern was first investigated in Agler et al. [1] and, independently, by Griewank and Toint [4]. An extensive study has been recently published by Vandenberghe and Andersen [13]. We will call this technique chordal decomposition. It was first used in semidefinite optimization by Kojima and his co-workers; see [3,10] and, more recently, [8]. The group also developed a preprocessing software for semidefinite optimization named SparseCoLO [2] that performs the decomposition of matrix constraints automatically.
The goal of this paper is two-fold. Firstly, we introduce a new decomposition of arrow type positive semidefinite matrices called arrow decomposition. Unlike the chordal decomposition that generates additional dense matrix variables, arrow decomposition only requires additional vector variables of the same size, leading to significant reduction of number of variables in the decomposed problem. The second goal is to apply both decomposition techniques to the topology optimization problem. This problem arises from finite element discretization of a partial differential equation. We will show that techniques known from domain decomposition can be used to define the matrix decomposition. In particular, we will be able to control the number and size of the decomposed matrix inequalities. Even when using the chordal decomposition, this will allow us to gain tremendous speed-up when compared to approaches based on automatic chordal completion as in [2]. We will also give a connection of the arrow decomposition with the theory of domain decomposition and show that the additional vector variables are outcomes of the corresponding discrete Steklov-Poincaré operators.
To solve all semidefinite optimization problems, we will use the state of the art solver MOSEK [9]. Numerical examples will demonstrate tremendous speed-up in the solution of the decomposed problems, as compared to the original large scale problem. Moreover, in our numerical examples the arrow decomposition exhibits linear growth in complexity, compared to the higher than cubic growth when solving the original problem formulation.
Notation Let S n be the space of n × n symmetric matrices, A ∈ S n , and I ⊂ {1, . . . , n} with s = |I |. We denote -by (A) i, j the (i, j)th element of A; -by (A) I the restriction of A to S s , i.e., the s × s submatrix of A with row and column indices from I ; -by O m,n the m × n zero matrix; when the dimensions are clear from the context, we simply use O. A matrix is called dense if all its elements are non-zeros. Otherwise, the matrix is called sparse. A matrix-valued function A(x) is called dense if there existsx such that A(x) is dense.
Let and analogously S n + (G). Let G s (N s , E s ) be an induced subgraph of G(N , E). Notice the difference between S n (G s ) and S n (N s ). If A ∈ S n (N s ) then its restriction (A) N s is a dense matrix. This is not true for A ∈ S n (G s ), the sparsity pattern of which is given by the set of edges E s . In particular, S n (G s ) = S n (N s ) if and only if G s is a maximal clique.
Finally, for functions from R d → R we will use bold italics (such as u or u(ξ )), while for vectors resulting from finite element discretization of these functions, we will use the same symbol but in italics (e.g. u ∈ R n ).

Matrices with chordal sparsity graphs
We first recall the well-studied case of matrices with chordal sparsity graph. The following theorem was proved independently by Grone et al. [5], Griewank and Toint [4] and by Agler et al. [1]. A new, shorter proof can be found in [7]. Theorem 1 Let G(N , E) be an undirected graph with maximal cliques C 1 , . . . , C p . The following two statements are equivalent: Notice that this decomposition is not unique. However, Kakimura [7] has shown that there exist matrices Y * k minimizing p k=1 rank Y k subject to p k=1 Y k = A and Y k ∈ S n + (C k ) (k = 1, . . . , p) and that p k=1 rank Y * k = rank A.

Matrices embedded in those with a chordal sparsity graph
Let A ∈ S n , n ≥ 3, with a sparsity graph G = (N , E). Let the set of nodes N = {1, 2, . . . , n} be partitioned into p ≥ 2 overlapping sets Let I k, denote the intersection of the kth and th set, i.e., There exists at least one index with 1 ≤ ≤ p, = k, such that I k ∩ I = ∅.
Assumption 2 I k ∪ I = I k for all 1 ≤ k, ≤ p, k = , i.e., no I is a subset of any I k .

Assumption 3
The intersections are "sparse" in the sense that for each k ∈ {1, . . . , p} there are at most p k indices i such that In a typical situation only I k,k+1 , k = 1, . . . , p − 1, are not empty (corresponding to a block diagonal matrix with overlapping blocks) or I k has a non-empty intersection with up to eight other sets (see Sect. 4).
Denote the induced subgraphs of G(N , E) corresponding to I k by G k (I k , E k ), k = 1, . . . , p. These subgraphs are not necessarily cliques. Lemma 1 Let A be defined as above and Assumptions 1-3 hold. Then there exist matrices Q k ∈ S n (G k ) such that The proof of the above lemma is obvious, as the principal submatrices associated with the subgraphs G k (I k , E k ) cover all the nonzeros in A. However, although matrices Q k are obviously non-unique, in our application in Sect. 3 they will be specified a priori and will be, in fact, used for the construction of the matrix A.
For all k = 1, . . . , p, let G k (I k , E k ) denote a completion of G k (I k , E k ), i.e., a clique in G(N , E). According to Assumption 2, G k (I k , E k ) are even maximal cliques. Clearly, Q k ∈ S n ( G k ). E); see, e.g., [13,Section 8.3].
Notice that the rather restrictive Assumption 4 is satisfied when A is a block diagonal matrix with overlapping blocks. It may not be satisfied in the application in Sect. 4; we will see, however, that it will not be needed in this application.
Theorem 2 Let A be defined as above and Assumptions 1-4 hold. The following two statements are equivalent: If I k, = ∅ or is not defined then S k, is a zero matrix.
Proof Using the chordal extension G(N , E) of G(N , E), we embed the matrix A into a set of matrices with chordal sparsity graphs with maximal cliques G k (I k , E k ), k = 1, . . . , p. Then we can apply Theorem 1. Hence there exist matrices Y k ∈ S n + (I k ), k = 1, . . . , p, such that A = Y 1 + · · · + Y p . Now, Y k must be equal to Q k for the "internal" indices of I k , i.e., for all (i, j) ∈ I k \ : >k (I k, ) ∪ : <k (I ,k ) 2 .
Therefore the unknown elements of Y k reduce to the overlaps I k, .
Having Q k and Y k , k = 1, . . . , p, we will now define the matrices the S k, as follows. Firstly, for k = 1 we select any solution S 1, I 1, =∅ of the equation Notice that many elements of matrices S 1, (I 1, =∅ ) are uniquely defined by this equation. Only elements with indices from nonempty intersections I 1, ∩ I 1,k are not unique, as they appear in more than one matrix S •,• in the above equation. Now, for 1 < k < p, we solve the equation All matrices S ,k , < k, were defined in steps 1, . . . , k − 1, hence we are in the same situation as above and select any solution S k, >k,I k, =∅ of the above equation. Any selection of the non-unique elements of S •,• will be consistent with the last equation Fig. 1 Decomposition of a five-diagonal matrix by maximal cliques, according to Theorem 1 (a), the sparsity graph of this matrix (b), decomposition to two blocks according to Theorem 2 (c), and the corresponding chordal extension of the graph with two maximal cliques (d) Theorem 2 allows us to define the decomposition at our will, within the limits of the assumptions; even for matrices with chordal sparsity graph the decomposition does not have to be driven by the maximal cliques. This is illustrated in the following example.
Example 1 Consider a 7 × 7 five-diagonal matrix as schematically depicted in Fig. 1a; here crosses represent nonzero real numbers. The sparsity graph of this matrix is shown in Fig. 1b. This graph is chordal with six maximal cliques corresponding to the six triangles. Following Theorem 1, the chordal decomposition will use six principal submatrices shown in Fig. 1a. However, using Theorem 2 we can choose to decompose the matrix to only two principal submatrices, as shown in Fig. 1c. In this case, I 1 = {1, 2, 3, 4, 5}, I 2 = {4, 5, 6, 7, 8} and I 1,2 = {4, 5}. The corresponding chordal extension with two maximal cliques is shown in Fig. 1d.

Arrow type matrices
Let us now consider a particular type of sparse matrices, the arrow type matrices. Let again A ∈ S n , n ≥ 3, and let I k , I k, and G k (I k , E k ), k = 1, . . . , p, be defined as in the previous section.
Assume again that A is a sum of matrices associated with G k : We also define I k = I k ∪ {n + 1, . . . , n + m}, k = 1, . . . , p and Finally, let C ∈ S m be positive definite. We define the following arrow type matrix: According to the definition of A k and B k , we have that The simplest example of an arrow type matrix is a block diagonal matrix with overlapping blocks and with additional rows and columns corresponding to matrices B and C. The next example presents another typical situation.
(a) Notice that the structure of the overlapping blocks can be more complicated and that, in general, A (the arrow "shaft") does not have to be a band matrix. Such matrices arise in the application introduced later in Sect. 3; see Figs. 6 and 7. In this application, we will have m = 1, so that B will be an n-vector and C ∈ R. However, in this section we consider the more general situation which may be useful in other applications.
Before going on, we need the following auxiliary result. It tells us that we do not need to consider all intersections of the extended sets I k but only those that are extensions of I k, , i.e., only I k, as defined in (2).

where E k H contains all edges with one vertex in I k and one vertex in N H . Then G( N , E) is a chordal graph with maximal cliques G k
Proof Obviously, G k ( I k , E k ) are cliques, as all vertices in I k are adjacent, by assumption (G k is a clique and H is complete) and by construction. Further, by construction, Hence a graph with vertices N K ∪ v is a clique in G which contradicts maximality of K . Finally, G is chordal, as any newly created cycle with vertices in N H contains a chord in E G H .
We can now adapt Theorem 2 to the arrow type structure. (3). The following two statements are equivalent:
If I k, = ∅ or is not defined then S k, is a zero matrix.
Proof Let H (N H , E H ) be a sparsity graph of the matrix C; this is a complete graph. Recall that G(N , E) is the chordal extension of G(N , E), the sparsity graph of the matrix A. The maximal cliques of G are G k (I k , E k ). The matrix B is dense and so all vertices from N H are adjacent to all vertices in N . Then, by Lemma 2, the chordal extension of the sparsity graph of M has maximal cliques G k ( I k , E k ). The rest is a direct application of Theorem 2 with Q k = M k for k = 1, . . . , p − 1, and Under additional assumptions, but leaving out Assumption 4, we can strengthen the above corollary as follows. (3). The following two statements are equivalent: If I k, = ∅ or is not defined then D k, is a zero matrix.
Proof We will prove the theorem by constructing matrices D k,k+1 and C k . By assumption, A is positive definite, so that we can define Then We define D k,k+1 and C k as follows. For k = 1, we solve the equation As in the proof of Theorem 2, some elements of thus defined D 1, may not be unique; in this case, we just select a solution. Then, for any 1 < k < p, we solve the equation to define D k, , > k, analogously to Theorem 2. Any selection of the non-unique elements of D •,• will be consistent with the last equation because of (5). From (4) and (6) we see that D k, is only non-zero on I k, , (k, ) ∈ Θ p , as required.
Define further Now the matrices defined for k = 1, . . . , p by are clearly positive semidefinite with (at least) m zero eigenvalues. We set M k = M k , As A p 0 by assumption, positive semidefiniteness of M p amounts to which, by (5), is the same as By the Schur complement theorem, the last inequality is equivalent to M 0. This completes the proof.
We will call the decomposition of arrow type matrices using Corollary 1 chordal decomposition and the one using Theorem 3 arrow decomposition.
Complexity remarks Let Assumptions 1-4 hold. Let r be the number of non-empty sets I k, , (k, ) ∈ Θ p . Comparing Corollary 1 with Theorem 3 we see that both provide us with a replacement of a "large" matrix inequality M 0 to a number of smaller ones M k 0, k = 1, . . . , p. However, while in Corollary 1 we have to introduce r additional matrix variables of sizes | I k, | × | I k, |, in Theorem 3 we only have r additional matrix variables of sizes |I k, | × m and p matrix variables of size m × m. Recall that m < min k, =1,..., p k< |I k, | and, in our application below, m = 1, so the additional variables in Theorem 3 are vectors instead of matrices of the same dimension in Corollary 1, offering thus significant reduction in the dimension of the additional variables.
The example below shows that the arrow decomposition does not only lead to a problem of smaller dimension, it also allows us to use decompositions that do not satisfy Assumption 4. In particular, the arrow decompositions can be sparser, with smaller overlaps and hence leading to sparser and smaller SDO problem. Fig. 3a. The sparsity graph of this matrix is shown in Fig. 3b, where the central node corresponds to the last index in the matrix. Let us compare the arrow decomposition with the chordal decomposition.

Arrow decomposition
As in Example 2, we decompose the 12 × 12 leading principal submatrix into six 4 × 4 principal submatrices, as shown in Fig. 3c . . , 6, we get six intersections I A k, , all of dimension 3. This decomposition satisfies Assumptions 1-3 but not Assumption 4. Indeed, if we extend the subgraphs associated with the decomposition to cliques, we obtain the graph shown in Fig. 3d. This graph is, however, not chordal: for instance, the cycle 1-4-6-7-9-11-1 does not have a chord. Therefore, we cannot apply neither of the theorems based on chordal decomposition (Theorems 1, 2, Corollary 1); however, we can apply Theorem 3 above. As a result, we get six additional vector variables of dimension 3 corresponding to I A k, .

Chordal decomposition
Chordal decomposition must satisfy Assumption 4: the closest one to the above is leading principal submatrix, Fig. 3f for the corresponding extended chordal graph and Fig. 3g for the matrix associated with the extended graph. In this case, we need five additional matrix variables of dimension 4 × 4.
Two natural questions arise: 1. Are the additional assumptions of Theorem 3 too restrictive? Are there any applications satisfying them? 2. Is it worth reducing the dimension of the additional variables? Will it bring any significant savings of CPU time when solving the decomposed problem?
Both questions will be answered in the rest of the paper using a problem from structural optimization.

Application: topology optimization problem, semidefinite formulation
Consider an elastic body occupying a d-dimensional bounded domain ⊂ R d with a Lipschitz boundary ∂ , where d ∈ {2, 3}. By u(ξ ) ∈ R d we denote the displacement vector at a point ξ , and by the (small-)strain tensor. We assume that our system is governed by linear Hooke's law, i.e., the stress is a linear function of the strain where E is the elastic (plane-stress for d = 2) stiffness tensor. Assume that the boundary of is partitioned as The weak form of the linear elasticity problem reads as: where In the basic topology optimization problem, the design variable is the multiplier x ∈ L ∞ ( ) of the elastic stiffness tensor E which is a function of the space variable ξ . We will consider the following constraints on x: with some given positive "volume" V and with x, x ∈ L ∞ ( ) satisfying 0 ≤ x ≤ x and x(ξ ) dξ < V < x(ξ ) dξ . The minimum compliance single-load topology optimization problem reads as The objective, the so called compliance functional, measures how well the structure can carry the load f . Problem (9) is now discretized using the standard finite element method; the details can be found, e.g., in [6,11]. In particular, we use quadrilateral elements, element-wise constant approximation of function x and element-wise bilinear approximation of the displacement field u. After discretization, the variables will be vectors x ∈ R m and u ∈ R n , where m is the number of finite elements and n the number of degrees of freedom (the number of finite element nodes times the spatial dimension). With every element we associate the local (symmetric and positive semidefinite) stiffness matrix K i and (for elements including part of the boundary f ) the discrete load vector f i , i = 1, . . . , m. Now we can formulate the discretized version of the linear elasticity problem (7) as the following system of linear equations where K (x) = m i=1 x i K i is the global stiffness matrix and f = m i=1 f i is the finite element assembly of the load vector.
The topology optimization problem (9) becomes min u∈R n , x∈R m , γ ∈R γ subject to Using the Schur complement theorem, the compliance constraint and the equilibrium equation can be written as one matrix inequality constraint: The minimum compliance problem can then be formulated as follows: For ease of notation, in the rest of the paper we will restrict ourselves to the planar case d = 2. Generalization of all ideas to the three-dimensional case is straightforward.

Decomposition of the topology optimization problem (13)
Let h ⊂ R 2 be a polygonal approximation of discretized by finite elements. Assume that h is partitioned into p non-overlapping subdomains D k , k = 1, . . . , p, whose boundaries coincide with finite element boundaries. In our examples = h is a rectangle, the underlying finite element mesh is regular and so is the partitioning into the subdomains. Confront Fig. 4 that shows typical decomposition of h into N x × N y subdomains.
Let I k be the index set of all degrees of freedom associated with the subdomain D k , k = 1, . . . , p. The intersections of these index sets will include the degrees of freedom on the respective internal boundaries and will be again denoted by Denote by D k the index set of elements belonging to subdomain D k and define Matrix K (k) (x) = K (k) can then be partitioned as follows where the set Γ collects indices of all degrees of freedom corresponding with indices in one of he sets I ,k or I k, , = 1, . . . , p; the set I then collects indices of all remaining "interior" degrees of freedom in D k .
We are now in a position to apply the theorems from Sect. 2. Notice that Assumptions 1-3 are satisfied, as well as the additional assumptions of Theorem 3. Assumption 4 is discussed below.
Case A -Chordal decomposition Let us first apply Corollary 1. It says that the matrix inequality Z (x) 0 from (13) can be equivalently replaced by the following matrix inequalities The additional variables are the matrices, vectors and scalars Case B -Arrow decomposition Now we apply Theorem 3. In this case, the matrix inequality Z (x) 0 from (13) can be replaced by the following matrix inequalities The additional variables g •,• and γ , respectively, have the same dimensions as the variables σ •,• and s in Case A. Recall that Theorem 3 does not use the restrictive Assumption 4 from Sect. 2. This is important, because Assumption 4 is not satisfied when the domain contains holes, and so the decomposition technique would not be applicable to some practical problems. Consider, for instance, the finite element mesh in Fig. 4 and assume that the (i, j)th subdomain is not part of the domain , it is a hole with no finite elements. Then, even if we assume all matrices K (k) to be dense, the sparsity graph of K (x) is not chordal, as it contains the chordless cycle connecting (more than 3) nodes on the boundary of the internal hole.
Before formulating the decomposed version of problem (13) we notice that, according to Corollary 1 and Theorem 3, We will therefore replace the variable γ in the decomposed problems by either s k or γ k and the objective function by one of the above sums.
Case A Using the chordal decomposition approach, the decomposed optimization problem in variables with Z
Case B Using the arrow decomposition approach, the decomposed optimization problem in variables defined as in (18)

, (19).
A versus B Consider now the finite element mesh and decomposition as in Fig. 4 with n x × n y finite elements and N x × N y subdomains. Instead of (13) we can solve one of the decomposed problems (20) and (21). In Case A of the chordal decomposition the single matrix inequality of dimension (n + 1) × (n + 1) is replaced by N x · N y inequalities of dimension of order 2(n x /N x + 1)(n y /N y + 1) + 1 and we have to add N x (N y − 1) + (N x − 1)N y additional vectors σ •,• of a typical size 2(n x /N x + 1) or 2(n y /N y + 1), the same number of additional (dense) matrix variables S •,• of the same order and N x · N y scalar variables s • . (Recall that the factor 2 stems from the fact that there are two degrees of freedom at every finite element node.) In Case B of the arrow decomposition, the number and order of the new matrix constraints is the same as above but we only need the additional scalar and vector variables; the additional matrix variables are not needed. Later in Sect. 6 we will see that these decompositions leads to enormous speedup in computational time of a state-of-the-art SDO solver. We will also see that the omission of the additional matrix variables in the arrow decomposition can make a big difference.

Example 4
The notation used in the above decomposition approaches is rather cumbersome, so let us illustrate it using a simple example. Figure 5 presents a finite element mesh with 16 elements and 25 nodes. All nodes on the left-hand side are fixed and thus eliminated from the stiffness matrix. Hence the corresponding stiffness matrix will have dimension 40 × 40 (two degrees of freedom associated with every free finite element node, as depicted in the figure). The structure of the corresponding stiffness matrix K is shown in Fig. 6; here the elements corresponding to interior degrees of freedom (index sets I) are denoted by circles, while elements associated with the the intersections I k, are marked by full dots.

Decomposition by fictitious loads
So far, all the reasoning was purely algebraic. There is, however, an alternative, functional analytic view of the arrow decomposition in Theorem 3. We will present it in this section. The purpose is to illustrate a different viewpoint and so, to keep the notation simple, we will only consider the case of two subdomains.

Infinite dimensional setting
Let us recall the weak formulation (7) of the elasticity problem depending on parameter x: Let be partitioned into two mutually disjoint subdomains 1 and 2 such that Denote the interface boundary between the two subdomains by I ; see Fig. 8. We consider the general situation when u and f may be a part of both, ∂ 1 ∩ ∂Ω and ∂ 2 ∩ ∂Ω. Define a i as a restriction of the bilinear form a to i (the integral in (8) is simply computed over i ), Consider the following "restricted" problems: The following theorem forms a basis of our approach.
Proof The requested function g is the outcome of the respective Steklov-Poincaré operator applied to u * ; see, e.g., [12].
In the above theorem, function g can be interpreted as a fictitious load applied to either of the problems (25), (26). The theorem says that there exists such a g that the solutions of (25), (26) are equivalent to the solution of the "full" problem (22) restricted to the respective subdomain. Or, in other words, the solutions of (25), (26) can be "glued" to form the solution of (22).

Finite dimensional setting
Now assume that the discretization of is such that the interface boundary I is a union of boundaries of some finite elements. More precisely, we assume that the index set of finite elements used to the discretization of can be split into two disjoint subsets such that i is discretized by elements with indices from D i , i = 1, 2. Define the restrictions of the load vector f on boundaries of 1 and 2 , respectively. Denote the index set of degrees of freedom associated with finite element nodes on I by I 1,2 . Let n be the dimension of I 1,2 .
Finally, for a vector in z ∈ R n denote by ← → z its extension to R n : The discrete version of Theorem 4 can then be formulated as follows. (The following corollary is, in fact, trivial in the finite dimension; however, we need the above theorem to understand the meaning of the fictitious load and its existence in the original setting of the problem.) Corollary 2 Assume that u * solves (10). Then for all x ∈ R m there exists g ∈ R n such that ⎛ ⎝ i∈D 1 Notice that (27), (28) are still systems of dimension n; however, many rows and columns in the matrix and the right hand side are equal to zero, so they can be solved as systems of dimensions |N (1) | and |N (2) |, respectively. Hence, if we knew the fictitious load g, we could replace the large system of Eq. (10) by two smaller ones which, numerically, would be more efficient. Of course, we do not know it. However, and this is the key idea of this section, the linear system (10) is a constraint in an optimization problem, hence we can add g among the variables and, instead of searching for the optimal design x and the corresponding u satisfying (10), search for optimal x and for a pair (u, g) satisfying two smaller equilibrium Eqs. (27) and (28).
We can now formulate a result regarding the decomposition of the discretized topology optimization problem (11). (11) is equivalent to the following problem:
Using again the Schur complement theorem, we finally arrive at the decomposition of the SDO problem (13). (13) can be equivalently formulated as follows:

Corollary 3 Problem
Problem (30) is now exactly the same as problem (21) arising from arrow decomposition applied to two subdomains.

Numerical experiments
The decomposition techniques described in the article were applied to an example whose data (geometry, boundary conditions and forces) are shown in Fig. 9-left. We always use regular decomposition of the rectangular domain; an example of a decomposition into 8 subdomains is shown in Fig. 9-right. We have used finite element meshes with up to 160 × 80 elements.
We tested several codes to solve the SDO problems. Here we present results obtained by MOSEK, version 8.0 [9]. The reason for this is that MOSEK best demonstrated the decomposition idea; the speed-up achieved by the decomposition was most significant when using this software.
When solving the SDO problems, we used default MOSEK settings with the exception of duality gap parameter MSK_DPAR_INTPNT_CO_TOL_REL_GAP that was set to 10 −9 , instead of the default value 10 −8 . We will comment on the resulting accuracy of the solution later in the section.
We also tried to solve smaller problems by SparseCoLO [2], software that performs the decomposition of matrix constraints based on Theorem 1 automatically. In particular, the software checks whether the matrix in question has a chordal sparsity graph; if not, the graph is completed to be chordal. After that, maximal cliques are found and Theorem 1 is applied. Because the sparsity graph of the matrix in problem (13) is not chordal, a chordal completion is performed by SparseCoLO. Such a completion is not unique and may thus lead to different sets of maximal cliques. And here is the main difference to our approach: while we can steer the decomposition to result in smaller matrix constraints of the same size, matrix constraints resulting from application of SparseCoLO are of variable size, some small, some rather large. This fact has a big effect on the efficiency of SparseCoLO, as we will see in the examples below.
In all experiments we used a 2018 MacBook Pro with 2.3 GHz dual-core Intel Core i5, Turbo Boost up to 3.6 GHz and 16 GB RAM, and MATLAB version 9.2.0 (2017a).

Remark 1 (Element-wise decomposition)
The above text suggests that we always perform decomposition of the original finite element mesh into several (possibly uniform) sub-meshes, each of them having interior points; cf. Figs. 4, 5 and 9, and the notation used in Sect. 4. However, nothing prevents us from associating each subdomain with a finite element. When every subdomain consist of a single finite element, then the subdomains have no interior points, apart from those lying on the boundary of Ω and having no neighboring element. For instance, in Example 1, Fig. 5, these would only be degrees of freedom number 31, 32, 39, 40. In the numerical examples below, we will see that the big number of additional variables makes this option less attractive that other decompositions. However, while not the most effective of all decompositions, it is still much less computationally demanding than the original problem. The element-wise decomposition has one big advantage in simplicity of data preparation: the user can use any standard finite-element mesh generator and does not have to worry about definition of subdomains. This may be particularly advantageous in case of highly irregular meshes.

Computational results
In the following tables, we present results of the N x × N y examples using the chordal and arrow decomposition. In these tables the first row of numbers shows data for the original problem (13), the remaining rows are for the decomposed problems. The first column shows the number of subdomains, the next two ones the number of variables and the size of the largest matrix inequality. After that, we present the total number of iterations needed by MOSEK before it terminated. The next two columns show the total CPU time and CPU time per one iteration and are followed by columns reporting speed-up relative to the original problem formulation, both total and per iteration.
In the final column we see the MOSEK constant MSK_DINF_INTPNT_OPT_ STATUS, a number that is supposed to converge to 1. Let us call this constant μ, for brevity. In our experience, MOSEK delivers acceptable solution reporting "Solution status: OPTIMAL" when 0.999 ≤ μ ≤ 1.0009.
When μ is farther away from 1, MOSEK, typically in these examples, announces "Solution status: NEAR_OPTIMAL." For instance, in the 120 × 60 example with chordal decomposition with 800 subdomains, MOSEK finished with μ = 0.9946 and the final objective value was correct to 3 digits, while with 1800 subdomains MOSEK reported μ = 0.9865 and we only got 2 correct digits in the objective function.
We first present results for the 40 × 20 example using the chordal decomposition; see Table 1.  The table shows that while we increase the number of the subdomains (refine the decomposition), the number of variables increases (those are the additional matrix variables in chordal decomposition) and the size of the constraints decreases. We can further see from Table 1 that the total number of iterations needed to solve any of the problem formulations is almost constant. The main message of Table 1 is in columns 5 and 6; here we can see tremendous decrease in the CPU time when solving the decomposed problems.
We now solve the same 40 × 20 example using the arrow decomposition. The results are presented in Table 2. We have added two more columns showing the speedup relative to the undecomposed problem, both total and per iteration.
In all examples presented in Table 2, MOSEK reported Optimal solution status. Comparing result in Tables 1 and 2, we can see that the arrow decomposition is not only more efficient than the chordal one, due to smaller number of variables, but also delivers more accurate solution, i.e., a better conditioned SDO problem.
For a comparison, in Table 3 we present result for example 40 × 20 obtained by solving problems decomposed by the automatic decomposition software SparseCoLO. In this case, the size of the 34 matrix constraints varied from 11 to 260. The decomposed problem is still solved more efficiently that the original one but that speed-up is negligible, compared to either the chordal or the arrow decomposition from Tables 1 and 2.
The next Table 4 presents results for the 80 × 40 discretization and chordal decomposition, while Table 5 present the results for the same problem using arrow decomposition. This was the largest problem we could solve by MOSEK in the original formulation (13), due to memory limitations.
As we can see, for a larger problem the speed-up obtained by arrow decomposition is even more significant. matrix inequality is always at most 19. The CPU times for original formulation of the larger problems have been extrapolated and are denoted by the † symbol. The last row of Table 8 presents the estimate of computational complexity of each approach, as a function cν q of problem size ν; in this case, ν is the number of variables of the SDO problem, as reported in the table. The exponent q is estimated from the CPU times. In case of the original, undecomposed problem, we calculated q ≈ 3.18 which slightly underestimates the theoretical complexity of interior point methods for SDO. The decomposed problem, on the other hand, exhibits linear complexity with q ≈ 1.0006. See also Fig. 10 for graphical representation of the complexity of the original problem (top line), single iteration of the original problem (middle line) and of the decomposed problem (bottom line). This, in our opinion, is the principal contribution of the arrow decomposition method.