Decomposition of arrow type positive semidefinite matrices with application to topology optimization

Decomposition of large matrix inequalities for matrices with chordal sparsity graph has been recently used by Kojima et al.\ \cite{kim2011exploiting} to reduce problem size of large scale semidefinite optimization (SDO) problems and thus increase efficiency of standard SDO software. A by-product of such a decomposition is the introduction of new dense small-size matrix variables. We will show that for arrow type matrices satisfying suitable assumptions, the additional matrix variables have rank one and can thus be replaced by vector variables of the same dimensions. This leads to significant improvement in efficiency of standard SDO software. We will apply this idea to the problem of topology optimization formulated as a large scale linear semidefinite optimization problem. Numerical examples will demonstrate tremendous speed-up in the solution of the decomposed problems, as compared to the original large scale problem. In our numerical example the decomposed problems exhibit linear growth in complexity, compared to the more than cubic growth in the original problem formulation. We will also give a connection of our approach to the standard theory of domain decomposition and show that the additional vector variables are outcomes of the corresponding discrete Steklov-Poincar\'{e} operators.

is then the solution of a linear system with this matrix. For problems with large matrix inequalities, it is often the first bottleneck that dominates the CPU time and that prevents the user from solving large scale problems.
To circumvent this obstacle, the technique of decomposition of a large matrix inequality into several smaller ones proved to be efficient, at least for certain classes of problems. Decomposition of positive semidefinite matrices with a certain sparsity pattern was first investigated in Agler et al. [1] and, independently, by Griewank and Toint [4]. An extensive study has been recently published by Vandenberghe and Andersen [13]. We will call this technique chordal decomposition. It was first used in semidefinite optimization by Kojima and his co-workers; see [3,10] and, more recently, [8]. The group also developed a preprocessing software for semidefinite optimization named SparseCoLO [2] that performs the decomposition of matrix constraints automatically.
The goal of this paper is twofold. Firstly, we introduce a new decomposition of arrow type positive semidefinite matrices called arrow decomposition. Unlike the chordal decomposition that generates additional dense matrix variables, arrow decomposition only requires additional vector variables of the same size, leading to significant reduction of number of variables in the decomposed problem. The second goal is to apply both decomposition techniques to the topology optimization problem. This problem arises from finite element discretization of a partial differential equation. We will show that techniques known from domain decomposition can be used to define the matrix decomposition. In particular, we will be able to control the number and size of the decomposed matrix inequalities. We will also give a connection of the arrow decomposition with the theory of domain decomposition and show that the additional vector variables are outcomes of the corresponding discrete Steklov-Poincaré operators.
To solve all semidefinite optimization problems, we will use the state of the art solver MOSEK [9]. Numerical examples will demonstrate tremendous speed-up in the solution of the decomposed problems, as compared to the original large scale problem. Moreover, in our numerical examples the arrow decomposition exhibits linear growth in complexity, compared to the higher than cubic growth when solving the original problem formulation.
Notation Let S n be the space of n × n symmetric matrices, A ∈ S n , and I ⊂ {1, . . . , n} with s = |I|. We denote by (A) i,j the (i, j)-th element of A; by (A) I the restriction of A to S s , i.e., the s × s submatrix of A with row and column indices from I ; by O m,n the m × n zero matrix; when the dimensions are clear from the context, we simply use O.
A matrix is called dense if all its elements are non-zeros. Otherwise, the matrix is called sparse.  and analogously S n + (G). Let G s (N s , E s ) be an induced subgraph of G(N, E). Notice the difference between S n (G s ) and S n (N s ). If A ∈ S n (N s ) then its restriction (A) N s is a dense matrix. This is not true for A ∈ S n (G s ), the sparsity pattern of which is given by the set of edges E s . In particular, S n (G s ) = S n (N s ) if and only if G s is a maximal clique.
Finally, for functions from R d → R we will use bold italics (such as u or u(ξ)), while for vectors resulting from finite element discretization of these functions, we will use the same symbol but in italics (e.g. u ∈ R n ).
2 Decomposition of positive semidefinite matrices

Matrices with chordal graphs
We first recall the well-studied case of matrices with chordal sparsity graph. The following theorem was proved independently by Grone, et al. [5], Griewank and Toint [4] and by Agler et al. [1]. A new, shorter proof can be found in [7].
Theorem 1 Let G(N, E) be an undirected graph with maximal cliques C 1 , . . . , C p . The following two statements are equivalent: Notice that this decomposition is not unique. However, Kakimura [7] has shown that there . . , p) and that p k=1 rank Y * k = rank A.

Matrices embedded in those with a chordal graph
Let A ∈ S n , n ≥ 3, with a sparsity graph G = (N, E). Let the set of nodes N = {1, 2, . . . , n} be partitioned into p ≥ 2 overlapping sets Let I k, denote the intersection of the kth and th set, i.e., There exist at least one index with 1 ≤ ≤ p, = k, such that I k ∩ I = ∅.
Assumption 2. I k ∪ I = I k for all 1 ≤ k, ≤ p, k = , i.e., no I is a subset of any I k .
Assumption 3. The intersections are "sparse" in the sense that for each k ∈ {1, . . . , p} there are at most p k indices i such that I k ∩ I i = ∅, i = 1, . . . , p k , where 1 ≤ p k p.
In a typical situation only I k,k+1 , k = 1, . . . , p − 1, are not empty (corresponding to a block diagonal matrix with overlapping blocks) or I k has a non-empty intersection with up to eight other sets (see Section 4).
Denote the induced subgraphs of G(N, E) corresponding to I k by G k (I k , E k ), k = 1, . . . , p. These subgraphs are not necessarily cliques. Assume that For all k = 1, . . . , p, let G k (I k , E k ) denote a completion of G k (I k , E k ), i.e., a clique in G(N, E). According to Assumption 2, G k (I k , E k ) are even maximal cliques. Clearly, Q k ∈ S n ( G k ).
Notice that the rather restrictive Assumption 4 is satisfied when A is a block diagonal matrix with overlapping blocks. It may not be satisfied in the application in Section 4; we will see, however, that it will not be needed in this application.
Theorem 2 Let Assumptions 1-4 hold. The following two statements are equivalent:
If I k, = ∅ or is not defined then S k, is a zero matrix.
Proof Using the chordal extension G(N, E) of G(N, E), we embed the matrix A into a set of matrices with chordal sparsity graphs with maximal cliques G k (I k , E k ), k = 1, . . . , p.
Then we can apply Theorem 1. Hence there exist matrices Y k ∈ S n + (I k ), k = 1, . . . , p, such that A = Y 1 + . . . + Y p . Now, Y k must be equal to Q k for the "internal" indices of I k , i.e., for all (i, j) ∈ I k \ : >k (I k, ) ∪ : <k (I ,k ) 2 . Therefore the unknown elements of Y k reduce to the overlaps I k, .
Having Q k and Y k , k = 1, . . . , p, we will now define matrices the S k, as follows. Firstly, for k = 1 we select any solution {S 1, } I 1, =∅ of the equation Notice that many elements of matrices S 1, (I 1, =∅ ) are uniquely defined by this equation. Only elements with indices from nonempty intersections I 1, ∩ I 1,k are not unique, as they appear in more than one matrix S •,• in the above equation. Now, for 1 < k < p, we solve the equation All matrices S ,k , < k, were defined in steps 1, . . . , k − 1, hence we are in the same situation as above and select any solution {S k, } >k,I k, =∅ of the above equation. Any selection of the non-unique elements of S •,• will be consistent with the last equation Q k and the assertion follows.

Arrow type matrices
Let us now consider a particular type of sparse matrices, the arrow type matrices. Let again A ∈ S n , n ≥ 3, and let I k , I k, and G k (I k , E k ), k = 1, . . . , p, be defined as in the previous section.
Finally, let C ∈ S m be positive definite. We define the following arrow type matrix: Let us recall that The simplest example of an arrow type matrix is a block diagonal matrix with overlapping blocks and with additional rows and columns corresponding to matrices B and C; see Figure 1. Notice that the matrices A k can also be sparse.
Notice, however, that the structure of the overlapping blocks can be more complicated and that, in general, A (the arrow "shaft") does not have to be a band matrix. Such matrices arise in the application introduced later in Section 3; see Figure 4 and 5. In this application, we will have m = 1, so that B will be an n-vector and C ∈ R. However, in this section we consider the more general situation which may be useful in other applications. We will first adapt Theorem 2 to the arrow type structure. If I k, = ∅ or is not defined then S k, is a zero matrix.
Proof A direct application of Theorem 2 with Q k = M k for k = 1, . . . , p − 1, and Q p = Under additional assumptions, we can strengthen the above corollary as follows.
Theorem 3 Let Assumptions 1-3 hold. Assume that A k 0, k = 1, . . . , p, A 0 and C 0. Let M be defined as in (1). The following two statements are equivalent: If I k, = ∅ or is not defined then D k, is a zero matrix.
Proof We will prove the theorem by constructing matrices D k,k+1 and C k . By assumption, A is positive definite, so that we can define Then We define D k,k+1 and C k as follows. For k = 1, we solve the equation As in the proof of Theorem 2, some elements of thus defined D 1, may not be unique; in this case, we just select a solution. Then, for any 1 < k < p, we solve the equation to define D k, , > k, analogously to Theorem 2. Any selection of the non-unique elements of D •,• will be consistent with the last equation (2) and (4) we see that D k, is only non-zero on I k, , (k, ) ∈ Θ p , as required.
Define further Now the matrices defined for k = 1, . . . , p by By the Schur complement theorem, the last inequality is equivalent to M 0. This completes the proof.
Let r be the number of non-empty sets I k, , (k, ) ∈ Θ p . Comparing Corollary 1 with Theorem 3 we see that both provide us with a decomposition of a "large" matrix inequality M 0 by a number of smaller ones M k 0, k = 1, . . . , p. However, while in Corollary 1 we have to introduce r additional matrix variables of sizes | I k, | × | I k, |, in Theorem 3 we only have r additional matrix variables of sizes |I k, | × m and p matrix variables of size m × m. Recall that m < min k, =1,...,p k< |I k, | and, in our application below, m = 1, so the additional variables in Theorem 3 are vectors instead of matrices of the same dimension in Corollary 1, offering thus significant reduction in the dimension of the additional variables. Notice that in Theorem 3 we only require Assumptions 1-3 to hold, we do not need the restrictive Assumption 4. This, in turn, means that if M satisfies assumptions of Theorem 3, we can apply Corollary 1 without verifying Assumption 4, because we can choose, by Theo- . This, of course, is only true for our specific definition of arrow type matrices. We will call the decomposition of arrow type matrices using Corollary 1 chordal decomposition and the one using Theorem 3 arrow decomposition.
Two natural questions arise: 1. Are the additional assumptions of Theorem 3 too restrictive? Are there any applications satisfying them? 2. Is it worth reducing the dimension of the additional variables? Will it bring any significant savings of CPU time when solving the decomposed problem?
Both questions will be answered in the rest of the paper using a problem from structural optimization.
the (small-)strain tensor. We assume that our system is governed by linear Hooke's law, i.e., the stress is a linear function of the strain where E is the elastic (plane-stress for d = 2) stiffness tensor. Assume that the boundary of Ω is partitioned as The weak form of the linear elasticity problem reads as: In the basic topology optimization, the design variable is the multiplier x ∈ L ∞ (Ω) of the elastic stiffness tensor E which is a function of the space variable ξ. We will consider the following constraints on x: in Ω with some given positive "volume" V and with x, The choice of L ∞ is due to the fact that we want to allow for material/no-material situations.
The minimum compliance single-load topology optimization problem reads as subject to u solves (5) The objective, the so called compliance functional, measures how well the structure can carry the load f . Problem (7) is now discretized using the standard finite element method; the details can be found, e.g., in [6,11]. In particular, we use quadrilateral elements, element-wise constant approximation of function x and element-wise bilinear approximation of the displacement field u. After discretization, the variables will be vectors x ∈ R m and u ∈ R n , where m is the number of finite elements and n the number of degrees of freedom (the number of finite element nodes times the spatial dimension). With every element we associate the local (symmetric and positive semidefinite) stiffness matrix K i and (for elements including part of the boundary Γ f ) the discrete load vector f i , i = 1, . . . , m. Now we can formulate the discretized version of the linear elasticity problem (5) as the following system of linear equations f i is the finite element assembly of the load vector. The topology optimization problem (7) becomes Using the Schur complement theorem, the compliance constraint and the equilibrium equation can be written as one matrix inequality constraint: The minimum compliance problem can then be formulated as follows: For ease of notation, in the rest of the paper we will restrict ourselves to the planar case d = 2. Generalization of all ideas to the three-dimensional case is straightforward.

Decomposition of the topology optimization problem (11)
Let Ω h ⊂ R 2 be a polygonal approximation of Ω discretized by finite elements. Assume that Ω h is partitioned into p non-overlapping subdomains D k , k = 1, . . . , p, whose boundaries coincide with finite element boundaries. In our examples Ω = Ω h is a rectangle, the underlying finite element mesh is regular and so is the partitioning into the subdomains. Confront Figure 2 that shows typical decomposition of Ω h into N x × N y subdomains. Let I k be the index set of all degrees of freedom associated with the subdomain D k , k = 1, . . . , p. The intersections of these index sets will include the degrees of freedom on the respective internal boundaries and will be again denoted by Denote by D k the index set of elements belonging to subdomain D k and define Matrix K (k) (x) = K (k) can then be partitioned as follows where the set Γ collects indices of all degrees of freedom corresponding with indices in one of he sets I ,k or I k, , = 1, . . . , p; the set I then collects indices of all remaining "interior" degrees of freedom in D k .
We are now in a position to apply the theorems from Section 2.
Case A -Chordal decomposition Let us first apply Corollary 1. It says that the matrix inequality Z(x) 0 from (11) can be equivalently replaced by the following matrix inequalities where The additional variables are the matrices, vectors and scalars Case B -Arrow decomposition Now we apply Theorem 3. In this case, the matrix inequality Z(x) 0 from (11) can be replaced by the following matrix inequalities where The additional variables g •,• and γ, respectively, have the same dimensions as the variables σ •,• and s in Case A. Recall that Theorem 3 does not use the restrictive Assumption 4 from Section 2. This is important, because Assumption 4 is not satisfied when the domain Ω contains holes, and so the decomposition technique would not be applicable to some practical problems. Consider, for instance, the finite element mesh in Figure 2 and assume that the (i, j)th subdomain is not part of the domain Ω, it is a hole with no finite elements. Then, even if we assume all matrices K (k) to be dense, the sparsity graph of K(x) is not chordal, as it contains the chordless cycle connecting (more than 3) nodes on the boundary of the internal hole.
Before formulating the decomposed version of problem (11) we notice that, according to Corollary 1 and Theorem 3, B , which means, in particular, that We will therefore replace the variable γ in the decomposed problems by either s k or γ k and the objective function by one of the above sums.
Case A Using the chordal decomposition approach, the decomposed optimization problem in variables A defined as in (13), (14),(15).
Case B Using the arrow decomposition approach, the decomposed optimization problem in variables defined as in (16),(17).
A versus B Consider now the finite element mesh and decomposition as in Figure 2 with n x × n y finite elements and N x × N y subdomains. Instead of (11) we can solve one of the decomposed problems (18) and (19). In Case A of the chordal decomposition the single matrix inequality of dimension (n + 1) × (n + 1) is replaced by N x · N y inequalities of dimension of order 2(n x /N x + 1)(n y /N y + 1) + 1 while we have to add N x (N y − 1) + (N x − 1)N y additional vectors σ •,• of a typical size 2(n x /N x + 1) or 2(n y /N y + 1), the same number of additional (dense) matrix variables S •,• of the same order and N x · N y scalar variables s • . (Recall that the factor 2 stems from the fact that there are two degrees of freedom at every finite element node.) In Case B of the arrow decomposition, the number and order of the new matrix constraints is the same as above but we only need the additional scalar and vector variables; the additional matrix variables are not necessary. Later in Section 6 we will see that this decomposition leads to enormous speed-up in computational time of a state-of-the-art SDO solver. We will also see that the omission of the additional matrix variables in the arrow decomposition can make a big difference.

Example 1
The notation used in the above decomposition approaches is rather cumbersome, so let us illustrate it using a simple example. Figure 3 presents  elements and 25 nodes. All nodes on the left-hand side are fixed and thus eliminated from the stiffness matrix. Hence the corresponding stiffness matrix will have dimension 40×40 (two degrees of freedom associated with every free finite element node, as depicted in the figure). The structure of the corresponding stiffness matrix K is shown in Figure 4; here the elements corresponding to interior degrees of freedom (index sets I) are denoted by circles, while elements associated with the the intersections I k, are marked by full dots.  The chordal decomposition problem (18) will have four matrix constraints, two of order 13 and two of order 19, and additional variables s ∈ R 6 , σ 1,4 , σ 2,3 ∈ R 2 , σ 1,2 ∈ R 4 , σ 1,3 , σ 2,4 , σ 3,4 ∈ R 6 and S 1,4 , S 2,3 ∈ S 2 , S 1,2 ∈ S 4 , S 1,3 , S 2,4 , S 3,4 ∈ S 6 . The arrow decomposition problem (19) will have the same number of matrix constraints as (18) and additional variables γ ∈ R 6 , g 1,4 , g 2,3 ∈ R 2 , g 1,2 ∈ R 4 , g 1,3 , g 2,4 , g 3,4 ∈ R 6 .

Decomposition by fictitious loads
So far, all the reasoning was purely algebraic. There is, however, an alternative, functional analytic view of the arrow decomposition in Theorem 3. We will present it in this section. The purpose is to illustrate a different viewpoint and so, to keep the notation simple, we will only consider the case of two subdomains.

Infinite dimensional setting
Let us recall the weak formulation (5) of the elasticity problem depending on parameter x: Let Ω be partitioned into two mutually disjoint subdomains Ω 1 and Ω 2 such that Ω 1 ∪ Ω 2 = Ω. Denote the interface boundary between the two subdomains by Γ I ; see Figure 6.
We consider the general situation when Γ u and Γ f may be a part of both, ∂Ω 1 ∩ ∂Ω and ∂Ω 2 ∩ ∂Ω. Define a i as a restriction of the bilinear form a to Ω i (the integral in (6) is simply computed over Ω i ), f i = f | ∂Ω i and Fig. 6 Partitioning of domain Ω into two subdomains with interface boundary Γ I . Consider the following "restricted" problems: The following theorem forms a basis of our approach.
Proof The requested function g is the outcome of the respective Steklov-Poincaré operator applied to u * ; see, e.g., [12].
In the above theorem, function g can be interpreted as a fictitious load applied to either of the problems (23),(24). The theorem says that there exists such a g that the solutions of (23),(24) are equivalent to the solution of the "full" problem (20) restricted to the respective subdomain. Or, in other words, the solutions of (23),(24) can be "glued" to form the solution of (20).

Finite dimensional setting
Now assume that the discretization of Ω is such that the interface boundary Γ I is a union of boundaries of some finite elements. More precisely, we assume that the index set of finite elements used to the discretization of Ω can be split into two disjoint subsets such that Ω i is discretized by elements with indices from D i , i = 1, 2. Define the restrictions of the load vector f on boundaries of Ω 1 and Ω 2 , respectively. Denote the index set of degrees of freedom associated with finite element nodes on Γ I by I 1,2 . Let n Γ be the dimension of I 1,2 .
Finally, for a vector in z ∈ R n Γ denote by ← → z its extension to R n : The discrete version of Theorem 4 can then be formulated as follows. (The following corollary is, in fact, trivial in the finite dimension; however, we need the above theorem to understand the meaning of the fictitious load and its existence in the original setting of the problem.) Corollary 2 Assume that u * solves (8). Then for all x ∈ R m there exists g ∈ R n Γ such that ( i∈D 1 Notice that (25), (26) are still systems of dimension n; however, many rows and columns in the matrix and the right hand side are equal to zero, so they can be solved as systems of dimensions |N (1) | and |N (2) |, respectively. Hence, if we knew the fictitious load g, we could replace the large system of equations (8) by two smaller ones which, numerically, would be more efficient. Of course, we do not know it. However, and this is the key idea of this section, the linear system (8) is a constraint in an optimization problem, hence we can add g among the variables and, instead of searching for the optimal design x and the corresponding u satisfying (8), search for optimal x and for a pair (u, g) satisfying two smaller equilibrium equations (25) and (26).
We can now formulate a result regarding the decomposition of the discretized topology optimization problem (9).
Proof The theorem follows from the comparison of the KKT conditions of both problems.
Using again the Shur complement theorem, we finally arrive at the decomposition of the SDO problem (11).
Corollary 3 Problem (11) can be equivalently formulated as follows: Problem (28) is now exactly the same as problem (19) arising from arrow decomposition applied to two subdomains.

Numerical experiments
The decomposition techniques described in the article were applied to an example whose data (geometry, boundary conditions and forces) are shown in Figure 7-left. We always use regular decomposition of the rectangular domain; an example of a decomposition into 8 subdomains is shown in Figure 7-right. We have used finite element meshes with up to 160×80 elements. We tested several codes to solve the SDO problems. Here we present results obtained by MOSEK, version 8.0 [9]. The reason for this is that MOSEK best demonstrated the decomposition idea; the speed-up achieved by the decomposition was most significant when using this software.
When solving the SDO problems, we used default MOSEK settings with the exception of duality gap parameter MSK_DPAR_INTPNT_CO_TOL_REL_GAP that was set to 10 −9 , instead of the default value 10 −8 . We will comment on the resulting accuracy of the solution later in the section.
We also tried to solve the smaller problems by SparseCoLO [2], software that performs the decomposition of matrix constraints based on Theorem 1 automatically. In particular, the software checks whether the matrix in question has a chordal sparsity graph; if not, the graph is completed to be chordal. After that, maximal cliques are found and Theorem 1 is applied. Because the sparsity graph of the matrix in problem (11) is not chordal, a chordal completion is performed by SparseCoLO. Such a completion is not unique and may thus lead to different sets of maximal cliques. And here is the main difference to our approach: while we can steer the decomposition to result in smaller matrix constraints of the same size, matrix constraints resulting from application of SparseCoLO are of variable size, some small, some rather large. This fact has a big effect on the efficiency of SparseCoLO, as we will see in the examples below.
In all experiments we used a 2018 MacBook Pro with 2.3GHz dual-core Intel Core i5, Turbo Boost up to 3.6GHz and 16GB RAM, and MATLAB version 9.2.0 (2017a).

Remark 1 (Element-wise decomposition)
The above text suggests that we always perform decomposition of the original finite element mesh into several (possibly uniform) sub-meshes, each of them having interior points; cf. Figures 2, 3, 7, and the notation used in Section 4. However, nothing prevents us from associating each subdomain with a finite element. When every subdomain consist of a single finite element, then the subdomains have no interior points, apart from those lying on the boundary of Ω and having no neighboring element.
For instance, in Example 1, Figure 3, these would only be degrees of freedom number 31,32,39,40. In the numerical examples below, we will see that the big number of additional variables makes this option less attractive that other decompositions. However, while not the most effective of all decompositions, it is still much less computationally demanding than the original problem. The element-wise decomposition has one big advantage in simplicity of data preparation: the user can use any standard finite-element mesh generator and does not have to worry about definition of subdomains. This may be particularly advantageous in case of highly irregular meshes.

Computational results
In the following tables, we present results of the N x ×N y examples using the chordal and arrow decomposition. In these tables the first row of numbers shows data for the original problem (11), the remaining rows are for the decomposed problems. The first column shows the number of subdomains, the next two ones the number of variables and the size of the largest matrix inequality. After that, we present the total number of iterations needed by MOSEK before it terminated. The next two columns show the total CPU time and CPU time per one iteration and are followed by columns reporting speed-up, both total and per iteration.
In the final column we see the MOSEK constant MSK_DINF_INTPNT_OPT_STATUS, a number that is supposed to converge to 1. Let us call this constant µ, for brevity. In our experience, MOSEK delivers acceptable solution reporting "Solution status: OPTIMAL" when 0.999 ≤ µ ≤ 1.0009 .
When µ is farther away from 1, MOSEK, typically in these examples, announces "Solution status: NEAR OPTIMAL." For instance, in the 120×60 example with chordal decomposition with 800 subdomains, MOSEK finished with µ = 0.9946 and the final objective value was correct to 3 digits, while with 1800 subdomains MOSEK reported µ = 0.9865 and we only got 2 correct digits in the objective function. We first present results for the 40×20 example using the chordal decomposition; see Table 1. The table shows that while we increase the number of the subdomains (refine the We now solve the same 40x20 example using the arrow decomposition. The results are presented in Table 2. We have added two more columns showing the speed-up, both total and per iteration. In all examples presented in Table 2, MOSEK reported Optimal solution status. Comparing result in Table 1 and Table 2, we can see that the arrow decomposition is not only more efficient than the chordal one, due to smaller number of variables, but also delivers more accurate solution, i.e., a better conditioned SDO problem. For a comparison, In Table 3 we present result for example 40×20 obtained by solving problems decomposed by the automatic decomposition software SparseCoLO. In this case, the size of the 34 matrix constraints varied from 11 to 260. The decomposed problem is still solved more efficiently that the original one but that speed-up is negligible, compared to either the chordal or the arrow decomposition from Tables 1 and 2. The next Table 4 presents results for the 80×40 discretization and chordal decomposition, while Table 5 present the results for the same problem using arrow decomposition. This was the largest problem we could solve by MOSEK in the original formulation 11 (due to memory restriction). As we can see, for a larger problem the speed-up obtained by arrow decomposition is even more significant.
Examples with finer discretization cannot be solved by MOSEK in the original formulation 11 (on the laptop we used for the experiments). They can, however, easily be solved in the decomposed setting. The results are presented in the next tables. In these tables, we also show estimated number of iterations and CPU time for the original problem; these numbers are extrapolated from the lower-dimensional problems (also those that are not presented here). Table 6 presents results for the 120×60 discretization and chordal decomposition, while Table 7 shows the results for the same example, this time using arrow decomposition. When using the chordal decomposition (Table 6), MOSEK had significant problems with convergence to the optimal solution. In case of 800 subdomains, the final objective value was correct to 3 digits, while for the 1800 subdomains only to 2 digits. In both cases, the solution status of MOSEK was "Nearly optimal". In case of arrow decomposition, all problems finished with "Optimal" solution status. Again, the arrow decomposition outperforms the chordal one, so from now on we will only focus on the arrow decomposition.
From the results presented so far, it seems that the most efficient decomposition is either the finest or the second-finest one (not counting the element-wise decomposition); in the first case, each subdomain contains four finite elements, in the second case 16 finite elements. To get a clearer idea about the relation of the problem size and speed-up, we present the next Table 8 of results for examples with dimension increasing from 40×20 to 160×80 elements. For each example we only consider the finest decomposition with four finite elements per subdomain. So the size of every matrix inequality is always at most 19. The CPU times for original formulation of the larger problems have been extrapolated and are denoted by the † symbol.