A predictor-corrector algorithm for semidefinite programming that uses the factor width cone

We propose an interior point method (IPM) for solving semidefinite programming problems (SDPs). The standard interior point algorithms used to solve SDPs work in the space of positive semidefinite matrices. Contrary to that the proposed algorithm works in the cone of matrices of constant \emph{factor width}. This adaptation makes the proposed method more suitable for parallelization than the standard IPM. We prove global convergence and provide a complexity analysis. Our work is inspired by a series of papers by Ahmadi, Dash, Majumdar and Hall, and builds upon a recent preprint by Roig-Solvas and Sznaier [arXiv:2202.12374, 2022].


Introduction
Semidefinite programming problems (SDPs) are a generalization of linear programming problems (LPs).While capturing a much larger set of problems, SDPs are still being solvable up to fixed precision in polynomial time in terms of the input data [16]; see [11] for the complexity in the Turing model of computation.In practice this is, however, more complicated.While we are able to solve linear programs with millions of variables and constraints routinely, SDPs become intractable already for a few tens of thousands of constraints and for n × n matrix variables of the order n ≈ 1, 000.The reason is that each iteration of a typical interior point algorithm for SDP requires O(n 3 m + n 2 m 2 + m 3 ) operations, where n is the size of the matrix variable and m is the number of equality constraints; see e.g.[5].However, solving large instances of SDPs is of growing interest, due to applications in power flow problems on large power grids, SDP-based hierarchies for polynomial and combinatorial problems, etc (see [13,22,24]).In the following we will revisit a relaxation of a given SDP, where the cone of positive semidefinite matrices is replaced by a more tractable cone, namely the cone of matrices of constant factor width [8].The simplest examples of matrices of constant factor width are non-negative diagonal matrices (corresponding to linear programs), and scaled diagonally dominant matrices (corresponding to second order cone programming) [3].We then review how iteratively rotating the cone and solving the given optimization problem over this new set leads to a non-increasing sequence of optimal values lower bounded by the optimum of the sought SDP.This iterative procedure, due to [1], does not lead to a convergent algorithm.However, its essence can be used to construct a convergent predictor-corrector interior point method, as was done in [18].Our paper is inspired by ideas from [3,2,1,4,18].In particular, we will extend the results in [18], and give a more concise complexity analysis in our extended setting.

Iterative approximation scheme
Let the set of symmetric n × n matrices be given by S n , where n ∈ N is a positive integer.We write [m] for the set {1, 2, . . ., m}, where m ∈ N. Consider a set {A i ∈ S n : i ∈ [m]} of symmetric data matrices and define the linear operator A(X) = ( A 1 , X , . . ., A m , X ) ∈ R m , where X, Y := tr(XY ) for X, Y ∈ S n .Further, define for b ∈ R m the affine subspace Consider the following semidefinite program which we assume to be strictly feasible.Replacing the cone of positive semidefinite (psd) matrices in (2) by a cone K ⊆ S n + , which is more tractable, leads to the following program v K = inf A 0 , X : A(X) = b, X ∈ K, where K ⊆ S n + .
(3) Clearly, v K ≥ v * SDP .The quality of the approximation depends on the chosen cone K.In [3], while focusing on sums-of-squares optimization the authors consider the cones of diagonally dominant and scaled diagonally dominant matrices.Ahmadi and Hall developed the idea of replacing the psd cone by a simpler cone further in [1], leveraging an optimal solution of the relaxation.Essentially, the idea is as follows.Define the feasible set for (2) as We will consider a sequence of strictly feasible points for (3), denoted by X for = 0, 1 . ... Since X 0, the matrix X 1/2 is well-defined.One can update the data matrices in the following way giving rise to a new linear operator We may also refer to this operation as rescaling with respect to X .Via this rescaling one obtains the following sequence of reformulations of (2) whose feasible set we define as For each the identity matrix is feasible, i.e., we have X = I ∈ F SDP .To see this, note that for all i ∈ [m] we have Similarly, the identity leads to the same objective value in (4) as X in (3).Let X 0 be an optimal solution to (3).
Rescaling with respect to X 0 we find by the same reasoning that v (0) Reiterating this procedure leads to a non-increasing sequence of values v lower bounded by v * SDP .Unfortunately, this procedure does not converge to the true optimum of (2) in general, as mentioned in [18].Indeed, it can happen that lim inf →∞ v ( ) SDP .The rest of this paper is devoted to the development and analysis of an algorithm, which converges to the optimal value v * SDP .We thereby generalize results from [18].

Outline of the paper
This paper is conceptually divided into two parts.The first part contains sections 1 and 2 and is devoted to introducing the setting as well as the algorithm.Our aim with the first part is to convey the concept in a comprehensible way.The second part consists of the remaining sections 3-6.It is more technical and contains the derivation of objects used in the algorithm as well as the formal complexity analysis.

The factor width cone
Fix n ∈ N. The cone of n × n matrices of factor width k, denoted by FW n (k), is defined as The notion of factor width was first used in [8] where the authors proved that FW n (2) is the cone of scaled diagonally dominant matrices.Trivially, FW n (1) is the cone of non-negative n × n diagonal matrices.Clearly, we have that It is easy to see these cones are proper.As they define an inner approximation of the cone S n + we may use them in the aforementioned iterative scheme.Define An optimization problem over the cone FW n (k) may be formulated as an optimization problem over the cone product
To see this we need to consider principal submatrices.For a matrix S ∈ R n×n we define the principal submatrix S J,J for J ⊆ [n] to be the restriction of S to rows and columns whose indices appear in J. Further, for a set J = {i 1 , . . ., i |J| } ⊆ [n] and a matrix S ∈ R |J|×|J| we define the n × n matrix S →n J as follows for i, j In other words, S →n J has S J as principal sub-matrix indexed by J, and zeros elsewhere.Now, to write a program over FW n (k) as an SDP note the following lemma.Lemma 1.For any X ∈ FW n (k) we have that Proof.The proof is straightforward and omitted for the sake of brevity.
Thus, we can write as It is straightforward to show that the dual cone is given by The dual cone has been studied in the context of semidefinite optimization in [7], where it was shown that the distance of FW n (k) * and S n + in the Frobenius norm can be upper bounded by n−k n+k−2 for matrices of trace 1.For k ≥ 3n/4 and n ≥ 97 this bound can be improved to O(n −3/2 ) (see [7]).

Interior point methods and the central path
Interior point methods (IPMs) are among the most commonly used algorithms to solve conic optimization problems in practice.Notable software for IPMs include Mosek [15], CSDP [9], SDPA [21,12], SeDuMi [19] and SDPT3 [20].In the remainder of this section, we will closely follow the notation used in [17], since we will make use of several results from this book.Consider the following conic optimization problem for a proper convex cone K ⊂ R n : In IPMs the cone membership constraint is replaced by adding a convex penalty function f to the objective.This function f is a so-called self-concordant barrier function.Loosely speaking, the function f returns larger values the closer the input is to the boundary of the cone and tends to infinity as the boundary is approached.In order to formally define self-concordant barrier functionals, let f : R n ⊃ D f → R be such that its Hessian H(x) is positive definite (pd) for all x ∈ D f .With respect to this function, we can define a local inner product as follows u, v x := u, H(x)v , where u, v ∈ R n and •, • is some reference inner product.Let B x (y, r) be the open ball centered at y with radius r > 0 whose radius is measured by || • || x , i.e., the norm arising from the local inner product at x. Definition 1. (see [17, § 2.2.1])A functional f is called (strongly non-degenerate) self-concordant if for all x ∈ D f we have that B x (x, 1) ⊂ D f and whenever y ∈ B x (x, 1) we have A functional f is called a self-concordant barrier functional if f is self-concordant and additionally satisfies where g(x) is the gradient of f .
We refer to ϑ f as the complexity value of f (see [17, p. 35]), which will become crucial in our complexity analysis.Henceforth, let f be a self-concordant barrier functional for K and consider the following family of problems for positive η ∈ R + The minimizers z η of (9) define a curve, parametrized by η in the interior of K.This curve is called the central path.
For η → ∞ one can show that z η → x * .Interior point methods work by subsequently approximating a sequence of points {z ηi : i = 1, . . ., N } on the central path, where η 1 < η 2 < . . .such that z η N is within the desired distance to the optimal solution.The type of interior point method we consider is an adaptation of the predictor-corrector method (see [17, § 2.4.4]).This method uses the ordinary affine scaling direction to produce a new point inside the cone with decreased objective value.Afterwards, a series of corrector steps is performed to obtain feasible solutions with the same objective value that lie increasingly close to the central path.Interior point methods typically rely on Newton's method in each step, where the convergence rate depends on the so-called Newton decrement.Definition 2. If f : R n → R has a gradient g(x) and positive definite Hessian H(x) 0 at a point x in its domain, then the Newton decrement of f at x is defined as For self-concordant functions f , a sufficiently small value of ∆(f, x), e.g., ∆(f, x) < 1/9, implies that x is close to the minimizer of f (cf.[17, Theorem 2.2.5]).
Suppose we are given a starting point x 0 , which is close to z η0 for some η 0 ∈ R. The affine-scaling direction is given by −c x0 := −H(x 0 )c and points approximately tangential to the central path in the direction of decreasing the objective value c, x (−H(z η0 )c is exactly tangential to the central path).The predictor step moves from x 0 a fixed fraction σ ∈ (0, 1) of the distance towards the boundary of the feasible set in the affine-scaling direction, thereby producing a new point x 1 satisfying c, x 1 < c, x 0 .The new point x 1 is not necessarily close to the central path.The algorithm then proceeds to produce a sequence of feasible points x 2 , x 3 , . . .satisfying c, x 1 = c, x i for i = 2, 3, . . .while each x i for i = 2, 3, . . . is closer to the central path than its predecessor x i−1 .In other words, the algorithm targets the point z η1 on the central path with the same objective value as x 1 and produces a sequence of points converging to z η1 .Once an x j is found such that ∆(f, x j ) < 1/9, the next predictor step is taken.This procedure is repeated until an ε-optimal solution is found.The corrector phase works by minimizing the self-concordant barrier restricted to the feasible affine space intersected with the set of all x ∈ R n such that c, x = c, x i , where x i is the point produced by the most recent predictor step.This minimization problem is solved iteratively by performing line searches along the direction given by the Newton step for the restricted functional.We provide a visualization of the predictor-corrector method in Figure 1.

Newton decrements for functions restricted to subspaces
If a self-concordant function f is restricted to a (translated) linear subspace L, and denoted by f |L , then the Newton decrement at x becomes x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 where • x is the norm induced by the inner product u, v x = u, H(x)v , and P L,x is the orthogonal projection onto L for the • x norm; see [17, § 1.6].
Note that we have where n(x) is the Newton step at x, i.e., n(x) = −H(x) −1 g(x).Hence, restricting the function f to a subspace L we find

A predictor-corrector method
In this subsection we propose our algorithm which makes use of the rescaling introduced in section 1.1.Our aim is to provide a comprehensible exposition, while the details are postponed to the second part of the paper, beginning with section 3.
Algorithm 1 is an adaption of the predictor-corrector method as described in [17, § 2.2.4].Before describing the algorithm in detail we fix some notation.Let We define the operator Ψ as where we made use of the notation defined in (6).Hence, if Y is a collection of positive semidefinite k × k matrices, then Ψ(Y) ∈ FW n (k).Furthermore, let where we denote for n, k ∈ N the binomial coefficient as n k =: C n k , so that Ψ(Y 0 ) = I.Now let X be a strictly feasible solution to a problem of form (2) and rescale the data matrices with respect to X .Recall the feasible set of the resulting SDP is given by Likewise, the feasible set of the factor width relaxation written over S (n,k) + (cf.( 8)) can be written as Note that I ∈ L and Y 0 ∈ L Ψ .We emphasize that, by definition, for any element Y ∈ L Ψ we have Ψ(Y) ∈ L .

Main method
The algorithm requires a feasible starting point X 0 close to the central path, which is used in the first rescaling step.We also require an ε > 0, i.e., our desired accuracy as well as a σ ∈ (0, 1) used in the predictor step.In the following let f FW(k) be a self-concordant barrier function for S (n,k) + (we postpone its derivation to section 3, for now we assume it exists and is efficiently computable).In the algorithm we denote the restriction of f FW(k) to the subspace null(L Ψ ) by f |null(L Ψ ) .The algorithm initializes = 0.The outer while loop repeats until an ε optimal solution is found.If after rescaling with respect to X the Newton decrement at Y 0 satisfies the predictor subroutine is called.Here, the affine-scaling direction is projected onto the null space of L Ψ , call it Z.Clearly, Y 0 + sZ ∈ L Ψ for all s ∈ R. Then the subroutine computes which provides the necessary notion of distance to the boundary in terms of Y 0 and Z.The returned point Y := Y 0 + σs * Z is feasible and decreases the objective value, as shown in section 5.If the Newton decrement is not small enough, the corrector subroutine is called.Let v = A 0 , X , i.e., the objective value of the previous iteration, and define |L Ψ (v ) at a point x i .The corrector step now computes until x i+1 is close enough to the central path of the rescaled problem over S (n,k) + and returns Y := x i+1 .We will prove in section 4 how this leads to a decrease in distance to the central path of original SDP.Note that multiple calls of the corrector step may be necessary as after rescaling the Newton decrement might not be small enough anymore.However, as we prove later on, the maximum number of corrector step can be bounded in terms of the problem data.Let Y be the point returned by one of the subroutines.We set

Termination criterion
In the predictor as well as in the corrector subroutine we solve a linear system for y ∈ R m .The solution of this linear system may be interpreted as a dual feasible solution provided the current iterate is sufficiently close to the central path.Hence, we can approximate the duality gap of our problem by calculating the difference where y is calculated in every subroutine call.We may use this as a termination criterion.Once the duality gap falls below some ε > 0 chosen beforehand, we terminate with an ε optimal solution.
Algorithm 1 Predictor-Corrector SDP algorithm using FW n (k) Solve for y: 3 Barrier functionals for S n + and FW n (k) In this section we derive the self-concordant barrier functional for the cone S (n,k) + which is used in the algorithm.Note that the ordinary self-concordant barrier for S n + is given by f SDP (X) = − log(det(X)).We will emphasize parallels to the work of Roig-Solvas and Sznaier [18].
In order to construct a self-concordant barrier function for our underlying set, we introduce the notions of hyper-graphs and edge colorings as well as a well-known result about these objects.Definition 3. A hyper-graph H = (V, E) consists of a set V = {1, . . ., n} of vertices and a set of hyper-edges E ⊆ {J ⊆ V : |J| ≥ 2}, which are subsets of the vertex set V .If all elements in E contain exactly k vertices, we call the corresponding hyper-graph k-uniform.Definition 4. Let H = (V, E) be a hyper-graph.A proper hyper-edge coloring with m colors is a partition of the hyper-edge set E into m disjoint sets, say E = ∪ i∈[m] S i such that S i ∩S j = ∅ if i = j, i.e., two hyper-edges that share a vertex are not in the same set.In other words, a proper hyper-edge coloring assigns a color to every hyper-edge such that, if a given vertex appears in two different hyper-edges, they have different colors.
Theorem 1 (Baranyai's theorem [6]).Let k, n ∈ N such that k|n and let K n k the complete k-uniform hyper-graph on n vertices.Then there exists a proper hyper-edge coloring using C n−1 k−1 colors.
In (8) we wrote a program over FW n (k) as an equivalent program over the cone product S (n,k) + . The algorithm uses a self-concordant barrier function over said cone product.The mapping Ψ from S (n,k) + to FW n (k) is surjective, but not bijective, since multiple elements in the former may give rise to the same element in the latter set.
Assumption 1.Throughout we will assume k|n for some n ∈ N and 2 ≤ k ∈ N.
In the following we will let J = {J ⊂ [n] : |J| = k} and Y = {Y J : J ∈ J } be a collection of n k matrices of size k × k.We recall the operator Ψ is defined as The following generalizes Lemma 4.4 in [18], where a similar result is proved for k = 2.It will be crucial in our analysis as it allows us to compare the values taken by the barrier functionals on S (n,k) + and S n + at Y and Ψ(Y), respectively.
Let us emphasize here that f FW(k) is a self-concordant barrier for S (n,k) + not FW n (k).Before proving Lemma 2 we need an auxiliary result which extends Lemma A.1 from [18] to general values of k such that k|n.To prove it we will make use of Theorem 1.
Proof.Let K n k be the complete k-uniform hyper-graph on n vertices.We can identify each hyper-edge is finite, we know that Z i 0.Moreover, since each S i induces a perfect matching, there exists a permutation matrix P i for every i = 1, . . ., C n−1 k−1 such that P i Z i P T i is a block-diagonal matrix with blocks Y J on the diagonal for J ∈ S i .From this we find Hence, completing the proof.
We continue to prove Lemma 2. In the proof we use Minkowski's determinant inequality, which we restate for convenience.Theorem 2. (Minkowski's determinant inequality, see, e.g.[14, Theorem 4.1.8])Let A, B ∈ S n + .Then (det(A + B)) Proof.(Lemma 2) The self-concordance of f FW(k) on int S (n,k)+ follows immediately from the self-concordance of − log det(X) on int (S n ).By assumption where the inequality follows from Minkowski's determinant inequality (14).Applying the logarithm on both sides and rearranging the left-hand-side yields Using the fact that the logarithm is concave we see

Multiplying by nC
The following corollary is analogous to Corollary 4.5 from [18].
Proof.The first statement follows when noting that each i ∈ [n] lies in exactly n−1 k−1 subsets of [n] of size k.The reason is that when fixing i, there are n − 1 elements left out of which we want to choose k − 1 more elements to make a set of size k.For the second statement note that The result follows when noting that

Relations of the barrier functions
To prove convergence of our algorithm we need two essential ingredients.First, we need to prove that the predictor step reduces the current objective value sufficiently, and secondly, we must prove that the corrector step converges to a point close to the central path.Moreover, we have to show that our criterion to decide which subroutine to call is valid.The issue here is that we compute the Newton decrement of f FW(k) at Y 0 , but we need to be able to assert that the Newton decrement of f SDP at X is small enough.
The next result we present will allow us to lower bound the progress made by the corrector step.For this we need to be able to compare the barrier functions for S n + and S (n,k) + . We assume we have a given feasible solution X such that A which we would like to compare to Suppose Y * is an approximate solution to (16).Defining we find that X ∈ F SDP for all .In other words, the points X we obtain via this procedure are all feasible for the original SDP (2).The following lemma allows us to lower bound the decrease achieved by one corrector step in terms of an element in S (n,k) + .
Lemma 4. Let Y * be a feasible solution to (16) and Y 0 as in (11).Further, let Proof.The proof follows immediately when noting that

Relation of the Newton decrements
In this subsection we will prove that we can upper bound the Newton decrement of f SDP at the identity in terms of the Newton decrement of f FW(k) at Y 0 .We now define the following operator where • denotes the Hadamard product.See Figure 2 for a visualization of the surjection from S (n,k) + to FW n (k).

This operator satisfies
An inner product on S (n,k) given by X , Suppose now X is a feasible solution to (5) such that A 0 , X = v.We define the vector b(v) := (v, b 1 , . . ., b m ) T as well as the two subspaces Note that we may also add an equality for the objective, in which case we will refer to the following operator The respective subspaces will be denoted as follows and When we consider the subspaces defined via the operator with respect to the initial data matrices, we omit the subscript , e.g., The following lemma corresponds to Lemma A.2 in [18], and allows us to bound the Newton decrement of f SDP |L in terms of f Proof.Following (10) we have and evaluating the expression at Y 0 we find where the second inequality follows from Lemma 5. Setting X = I and noting

Complexity analysis
We begin the complexity analysis with the following lemma, which helps us to check whether the current point is close enough to the central path of the SDP.Lemma 7. Let X be a feasible iterate for the SDP (15) and let the objective value at X be v.Define the two subspaces L Ψ (v), L as in (17), (12) respectively.Then, if Proof.By Lemma 6 we know that Let now z(v) be the point on the central path of the rotated SDP with objective value v and let the corresponding parameter be η v .By Theorem 2.2.5 from [17] we have Let X + be the point returned by taking a Newton step at X = I with respect to the function f SDP ηv restricted to L .By Theorem 2.2.3 in [17] we have ||z(v) − I|| 2 The Newton decrement of the rotated SDP being smaller than 1/9 means we can safely perform the next predictor step.
If the current point is too far away from the central path and one were to perform the predictor step the direction may not be approximately tangential to the central path.Hence, once the Newton decrement of the factor width program is small enough, so is the one of the SDP and we can perform the next predictor step, knowing the direction will be approximately tangential to the central path.After each predictor step we may have to take several corrector steps, to get back close to the central path.

Corrector step
We will now find an upper bound on the number of corrector steps needed to get close to the central path.We know from Lemma 4 that a decrease in the barrier for the factor width cone will lead to a decrease in the barrier function for our original SDP, meaning we made progress towards its central path.The following lemma asserts that if we are too far away from the central path we can attain at least a constant reduction in the barrier of the factor width cone and therefore obtaining a constant reduction in the SDP barrier as well.
Lemma 8. Let X be a feasible iterate for the SDP (15) and let the objective value at X be v.Define the subspace L Ψ (v) as in (17).
|L Ψ (v) , Y 0 > 1 14 the corrector step will employ a line search to find Y * , i.e. the point in L Ψ (v) that minimizes f FW(k) .Let n L Ψ (v) (Y 0 ) be the Newton step taken from Y 0 and let t = , where the norm in the denominator is the local norm at Y 0 induced by •, • (n.k) .Then, for we find by Theorem 2.2.2 in [17] .
Note that this implies together with Lemma 4 that Knowing each line search reduces the distance to the targeted point on the central path at least by a constant amount will allow us to bound the number of line searches we need to get close enough if we have an upper bound on the distance of the result of the predictor step and the corresponding point on the central path of the SDP.
Lemma 9. Let X 1 be close to a point z(v 1 ) on the central path of the SDP in the sense that ∆ f SDP L (v1) , X 1 ≤ 1 9 .Further, let X 2 be the result of the predictor step and z(v 2 ) be the point on the central path with the same objective value as X 2 .Then Proof.A proof of this statement for generic self-concordant barriers may be found on page 54 of [17].We have used that the barrier parameter for the barrier of the psd cone is given by ϑ f SDP = n.
Lemma 10.Let v 2 be the objective value of the result X 2 of the predictor step.The maximum number K of line searches needed to find a point where z(v 2 ) is the point on the central path with objective value v 2 .
Proof.We know that the distance between the result of the predictor phase and the targeted point on the central path is at most n log 1 1−σ + 1 154 by Lemma 9.Moreover, using Lemma 8 we find that in each corrector step we reduce this distance by at least , unless the SDP Newton decrement at I is already small enough to perform the next predictor step.If after rescaling the Newton decrement of the factor width program satisfies , thereby implying by Lemma 7 that I is not close to the central path of the SDP we can perform another corrector step yielding at least a constant decrease of of the distance to the central path, and rescale again.This process can be continued until we do not get such a constant decrease anymore at which point we know we must be close enough to the central path, in the sense of Lemma 7.This is because if the decrease is not greater than , from which follows by Lemma 7 that This implies we are close enough to the central path to perform the next predictor step.Hence, after at most 154 corrector steps we are close enough to the central path so that we can perform the next predictor step.

Predictor step
We will make use of the analysis of the short step interior point method discussed in Section 2.4.2 in [17].We will show that each predictor step reduces the objective value by an amount at least as large as the objective decrease by the short-step interior point method.This will allow us to conclude the maximum number of predictor steps needed to obtain an ε optimal solution of the given SDP.Note that the decrease in objective value obtained by our predictor method is as follows.Let X be the point from where the predictor method starts and −(A 0 ) X := −H(X)A 0 be the direction.Then for σ ≥ 1  4 we find A 0 , X − s * σ (A 0 ) X = (A 0 ) X , X − s * σ A 0 , (A 0 ) X ≤ A 0 , X − 1 4 (A 0 ) X X .
This implies the decrease is at least as large the one obtained in one iteration of the short-step method, as discussed in [17, § 2.4.2].Renegar's analysis shows that short-step method leads to an ε optimal solution in at most K = 10 ϑ f log(ϑ f /(ε η 0 )) steps, where η 0 is such that our starting point X 0 is close to z η0 .By an ε optimal solution we mean a feasible solution X such that v * SDP ≤ A 0 , X ≤ v * SDP + ε.

Predictor and corrector steps combined
Combining the complexity analysis of predictor and corrector steps we arrive at the following theorem.Theorem 3. Let X 0 be a feasible solution of the SDP (2) and assume it is close to some point z η0 on the corresponding central path in the sense that ∆ f SDP | L(v) , X 0 < 1/14, where L is as in (18) for v = A 0 , X 0 .Algorithm 1 converges to an ε optimal solution in at most steps.
The assumption of a starting point "close to the central path" may be satisfied by the self-dual embedding strategy [10].Alternatively, one may first solve an auxiliary SDP problem, as in [17, § 2.4.2], by using the algorithm we have presented.The solution of this auxiliary problem then yields a point close to the central path of the original SDP problem.
6 Discussion and future prospects We finish with a brief discussion on the prospects of efficient implementation of Algorithm 1.

Parallelization
Essentially, the contribution of the present paper lies in providing an algorithm for solving SDPs which is much more suitable for parallelization than the ordinary interior point method working over S n + .Given common memory access, the computation of the necessary data for the respective cone factors S k + is local, meaning these tasks can be distributed among processor cores leading to a runtime decrease since each corrector step involves n k parallel computations of O(k 3 m + k 2 m 2 + m 3 ) flops.This offers the potential to perform the centering steps much more quickly than for SDP interior point methods through parallel computation.

Replacing the predictor step
In their paper [18], the authors propose to perform a fixed number of decrease steps, where a decrease step consists of solving (7) and rescaling with respect to the optimal solution.In our algorithm we considered a different method to decrease the objective value, i.e., the predictor method, where we use the traditional SDP affine scaling direction.

Tractability of factor width cones
The entire approach described in this paper relies on the premise that one may optimize more efficiently over FW n (k) than over S n + .In practice this has not yet been demonstrated convincingly for k > 2, although the consensus is that it should be possible.Some recent ideas that could be useful in this regard are: • the idea to optimize over the dual cone of FW n (k) by utilizing clique trees [22] • a variation on the factor width cone involving fewer blocks [23].
In addition, it would be very helpful to know a computable self-concordant barrier functional for the cone FW n (k), as well as its complexity parameter.

Figure 1 :
Figure 1: Visualization of predictor-corrector method.Initial feasible solution close to central path (red) is given by x 1 .Algorithm performs predictor step returning x 2 .Corrector steps are taken until point close enough to central path (x 4 ) is found.Next predictor step returns x 5 .Corrector steps are taken until x 8 is found, which is close enough to central path to perform next predictor step returning x 9 .After one corrector step the final point x 10 is ε-close to x * .

Figure 2 :
Figure 2: Visualization the surjection from S (n,k) + to FW n (k)