Second-order cone and semidefinite methods for the bisymmetric matrix approximation problem

Approximating the closest positive semi-definite bisymmetric matrix using the Frobenius norm to a data matrix is important in many engineering applications, communication theory and quantum physics. In this paper, we will use the interior point method to solve this problem. The problem will be reformulated into various forms, in the beginning as a semi-definite programming problem and later, into the form of a mixed semidefintie and second-order cone optimization problem. Numerical results comparing the efficiency of these methods with the alternating projection algorithm will be reported.

To take advantage of bisymmetric structure, an isometry operator bvec is introduced. The isometry operator, bvec takes n × n bisymmetric matrices into r vectors, where r given (1.3) and it is much smaller than n 2 . This operator will give our methods an advantage over the other methods.
Some notations are introduced now that will be used throughout the paper. The set of all n×n real symmetric matrices will be denoted by S n . The cone of all n × n real symmetric positive semi-definite matrices will be denoted by S + n , where S + n = {A : A ∈ IR n×n , and z T Az ≥ 0, ∀ z ∈ IR n } (1.1) is a convex cone of dimension n(n + 1)/2. Q h is a second-order cone of dimension h, and is defined as where . 2 stands for the Euclidean distance norm defined as y 2 = n i=1 y 2 i , ∀y ∈ IR n and x 2:h = [x 2 , x 2 , . . . , x h ] T . The set of all n × n real bisymmetric matrices will be denoted by B n , where which is a subspace of dimension r . In the following, the structure of an n × n real bisymmetric matrix B(b): where r = mn − k, m = n/2 and k = n(n − 2)/4 if n is even and m = (n + 1)/2 and k = (n − 1)(n + 1))/4 if n is odd. It is obvious that B n ⊂ S n . The norm defined on S n is the Frobenius norm expressed as follows: Here W • W = trace(W · W ) = n i, j W 2 i, j , vec(W ) means vectorization operator founded by putting all columns of the matrix W on top of each other as one column and vec T is the transpose of vec. We denote the partial ordering on S + n and Q h on S n and IR h , respectively, by and ≥ Q . That is, where y ≥ 0 for a vector y ∈ IR n stands for each component of y being nonnegative. We denote the zero and identity matrices by 0 and I , respectively. We can now describe our problem in mathematical notation as follows: Given a data matrix G ∈ IR n×n , find the closest positive semi-definite bisymmetric matrix B(b) to G such that G − B(b) F is minimal. Thus, we have the following minimization problem: (1.5) In Sect. 2, we describe briefly the alternating projection method. Although the rate of convergence is slow, the modified alternating projection method converges globally to the optimal solution. We can compare the results achieved by alternating projection method with our methods in Sects. 3 amd 4, since alternating projection method provides us with an accurate solutions. A brief description of semi-definite and second-order cone minimization problems alongside the interior point formulas of problem (1.5) in the formation of the relevant class will be given in Sects. 3 and 4, respectively. The performance of these primal-dual path-following methods against the alternating projection method is shown in numerical results Sect. 5.

The alternating projection method
The algorithm of this section is obtained from modified alternating projection method that are originally proposed by von Neumann [18] for finding the minimum distance from a given fixed point to an intersection of convex sets.
When applying the projection method to approximate a bisymmetric matrices, it is convenient to use the Frobenius norm, as expressed (1.4). To apply projection method, the projection maps P S (.) and P B (.) are needed. These projections are the maps from K = {G : G ∈ IR n×n } on to S + n and B n . The projection mapping P S (G) on to S + n is given by the formula [10] where + = r 0 0 0 , and r = diag [λ 1 , λ 2 , . . . , λ r ] is the diagonal matrix consisting from the nonnegative eigenvalues of . The mapping P B (G) formula onto B n is now given by The single valued projection mappings P S (G) and P B (G) given in (2.1) and (2.2) can now be used to execute the Dykstra algorithm. A data matrix G ∈ IR n×n is given, then the method is initialized by The iteration algorithm is given now by: Both sequences {P S (G (k) )} and {P B (P S (G (k) ))} produced by (2.3) converge globally to the optimal solution B * of (1.5), [8].

Semidefinite programming approach
The primal standard form for semi-definite programming (SDP) problem is given by: where all A i , D ∈ S n , b ∈ IR m are given and the variable is X ∈ S n . This minimization problem (3.1) is a convex minimization problem since its constraint and objective are convex. The dual problem of (3.1) is where the variable is y ∈ IR m . Many problems are special cases of problems (3.1) and (3.2) and there are many applications, in particular (1.5). The following lemma is useful in writing (1.5) in the form of (3.2):

Formulation I (SDV)
where t is a real nonnegative scalar and observing that: We can deduce from Lemma 3.1 the equivalency in the above inequalities. Therefore, we can write (1.5) as follows this is an SDP problem in the dual form (3.2) and the dimensions of this problem are r + 1 (see (1.3) number of variables) and n 2 + n + 2 (size of the matrices).

Formulation II (SDB)
Now, the SDP problem (3.3) is very huge even for a modest data matrix G. For example, a 30 × 30 matrix G will make the problem grow to a problem with dimensions 240 and 932, thus it is not efficient to solve (1.5) using formulation (3.3). Furthermore, the structure of the matrix B(b) being bisymmetric is not utilized. An alternative approach is to develop an SDP problem with acceptable dimensions and utilizes the bisymmetric structure of B(b). This can now be achieved by means of the following isometry operator.
where m = n/2 and g = 2 if n is even and m = (n + 1)/2 and g = 4 if n is odd.
It is clear that bvec is a linear operator taking the set of all n × n real bisymmetric matrices to IR r . The characterizations of bvec are given by the following lemma:

Lemma 3.3 Given the operator bvec, defined in the above definition, the following conditions hold: For any
. Proof It is clear from the definition of the operator bvec that Part 1 is satisfied. Part 2 is a consequence of Part 1.
From Part 1, it is clear that bvec is an isometry. To take advantage of the above lemma, we need G to be bisymmetric. Projecting G onto B n using the above orthogonal projection given in (2.2) makes a bisymmetric matrix, sayḠ. In the following proposition, we show that the closest bisymmetric positive semi-definite matrix toḠ is also the nearest to G.

Proposition 3.4 Given an orthogonal projectionḠ of G onto B n and let B(b) be the closest bisymmetric positive semi-definite matrix toḠ, then B(b) is so for G.
Proof The proof is complete ifḠ is positive semi-definite. If not, then for any T ∈ B n , we have (G −Ḡ) T • (Ḡ −T ) = 0 . This is true becauseḠ is the orthogonal projection of G. Thus, As a result of this proposition, the following problem is equivalent to (1.5): From Lemma 3.1, the following are equivalences (for t ≥ 0 ∈ IR): Therefore, we have the following SDP problem: The dimensions of this problem are r and n + r + 2. These dimensions are much better than (3.3) where r = mn − k, m = n/2 and k = n(n − 2)/4 if n is even and m = (n + 1)/2 and k = (n − 1)(n + 1))/4 if n is odd.

Formulation III (SDQ)
An alternative method for formulating (1.5) is by means of the Frobenius norm definition: For a nonnegative real scalar t, we have the following problem: (P 1/2 y) T (P 1/2 y) + 2q T y + s ≤ t ⇔ t − 2q T y − s − (P 1/2 y) T I (P 1/2 y) ≥ 0 Therefore, we deduce the following SDP problem: This is an SDP problem in the dual form (3.2) and the dimensions of this problem are r + 1 and n + r + 2. In spite of the fact that the dimensions of problem (3.5) is the same as the dimensions of problem (3.6), the latter is less efficient if we solve it using SDP method, especially when G is large in size. It has been found that in practice, as we will see later in Sect. 5, the performance of SDQ formula is not as efficient as in SDB formula. The reason for this inefficiency is the matrix P being of full rank which makes the system badly conditioned. We can develop a more efficient method for this formula by reformulating it over the second-order cone as described in Sect. 4, (see [12]). Formula SDQ appears to be straightforward. Nonetheless, we found that using this SDQ formula to solve related problems was not an adequate option. In the next section, we will explain the cause for that when we speak about mixed SDP and second-order cone programming method. In Sect. 5, we will understand this fact about SDQ formula when we use the formula to solve large numerical examples with n > 50. The SDV formula does not contend favorably with the other two SDB and SDQ formulations due to the quantity of work per one iteration of interior-point method which solve SDV formula. It is O(n 6 ), where n in the dimension of G and O(.) is the order of convergence. The SDV formulation is even slower than the projection method in some cases. Hence, using the SDV formulation to solve (1.5) is time consuming. The above discussion makes SDB formula as the best choice since we anticipate good performance while it does not have the illness of SDQ nor the large size of SDV.

Mixed semidefinite and second-order cone approach
Now, we explain the primal and the dual mixed semi-definite and second-order cone problem (SOCP) which is of the form: The variables are X S ∈ S n and X Q ∈ IR k . The given data matrices are C S , (D S ) i ∈ S n and C Q , (D Q ) i ∈ IR k , ∀i. In the two above inequalities each has a different explication : X S 0 means X S ∈ S + n and X Q ≥ Q 0 means that X Q ∈ Q k .
The dual problem of (4.1) is: Here, y ∈ IR m is the variable. The objective function of problem (1.5) can now be rewritten as a dual SOCP in three different ways.

Formulation IV (SQV)
One direction is to define Hence, if we place G − B(b) F ≤ t for t belong to IR + , the second-order cone definition gives us Therefore, we have the following equivalent problem to (1.5): where t ∈ IR + . The above problem is in the shape of problem (4.2). The second-order cone constraint is the difference between this form and SDV. The dimensions of this SQV problem are r + 1, the SDP part is n + 1 and the SOCP part is n 2 + 1. This makes us anticipate much less efficiency from SQV when we execute it.

Formulation V (SQQ)
The second formula is as established in Sect. 3.3, i.e.
Therefore, we have the following problem which is equivalent to problem (1.5) But y T Py + 2q T y + s = P 1/2 y + P −1/2 q 2 2 + s − q T P −1 q. Now, we optimize G − B(b) 2 F by minimizing P 1/2 y + P −1/2 q 2 . Then, we have the following equivalent problem: where t ∈ IR + . The above problem is in the shape of problem (4.2). The second-order cone constraint is the difference between this form and SDQ. The dimensions of this SQQ problem are r + 1, the SDP part is n + 1 and the SOCP part is r + 1. It may be noticed that we did not speak about the constraint of B(b) being bisymmetric. This is because the structure of the bisymmetric matrix B(b) is embedded inside the other constraints.

Formulation VI (SQB)
The last formula will take advantage and feature the bisymmetric structure of B(b) explicitly. In Sect. 3, we introduced the operator vectorization bvec on bisymmetric matrices. This operator will be used to develop SQB formula. Lemma 3.3 gives us the following: , which leads to: The second-order cone dimension in this form is r + 1, which is the same as that of SQQ. The mixed formulations are expected to be more efficient in practice than SDP-only formulas, particularly the SQQ and SQB which have the least dimension in the second-order cone constraint. For interior point method, the SOCP has superior worst-case complexity more than the SDP method. Nevertheless, SDB has a much less SDP dimension with no weakness like SDQ has, and that causes SDB a preferable choice among other SDP. This is due to the economical vectorization operator bvec. It was clear from practical experiments that the SQB formula shows competitive behaviour over SQQ and similar to SDB (see Sect. 5).

Numerical results
In this section, we compare and present the performance of the methods we discussed in previous sections. First, we present numerical results for the projection method and then we use NT-direction of the interior-point primal-dual path-following method. Next, we present numerical results for all the six different formulas of Sects. 3and 4, and compare them with the alternating projection method.
To implement the modified alternating projection method, a Matlab code was written and the iteration is stopped when P B (P S (G j )) − P S (G j ) F ≤ 10 −5 . For the six SDP and SOCP formulas, we used the software SDPT3 ver. 3.0 [17] because of its stability and its ability to utilize sparsity very efficiently.
Problem (1.5) was transformed into six different formulas as explained in Sects. 3 and 4. For each formula, we wrote a Matlab code. This Matlab code transforms the problem and passes it through to SDPT3 for a first run. A second run is done with the minimal iterate from the first run being the starting initial point. We repeat this process until no more progress is detected. We execute all the numerical experiment in this section using Matlab 9.0.
We applied all approaches to problems starting from small dense problems with n = 10 up to a large problems with n = 150. We apply the methods to obtain results as follows: we construct a matrix B positive definite bisymmetric randomly, then we perturb this matrix by adding random noise matrix N to B, where elements of B changes between −10.0 and 10.0. Then, the problem is to retrieve the matrix before the matrix noise was added. The optimal solution is found for all the cases with high accuracy, at least seven decimals, except for the projection method where we stop with five decimals of accuracy. Table 1 shows how close, in Frobenius norm for selected test problem and the minimal solution of each method, B * to the data matrix G. We show in Fig. 1 the comparison results, comparing the CPU time in ln of seconds against the size of the data matrix G. The correlation between the size of the matrix and the CPU time is shown in Fig. 1.
The SDV formulation does not compete favorably with the other formulations due to the volume of work at each iteration of interior-point methods which solve SDV formula in O(n 6 ), where O(.) is the order of convergence. The SDV formulation is even sometimes slower than the projection method. Hence, using the SDV formulation to solve (1.5) is time consuming. This leaves us with SDB and SQB formula in which we anticipate good performance since it does not have the weakness of SDQ or SQQ nor the huge size of SDV or SQV. However, we practically found that all formulations work well except for the SDV and SQV.