Skip to main content
Log in

Computational Complexity of Decomposing a Symmetric Matrix as a Sum of Positive Semidefinite and Diagonal Matrices

  • Published:
Foundations of Computational Mathematics Aims and scope Submit manuscript

Abstract

We study several variants of decomposing a symmetric matrix into a sum of a low-rank positive-semidefinite matrix and a diagonal matrix. Such decompositions have applications in factor analysis, and they have been studied for many decades. On the one hand, we prove that when the rank of the positive-semidefinite matrix in the decomposition is bounded above by an absolute constant, the problem can be solved in polynomial time. On the other hand, we prove that, in general, these problems as well as their certain approximation versions are all NP-hard. Finally, we prove that many of these low-rank decomposition problems are complete in the first-order theory of the reals, i.e., given any system of polynomial equations, we can write down a low-rank decomposition problem in polynomial time so that the original system has a solution iff our corresponding decomposition problem has a feasible solution of certain (lowest) rank.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1:
Algorithm 2:
Algorithm 3:
Fig. 1

Similar content being viewed by others

Notes

  1. for a definition, see [17]

References

  1. A. A. Albert. The matrices of factor analysis. Proc. Nat. Acad. Sci. U.S.A., 30:90–95, 1944.

    Article  MathSciNet  MATH  Google Scholar 

  2. Sal Barone and Saugata Basu. Refined bounds on the number of connected components of sign conditions on a variety. Discrete & Computational Geometry, 47(3):577–597, 2012.

    Article  MathSciNet  MATH  Google Scholar 

  3. Saugata Basu, Richard Pollack, and Marie-Françoise Roy. Algorithms in real algebraic geometry, volume 10 of Algorithms and Computation in Mathematics. Springer-Verlag, Berlin, second edition, 2006.

  4. Dimitris Bertsimas, Martin S. Copenhaver, and Rahul Mazumder. Certifiably optimal low rank factor analysis. The Journal of Machine Learning Research, 18(1):907–959, 2017.

  5. Dimitris Bertsimas, Ryan Cory-Wright, and Jean Pauphilet. Mixed-projection conic optimization: a new paradigm for modeling rank constraints. Oper. Res., 70(6):3321–3344, 2022.

    Article  MathSciNet  MATH  Google Scholar 

  6. L. Blum, M. Shub, and S. Smale. On a theory of computation and complexity over the real numbers: NP-completeness, recursive functions and universal machines. Bulletin of the American Mathematical Society, 21:1–46, 1989.

    Article  MathSciNet  MATH  Google Scholar 

  7. Christopher W. Brown and James H. Davenport. The complexity of quantifier elimination and cylindrical algebraic decomposition. In Proceedings of the 2007 International Symposium on Symbolic and Algebraic Computation, ISSAC ’07, page 54-60, New York, NY, USA, 2007. Association for Computing Machinery.

  8. John Canny. Some algebraic and geometric computations in PSPACE. In Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing, STOC ’88, page 460-467, New York, NY, USA, 1988. Association for Computing Machinery.

  9. George E. Collins. Quantifier elimination for real closed fields by cylindrical algebraic decomposition. In H. Brakhage, editor, Automata Theory and Formal Languages, pages 134–183, Berlin, Heidelberg, 1975. Springer Berlin Heidelberg.

  10. Giacomo Della Riccia and Alexander Shapiro. Minimum rank and minimum trace of covariance matrices. Psychometrika, 47(4):443–448, 1982.

  11. Mathias Drton, Bernd Sturmfels, and Seth Sullivant. Algebraic factor analysis: tetrads, pentads and beyond. Probability Theory and Related Fields, 138:463–493, 2007.

    Article  MathSciNet  MATH  Google Scholar 

  12. M. Fazel, H. Hindi, and S.P. Boyd. A rank minimization heuristic with application to minimum order system approximation. In Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148), volume 6, pages 4734–4739 vol.6, 2001.

  13. Bin Gao and P.-A. Absil. A Riemannian rank-adaptive method for low-rank matrix completion. Computational Optimization and Applications, 81:1–24, 2022.

  14. Gene H. Golub and Charles F. Van Loan. Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth edition, 2013.

  15. Monique Laurent. Cuts, matrix completions and graph rigidity. Math. Programming, 79(1-3, Ser. B):255–283, 1997.

  16. László Lovász. Graphs and geometry, volume 65 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, RI, 2019.

  17. Nikolai E Mnëv. The universality theorems on the classification problem of configuration varieties and convex polytopes varieties. In Topology and geometry-Rohlin seminar, pages 527–543. Springer, 1988.

  18. A. Nemirovskii. Several NP-hard problems arising in robust stability analysis. Math. Control Signals Systems, 6(2):99–105, 1993.

    Article  MathSciNet  MATH  Google Scholar 

  19. René Peeters. Orthogonal representations over finite fields and the chromatic number of graphs. Combinatorica, 16(3):417–431, 1996.

    Article  MathSciNet  MATH  Google Scholar 

  20. Svatopluk Poljak and Jiří Rohn. Checking robust nonsingularity is NP-hard. Math. Control Signals Systems, 6(1):1–9, 1993.

    Article  MathSciNet  MATH  Google Scholar 

  21. C Ramya. Recent progress on matrix rigidity–a survey. arXiv preprint arXiv:2009.09460, 2020.

  22. Benjamin Recht, Maryam Fazel, and Pablo A. Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3):471–501, 2010.

    Article  MathSciNet  MATH  Google Scholar 

  23. Benjamin Recht and Christopher Ré. Parallel stochastic gradient algorithms for large-scale matrix completion. Mathematical Programming Computation, 5(2):201–226, 2013.

    Article  MathSciNet  MATH  Google Scholar 

  24. F. Rouillier, M.-F. Roy, and M. Safey El Din. Finding at least one point in each connected component of a real algebraic set defined by a single equation. Journal of Complexity, 16(4):716–750, 2000.

    Article  MathSciNet  MATH  Google Scholar 

  25. J. Saunderson, V. Chandrasekaran, P. A. Parrilo, and A. S. Willsky. Diagonal and low-rank matrix decompositions, correlation matrices, and ellipsoid fitting. SIAM Journal on Matrix Analysis and Applications, 33(4):1395-1416, 2012.

    Article  MathSciNet  MATH  Google Scholar 

  26. J. B. Saxe. Two papers on graph embedding problems. Technical Report CMU-CS80-102, Department of Computer Science, Carnegie-Mellon University, 1980.

  27. Alexander Shapiro. Statistical inference of semidefinite programming. Math. Program., 174(1-2, Ser. B):77–97, 2019.

  28. Yaroslav Shitov. How hard is the tensor rank? arXiv preprint arXiv:1611.01559, 2016.

  29. Yilei Wu, Yingli Qin, and Mu Zhu. High-dimensional covariance matrix estimation using a low-rank and diagonal decomposition. Canadian Journal of Statistics, 48(2):308–337, 2020.

    Article  MathSciNet  MATH  Google Scholar 

  30. Fuzhen Zhang, editor. The Schur complement and its applications, volume 4. Springer Science & Business Media, 2006.

Download references

Acknowledgements

The authors are grateful to two anonymous referees for their careful reading of the paper and helpful comments that have significantly improved the presentation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen A. Vavasis.

Additional information

Communicated by Jim Renegar.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Levent Tunçel: Research of this author was supported in part by Discovery Grants from the Natural Sciences and Engineering Research Council (NSERC) of Canada, and by U.S. Office of Naval Research under Award Number N00014-18-1-2078.

Stephen A. Vavasis: Research of this author was supported in part by Discovery Grants from the Natural Sciences and Engineering Research Council (NSERC) of Canada.

Jingye Xu: Work of this author was completed during 2021–2022 while in the undergraduate program at David R. Cheriton School of Computer Science and the Department of Statistics and Actuarial Science, Faculty of Mathematics, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada. Research of this author was supported in part by Discovery Grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.

Appendix A: NP-hardness of \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\)

Appendix A: NP-hardness of \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\)

1.1 A.1 Restatement of \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\)

A polynomial p(n) is fixed in advance. One is given \(B\in \mathbb {S}^n\) such that \({{\,\mathrm{{\text {diag}}}\,}}(B)={\varvec{0}}\), an integer r, and a number \(\epsilon >0\). Assume that there exists a vector \({\textbf{d}}_0\ge {\varvec{0}}\) and matrix \(H_0\) such that \(\Vert H_0\Vert _F\le \epsilon \), \(B+H_0+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}_0)\succeq 0\), and \({{\,\mathrm{{\text {rank}}}\,}}(B+H_0+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}_0))\le r\). Find a vector \({\textbf{d}}\ge {\varvec{0}}\) such that there exists H such that \(\Vert H\Vert _F\le p(n)\epsilon \), \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\succeq 0\), and \({{\,\mathrm{{\text {rank}}}\,}}(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}))\le r\).

The construction in this appendix takes as input an undirected graph \({\mathcal {G}}\) and produces a matrix B, integer r, and number \(\epsilon >0\). If the graph is 3-colorable, then there exists \({\textbf{d}}\) and H such that \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\) is semidefinite and has rank r, and \(\Vert H\Vert _F\le \epsilon \). Therefore, a candidate algorithm that correctly solves \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\) must find some H and \({\textbf{d}}\) satisfying \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\succeq 0\), \({{\,\mathrm{{\text {rank}}}\,}}(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}))\le r\), and \(\Vert H\Vert _F\le p(n)\epsilon \). On the other hand, if \({\mathcal {G}}\) is not 3-colorable, our construction has the property that there is no (\(H,{\textbf{d}})\) satisfying the properties in the previous sentence, so the candidate algorithm that correctly solve \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\) must report failure on such an instance. In this way, we prove that any candidate algorithm for \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\) can solve the NP-hard problem of determining graph 3-colorability.

1.2 A.2 Preliminary Graph Construction

Suppose we are given a graph \({\mathcal {G}}\) that we wish to test for 3-colorability. Let \(c>0\) be a fixed integer. From \({\mathcal {G}}\), it is possible to construct a larger graph \({\mathcal {G}}'\) with the following property. If \({\mathcal {G}}\) is 3-colorable, then so is \({\mathcal {G}}'\). However, if \({\mathcal {G}}\) is not 3-colorable, then neither is any induced subgraph of \({\mathcal {G}}'\) whose number of nodes is \(|{\mathcal {V}}({\mathcal {G}}')|-c\) or larger. In other words, \({\mathcal {G}}'\) is “robustly” non-3-colorable in the sense that non-3-colorability is preserved under deletion of up to c nodes.

The construction of \({\mathcal {G}}'\) is as follows. Replace every node in \({\mathcal {G}}\) with a gadget consisting of \(3(c+1)\) nodes. Connect these nodes with a complete 3-partite graph. In other words, partition the nodes of the gadget into three sets of size \((c+1)\) each and then connect every node to all the nodes in the other two partitions. This requires a total of \(3(c+1)^2\) edges. The point is that in any 3-coloring of the gadget, the three partitions must be three different colors, and this property holds robustly under the deletion of any c nodes.

Now, for each original node of \({\mathcal {G}}\), pick one of the three partitions of its gadget to be “exposed”. For every edge \(\{i,j\}\) of the original graph \({\mathcal {G}}\), join all \(c+1\) exposed copies of i to all \(c+1\) exposed copies of j, so that the original edge of \({\mathcal {G}}\) is replaced by \((c+1)^2\) edges in \({\mathcal {G}}'\). It is an easy exercise to show that this construction has the claimed property.

For the remainder of this appendix, we assume that, first, the input graph is replaced by its Peeters supergraph, and second, the transformation described in this subsection has been applied to the supergraph. We therefore dispense with the notation \({\mathcal {G}}'\) for the transformed graph.

1.3 A.3 A Linear Algebraic Lemma

Lemma A.1

Let \(A\in \mathbb {S}^n\) be positive semidefinite, and suppose that all of its off-diagonal entries are negative. Then, \({{\,\mathrm{{\text {rank}}}\,}}(A)\ge n-1\). Furthermore,

  • If \({{\,\mathrm{{\text {rank}}}\,}}(A)=n-1\), then \({{\,\mathrm{{\text {Null}}}\,}}(A)\) is spanned by a vector all of whose entries are positive. Furthermore, no vector all of whose entries are positive can be in \({{\,\mathrm{{\text {Range}}}\,}}(A)\).

  • If \({{\,\mathrm{{\text {rank}}}\,}}(A)=n\), then all entries of \(A^{-1}\) are positive.

Proof

The proof is by induction on n. The base case \(n=1\) is obvious. Now suppose the claim holds for \(n-1\). Let us rewrite

$$\begin{aligned} A=\left( \begin{array}{cc} \alpha &{} {\textbf{w}}^{\top } \\ {\textbf{w}}&{} A' \end{array} \right) , \end{aligned}$$

in which \({\textbf{w}}<{\textbf{0}}\), and \(A'\in \mathbb {S}^{n-1}\) is semidefinite with negative off-diagonal entries. We also know \(\alpha >0\) by Lemma 2.1 since \({\textbf{w}}\ne {\textbf{0}}\). Therefore, the Schur complement \(S:=A'-{\textbf{w}}{\textbf{w}}^{\top }/\alpha \) exists and is semidefinite. However, observe that the subtracted term \({\textbf{w}}{\textbf{w}}^{\top }/\alpha \) is positive in all entries by the sign assumption. Since \(A'\) previously had negative off-diagonal entries, it follows that S must also have this property. Therefore, by the induction hypothesis, \({{\,\mathrm{{\text {rank}}}\,}}(S)\ge n-2\), hence \({{\,\mathrm{{\text {rank}}}\,}}(A)\ge n-1\).

Next, consider two subcases based on whether \({{\,\mathrm{{\text {rank}}}\,}}(S)=n-2\) or \({{\,\mathrm{{\text {rank}}}\,}}(S)=n-1\). In the first subcase, by induction, the 1-dimensional null space of S is spanned by some \({\textbf{v}}>{\varvec{0}}\). Let \({{\bar{{\textbf{v}}}}}=[-{\textbf{w}}^{\top }{\textbf{v}}/\alpha ; {\textbf{v}}]\). Then, one checks by multiplying out that \({{\bar{{\textbf{v}}}}}\) is in the null space of A. Vector \({{\bar{{\textbf{v}}}}}\) has all positive entries. Next, for a symmetric matrix, the range space and null space are orthogonal complements. Since the null space is spanned by an all-positive vector, it is not possible for an all-positive vector to be in the range (because the inner product of two positive vectors is positive rather than 0).

The other subcase is that \({{\,\mathrm{{\text {rank}}}\,}}(S)=n-1\). Then, one checks by multiplying out that

$$\begin{aligned} A^{-1}=\left( \begin{array}{cc} 1/\alpha + {\textbf{w}}^{\top }S^{-1}{\textbf{w}}/\alpha ^2 &{} -{\textbf{w}}^{\top }S^{-1}/\alpha \\ -S^{-1}{\textbf{w}}/\alpha &{} S^{-1} \end{array}\right) \end{aligned}$$

By the assumptions that \({\textbf{w}}<{\varvec{0}}\) and induction hypothesis that \(S^{-1}>0\), one confirms that all four blocks of this matrix are positive. \(\square \)

1.4 A.4 Construction of B and r

The reduction to show is \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\) is NP-hard is again from graph 3-coloring and is presented in this subsection. Let \(\mathcal {G}\) be the graph to be tested for 3-colorability. Based on the construction in A.2, we will assume that it is 3-colorable with robustness parameter c to be determined later. We define

$$\begin{aligned} B=\left( \begin{array}{cc} D &{} K^{\top } \\ K &{} A \end{array} \right) , \end{aligned}$$
(14)

where the blocks DKA will be defined in the remainder of this subsection. The entries of these matrices will depend on a ‘large’ parameter \(s>0\) and ‘small’ parameter \(\delta >0\). The precise value of \(\delta \) is given in (17) below. The value of s is not precisely specified; instead, s must be chosen large enough so that several inequalities involving the other parameters hold. All of these inequalities are lower bounds on s, so there is no possibility of conflicting inequalities for s. The valid range of parameter \(\epsilon \) (which appears in the statement of \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\)) is determined by (21).

First, block A is the same as the block A that was constructed before Theorem 3.10 used for both the \(({{\,\mathrm{{\text {P}}}\,}}_1)\) and \(({{\,\mathrm{{\text {P}}}\,}}_2)\) input constructions. Thus, \(A\in \mathbb {R}^{3n\times 3n}\), where for this appendix \(n:=|\mathcal {V}(\mathcal {G})|\).

As earlier, use the term “nonedge” to refer to a pair \(\{i,j\}\in \mathcal {K}(\mathcal {V}(\mathcal {G})){\setminus } \mathcal {E}(\mathcal {G})\). In other words, \(\{i,j\}\in \mathcal {E}({\bar{\mathcal {G}}})\), the complement of \(\mathcal {G}\). Let \({{\bar{m}}}\) denote the number of nonedges, which is equal to \(\left( {\begin{array}{c}n\\ 2\end{array}}\right) -|\mathcal {E}(\mathcal {G})|\).

Next, we turn to block K, which has nine columns per nonedge of \(\mathcal {G}\), i.e., \(9{\bar{m}}\) columns. Focus on one of those nine columns, say column u, for one particular nonedge \(\{i,j\}\in \mathcal {E}({\bar{\mathcal {G}}})\). This row is in correspondence with a particular ‘2’ and its symmetric partner in A (out of the nine pairs that correspond to \(\{i,j\}\)), say the ‘2’ that appears in position (kl) and (lk) of A. Then,

$$\begin{aligned} K(t,u)=\left\{ \begin{array}{ll} s, &{} \text{ if } t=k \text{ or } t=l, \\ s\delta , &{} \text{ else. } \end{array} \right. \end{aligned}$$

Thus, K has two kinds of entries, ‘small’ (namely, \(s\delta \)) and ‘large’ (namely s) with exactly two large entries per column. In what follows, we write \(K=K^\textrm{lg}+K^\textrm{sm}\) with the large and small entries, respectively.

The parameter denoted “p(n)” in the statement of \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\) is actually \(p(9{\bar{m}}+3n)\) in the context of the construction of this section. Since the argument of \(p(\cdot )\) does not change in this section, we will abbreviate \(p(9{{\bar{m}}}+3n)\) as \({\hat{p}}\). We will assume that \({\hat{p}}>1\) throughout.

Define D to be a \(9{{\bar{m}}}\times 9{{\bar{m}}}\) matrix all of whose off-diagonal entries equal \(-2{\hat{p}} \epsilon \) and whose diagonal entries are 0’s. Finally, define the rank cutoff \(r:=9{{\bar{m}}} + 3\).

The main theorem about this construction is as follows. Its proof occupies the remainder of this appendix. This theorem shows that graph 3-colorability can be reduced to \(({{\,\mathrm{{\overset{\sim }{\hbox {P}}}}\,}}_2)\), and thus, the latter is NP-hard.

Theorem A.2

Given a graph \(\mathcal {G}\), form \(\mathcal {G}'\) as in Section A.2, and from \(\mathcal {G}'\) form B given by (14). If \(\mathcal {G}\) is 3-colorable, then there exists an \(H\in \mathbb {S}^{9{\bar{m}}+3n}\) such that \(\Vert H\Vert _F\le \epsilon \) and a vector \({\textbf{d}}\in \mathbb {R}^{9{{\bar{m}}} + 3n}\) such that \({{\,\mathrm{{\text {rank}}}\,}}(B+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})+H)\le 9{{\bar{m}}} + 3\). On the other hand, if \(\mathcal {G}\) is not 3-colorable, then for any H such that \(\Vert H\Vert _F\le {\hat{p}}\epsilon \) and for any vector \({\textbf{d}}\), \({{\,\mathrm{{\text {rank}}}\,}}(B+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})+H)\ge 9{{\bar{m}}} + 4\).

1.5 A.5 Rank when \(\mathcal {G}\) is 3-Colorable

We argue that if \(\mathcal {G}\) is 3-colorable, then there exists a nonnegative \({\textbf{d}}\in \mathbb {R}^{9{\bar{m}}+3n}\) and perturbation \(H\in \mathbb {S}^{9{{\bar{m}}} + 3n}\) such that \(\Vert H\Vert _F\le \epsilon \) and such that \({{\,\mathrm{{\text {rank}}}\,}}(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}))\le 9{{\bar{m}}} +3\). Write \({\textbf{d}}=[{\textbf{d}}^\mathcal {E},{\textbf{d}}^\mathcal {V}]\), and partition

$$\begin{aligned} H=\left( \begin{array}{cc} H_1 &{} H_2^{\top } \\ H_2 &{} H_3 \end{array} \right) \end{aligned}$$

conformally with B. Each nonedge of \(\mathcal {G}\) corresponds to nine entries of \({\textbf{d}}^\mathcal {E}\). Consider diagonal entry \(u\in \{1,\ldots ,9{\bar{m}}\}\) corresponding to nonedge \(\{i,j\}\). Then, we choose

$$\begin{aligned} d^\mathcal {E}(u)= \left\{ \begin{array}{ll} s^2/2 &{} \text{ if } i, j \text{ have } \text{ different } \text{ colors }, \\ s^2 &{} \text{ else. } \end{array} \right. \end{aligned}$$

We choose \(H_1=0\), \(H_2=0\), and \(H_3\) will be specified below.

The goal is to show that we can choose \({\textbf{d}}^{\mathcal {V}}\) and \(H_3\) so that the Schur complement of the (1, 1) block of \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\), which is,

$$\begin{aligned} S:=A-K(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E}))^{-1}K^{\top }+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^{\mathcal {V}})+H_3, \end{aligned}$$

is positive semidefinite and of rank 3. We write as \(S=S'+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {V})+H_3\) where

$$\begin{aligned} S':=A-K(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E}))^{-1}K^{\top }. \end{aligned}$$

We begin by considering

$$\begin{aligned} S'':=A-K^\textrm{lg}{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})^{-1}(K^\textrm{lg})^{\top }, \end{aligned}$$

and below we apply Lemma 3.2 to bound \(\Vert S'-S''\Vert \).

Consider some entry (kl) of A corresponding to nonedge \(\{i,j\}\) such that ij have different colors. The entry (kl) of A, which equals 2, in turn corresponds to a column u of K. Entry (kl) of the Schur complement update \(K^\textrm{lg}{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})^{-1}(K^\textrm{lg})^{\top }\) is determined by column u of \(K^\textrm{lg}\) because no other column of \(K^\textrm{lg}\) has nonzero entries in both positions kl. In this case, \(K^\textrm{lg}(k,u)d^\mathcal {E}(u)^{-1}K^\textrm{lg}(l,u)= s(s^2/2)^{-1}s=2\), hence \(S''(k,l)=2-2=0\). On the other hand, if ij have the same color, then \(K^\textrm{lg}(k,u)d^\mathcal {E}(u)^{-1}K^\textrm{lg}(k,u)= s(s^2)^{-1}s=1\) and hence \(S''(k,l)=2-1=1\). Therefore, all off-diagonal entries of \(S''\) corresponding to nonedges in the same color class are 1’s, while those corresponding to different color classes are 0’s. Entries corresponding to edges are 0’s by construction and are not updated by \(K^\textrm{lg}{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})^{-1}(K^\textrm{lg})^{\top }\). Finally, entries corresponding to the same vertex but different representatives (i.e., off-diagonal entries of the \(3\times 3\) principal diagonal submatrices) are also 1’s by construction and are not updated. Thus, the off-diagonal entries of \(S''\) consist of three disjoint blocks of all 1’s.

Next, Lemma 3.2 states that

$$\begin{aligned}{} & {} \Vert S''-S'\Vert _F \le \frac{1}{2} \Vert {{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})^{-1}\Vert _F\cdot \Vert K^\textrm{lg}\Vert _F \cdot \nonumber \\{} & {} \quad \left( 9 \Vert K^\textrm{lg}\Vert _F \cdot \Vert {{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})^{-1}\Vert _F \cdot \Vert D\Vert _F + 5\Vert K^\textrm{sm}\Vert _F\right) . \end{aligned}$$
(15)

Note that the two hypotheses of the lemma can be confirmed using the formula (17) below and the assumption that s is sufficiently large. We have the following straightforward upper bounds based on construction of \(K,{\textbf{d}}^\mathcal {E},D\) and the fact that \(\Vert A\Vert _F\le \Vert A\Vert _\textrm{max}\cdot \sqrt{\textrm{nnz}(A)}\):

$$\begin{aligned} \Vert K^\textrm{lg}\Vert _F&\le \sqrt{18{\bar{m}}}s, \\ \Vert K^\textrm{sm}\Vert _F&\le \sqrt{27{\bar{m}}n}s\delta , \\ \Vert {{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})^{-1}\Vert _F&\le 18{\bar{m}}/s^2, \\ \Vert D\Vert _F&\le 6{\bar{m}}{\hat{p}} \epsilon . \end{aligned}$$

Substituting these bounds in (15) yields

$$\begin{aligned} \Vert S'-S''\Vert _F&\le 27\sqrt{2} {{{\bar{m}}}}^{3/2}s^{-1}\left( 15\sqrt{3{\bar{m}}n}s\delta +2916 \sqrt{2}{\bar{m}}^{5/2}{\hat{p}} s^{-1} \epsilon \right) \nonumber \\&\quad = 3^4 \cdot 5 \sqrt{6}{{{\bar{m}}}}^2n^{1/2}\delta + 2^3 \cdot 3^9 \cdot {\bar{m}}^{4}{\hat{p}}\epsilon /s^2. \end{aligned}$$
(16)

We can ensure both terms are at most \(\epsilon /2\) by defining

$$\begin{aligned} \delta :=\frac{\epsilon }{10 \cdot 3^4 \cdot \sqrt{6} {{{\bar{m}}}}^2 n^{1/2}}, \end{aligned}$$
(17)

and choosing s sufficiently large to upper bound the second term of (16) also by \(\epsilon /2\). Since \(\Vert S'-S''\Vert _F\le \epsilon \), we define \(H_3\) to equal \(S''-S'\) in off-diagonal entries, and equal to 0 on the diagonal. In this way, \(S'+H_3\) is a matrix whose off-diagonal entries are 0’s and 1’s, and the 1’s are arranged in three disjoint square blocks. Observe that the diagonal entries of \(S'+H_3\) are all negative since the diagonal entries of A and \(H_3\) are 0’s, while the diagonal entries of \(-K(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E}))^{-1}K^{\top }\) are negative. Therefore, there is some positive vector \({\textbf{d}}^\mathcal {V}\) such that the diagonal entries of \(S'+H_3+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {V})\) are all 1’s. This matrix is positive semidefinite, its rank is 3, and it is the Schur complement after eliminating the (1, 1) block of \(B+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})+H\).

1.6 A.6 Rank when \(\mathcal {G}\) is not 3-Colorable

The hypothesis of this subsection is that \(\mathcal {G}\) is not 3-colorable, and therefore, by robustness described in A.2, neither is any induced subgraph with at least \(|\mathcal {V}(\mathcal {G})|-c\) nodes, where c is to be determined. We prove that \({{\,\mathrm{{\text {rank}}}\,}}(B)\ge 9{{\bar{m}}}+4\) for any perturbation of size \(\hat{p}\cdot \epsilon \) plus any diagonal matrix that yields a positive semidefinite matrix. For this section, let H be a perturbation to B, and let \({\textbf{d}}\) be a vector, and suppose that \(\Vert H\Vert _F\le \hat{p}\epsilon \) and \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\succeq 0\). We will argue that \({{\,\mathrm{{\text {rank}}}\,}}(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}))\ge 9{{\bar{m}}} + 4\).

As in the previous section, partition H and \({\textbf{d}}\) conformally with B. The first aim of this subsection is to show that the (1, 1) block of \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\), namely \(D+H_1+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})\) is invertible. Once this is proved, the main task of this subsection is to show that rank of the Schur complement, that is, the rank of

$$\begin{aligned} S:=A+H_3-(K+H_2)(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1)^{-1}(K+H_2)^{\top }+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^V) \end{aligned}$$
(18)

is at least 4.

We first observe that since \(\Vert H_1\Vert _F\le {\hat{p}}\epsilon \), all off-diagonal entries of \(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1\) are negative because \(H_1\) is not large enough to cancel the negative entries of D. This means that \({{\,\mathrm{{\text {rank}}}\,}}(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1)\ge 9{\bar{m}}-1\) by Lemma A.1. In fact, we claim more strongly that \({{\,\mathrm{{\text {rank}}}\,}}(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1)= 9{\bar{m}}\). Consider summing all columns of \(K^{\top }\) to yield a \(9{\bar{m}}\)-length vector \(K^{\top }{\textbf{e}}\). Since each row of \(K^{\top }\) has exactly two entries equal to s and \(3n-2\) entries equal to \(s\delta \), each entry of \(K^{\top }{\textbf{e}}\) is exactly \((2+(3n-2)\delta )s\). Thus, each entry of \(K^{\top }{\textbf{e}}\) is at least 2s. Then, each entry of \((K+H_2)^{\top }{\textbf{e}}\) is at least 1.9s since \(\Vert H_2\Vert _F\le \epsilon \), and \(\epsilon \le 1\) is much smaller than s (i.e., choose s large enough to ensure this). Since \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\) is positive semidefinite, it follows from Lemma 2.1 that \((K+H_2)^{\top }{\textbf{e}}\in {{\,\mathrm{{\text {Range}}}\,}}(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1)\). Since this vector \((K+H_2)^{\top }{\textbf{e}}\) has all positive entries, it then follows from Lemma A.1 that \({{\,\mathrm{{\text {rank}}}\,}}(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1)=9{\bar{m}}\).

Thus, for the rest of this analysis, S given by (18) is well-defined, and we define \(\Pi :=(D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1)^{-1}\), which is a factor appearing in the formula for S. By Lemma A.1, \(\Pi >0\). As noted in Sect. 3, the hurdle in the analysis of \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\) is that there is no prior lower bound on the entries of \({\textbf{d}}^\mathcal {E}\), in other words, no prior upper bound on \(\Vert \Pi \Vert _F\). Therefore, we take cases depending on the sizes of diagonal entries of \(\Pi \). (Recall that the largest entry of a positive-semidefinite matrix always appears on the diagonal.)

\(\underline{\hbox {Case 1, For at least one} u^*\in \{1,\ldots ,9{\bar{m}}\}, \Pi (u^*,u^*)\ge \pi _1}\). Here,

$$\begin{aligned} \pi _1= \frac{2{\hat{p}}\epsilon }{s^2\delta ^2}. \end{aligned}$$
(19)

Let us denote the entry of A corresponding to \(u^*\) as A(kl). Therefore, \(A(k,l)=2\), and (kl) corresponds to a nonedge \(\{i,j\}\) of \(\mathcal {G}\). We claim entries in columns k and l of \(A-(K+H_2)\Pi (K+H_2)^{\top }\) are negative. The update \((K+H_2)\Pi (K+H_2)^{\top }\) may be written as a sum of rank-one matrices:

$$\begin{aligned} (K+H_2)\Pi (K+H_2)^{\top }=\sum _{u=1}^{9{{\bar{m}}}}\sum _{u'=1}^{9{{\bar{m}}}}(K+H_2)(:,u)\Pi (u,u')(K+H_2)(:,u')^{\top }. \end{aligned}$$

Consider one term, namely \((K+H_2)(:,u^*)\Pi (u^*,u^*)(K+H_2)(:,u^*)^{\top }\). Note that \((K+H_2)({:,u^*})\) has entries at least 0.9s in positions kl and entries at least \(0.9s\delta \) in the remaining positions because K has entries s and \(s\delta \) in these positions, and \(H_2\) is much smaller assuming s is chosen sufficiently large. Therefore, every entry in row k of \((K+H_2)({:,u^*})\Pi (u^*,u^*)(K+H_2)({:,u^*})^{\top }\) is at least \(0.9s\cdot \pi _1\cdot 0.9s\delta =1.62s^2\delta ({\hat{p}}\epsilon )/(s^2\delta ^2)=1.62{\hat{p}}\epsilon /\delta >3\), by assumption that \({\hat{p}}\ge 1\) and \(\epsilon /\delta >10\) by (17). This is just one term in the update, but since all \((9{{\bar{m}}})^2\) terms are positive matrices, the remaining terms can only further increase the overall product. Thus, since \(A(k,l')\le 2\) for all \(l'=1,\ldots ,3n\), and the \((k,l')\) entry of the update is at least 3, we conclude that the \((k,l')\) entry of \(A-(K+H_2)\Pi (K+H_2)^{\top }\) is bounded above by \(-1\). This same analysis applies to the \((l,l')\) entry for all \(l'=1,\ldots ,3n\).

Next, consider an entry \(A(k',l')\) corresponding to an edge \(\{i',j'\}\) in \(\mathcal {G}\). By our construction, \(A(k',l')=0\). Observe that entry \((k',l')\) of \((K+H_2)\Pi (K+H_2)^{\top }\) is at least \((0.9s\delta )^2\pi _1=1.62s^2\delta ^2({\hat{p}}\epsilon )/(s^2\delta ^2)=1.62{\hat{p}}\epsilon \). This is seen by considering the same term used in the previous paragraph, namely entry \((k',l')\) of \((K+H_2)(:,u^*)\Pi (u^*,u^*)(K+H_2)(:,u^*)^{\top }\), noting that the other terms can only increase the update. Thus, the \((k',l')\) entry of \(A -(K+H_2)\Pi (K+H_2)^{\top }\) is bounded above by \(-1.62{\hat{p}}\epsilon \).

Since the off-diagonal entries of S (given by (18)) equal \(A-(K+H_2)\Pi (K+H_2)^{\top }+H_3\) and \(\Vert H_3\Vert _F\le {\hat{p}}\epsilon \), we conclude that all entries of S analyzed in the previous few paragraphs are negative. In other words, all entries of S in columns k and l and all off-diagonal entries corresponding to edges of \(\mathcal {G}\) are negative.

Find a 3-cycle in \(\mathcal {G}\) that does not include either i or j, say \(i_1,i_2,i_3\in \mathcal {V}(\mathcal {G})\). Note that \(\mathcal {G}\) has many 3-cycles if we first apply the robustifying transformation described in A.2, so certainly such a cycle can be found. Each of these three nodes corresponds in turn to three rows of A; let \(l_1,l_2,l_3\) be three such rows of A (choose among the three representatives of each of \(i_1,i_2,i_3\) arbitrarily). Now consider the \(5\times 5\) principal submatrix of S indexed by \(k,l,l_1,l_2,l_3\). All the off-diagonal entries in this matrix are negative as argued in the previous paragraph. Therefore, the rank of this submatrix is at least 4 by Lemma A.1. This concludes the analysis of Case 1.

\(\underline{\hbox {Case 2,} \Pi (u,u')<\pi _1 \hbox {, for all} u,u'=1,\ldots ,9{\bar{m}}.}\) As in Case 1, \(\pi _1\) is given by (19).

The assumption of this case allows us to derive a stronger upper bound than \(\pi _1\) on the off-diagonal entries of \(\Pi \). We already know that \(\Pi \) is positive definite and positive. Recall from the definition of \(\Pi \) that \(\Pi (D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1)=I\). Focusing on a diagonal entry of this product, we know that for each \(u=1,\ldots ,9{{\bar{m}}}\),

$$\begin{aligned} \sum _{u'=1}^{9{{\bar{m}}}} \Pi (u,u')(D(u,u')+H_1(u,u')) + \Pi (u,u)d(u) = 1. \end{aligned}$$

Since \(\Pi (u,u')\le \max (\Pi (u,u),\Pi (u',u'))<\pi _1=2{\hat{p}}\epsilon /(s^2\delta ^2)\), \(|D(u,u')|\le 2{\hat{p}}\epsilon \), and \(|H_1(u,u')|\le {\hat{p}}\epsilon \), we conclude that each term in the above summation is at most \(6{\hat{p}}^2\epsilon ^2/(s^2\delta ^2)\) in magnitude. By choosing s sufficiently large, we are assured that every term in the summation is at most \(1/(90{{\bar{m}}})\) in magnitude. Then, we conclude that the first summation is at most 1/10 in magnitude, and therefore, \(\Pi (u,u)d(u)\ge 0.9\) so \(d(u)\ge 0.9/\pi _1\) for each \(u=1,\ldots ,9{{\bar{m}}}\).

Now consider an off-diagonal entry of the product \(\Pi (D+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {E})+H_1)\), say entry \((u,u')\) to obtain

$$\begin{aligned} \sum _{u''=1}^{9{{\bar{m}}}} \Pi (u,u'')(D(u,u'')+H_1(u,u''))+\Pi (u,u')d(u')=0. \end{aligned}$$

The magnitude of each term in the summation is at most \(6{\hat{p}}^2\epsilon ^2/(s^2\delta ^2)\) as in the last paragraph. By choosing s sufficiently large, we can ensure that this is at most \(\delta ^2/(90{{\bar{m}}})\), so the summation has magnitude at most \(\delta ^2/10\). Also, \(d(u')\ge 0.9/\pi _1\) as shown in the last paragraph. Therefore,

$$\begin{aligned} \Pi (u,u') \le \frac{\delta ^2}{10\cdot 0.9/\pi _1}, \end{aligned}$$

for each \(u,u'=1,\ldots ,9{{\bar{m}}}\) with \(u\ne u'\), which implies that \(\Pi (u,u')\le \delta ^2\pi _1/9\), which is stronger than the upper bound of \(\pi _1\) that holds by the assumption of this case.

Next, let us define two ranges of \(\Pi (u,u)\) for \(u=1,\ldots ,9{{\bar{m}}}\).

  • Small entries satisfy \(\Pi (u,u)\in (0,\pi _0]\).

  • Large entries satisfy \(\Pi (u,u)\in (\pi _0,\pi _1)\).

Here,

$$\begin{aligned} \pi _0=\frac{\epsilon _0}{27 {{\bar{m}}} n s^2\delta }. \end{aligned}$$
(20)

In this formula, \(\epsilon _0\) is the universal constant introduced in Theorem 3.4. Note that the case that \(\Pi (u,u)>\pi _1\) (which could be called “very large”) was already handled by Case 1.

We now form a graph \(\mathcal {G}_\textrm{lg}\) defined as follows. The nodes \(\mathcal {V}(\mathcal {G}_\textrm{lg})\) of the graph are integers \(1,\ldots ,3n\) in correspondence with the rows/columns of A. Given a node in \(\mathcal {G}\), we will say that it “owns” the three nodes of \(\mathcal {G}_\textrm{lg}\) associated with that original node. Let kl be a pair of nodes of \(\mathcal {G}_\textrm{lg}\), \(k\ne l\), that correspond to a nonedge of \(\mathcal {G}\). In this case, there is an index \(u\in \{1,\ldots ,9m\}\) that corresponds to the pair (kl). Include the edge \(\{k,l\}\) in \(\mathcal {E}( \mathcal {G}_\textrm{lg})\) whenever \(\Pi (u,u)\) is large according to the above dichotomy.

We now take two subcases of Case 2.

\(\underline{\hbox {Case 2a, the edges of} {\mathcal {G}}_{lg} \hbox {can be covered by} c\hbox { nodes of} {\mathcal {G}}_{\hbox {lg}}.}\)

Here, c is the constant that makes non-colorability of \(\mathcal {G}\) robust; the precise value of c will be selected in Case 2b below. Note that this case includes the subcase that \(\mathcal {G}_\textrm{lg}\) has no edges, i.e., there are no large diagonal entries of \(\Pi \).

Let this vertex cover be denoted \(\Sigma _c\subseteq \mathcal {V}(\mathcal {G}_\textrm{lg})\). Let \(\Sigma \subseteq \mathcal {E}(\mathcal {V}(\mathcal {G}))\) be the at most c nodes of \(\mathcal {G}\) that collectively own \(\Sigma _c\), and let \(\Sigma _\textrm{lg}\) be the at most 3c nodes of \(\mathcal {G}_\textrm{lg}\) owned by \(\Sigma \). By construction \(\Sigma _c\subseteq \Sigma _\textrm{lg}\).

Let \({{\bar{\Sigma }}}=\mathcal {V}(\mathcal {G}){\setminus } \Sigma \) and let \({{\bar{\Sigma }}}_\textrm{lg}\) be the \(3|{{\bar{\Sigma }}}|\) nodes of \(\mathcal {G}_\textrm{lg}\) (equivalently, rows of A) owned by \({{\bar{\Sigma }}}\). Since \(\mathcal {G}\) is robustly not 3-colorable with parameter c, the induced subgraph \(\mathcal {G}[{{\bar{\Sigma }}}]\) is not 3-colorable. Recall that S stands for the Schur complement of the (1, 1) block of \(B+H+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}})\). We will argue that \(S({{\bar{\Sigma }}}_\textrm{lg},{{\bar{\Sigma }}}_\textrm{lg})\) may be written as \(A+L+H'\), where \(A,L,H'\) satisfy conditions 2(a)–(d) of Theorem 3.4. Since \(G[{{\bar{\Sigma }}}]\) is not 3-colorable, it follows from the theorem that \({{\,\mathrm{{\text {rank}}}\,}}(S({{\bar{\Sigma }}}_\textrm{lg}, {{\bar{\Sigma }}}_\textrm{lg}))\ge 4\), which in turn implies \({{\,\mathrm{{\text {rank}}}\,}}(S)\ge 4\), thus concluding Case 2a. The matrix A in this formula for \(S({{\bar{\Sigma }}}_\textrm{lg},{{\bar{\Sigma }}}_\textrm{lg})\) is the same as the matrix A defined earlier in the appendix, all of whose entries are 0’s, 1’s, or 2’s.

Recall from the notation of the theorem that L is may have arbitrary entries in positions corresponding to nonedges of \(\mathcal {G}[{{\bar{\Sigma }}}]\) and in diagonal positions. Therefore, these entries of \(S({{\bar{\Sigma }}}_\textrm{lg},{{\bar{\Sigma }}}_\textrm{lg})\) do not need any further analysis in our decomposition of \(S({{\bar{\Sigma }}}_\textrm{lg},{{\bar{\Sigma }}}_\textrm{lg})\) as the sum \(A+L+H'\). The entries of S that need consideration are therefore the entries corresponding to edges (which are 0’s in A) and the six off-diagonal entries of the \(3\times 3\) diagonal blocks (which are 1’s in A).

Therefore, we now proceed to bound the entries of the update to A

$$\begin{aligned} H':=H_3 - (K+H_2)\Pi (K+H_2)^{\top } \end{aligned}$$

that correspond to edges of \(\mathcal {G}[{{\bar{\Sigma }}}]\) and to off-diagonal entries of diagonal blocks of \(S({{\bar{\Sigma }}}_\textrm{lg},{{\bar{\Sigma }}}_\textrm{lg})\). Let us call these the “distinguished" entries of \(H'\). As above, let us write the second term of \(H'\) as a sum of \((9{{\bar{m}}})^2\) rank-one matrices:

$$\begin{aligned} (K+H_2)\Pi (K+H_2)^{\top }=\sum _{u=1}^{9{{\bar{m}}}}\sum _{u'=1}^{9{{\bar{m}}}} (K+H_2)(:,u)\Pi (u,u')(K+H_2)(:,u')^{\top } \end{aligned}$$

and classify the terms into three categories: (a) terms such that \(u=u'\) and \(\Pi (u,u)\) is large, (b) terms such that \(u=u'\) and \(\Pi (u,u')\) is small, and (c) terms such that \(u\ne u'\).

For u in category (a), a large diagonal entry of \(\Pi \), we know \(\Pi (u,u)\le \pi _1\) by the hypothesis of Case 2. Furthermore, we know that entries of \(K(:,u)K(:,u)^{\top }\) in distinguished positions are \((s\delta )^2\) (that is, small entries of K squared) because the two large entries of column K( : , u), say in positions kl, are such that both kl are excluded from \({{\bar{\Sigma }}}_\textrm{lg}\) by construction of \({{\bar{\Sigma }}}_\textrm{lg}\). The perturbation \(H_2\) raises this to at most \(2.2(s\delta )^2\) (since \(\Vert H_2\Vert _F\le {\hat{p}}\epsilon \), and we can make s sufficiently large). Therefore, if (kl) is a distinguished entry and u is in category (a), we have shown that

$$\begin{aligned} ((K+H_2)(:,u)\Pi (u,u)(K+H_2)(:,u)^{\top }))(k,l)&\le 2.2(s\delta )^2\pi _1 \\&\quad =2.2{\hat{p}}\epsilon . \end{aligned}$$

Next, if u is in category (b), a small diagonal entry of \(\Pi \), we know that \(\Pi (u,u)\le \pi _0\). Furthermore, we know that entries of \(K(:,u)K(:,u)^{\top }\) in distinguished positions are either \((s\delta )^2\) or \(s^2\delta \), i.e., a product of two small entries of K or a small and large entry. This is because the products of two large entries appear only in positions (kl) and (lk), where k and l are the indices of the row associated with a nonedge by construction of K. However, entries corresponding to nonedges are not distinguished in this analysis. Therefore, an upper bound on this entry is \(s^2\delta \), and therefore an upper bound on the corresponding entry of \((K+H_2)(:,u)(K+H_2)(:,u)^{\top }\) is \(1.1s^2\delta \). Therefore, if (kl) is a distinguished entry and u is in category (b), we have show that

$$\begin{aligned}&((K+H_2)(:,u)\Pi (u,u)(K+H_2)(:,u)^{\top })(k,l) \le 1.1s^2\delta \pi _0 \\&\quad =\frac{0.1\epsilon _0}{27{{\bar{m}}} n}. \end{aligned}$$

Finally, if \((u,u')\) is in category (c), i.e., an off-diagonal entry of \(\Pi \), then entries of \(K({:,u})K({:,u'})^{\top }\) can be as large as \(s^2\), and \(\Pi (u,u')\le \pi _1\delta ^2/9\) by the analysis at the beginning of Case 2, so for any entry (kl) of the product (distinguished or not),

$$\begin{aligned}&((K+H_2)(:,u)\Pi (u,u')(K+H_2)(:,u')^{\top })(k,l) \le 1.1s^2\cdot \pi _1\delta ^2/9 \\&\quad \le 0.3{\hat{p}} \epsilon . \end{aligned}$$

In all three cases, we have an upper bound on the contribution of one rank-one term to a distinguished entry of \(H'\). The contributions from category (b) add up to at most \(0.1\epsilon _0/(3n)\) since there are at most \(9{\bar{m}}\) category (b) terms. The sum of terms from categories (a) and (c) is at most \(2.2\cdot (9{{\bar{m}}})^2{\hat{p}}\epsilon \) since there are at most \(9{{\bar{m}}}\) category-(a) terms and at most \((9{{\bar{m}}})^2-9{{\bar{m}}}\) category-(c) terms. We can place an upper bound of \(0.9\epsilon _0/(3n)\) on the sum of terms of categories (a) and (c) if we impose the assumption:

$$\begin{aligned} \epsilon \le \frac{\epsilon _0}{600{{\bar{m}}}^2 n {\hat{p}}}. \end{aligned}$$
(21)

With this assumption in place, we can now claim that the sum of all contributions to a distinguished entry of \(H'\) is at most \(\epsilon _0/(3n)\) in magnitude. The non-distinguished entries of \(H'\) may be set to 0 since these entries correspond to edges (which are covered by L or \({{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^{\mathcal {V}})\)). Since the size of \(H'\) is \(3n\times 3n\), it follows that \(\Vert H'\Vert _F\le \epsilon _0\), and thus, all hypotheses 2(a)–(d) of Theorem 3.4 are satisfied except the rank assumption. Since the graph is not 3-colorable, this implies that \({{\,\mathrm{{\text {rank}}}\,}}(S({{\bar{\Sigma }}}_\textrm{lg},{{\bar{\Sigma }}}_\textrm{lg}))\ge 4\). This concludes the analysis of Case 2a.

\(\underline{\hbox {Case 2b, more than} c \hbox {nodes of} {\mathcal {G}}_{\hbox {lg}} \hbox {are required to cover the edges of} {\mathcal {G}}_{\hbox {lg}}.}\)

We will now specify \(c=5\).

Let u be a large diagonal entry of \(\Pi \) corresponding to entry (kl) of A. This entry also corresponds to an edge of \(\mathcal {G}_\textrm{lg}\) and to a nonedge \(\{i,j\}\) of the original graph \(\mathcal {G}\). We have the following lower bound:

$$\begin{aligned} ((K+H_2)\Pi (K+H_2)^{\top })(k,l)&\ge 0.9s^2\pi _0\\&= \frac{\epsilon _0}{30{{\bar{m}}} n \delta }\\&\ge \frac{\epsilon _0 \cdot 27 \sqrt{6} {{\bar{m}}}}{ n^{1/2} \epsilon } \\&\ge 16200 \sqrt{6}{{\bar{m}}}^3 n^{1/2}{\hat{p}}\\&> 3.96\cdot 10^4 \cdot {{\bar{m}}}^3 n^{1/2}{\hat{p}}. \end{aligned}$$

The first line is obtained because \(K(k,u)=K(l,u)=s^2\) (i.e., large entries of K), \(H_2\) does not perturb this much assuming s is chosen sufficiently large, and \(\Pi (u,u)\ge \pi _0\) by assumption that u is a large diagonal entry. Contributions from the other entries of \(\Pi \) can only increase this since \(K+H_2\) and \(\Pi \) are both positive matrices. The second line follows from (20), the third from (17), and the fourth from (21). Thus,

$$\begin{aligned} S(k,l)&= A(k,l)+H_3(k,l)-((K+H_2)\Pi (K+H_2))(k,l) \\&\quad< 2 + \epsilon - 3.96\cdot 10^4\\&\quad < -3.95 \cdot 10^4. \end{aligned}$$

Let us rename S as \(S^{(0)}\) and consider performing the following operations for \(\mu =0,1,2,\ldots \)

  1. (1)

    Let \({\tilde{S}}^{(\mu )}\in \mathbb {S}^{3n-\mu }\) denote \(PS^{(\mu )}P^{\top }\), where P is a permutation matrix chosen so that the largest diagonal entry of \({\tilde{S}}^{(\mu )}\) is in the (1, 1) position.

  2. (2)

    Terminate if \({\tilde{S}}^{(\mu )}(1,1)=0\).

  3. (3)

    Else let \(S^{(\mu +1)}\in \mathbb {S}^{3n-\mu -1}\) be the Schur complement of the (1, 1) entry of \({\tilde{S}}^{(\mu )}\).

Clearly the sequence of matrices produced by this iteration are all symmetric and positive semidefinite by the Schur complement lemmas. Recall the following fact: the entry in a positive-semidefinite matrix with the largest magnitude must occur on the diagonal. Therefore, \({\tilde{S}}^{(\mu )}(1,1)\) is the entry with largest magnitude of \({\tilde{S}}^{(\mu )}\). This in turn means that step 2 will not terminate, provided that \(S^{(\mu )}\) has any nonzero entry.

We claim that we can perform these operations at least 4 times before termination in step 2. Assuming we prove this claim, this implies that \({{\,\mathrm{{\text {rank}}}\,}}(S)\ge 4\) by the Schur complement lemmas, and this would therefore conclude the analysis of Case 2b.

To prove that we can perform the above iteration at least 4 times, we establish the following claims by induction.

Claims:

  1. (1)

    The largest positive off-diagonal entry in \(S^{(\mu )}\) is at most \((2+\epsilon )^{\mu +1}\).

  2. (2)

    Let \(r_\mu :=(2+\epsilon )^{\mu +1} - 10^4\). Call the entries of \(S^{(\mu )}\) whose value is less than \(r_\mu \) the “big negative entries.” The big negative entries cannot be covered by fewer than \(c+1-\mu \) rows/columns of \(S^{(\mu )}\).

Once we prove these claims by induction, then the main result follows because \(r_\mu <0\) for \(\mu =0,1,2,3,4\) and \(c+1-\mu >0\) for \(\mu =0,1,2,3,4\), so the second claim implies that there is at least one nonzero entry in \(S^{(\mu )}\), and hence, step 2 of the above algorithm will not terminate.

The base of the induction is as follows. Recall that

$$\begin{aligned} S^{(0)}=S=A+H_3-(K+H_2)\Pi (K+H)^{\top }+{{\,\mathrm{{\text {Diag}}}\,}}({\textbf{d}}^\mathcal {V}). \end{aligned}$$

The maximum entry of A is 2, the maximum entry of \(H_3\) is \(\epsilon \), the third term is negative, and the fourth term does not affect off-diagonal entries. Thus, the first induction claim holds for \(\mu =0\). We argued earlier that \(S^{(0)}\) contains an entry smaller than \(-10^4\) in position (kl) for every (kl) corresponding to a large diagonal entry of \(\Pi \), i.e., edges of \(\mathcal {G}_\textrm{lg}\). The hypothesis for Case 2b is that these entries cannot be covered by fewer than \(c+1\) rows/columns of \(S^{(0)}\), i.e., nodes of \(\mathcal {G}_\textrm{lg}\). Thus, the second induction hypothesis also holds for \(\mu =0\).

Now assume the hypothesis holds for \(\mu \); we show that it holds for \(\mu +1\). Observe that \(S^{(\mu +1)}={\tilde{S}}^{(\mu )}(2:3n-\mu ,2:3n-\mu )-{\textbf{w}}{\textbf{w}}^{\top }/{\tilde{S}}^{(\mu )}(1,1)\), where \({\textbf{w}}={\tilde{S}}^{(\mu )}(2:3n-\mu ,1)\). Consider an off-diagonal entry of \(S^{(\mu +1)}\), say entry (ij) with \(i\ne j\). We have: \(S^{(\mu +1)}(i,j)={\tilde{S}}^{(\mu )}(i+1,j+1)-w_iw_j/S^{(\mu )}(1,1)\). There are two cases: either \(w_i w_j\ge 0\) or \(w_iw_j<0\). In the first case, \(S^{(\mu +1)}(i,j)\le S^{(\mu )}(i+1,j+1)\). Since the bounds appearing in the induction claims are both upper bounds on the entries of \(S^{(\mu )}\) and these upper bounds increase with \(\mu \), then a decrease in an entry can only further sharpen the induction claim.

On the other hand, if \(w_iw_j<0\), say, without loss of generality, that \(w_i<0\) and \(w_j>0\), then \(-{\tilde{S}}^{(\mu )}(1,1)\le w_i<0\) since, as noted earlier, the largest magnitude entry of \({\tilde{S}}^{(\mu )}\) appears in the (1, 1) position. Therefore, \(|w_iw_j/{\tilde{S}}^{(\mu )}(1,1)|\le |w_j|\le (2+\epsilon )^\mu \), the second inequality arising from induction claim 1 since \(w_j>0\). Thus, \(S^{(\mu +1)}(i,j)\le {\tilde{S}}^{(\mu )}(i+1,j+1)+(2+\epsilon )^\mu \). Combined with the induction hypothesis, this means that the largest positive value in \(S^{(\mu +1)}\) in an off-diagonal position is at most \((2+\epsilon )^{\mu +1}\), thus establishing the first induction claim. Similarly, since \(r_{\mu +1}=r_\mu +(2+\epsilon )^\mu \), the big negative entries of \({\tilde{S}}^{(\mu )}\) remain big and negative in \(S^{(\mu +1)}\).

To finish the induction, we also need to show that the covering number of the big negative entries of \(S^{(\mu +1)}\) is at most \(c+1-(\mu +1)\). This follows because the preceding paragraphs show that all big negative entries of \({\tilde{S}}^{(\mu )}\) persist in \(S^{(\mu +1)}\) except for those lost in the discarded first row and column of \({\tilde{S}}^{(\mu )}\). Therefore, since the big negative entries of \({\tilde{S}}^{(\mu )}\) needed at least \(c+1-\mu \) rows/columns to be covered, then those of \(S^{(\mu +1)}\) need at least \(c+1-\mu -1\). This concludes the induction and hence the analysis of Case 2b.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tunçel, L., Vavasis, S.A. & Xu, J. Computational Complexity of Decomposing a Symmetric Matrix as a Sum of Positive Semidefinite and Diagonal Matrices. Found Comput Math (2023). https://doi.org/10.1007/s10208-023-09637-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10208-023-09637-4

Keywords

Mathematics Subject Classification

Navigation