Skip to main content
Log in

Lower bounds on nonnegative rank via nonnegative nuclear norms

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

The nonnegative rank of an entrywise nonnegative matrix \(A \in \mathbb {R}^{m \times n}_+\) is the smallest integer \(r\) such that \(A\) can be written as \(A=UV\) where \(U \in \mathbb {R}^{m \times r}_+\) and \(V \in \mathbb {R}^{r \times n}_+\) are both nonnegative. The nonnegative rank arises in different areas such as combinatorial optimization and communication complexity. Computing this quantity is NP-hard in general and it is thus important to find efficient bounding techniques especially in the context of the aforementioned applications. In this paper we propose a new lower bound on the nonnegative rank which, unlike most existing lower bounds, does not solely rely on the matrix sparsity pattern and applies to nonnegative matrices with arbitrary support. The idea involves computing a certain nuclear norm with nonnegativity constraints which allows to lower bound the nonnegative rank, in the same way the standard nuclear norm gives lower bounds on the standard rank. Our lower bound is expressed as the solution of a copositive programming problem and can be relaxed to obtain polynomial-time computable lower bounds using semidefinite programming. We compare our lower bound with existing ones, and we show examples of matrices where our lower bound performs better than currently known ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Throughout the paper, a nonnegative matrix is a matrix whose entries are all nonnegative.

  2. Note that the matrix considered in this example is symmetric, and so one could use the reduced semidefinite program (15) to simplify the computation of \(\nu _+^{[0]}(A)\). However for simplicity and for illustration purposes, we used in this example the original formulation (3).

  3. Note that the trivial representation of \(P\) uses \(f\) linear inequalities, where \(f\) is the number of facets of \(P\). However by introducing new variables (i.e., allowing projections) one can sometimes reduce dramatically the number of inequalities needed to represent \(P\). For example the cross-polytope \(P = \{x \in \mathbb {R}^n : w^\mathsf T x \le 1\; \forall w \in \{-1,1\}^n\}\) has \(2^n\) facets but can be represented using only \(2n\) linear inequalities after introducing \(n\) additional variables: \(P = \{x \in \mathbb {R}^n \; : \; \exists y \in \mathbb {R}^n \; y \ge x, \; y \ge -x, \; 1^\mathsf T y = 1\}\).

  4. Since we are interested in obtaining a lower bound for the specific perturbation \(A_{\varepsilon }\) of \(A\), we can obtain a better bound than the one of Theorem 3 by normalizing by \(\Vert A_{\varepsilon }\Vert _F^2\) directly.

  5. Indeed, observe first that if \(W\) is feasible for (18) then \(W^\mathsf T \) is also feasible and has the same objective value. Thus by averaging one can assume that \(W\) is symmetric. Then note that for any feasible symmetric \(W\) and any permutation \(\sigma \), the new matrix \(W'_{i,j} = W_{\sigma (i),\sigma (j)}\) is also feasible and has the same objective value as \(W\). Hence again by averaging we can assume \(W\) to be constant on the diagonal and constant on the off-diagonal.

References

  1. Arora, S., Ge, R., Kannan, R., Moitra, A.: Computing a nonnegative matrix factorization—provably. In: Proceedings of the Forty-fourth Annual ACM Symposium on Theory of Computing, STOC ’12, pp. 145–162. ACM (2012)

  2. Bernstein, D.S.: Matrix Mathematics: Theory, Facts, and Formulas, 2nd edn. Princeton University Press, Princeton (2009)

    Google Scholar 

  3. Braun, G., Jain, R., Lee, T., Pokutta, S.: Information-theoretic approximations of the nonnegative rank. ECCC preprint TR13-158 (2013)

  4. Berman, A., Shaked-Monderer, N.: Completely Positive Matrices. World Scientific Pub Co Inc, Singapore (2003)

    Book  MATH  Google Scholar 

  5. Cohen, J.E., Rothblum, U.G.: Nonnegative ranks, decompositions, and factorizations of nonnegative matrices. Linear Algebra Appl. 190, 149–168 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  6. Dür, M.: Copositive programming – a survey. In: Diehl, M., Glineur, F., Jarlebring, E., Michiels, W. (eds.) Recent Advances in Optimization and its Applications in Engineering, pp. 3–20. Springer, Berlin (2010). doi:10.1007/978-3-642-12598-0_1

  7. Doan, X., Vavasis, S.: Finding approximately rank-one submatrices with the nuclear norm and \(\ell _1\)-norm. SIAM J. Optim. 23(4), 2502–2540 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  8. Fiorini, S., Kaibel, V., Pashkovich, K., Theis, D.O.: Combinatorial bounds on nonnegative rank and extended formulations. Discret. Math. 313(1), 67–83 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  9. Fawzi, H., Parrilo, P.A.: Self-scaled bounds for atomic cone ranks: applications to nonnegative rank and cp-rank. arXiv:1404.3240 (2014)

  10. Gillis, N., Glineur, F.: On the geometric interpretation of the nonnegative rank. Linear Algebra Appl. 437(11), 2685–2712 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  11. Gillis, N.: Sparse and unique nonnegative matrix factorization through data preprocessing. J. Mach. Learn. Res. 13, 3349–3386 (2012)

    MATH  MathSciNet  Google Scholar 

  12. Goemans, M.X.: Smallest compact formulation for the permutahedron. Math. Program. 1–7 (2014). doi:10.1007/s10107-014-0757-1

  13. Gatermann, K., Parrilo, P.A.: Symmetry groups, semidefinite programs, and sums of squares. J. Pure Appl. Algebra 192(1), 95–128 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  14. Gouveia, J., Parrilo, P.A., Thomas, R.R.: Approximate cone factorizations and lifts of polytopes. arXiv:1308.2162 (2013)

  15. Gouveia, J., Parrilo, P.A., Thomas, R.R.: Lifts of convex sets and cone factorizations. Math. Oper. Res. 38(2), 248–264 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  16. Gouveia, J., Robinson, R.Z., Thomas, R.R.: Polytopes of minimum positive semidefinite rank. Discret. Comput. Geom. 50(3), 679–699 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  17. Jameson, G.J.O.: Summing and Nuclear Norms in Banach Space Theory, vol. 8. Cambridge University Press, Cambridge (1987)

    Book  MATH  Google Scholar 

  18. Jain, R., Shi, Y., Wei, Z., Zhang, S.: Efficient protocols for generating bipartite classical distributions and quantum states. IEEE Trans. Inf. Theory 59(8), 5171–5178 (2013)

    Article  MathSciNet  Google Scholar 

  19. Kushilevitz, E., Nisan, N.: Communication Complexity. Cambridge University Press, New York (2006)

  20. Klerk, E., Pasechnik, D.V.: Approximation of the stability number of a graph via copositive programming. SIAM J. Optim. 12(4), 875–892 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  21. Löfberg, J.: YALMIP: a toolbox for modeling and optimization in MATLAB. In: Proceedings of the CACSD Conference, Taipei, Taiwan (2004)

  22. Löfberg, J.: Pre- and post-processing sum-of-squares programs in practice. IEEE Trans. Autom. Control 54(5), 1007–1011 (2009)

    Article  Google Scholar 

  23. Lovász, L.: Communication complexity: a survey. In: Korte, B., Promel, H.J., Graham, R.L. (eds.) Paths, Flows, and VLSI-Layout, pp. 235–265. Springer, New York (1990)

  24. Lee, T., Shraibman, A.: Lower Bounds in Communication Complexity Found. Trends Theor. Comput. Sci. 3(4), 263–399 (2009)

  25. Parrilo, P.A.: Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization. PhD thesis, California Institute of Technology (2000)

  26. Rothvoss, T.: The matching polytope has exponential extension complexity. In: Proceedings of the 46th Annual ACM Symposium on Theory of Computing, STOC ’14, pp. 263–272, ACM (2014)

  27. Vavasis, S.A.: On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20(3), 1364–1377 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  28. Yannakakis, M.: Expressing combinatorial optimization problems by linear programs. J. Comput. Syst. Sci. 43(3), 441–466 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  29. Zhang, S.: Quantum strategic game theory. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 39–59, ACM (2012)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamza Fawzi.

Additional information

This research was funded in part by AFOSR FA9550-11-1-0305.

Appendix: Proof of Proposition 3: Slack matrix of hypercube

Appendix: Proof of Proposition 3: Slack matrix of hypercube

In this appendix we prove Proposition 3 concerning the nonnegative rank of the slack matrix of the hypercube. We restate the proposition here for convenience:

Proposition

Let \(C_n = [0,1]^n\) be the hypercube in \(n\) dimensions and let \(S(C_n) \in \mathbb {R}^{2n \times 2^n}\) be its slack matrix. Then

$$\begin{aligned} {{\mathrm{rank}}}_+(S(C_n)) = \left( \frac{\nu _+^{[0]}(S(C_n))}{\Vert S(C_n)\Vert _F}\right) ^2 = 2n. \end{aligned}$$

Proof

The facets of the hypercube \(C_n = [0,1]^n\) are given by the linear inequalities \(\{x_k \ge 0\}, \; k=1,\ldots ,n\) and \(\{x_k \le 1\}, \; k=1,\ldots ,n\), and the vertices of \(C_n\) are given by the set \(\{0,1\}^n\) of binary words of length \(n\). It is easy to see that the slack matrix of the hypercube is a 0/1 matrix: in fact, for a given facet \(F\) and vertex \(V\), the \((F,V)\)’th entry of \(S(C_n)\) is given by:

$$\begin{aligned} S(C_n)_{F,V} = \left\{ \begin{array}{ll} 1 &{} \quad \text { if }V \notin F\\ 0 &{} \quad \text { if }V \in F \end{array}\right. . \end{aligned}$$

Since the slack matrix of the hypercube \(S(C_n)\) has \(2n\) rows we clearly have

$$\begin{aligned} \left( \frac{\nu _+^{[0]}(S(C_n))}{\Vert S(C_n)\Vert _F}\right) ^2 \le {{\mathrm{rank}}}_+(S(C_n)) \le 2n. \end{aligned}$$

To show that we indeed have equality, we exhibit a particular feasible point \(W\) of the semidefinite program (3) and we show that this \(W\) satisfies \((\langle S(C_n), W \rangle / \Vert S(C_n)\Vert _F)^2\) \( = 2n\).

Let

$$\begin{aligned} W = \frac{1}{\sqrt{2^{n-1}}} (2 S(C_n) - J). \end{aligned}$$
(19)

Observe that \(W\) is the matrix obtained from \(S(C_n)\) by changing the ones into \(\frac{1}{\sqrt{2^{n-1}}}\) and zeros into \(-\frac{1}{\sqrt{2^{n-1}}}\). For this \(W\) we verify using a simple calculation that

$$\begin{aligned} \left( \frac{\langle S(C_n), W \rangle }{\Vert S(C_n)\Vert _F}\right) ^2 = 2n. \end{aligned}$$

It thus remains to prove that \(W\) is indeed a feasible point of the semidefinite program (3) and that the matrix

$$\begin{aligned} \begin{bmatrix} I&\quad -W\\ -W^\mathsf T&\quad I\end{bmatrix} \in \mathbb {R}^{(2n + 2^n)\times (2n+2^n)} \end{aligned}$$

can be written as the sum of a nonnegative matrix and a positive semidefinite one. This is the main part of the proof and for this we introduce some notations. Let \(\fancyscript{F}\) be the set of facets of the hypercube, and \(\fancyscript{V}=\{0,1\}^n\) be the set of vertices. If \(F \in \fancyscript{F}\) is a facet of the hypercube, we denote by \(\bar{F}\) the opposite facet to \(F\) (namely, if \(F\) is given by \(x_k \ge 0\), then \(\bar{F}\) is the facet \(x_k \le 1\) and vice-versa). Similarly for a vertex \(V \in \fancyscript{V}\) we denote by \(\bar{V}\) the opposite vertex obtained by complementing the binary word \(V\). Denote by \(N_{\fancyscript{F}}:\mathbb {R}^{\fancyscript{F}}\rightarrow \mathbb {R}^{\fancyscript{F}}\) and \(N_{\fancyscript{V}}:\mathbb {R}^{\fancyscript{V}}\rightarrow \mathbb {R}^{\fancyscript{V}}\) the “negation” maps, so that we have:

$$\begin{aligned} \forall g \in \mathbb {R}^{\fancyscript{F}},\quad \forall F \in \fancyscript{F},\quad (N_{\fancyscript{F}} g)(F) = g(\bar{F}) \end{aligned}$$
(20)
$$\begin{aligned} \forall h \in \mathbb {R}^{\fancyscript{V}},\quad \forall V \in \fancyscript{V},\quad (N_{\fancyscript{V}} h)(V) = h(\bar{V}) \end{aligned}$$
(21)

Note that in a suitable ordering of the facets and the vertices, the matrix representation of \(N_{\fancyscript{F}}\) and \(N_{\fancyscript{V}}\) take the following antidiagonal form (\(N_{\fancyscript{F}}\) is of size \(2n\times 2n\) and \(N_{\fancyscript{V}}\) is of size \(2^n \times 2^n\)):

Consider now the following decomposition of the matrix \({\begin{bmatrix} I&\quad -W\\ -W^\mathsf T&\quad I\end{bmatrix}}\):

$$\begin{aligned} \begin{bmatrix} I&\quad -W\\ -W^\mathsf T&\quad I \end{bmatrix} = \begin{bmatrix} N_{\fancyscript{F}}&\quad 0\\ 0&\quad N_{\fancyscript{V}} \end{bmatrix} + \begin{bmatrix} I-N_{\fancyscript{F}}&\quad -W\\ -W^\mathsf T&\quad I-N_{\fancyscript{V}} \end{bmatrix} \end{aligned}$$
(22)

Clearly the first matrix in the decomposition is nonnegative. The next lemma states that the second matrix is actually positive semidefinite:

Lemma 2

Let \(\mathcal F\) and \(\mathcal V\) be respectively the set of facets and vertices of the hypercube \(C_n=[0,1]^n\) and let \(\widehat{W} \in \mathbb {R}^{\fancyscript{F}\times \fancyscript{V}}\) be the matrix:

$$\begin{aligned} \widehat{W}_{F,V} = \left\{ \begin{array}{ll} 1 &{} \quad \text { if }V \notin F\\ -1 &{} \quad \text { if }V \in F \end{array}\right. \;\; \forall F \in \fancyscript{F},\; V \in \fancyscript{V}\end{aligned}$$

Then the matrix

$$\begin{aligned} \begin{bmatrix} I-N_{\fancyscript{F}}&\quad -\gamma \widehat{W}\\ -\gamma \widehat{W}^\mathsf T&\quad I-N_{\fancyscript{V}} \end{bmatrix} \end{aligned}$$
(23)

is positive semidefinite for \(\gamma = 1/\sqrt{2^{n-1}}\) (where \(N_{\fancyscript{F}}\) and \(N_{\fancyscript{V}}\) are defined in (20) and (21)).

Proof

We use the Schur complement to show that the matrix (23) is positive semidefinite. In fact we show that

  1. 1.

    \(I-N_{\fancyscript{V}} \succeq 0\)

  2. 2.

    \({{\mathrm{range}}}(\widehat{W}^\mathsf T ) \subseteq {{\mathrm{range}}}(I-N_{\fancyscript{V}})\), and

  3. 3.

    \(I-N_{\fancyscript{F}} - \gamma ^2 \widehat{W} (I-N_{\fancyscript{V}})^{-1} \widehat{W}^\mathsf T \succeq 0\).

where \((I-N_{\fancyscript{V}})^{-1}\) denotes the pseudo-inverse of \(I-N_{\fancyscript{V}}\).

Observe that for any \(k \in \mathbb {N}\), the \(2k \times 2k\) matrix given by:

is positive semidefinite: in fact one can see that \(\frac{1}{2}(I_{2k}-N_{2k})\) is the orthogonal projection onto the subspace spanned by its columns (i.e., the subspace of dimension \(k\) spanned by \(\{e_i - e_{2k-i}:\;i=1,\ldots ,n\}\) where \(e_i\) is the \(i\)’th unit vector). Hence this shows that \(I-N_{\fancyscript{V}}\) is positive semidefinite, and it also shows that \((I-N_{\fancyscript{V}})^{-1} = \frac{1}{4} (I-N_{\fancyscript{V}})\).

Now we show that \({{\mathrm{range}}}(\widehat{W}^\mathsf T ) \subseteq {{\mathrm{range}}}(I-N_{\fancyscript{V}})\). For any \(F \in \fancyscript{F}\), the \(F\)’th column of \(\widehat{W}^\mathsf T \) satisfies \((\widehat{W}^\mathsf T )_{V,F} = -(\widehat{W}^\mathsf T )_{\bar{V},F}\) for any \(V \in \fancyscript{V}\), and thus \({{\mathrm{range}}}(\widehat{W}^\mathsf T ) \subseteq {{\mathrm{span}}}(e_{V} - e_{\bar{V}}, \; : \; V \in \fancyscript{V}) = {{\mathrm{range}}}(I-N_{\fancyscript{V}})\).

It thus remains to show that

$$\begin{aligned} I-N_{\fancyscript{F}} - \gamma ^2 \widehat{W} (I-N_{\fancyscript{V}})^{-1} \widehat{W}^\mathsf T \succeq 0 \end{aligned}$$

First note that since \(\frac{1}{2} (I-N_{\fancyscript{V}})\) is an orthogonal projection and that \({{\mathrm{range}}}(\widehat{W}^\mathsf T ) \subseteq {{\mathrm{range}}}(I-N_{\fancyscript{V}})\), we have \((I-N_{\fancyscript{V}})^{-1} \widehat{W}^\mathsf T = \frac{1}{2} \widehat{W}^\mathsf T \). Thus we now have to show that

$$\begin{aligned} I-N_{\fancyscript{F}} - \frac{\gamma ^2}{2} \widehat{W} \widehat{W}^\mathsf T \succeq 0. \end{aligned}$$

The main observation here is that the matrix \(\widehat{W}\widehat{W}^\mathsf T \) is actually equal to \(2^{n} (I-N_{\fancyscript{F}})\). For any \(F,G \in \fancyscript{F}\), we have:

$$\begin{aligned} (\widehat{W}\widehat{W}^\mathsf T )_{F,G} = \sum \limits _{a \in \fancyscript{V}}\widehat{W}_{F,a} \widehat{W}_{G,a} = \left\{ \begin{array}{ll} 2^n &{} \quad \text { if } F=G\\ -2^n &{} \quad \text { if } F=\bar{G}\\ 0 &{} \quad \text { else} \end{array} \right. \end{aligned}$$

First it is clear that if \(F=G\), then \((\widehat{W}\widehat{W}^\mathsf T )_{F,G} = 2^n\). Also if \(F=\bar{G}\) then \((\widehat{W}\widehat{W}^\mathsf T )_{F,G} = -2^n\) since if \(F=\bar{G}\) then \(a \in F \Leftrightarrow a \notin G\) hence \(\widehat{W}_{F,a} \widehat{W}_{G,a} = -1\) for all \(a \in \fancyscript{V}\). In the case that \(F \ne G\) and \(F \ne \bar{G}\), it is easy to verify by simple counting that \(\sum \nolimits _{a \in \fancyscript{V}}\widehat{W}_{F,a} \widehat{W}_{G,a} = 0\).

Hence we have \(I-N_{\fancyscript{F}} - \gamma ^2 \widehat{W} (I-N_{\fancyscript{V}})^{-1} \widehat{W}^\mathsf T = I-N_{\fancyscript{F}} - \frac{\gamma ^2}{2} \widehat{W}\widehat{W}^\mathsf T = (1 - \frac{\gamma ^2}{2}2^n) (I-N_{\fancyscript{F}})\) which is positive semidefinite for \(\gamma = 1/\sqrt{2^{n-1}}\).

Using this lemma, Eq. (22) shows that the matrix \(W\) is feasible for the semidefinite program (3), and thus that \(\nu _+^{[0]}(S(C_n)) \ge \langle S(C_n), W \rangle = \sqrt{2^{n-1}} \cdot 2n\). Hence since \(\Vert S(C_n)\Vert _F = \sqrt{2^{n-1} \cdot 2n}\) we get that

$$\begin{aligned} {{\mathrm{rank}}}_+(S(C_n)) \ge \left( \frac{\nu _+^{[0]}(S(C_n))}{\Vert S(C_n)\Vert _F}\right) ^2 \ge \left( \frac{\sqrt{2^{n-1}} \cdot 2n}{\sqrt{2^{n-1} \cdot 2n}}\right) ^2 = 2n \end{aligned}$$

which completes the proof.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fawzi, H., Parrilo, P.A. Lower bounds on nonnegative rank via nonnegative nuclear norms. Math. Program. 153, 41–66 (2015). https://doi.org/10.1007/s10107-014-0837-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-014-0837-2

Mathematics Subject Classification

Navigation