Lower bounds on nonnegative rank via nonnegative nuclear norms

Fawzi, Hamza; Parrilo, Pablo A.

doi:10.1007/s10107-014-0837-2

Lower bounds on nonnegative rank via nonnegative nuclear norms

Full Length Paper
Series B
Published: 12 November 2014

Volume 153, pages 41–66, (2015)
Cite this article

Mathematical Programming Submit manuscript

Hamza Fawzi¹ &
Pablo A. Parrilo¹

589 Accesses
8 Citations
2 Altmetric
1 Mention
Explore all metrics

Abstract

The nonnegative rank of an entrywise nonnegative matrix $A \in \mathbb {R}^{m \times n}_+$ is the smallest integer $r$ such that $A$ can be written as $A=UV$ where $U \in \mathbb {R}^{m \times r}_+$ and $V \in \mathbb {R}^{r \times n}_+$ are both nonnegative. The nonnegative rank arises in different areas such as combinatorial optimization and communication complexity. Computing this quantity is NP-hard in general and it is thus important to find efficient bounding techniques especially in the context of the aforementioned applications. In this paper we propose a new lower bound on the nonnegative rank which, unlike most existing lower bounds, does not solely rely on the matrix sparsity pattern and applies to nonnegative matrices with arbitrary support. The idea involves computing a certain nuclear norm with nonnegativity constraints which allows to lower bound the nonnegative rank, in the same way the standard nuclear norm gives lower bounds on the standard rank. Our lower bound is expressed as the solution of a copositive programming problem and can be relaxed to obtain polynomial-time computable lower bounds using semidefinite programming. We compare our lower bound with existing ones, and we show examples of matrices where our lower bound performs better than currently known ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lower Bounds on Matrix Factorization Ranks via Noncommutative Polynomial Optimization

Article Open access 31 January 2019

Self-scaled bounds for atomic cone ranks: applications to nonnegative rank and cp-rank

Article 19 August 2015

Positive semidefinite rank

Article 16 July 2015

Notes

Throughout the paper, a nonnegative matrix is a matrix whose entries are all nonnegative.
Note that the matrix considered in this example is symmetric, and so one could use the reduced semidefinite program (15) to simplify the computation of $\nu _+^{[0]}(A)$. However for simplicity and for illustration purposes, we used in this example the original formulation (3).
Note that the trivial representation of $P$ uses $f$ linear inequalities, where $f$ is the number of facets of $P$. However by introducing new variables (i.e., allowing projections) one can sometimes reduce dramatically the number of inequalities needed to represent $P$. For example the cross-polytope $P = \{x \in \mathbb {R}^n : w^\mathsf T x \le 1\; \forall w \in \{-1,1\}^n\}$ has $2^n$ facets but can be represented using only $2n$ linear inequalities after introducing $n$ additional variables: $P = \{x \in \mathbb {R}^n \; : \; \exists y \in \mathbb {R}^n \; y \ge x, \; y \ge -x, \; 1^\mathsf T y = 1\}$.
Since we are interested in obtaining a lower bound for the specific perturbation $A_{\varepsilon }$ of $A$, we can obtain a better bound than the one of Theorem 3 by normalizing by $\Vert A_{\varepsilon }\Vert _F^2$ directly.
Indeed, observe first that if $W$ is feasible for (18) then $W^\mathsf T $ is also feasible and has the same objective value. Thus by averaging one can assume that $W$ is symmetric. Then note that for any feasible symmetric $W$ and any permutation $\sigma $, the new matrix $W'_{i,j} = W_{\sigma (i),\sigma (j)}$ is also feasible and has the same objective value as $W$. Hence again by averaging we can assume $W$ to be constant on the diagonal and constant on the off-diagonal.

References

Arora, S., Ge, R., Kannan, R., Moitra, A.: Computing a nonnegative matrix factorization—provably. In: Proceedings of the Forty-fourth Annual ACM Symposium on Theory of Computing, STOC ’12, pp. 145–162. ACM (2012)
Bernstein, D.S.: Matrix Mathematics: Theory, Facts, and Formulas, 2nd edn. Princeton University Press, Princeton (2009)
Google Scholar
Braun, G., Jain, R., Lee, T., Pokutta, S.: Information-theoretic approximations of the nonnegative rank. ECCC preprint TR13-158 (2013)
Berman, A., Shaked-Monderer, N.: Completely Positive Matrices. World Scientific Pub Co Inc, Singapore (2003)
Book MATH Google Scholar
Cohen, J.E., Rothblum, U.G.: Nonnegative ranks, decompositions, and factorizations of nonnegative matrices. Linear Algebra Appl. 190, 149–168 (1993)
Article MATH MathSciNet Google Scholar
Dür, M.: Copositive programming – a survey. In: Diehl, M., Glineur, F., Jarlebring, E., Michiels, W. (eds.) Recent Advances in Optimization and its Applications in Engineering, pp. 3–20. Springer, Berlin (2010). doi:10.1007/978-3-642-12598-0_1
Doan, X., Vavasis, S.: Finding approximately rank-one submatrices with the nuclear norm and $\ell _1$-norm. SIAM J. Optim. 23(4), 2502–2540 (2013)
Article MATH MathSciNet Google Scholar
Fiorini, S., Kaibel, V., Pashkovich, K., Theis, D.O.: Combinatorial bounds on nonnegative rank and extended formulations. Discret. Math. 313(1), 67–83 (2013)
Article MATH MathSciNet Google Scholar
Fawzi, H., Parrilo, P.A.: Self-scaled bounds for atomic cone ranks: applications to nonnegative rank and cp-rank. arXiv:1404.3240 (2014)
Gillis, N., Glineur, F.: On the geometric interpretation of the nonnegative rank. Linear Algebra Appl. 437(11), 2685–2712 (2012)
Article MATH MathSciNet Google Scholar
Gillis, N.: Sparse and unique nonnegative matrix factorization through data preprocessing. J. Mach. Learn. Res. 13, 3349–3386 (2012)
MATH MathSciNet Google Scholar
Goemans, M.X.: Smallest compact formulation for the permutahedron. Math. Program. 1–7 (2014). doi:10.1007/s10107-014-0757-1
Gatermann, K., Parrilo, P.A.: Symmetry groups, semidefinite programs, and sums of squares. J. Pure Appl. Algebra 192(1), 95–128 (2004)
Article MATH MathSciNet Google Scholar
Gouveia, J., Parrilo, P.A., Thomas, R.R.: Approximate cone factorizations and lifts of polytopes. arXiv:1308.2162 (2013)
Gouveia, J., Parrilo, P.A., Thomas, R.R.: Lifts of convex sets and cone factorizations. Math. Oper. Res. 38(2), 248–264 (2013)
Article MATH MathSciNet Google Scholar
Gouveia, J., Robinson, R.Z., Thomas, R.R.: Polytopes of minimum positive semidefinite rank. Discret. Comput. Geom. 50(3), 679–699 (2013)
Article MATH MathSciNet Google Scholar
Jameson, G.J.O.: Summing and Nuclear Norms in Banach Space Theory, vol. 8. Cambridge University Press, Cambridge (1987)
Book MATH Google Scholar
Jain, R., Shi, Y., Wei, Z., Zhang, S.: Efficient protocols for generating bipartite classical distributions and quantum states. IEEE Trans. Inf. Theory 59(8), 5171–5178 (2013)
Article MathSciNet Google Scholar
Kushilevitz, E., Nisan, N.: Communication Complexity. Cambridge University Press, New York (2006)
Klerk, E., Pasechnik, D.V.: Approximation of the stability number of a graph via copositive programming. SIAM J. Optim. 12(4), 875–892 (2002)
Article MATH MathSciNet Google Scholar
Löfberg, J.: YALMIP: a toolbox for modeling and optimization in MATLAB. In: Proceedings of the CACSD Conference, Taipei, Taiwan (2004)
Löfberg, J.: Pre- and post-processing sum-of-squares programs in practice. IEEE Trans. Autom. Control 54(5), 1007–1011 (2009)
Article Google Scholar
Lovász, L.: Communication complexity: a survey. In: Korte, B., Promel, H.J., Graham, R.L. (eds.) Paths, Flows, and VLSI-Layout, pp. 235–265. Springer, New York (1990)
Lee, T., Shraibman, A.: Lower Bounds in Communication Complexity Found. Trends Theor. Comput. Sci. 3(4), 263–399 (2009)
Parrilo, P.A.: Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization. PhD thesis, California Institute of Technology (2000)
Rothvoss, T.: The matching polytope has exponential extension complexity. In: Proceedings of the 46th Annual ACM Symposium on Theory of Computing, STOC ’14, pp. 263–272, ACM (2014)
Vavasis, S.A.: On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20(3), 1364–1377 (2009)
Article MATH MathSciNet Google Scholar
Yannakakis, M.: Expressing combinatorial optimization problems by linear programs. J. Comput. Syst. Sci. 43(3), 441–466 (1991)
Article MATH MathSciNet Google Scholar
Zhang, S.: Quantum strategic game theory. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 39–59, ACM (2012)

Download references

Author information

Authors and Affiliations

Laboratory for Information and Decision Systems, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
Hamza Fawzi & Pablo A. Parrilo

Authors

Hamza Fawzi
View author publications
You can also search for this author in PubMed Google Scholar
Pablo A. Parrilo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hamza Fawzi.

Additional information

This research was funded in part by AFOSR FA9550-11-1-0305.

Appendix: Proof of Proposition 3: Slack matrix of hypercube

In this appendix we prove Proposition 3 concerning the nonnegative rank of the slack matrix of the hypercube. We restate the proposition here for convenience:

Proposition

Let $C_n = [0,1]^n$ be the hypercube in $n$ dimensions and let $S(C_n) \in \mathbb {R}^{2n \times 2^n}$ be its slack matrix. Then

$$\begin{aligned} {{\mathrm{rank}}}_+(S(C_n)) = \left( \frac{\nu _+^{[0]}(S(C_n))}{\Vert S(C_n)\Vert _F}\right) ^2 = 2n. \end{aligned}$$

Proof

The facets of the hypercube $C_n = [0,1]^n$ are given by the linear inequalities $\{x_k \ge 0\}, \; k=1,\ldots ,n$ and $\{x_k \le 1\}, \; k=1,\ldots ,n$, and the vertices of $C_n$ are given by the set $\{0,1\}^n$ of binary words of length $n$. It is easy to see that the slack matrix of the hypercube is a 0/1 matrix: in fact, for a given facet $F$ and vertex $V$, the $(F,V)$’th entry of $S(C_n)$ is given by:

$$\begin{aligned} S(C_n)_{F,V} = \left\{ \begin{array}{ll} 1 &{} \quad \text { if }V \notin F\\ 0 &{} \quad \text { if }V \in F \end{array}\right. . \end{aligned}$$

Since the slack matrix of the hypercube $S(C_n)$ has $2n$ rows we clearly have

$$\begin{aligned} \left( \frac{\nu _+^{[0]}(S(C_n))}{\Vert S(C_n)\Vert _F}\right) ^2 \le {{\mathrm{rank}}}_+(S(C_n)) \le 2n. \end{aligned}$$

To show that we indeed have equality, we exhibit a particular feasible point $W$ of the semidefinite program (3) and we show that this $W$ satisfies $(\langle S(C_n), W \rangle / \Vert S(C_n)\Vert _F)^2$ $ = 2n$.

Let

$$\begin{aligned} W = \frac{1}{\sqrt{2^{n-1}}} (2 S(C_n) - J). \end{aligned}$$

(19)

Observe that $W$ is the matrix obtained from $S(C_n)$ by changing the ones into $\frac{1}{\sqrt{2^{n-1}}}$ and zeros into $-\frac{1}{\sqrt{2^{n-1}}}$. For this $W$ we verify using a simple calculation that

$$\begin{aligned} \left( \frac{\langle S(C_n), W \rangle }{\Vert S(C_n)\Vert _F}\right) ^2 = 2n. \end{aligned}$$

It thus remains to prove that $W$ is indeed a feasible point of the semidefinite program (3) and that the matrix

$$\begin{aligned} \begin{bmatrix} I&\quad -W\\ -W^\mathsf T&\quad I\end{bmatrix} \in \mathbb {R}^{(2n + 2^n)\times (2n+2^n)} \end{aligned}$$

can be written as the sum of a nonnegative matrix and a positive semidefinite one. This is the main part of the proof and for this we introduce some notations. Let $\fancyscript{F}$ be the set of facets of the hypercube, and $\fancyscript{V}=\{0,1\}^n$ be the set of vertices. If $F \in \fancyscript{F}$ is a facet of the hypercube, we denote by $\bar{F}$ the opposite facet to $F$ (namely, if $F$ is given by $x_k \ge 0$, then $\bar{F}$ is the facet $x_k \le 1$ and vice-versa). Similarly for a vertex $V \in \fancyscript{V}$ we denote by $\bar{V}$ the opposite vertex obtained by complementing the binary word $V$. Denote by $N_{\fancyscript{F}}:\mathbb {R}^{\fancyscript{F}}\rightarrow \mathbb {R}^{\fancyscript{F}}$ and $N_{\fancyscript{V}}:\mathbb {R}^{\fancyscript{V}}\rightarrow \mathbb {R}^{\fancyscript{V}}$ the “negation” maps, so that we have:

$$\begin{aligned} \forall g \in \mathbb {R}^{\fancyscript{F}},\quad \forall F \in \fancyscript{F},\quad (N_{\fancyscript{F}} g)(F) = g(\bar{F}) \end{aligned}$$

(20)

$$\begin{aligned} \forall h \in \mathbb {R}^{\fancyscript{V}},\quad \forall V \in \fancyscript{V},\quad (N_{\fancyscript{V}} h)(V) = h(\bar{V}) \end{aligned}$$

(21)

Note that in a suitable ordering of the facets and the vertices, the matrix representation of $N_{\fancyscript{F}}$ and $N_{\fancyscript{V}}$ take the following antidiagonal form ($N_{\fancyscript{F}}$ is of size $2n\times 2n$ and $N_{\fancyscript{V}}$ is of size $2^n \times 2^n$):

Consider now the following decomposition of the matrix ${\begin{bmatrix} I&\quad -W\\ -W^\mathsf T&\quad I\end{bmatrix}}$:

$$\begin{aligned} \begin{bmatrix} I&\quad -W\\ -W^\mathsf T&\quad I \end{bmatrix} = \begin{bmatrix} N_{\fancyscript{F}}&\quad 0\\ 0&\quad N_{\fancyscript{V}} \end{bmatrix} + \begin{bmatrix} I-N_{\fancyscript{F}}&\quad -W\\ -W^\mathsf T&\quad I-N_{\fancyscript{V}} \end{bmatrix} \end{aligned}$$

(22)

Clearly the first matrix in the decomposition is nonnegative. The next lemma states that the second matrix is actually positive semidefinite:

Lemma 2

Let $\mathcal F$ and $\mathcal V$ be respectively the set of facets and vertices of the hypercube $C_n=[0,1]^n$ and let $\widehat{W} \in \mathbb {R}^{\fancyscript{F}\times \fancyscript{V}}$ be the matrix:

$$\begin{aligned} \widehat{W}_{F,V} = \left\{ \begin{array}{ll} 1 &{} \quad \text { if }V \notin F\\ -1 &{} \quad \text { if }V \in F \end{array}\right. \;\; \forall F \in \fancyscript{F},\; V \in \fancyscript{V}\end{aligned}$$

Then the matrix

$$\begin{aligned} \begin{bmatrix} I-N_{\fancyscript{F}}&\quad -\gamma \widehat{W}\\ -\gamma \widehat{W}^\mathsf T&\quad I-N_{\fancyscript{V}} \end{bmatrix} \end{aligned}$$

(23)

is positive semidefinite for $\gamma = 1/\sqrt{2^{n-1}}$ (where $N_{\fancyscript{F}}$ and $N_{\fancyscript{V}}$ are defined in (20) and (21)).

Proof

We use the Schur complement to show that the matrix (23) is positive semidefinite. In fact we show that

1.
$I-N_{\fancyscript{V}} \succeq 0$
2.
${{\mathrm{range}}}(\widehat{W}^\mathsf T ) \subseteq {{\mathrm{range}}}(I-N_{\fancyscript{V}})$, and
3.
$I-N_{\fancyscript{F}} - \gamma ^2 \widehat{W} (I-N_{\fancyscript{V}})^{-1} \widehat{W}^\mathsf T \succeq 0$.

where $(I-N_{\fancyscript{V}})^{-1}$ denotes the pseudo-inverse of $I-N_{\fancyscript{V}}$.

Observe that for any $k \in \mathbb {N}$, the $2k \times 2k$ matrix given by:

is positive semidefinite: in fact one can see that $\frac{1}{2}(I_{2k}-N_{2k})$ is the orthogonal projection onto the subspace spanned by its columns (i.e., the subspace of dimension $k$ spanned by $\{e_i - e_{2k-i}:\;i=1,\ldots ,n\}$ where $e_i$ is the $i$’th unit vector). Hence this shows that $I-N_{\fancyscript{V}}$ is positive semidefinite, and it also shows that $(I-N_{\fancyscript{V}})^{-1} = \frac{1}{4} (I-N_{\fancyscript{V}})$.

Now we show that ${{\mathrm{range}}}(\widehat{W}^\mathsf T ) \subseteq {{\mathrm{range}}}(I-N_{\fancyscript{V}})$. For any $F \in \fancyscript{F}$, the $F$’th column of $\widehat{W}^\mathsf T $ satisfies $(\widehat{W}^\mathsf T )_{V,F} = -(\widehat{W}^\mathsf T )_{\bar{V},F}$ for any $V \in \fancyscript{V}$, and thus ${{\mathrm{range}}}(\widehat{W}^\mathsf T ) \subseteq {{\mathrm{span}}}(e_{V} - e_{\bar{V}}, \; : \; V \in \fancyscript{V}) = {{\mathrm{range}}}(I-N_{\fancyscript{V}})$.

It thus remains to show that

$$\begin{aligned} I-N_{\fancyscript{F}} - \gamma ^2 \widehat{W} (I-N_{\fancyscript{V}})^{-1} \widehat{W}^\mathsf T \succeq 0 \end{aligned}$$

First note that since $\frac{1}{2} (I-N_{\fancyscript{V}})$ is an orthogonal projection and that ${{\mathrm{range}}}(\widehat{W}^\mathsf T ) \subseteq {{\mathrm{range}}}(I-N_{\fancyscript{V}})$, we have $(I-N_{\fancyscript{V}})^{-1} \widehat{W}^\mathsf T = \frac{1}{2} \widehat{W}^\mathsf T $. Thus we now have to show that

$$\begin{aligned} I-N_{\fancyscript{F}} - \frac{\gamma ^2}{2} \widehat{W} \widehat{W}^\mathsf T \succeq 0. \end{aligned}$$

The main observation here is that the matrix $\widehat{W}\widehat{W}^\mathsf T $ is actually equal to $2^{n} (I-N_{\fancyscript{F}})$. For any $F,G \in \fancyscript{F}$, we have:

$$\begin{aligned} (\widehat{W}\widehat{W}^\mathsf T )_{F,G} = \sum \limits _{a \in \fancyscript{V}}\widehat{W}_{F,a} \widehat{W}_{G,a} = \left\{ \begin{array}{ll} 2^n &{} \quad \text { if } F=G\\ -2^n &{} \quad \text { if } F=\bar{G}\\ 0 &{} \quad \text { else} \end{array} \right. \end{aligned}$$

First it is clear that if $F=G$, then $(\widehat{W}\widehat{W}^\mathsf T )_{F,G} = 2^n$. Also if $F=\bar{G}$ then $(\widehat{W}\widehat{W}^\mathsf T )_{F,G} = -2^n$ since if $F=\bar{G}$ then $a \in F \Leftrightarrow a \notin G$ hence $\widehat{W}_{F,a} \widehat{W}_{G,a} = -1$ for all $a \in \fancyscript{V}$. In the case that $F \ne G$ and $F \ne \bar{G}$, it is easy to verify by simple counting that $\sum \nolimits _{a \in \fancyscript{V}}\widehat{W}_{F,a} \widehat{W}_{G,a} = 0$.

Hence we have $I-N_{\fancyscript{F}} - \gamma ^2 \widehat{W} (I-N_{\fancyscript{V}})^{-1} \widehat{W}^\mathsf T = I-N_{\fancyscript{F}} - \frac{\gamma ^2}{2} \widehat{W}\widehat{W}^\mathsf T = (1 - \frac{\gamma ^2}{2}2^n) (I-N_{\fancyscript{F}})$ which is positive semidefinite for $\gamma = 1/\sqrt{2^{n-1}}$.

Using this lemma, Eq. (22) shows that the matrix $W$ is feasible for the semidefinite program (3), and thus that $\nu _+^{[0]}(S(C_n)) \ge \langle S(C_n), W \rangle = \sqrt{2^{n-1}} \cdot 2n$. Hence since $\Vert S(C_n)\Vert _F = \sqrt{2^{n-1} \cdot 2n}$ we get that

$$\begin{aligned} {{\mathrm{rank}}}_+(S(C_n)) \ge \left( \frac{\nu _+^{[0]}(S(C_n))}{\Vert S(C_n)\Vert _F}\right) ^2 \ge \left( \frac{\sqrt{2^{n-1}} \cdot 2n}{\sqrt{2^{n-1} \cdot 2n}}\right) ^2 = 2n \end{aligned}$$

which completes the proof.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fawzi, H., Parrilo, P.A. Lower bounds on nonnegative rank via nonnegative nuclear norms. Math. Program. 153, 41–66 (2015). https://doi.org/10.1007/s10107-014-0837-2

Download citation

Received: 19 March 2013
Accepted: 27 October 2014
Published: 12 November 2014
Issue Date: October 2015
DOI: https://doi.org/10.1007/s10107-014-0837-2

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lower bounds on nonnegative rank via nonnegative nuclear norms

Abstract

Access this article

Similar content being viewed by others

Lower Bounds on Matrix Factorization Ranks via Noncommutative Polynomial Optimization

Self-scaled bounds for atomic cone ranks: applications to nonnegative rank and cp-rank

Positive semidefinite rank

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of Proposition 3: Slack matrix of hypercube

Proposition

Proof

Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

Lower bounds on nonnegative rank via nonnegative nuclear norms

Abstract

Access this article

Similar content being viewed by others

Lower Bounds on Matrix Factorization Ranks via Noncommutative Polynomial Optimization

Self-scaled bounds for atomic cone ranks: applications to nonnegative rank and cp-rank

Positive semidefinite rank

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of Proposition 3: Slack matrix of hypercube

Appendix: Proof of Proposition 3: Slack matrix of hypercube

Proposition

Proof

Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation