Abstract
It is shown that the relative distance in Frobenius norm of a real symmetric order-d tensor of rank-two to its best rank-one approximation is upper bounded by \(\sqrt{1-(1-1/d)^{d-1}}\). This is achieved by determining the minimal possible ratio between spectral and Frobenius norm for symmetric tensors of border rank two, which equals \(\left( 1-{1}/{d}\right) ^{(d-1)/{2}}\). These bounds are also verified for arbitrary real rank-two tensors by reducing to the symmetric case.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
It is a well-known fact that the minimal possible ratio between spectral and Frobenius norm of a real \(n \times n\) matrix is \(1 / \sqrt{n}\), and is achieved for any matrix with identical singular values, that is, for multiples of orthogonal matrices. Since the spectral norm of a matrix measures the length of its best rank-one approximation, this statement has the geometric meaning that orthogonal matrices achieve the largest possible relative distance to rank-one matrices. More generally, using singular value decomposition, one can show that the minimal ratio between spectral and Frobenius norm of a rank-k matrix is \(1/\sqrt{k}\) and is achieved when all nonzero singular values are equal.
There has been considerable interest in determining the minimal possible ratio between spectral norm \(\Vert A \Vert _\sigma\) and Frobenius norm \(\Vert A \Vert _F\) of an \(n_1 \times \dots \times n_d\) tensor A; see, e.g., [1, 8, 9, 11,12,13]. As in the matrix case, this ratio measures the distance of A to the set of rank-one tensors, and is hence of both theoretical and practical relevance in problems of low-rank approximation and entanglement. The precise relation between the spectral norm of A and its distance to rank-one tensors is as follows:
Therefore, the minimal possible ratio \(\Vert A \Vert _\sigma / \Vert A \Vert _F\) that can be achieved is also called the best rank-one approximation ratio of the given tensor space [13]. By (1), it expresses the maximum relative distance of a tensor to the set of rank-one tensors.
Despite some recent progress achieved in the aforementioned references and others, determining the best rank-one approximation ratio for tensors remains a difficult problem in general and is largely open. One reason is
the lack of a suitable analog to the singular value decomposition. Moreover, the best rank-one approximation ratio of tensors usually differs over the real and complex field, as well as for nonsymmetric and symmetric tensors of the same size.
The available results in the literature focus on the best rank-one approximation ratio in the full tensor space. As for matrices, it would however also be useful to estimate its value in dependence of the tensor rank. In this work, we take a first step in this direction. We determine the minimal ratio between spectral and Frobenius norm of real rank-two tensors, and obtain that it is actually the same for symmetric and general tensors. Recall that for matrices this value equals \(1/\sqrt{2}\).
For tensors, one should also take into account that the set of tensors of rank at most two is not closed. Our main result is on symmetric tensors and reads as follows.
Theorem 1.1
Let A be a real symmetric tensor of order \(d\ge 3\) and rank at most two. Then,
and this bound is sharp. In particular,
where \({{\,\mathrm{brank}\,}}\) denotes border rank, and the minimum is taken over real symmetric tensors. Up to orthogonal transformation and scaling, the minimum is achieved only for the tensor
Here, \(e_1,e_2\) are two orthogonal unit tensors, \(u^d\) abbreviates \(u \otimes \dots \otimes u\) (d times) and \(u^{d-1}v\) denotes the symmetric part of \(u^{d-1} \otimes v\) (see below for notation).
The proof of Theorem 1.1 constitutes the main part of this work and is given in Sect. 2. The result however raises the question, whether the same bounds hold for general nonsymmetric tensors of rank two. In Sect. 3, we show that the answer is affirmative by reducing the question to the symmetric case.
Theorem 1.2
Let A be a real \(n_1 \times \dots \times n_d\) tensor of rank at most two. Then,
and this bound is sharp. In particular, assuming \(n_i \ge 2\) for \(i=1,\dots ,d\),
where \({{\,\mathrm{brank}\,}}\) denotes border rank, and the minimum is taken over real \(n_1 \times \dots \times n_d\) tensors.
Note that while for symmetric tensors, the notions of rank and symmetric rank are not the same in general [14], they coincide for rank-two tensors, see, e.g., [15].
Due to relation (1), the theorems above are equivalent to the following statement on the maximum relative distance of a real rank-two tensor to the set of rank-one tensors.
Theorem 1.3
Let A be a real tensor of order \(d\ge 3\) and rank at most two. Then,
and this bound is sharp both for general as well as for symmetric tensors. Equality is achieved for the symmetric tensor \(W_d\) as above.
It is interesting to note that for \(d\rightarrow \infty\) our results imply
and
In particular, both quantities are bounded independently of d.
1.1 Notation
We consider the subspace \({{\,\mathrm{Sym}\,}}_d({{\,\mathrm{\mathbb {R}}\,}}^n)\) of real symmetric \(n \times \dots \times n\) tensors \(A = [a_{i_1,\dots ,i_d}]\) of order d. It inherits the Euclidean inner product \(\langle A,B \rangle _F = \sum _{i_1,\dots ,i_d} a_{i_1 \dots i_d} b_{i_1\dots i_d}\) from the ambient space, which induces the Frobenius norm via
It will be convenient to introduce the notation
for symmetric rank-one tensors, and similarly
for the symmetrization of a nonsymmetric rank-one tensor \(u_1 \otimes u_2 \otimes \dots \otimes u_d\). It equals the orthogonal projection of \(u_1 \otimes u_2 \otimes \dots \otimes u_d\) onto \({{\,\mathrm{Sym}\,}}_d({{\,\mathrm{\mathbb {R}}\,}}^n)\). Specifically, the notation \(u^k v^\ell\) denotes the symmetrization of the rank-one tensor \(u^{\otimes k} \otimes v^{\otimes \ell }\). For symmetric rank-one tensors \(u^d\) and \(v^d\) , it holds that \(\langle u^d,v^d\rangle _F=\langle u,v\rangle ^d\) and, therefore, \(\Vert u^d\Vert _F=\Vert u\Vert ^d\).
To any symmetric tensor A, one associates a homogeneous polynomial
The spectral norm of A is then defined as
Due to a result of Banach [2], this definition of spectral norm for symmetric tensors is consistent with the general one, which is given in (15). If w is a normalized maximizer of \(\frac{1}{\Vert w\Vert ^d}{\left|p_A(w) \right|}\), then \(\lambda w^d\) with \(\lambda = p_A(w)=\langle A,w^d\rangle _F\) is a best symmetric rank-one approximation of A in Frobenius norm, that is, it satisfies
and vice versa.
A symmetric tensor of rank at most two takes the form
for vectors u, v and scalars \(\alpha ,\beta \ne 0\), and the rank is equal to two if and only if u and v are linearly independent. Note that the difference notation will turn out to be convenient later. Technically, this defines tensors of symmetric rank at most two. But since for rank two both notions of rank coincide [15], we can just use the word rank throughout. It is well-known that the set of tensors of rank at most two is not closed [7]. This is also true when restricting to symmetric tensors. The tensors in the closure are said to have border rank at most two, denoted as \({{\,\mathrm{brank}\,}}A \le 2\).
2 Proof of the main result
For proving Theorem 1.1, we will determine the infimum value of the optimization problem
Here, we can always additionally assume that \(\langle u,v\rangle \ge 0\) and \(\alpha >0\). We will proceed in several steps. First, in Sect. 2.1, we validate that the tensor \(W_d\), which has symmetric border rank two, achieves equality in (2). Hence the infimum in (4) cannot be larger than \((1-\frac{1}{d})^{d-1}\). We next consider in Sect. 2.2 the first-order necessary optimality condition for (4) and show that it cannot be fulfilled for rank-two tensors admitting a unique symmetric best rank-one approximation (Proposition 2.1). In other words, the potential candidates for achieving the infimum in (4) are rank-two tensors with more than one symmetric best rank-one approximation. In Sect. 2.3, we therefore derive a criterion for a symmetric rank-two tensor to have a unique symmetric best rank-one approximation (Proposition 2.3), and validate by hand in Sects. 2.4 and 2.5 that for tensors which do not satisfy this criterion the value of F is strictly larger than \((1-\frac{1}{d})^{d-1}\). It then remains to show in Sect. 2.6, that among the tensors of border rank two, and up to orthogonal transformation, only tensor \(W_d\) achieves the infimum. Taken together, these steps provide a complete proof of Theorem 1.1.
In our proofs, we will frequently assume that \(\alpha u^d - \beta v^d\in {{\,\mathrm{Sym}\,}}_d({{\,\mathrm{\mathbb {R}}\,}}^2)\) since we can always restrict to \({{\,\mathrm{Sym}\,}}_d({{\,\mathrm{span}\,}}\{u,v\})\).
2.1 The ratio for tensor \(W_d\)
Recall that \(W_d= e_1^{d-1}e_2^{}=\frac{d}{dt}(e_1+te_2)^d|_{t=0}\). We have \(\Vert W_d\Vert _F^2=d\). The spectral norm is given by following optimization problem:
The KKT conditions for this problem lead to the relation
that is, either \(x=0\), or \(x^2=(d-1)y^2\). We find that \(\smash {x=\sqrt{\frac{d-1 }{d}}}\) and \(y=\frac{1}{\sqrt{d}}\) is a maximizer with the value \(\Vert W_d\Vert _\sigma =d \left( \frac{d-1}{d}\right) ^{(d-1)/2}\frac{1}{\sqrt{d}}\), and therefore
2.2 Optimality condition for symmetric rank-two tensors
The target function in (4) can be written as a composition
where
and
While \(\varphi\) is smooth, the map G is not differentiable in all points. However, it is the quotient of the smooth function \(A\mapsto \Vert A\Vert _F^2\) and the convex function \(A\mapsto \Vert A\Vert _\sigma ^2\). Therefore, the rules for generalized gradients of regular functions are applicable; see [5, Section 2.3]. It follows that the subdifferential of G in a point A can be computed using a quotient rule, which yields
Here, \(\partial (\Vert A\Vert _\sigma ^{})\) denotes the subdifferential of the spectral norm in A. The derivative of \(\varphi\) equals
which leads to
with \(A = \varphi (\alpha ,\beta ,u,v) = \alpha u^d - \beta v^d\). The subdifferential of the spectral norm can be characterized as
see [4, Theorem 2.1] in general, and [1, Section 2.3] in particular. In words, \(\partial (\Vert A\Vert _\sigma )\) equals the convex hull of the normalized symmetric best rank-one approximations of A.
From (5) and (6), one concludes that the first-order optimality condition \(0\in \partial F(\alpha , \beta ,u,v)\) (see, e.g., [5, Proposition 2.3.2]) for problem (4) implies that there exists X in the convex set (6) such that
for all \((\delta \alpha , \delta \beta ,\delta u, \delta v)\) and some \(\lambda \in {{\,\mathrm{\mathbb {R}}\,}}\). This is equivalent with just requiring
for all \(\delta u\) and \(\delta v\). Let \(P_{u,v}\) denote the orthogonal projection onto the linear subspace \(\{u^{d-1}\delta u+v^{d-1}\delta v:\delta u,\delta v\in {{\,\mathrm{\mathbb {R}}\,}}^n\}\) of \({{\,\mathrm{Sym}\,}}_d({{\,\mathrm{\mathbb {R}}\,}}^n)\). Taking into account that \(P_{u,v} A = P_{u,v} (\alpha u^d-\beta v^d) = \alpha u^d-\beta v^d\), we conclude that the optimality condition can be written as
We now show that condition (7) cannot hold for tensors \(\alpha u^d -\beta v^d\) admitting a unique best symmetric rank-one approximation. This is an interesting analogy to the fact that matrices achieving a minimal ratio of spectral and Frobenius norm have equal singular values.
Proposition 2.1
Let \(A=\alpha u^d -\beta v^d\) have rank two. If A has a unique best symmetric rank-one approximation, then A is not a critical point of the optimization problem (4).
We use the following lemma that shows \(P_{u,v} w^d=au^{d-1}w+bv^{d-1}w\) for any \(w\in {{\,\mathrm{\mathbb {R}}\,}}^n\) with some \(a,b\in {{\,\mathrm{\mathbb {R}}\,}}\).
Lemma 2.2
Let \(\Vert u\Vert =\Vert v\Vert =1\). The projection \(P_{u,v}w^d\) is given by
Proof
This follows from the definition of orthogonal projection by a direct calculation. \(\square\)
Proof of Proposition 2.1
Let one of \(\pm w^d\) be the normalized best symmetric rank-one approximation of A. Since it is unique, the optimality condition becomes
From \(p_A(w)=\langle A, w^d\rangle _F\ne 0\) and \(A = \alpha u^d +\beta v^d\in \{u^{d-1}\delta u+v^{d-1}\delta v:\delta u,\delta v\in {{\,\mathrm{\mathbb {R}}\,}}^n\}\) , we have \(P_{u,v}w^d\ne 0\), which excludes \(\lambda = 0\). By Lemma 2.2, \(P_{u,v}w^d= au^{d-1}w+bv^{d-1} w\) for some \(a,b\in {{\,\mathrm{\mathbb {R}}\,}}\). However, since u and v are linearly independent, we have the decomposition
into two complementary subspaces. Therefore, (8) would only be possible if w is both a multiple of u and v, which contradicts the linear independence of u and v. \(\square\)
2.3 A condition for unique symmetric best rank-one approximation
We now present a class of symmetric rank-two tensors admitting unique best symmetric rank-one approximations. By the result of Proposition 2.1, these can then be excluded from the further discussion on the minimal norm ratio.
Proposition 2.3
Let
with \(u\ne v\), \(\Vert u\Vert =\Vert v\Vert =1\), \(\langle u,v\rangle \ge 0\) and \(\alpha> \beta >0\). Then, A has exactly one best symmetric rank-one approximation.
For the proof, we require auxiliary results. One is the following fact about polynomials.
Lemma 2.4
Let \(a, \gamma >0\) and \(b \ge 0\) and \(d\ge 2\). The equation \(x=\gamma (x-a)(x+b)^{d-1}\) has two real solutions if d is even, and three real solutions if d is odd.
Proof
Let \(p(x)=\gamma (x-a)(x+b)^{d-1}-x\). Then by the intermediate value theorem, p must have at least two real zeros, namely one in the interval \([-b,0]\) and another one in the interval \((a,\infty )\). On the other hand,
has at most two sign changes, one at a value larger than \(\frac{(d-1)a-b}{d}\) and another at one at a value smaller than \(-b\) if d is odd. Therefore, p has at most three real zeros. The statement follows from the fact that the number of real zeros of a polynomial with real coefficients has the same parity as its degree. \(\square\)
The second lemma narrows the possible locations of maximizers of the homogeneous form \({\left|p_A \right|}\).
Lemma 2.5
Under the assumptions of Proposition 2.3, let w be a maximizer of \({\left|p_A(w) \right|} = {\left|\langle \alpha u^d-\beta v^d,w^d \rangle _F \right|}\) subject to \(\Vert w\Vert =c>0\). Then, \({\left|\langle u,w\rangle \right|} \ge {\left|\langle v,w\rangle \right|}\).
Proof
Assume to the opposite that \({\left|\langle u,w\rangle \right|} < {\left|\langle v,w\rangle \right|}\) and without loss of generality \(\langle v,w\rangle >0\). Let Q be the symmetric orthogonal matrix mapping u to v and v to u (i.e., \(Q = I - z z^T\) with \(z = (u+v)/\Vert u+v\Vert\)), and let \(\bar{w}=Qw\). Then, \(\langle u,w\rangle =\langle v,\bar{w}\rangle\) and \(\langle v,w\rangle =\langle u,\bar{w}\rangle\). By assumption, we then have
If \({\left|\langle \alpha u^d-\beta v^d,{w}^d \rangle _F \right|}=\langle \alpha u^d-\beta v^d,{w}^d\rangle _F\) , this yields \({\left|\langle \alpha u^d-\beta v^d,\bar{w}^d \rangle _F \right|}>{\left|\langle \alpha u^d-\beta v^d,{w}^d \rangle _F \right|}\) (by using \((\alpha +\beta )\langle v,w\rangle ^d>(\alpha +\beta )\langle u, w\rangle ^d\)) which contradicts the optimality of w. In the other case, \({\left|\langle \alpha u^d-\beta v^d,{w}^d \rangle _F \right|}= - \langle \alpha u^d-\beta v^d,{w}^d\rangle _F\), optimality implies \(\beta (\langle u,w\rangle ^d+\langle v,w\rangle ^d)>\alpha (\langle u,w\rangle ^d+\langle v,w\rangle ^d)\) which contradicts \(\alpha > \beta\). \(\square\)
We are now in the position to prove Proposition 2.3.
Proof of Proposition 2.3
We can assume that \(A \in {{\,\mathrm{Sym}\,}}_d({{\,\mathrm{\mathbb {R}}\,}}^2)\), so that \(u,v \in {{\,\mathrm{\mathbb {R}}\,}}^2\). Without loss of generality, since we can change coordinates, we can consider \(\alpha =1\), \(u=\begin{pmatrix}0\\ 1 \end{pmatrix}\) and \(\root d \of {\beta }v=\begin{pmatrix}a\\ b\end{pmatrix}\) with \(a>0\), \(b\ge 0\) (since \(\langle u,v\rangle \ge 0\)), and \(a^2+b^2<1\) (since \(\beta <\alpha =1\)). Writing \(w = \lambda \begin{pmatrix}x\\ y\end{pmatrix}\) for points on the unit circle, where \(\lambda > 0\) is a normalization constant, we then have
Critical points on the circle are characterized by \(\langle w, \nabla p_A(w) \rangle = 0\), which means
independent of \(\lambda\). Note that here \(y=0\) is not possible since both a and b are nonzero. Recall that a symmetric best rank-one approximation of A is given as \(p_A(w)w^d\), where w maximizes \({\left|p_A(w) \right|}\) on the circle. Since \(p_A(-w)=(-1)^dp_A(w)\), in order to prove the assertion it suffices to show that \({\left|p_A(w) \right|}\) has exactly one maximizer w with \(y=1\). The optimality condition at such a w reduces to
Hence, we only need to show that there is exactly one solution x of this equation corresponding to a global maximum of \({\left|p_A \right|}\) on the unit circle.
If \(y=1\), then \(p_A\) in (9) has a zero at \(x_0=\frac{1-b}{a}\). Then,
This shows that (10) has at least one solution \(x^*>x_0\). We consider such a solution \(x^*\) such that the corresponding unit vector \(w=\lambda \begin{pmatrix} x^*\\ 1 \end{pmatrix}\) is a local maximum of \({\left|p_A \right|}\) on the unit circle. We have
By Lemma 2.5, w is not a global maximum of \({\left|p_A \right|}\). If d is even, then by Lemma 2.4, equation (10) has exactly two solutions, and therefore only one corresponds to a global maximum. If d is odd, then by the same lemma, (10) has three solutions. Taking into account that \(p_A\) in (9) has only one zero for \(y=1\), one of these solutions corresponds to a local minimizer of \({\left|p_A \right|}\). Hence, there is only one global maximizer. \(\square\)
2.4 The case \(\alpha > 0\ge \beta\)
We show that \(\frac{\Vert \alpha u^d-\beta v^d\Vert _\sigma ^2}{\Vert \alpha u^d-\beta v^d\Vert _F^2} \ge \frac{1}{2}\) if \(\langle u^d,v^d\rangle _F\ge 0\) and \(\alpha > 0 \ge \beta\). This shows that for \(d> 2\) such tensors do not attain the infimum in (4) since \(\frac{1}{2}> \left( 1-\frac{1}{d}\right) ^{d-1}\). We formulate this statement without \(\alpha\) and \(\beta\) by removing the restriction \(\Vert u \Vert = \Vert v \Vert = 1\).
Proposition 2.6
Let \(u \ne v\) and \(\langle u,v \rangle \ge 0\). Then, \(\frac{\Vert u^d+v^d\Vert ^2_\sigma }{\Vert u^d+v^d\Vert ^2_F}\ge \frac{1}{2}\).
Proof
We can assume \(\Vert u\Vert \ge \Vert v\Vert\). Using that \(\Vert u^d+v^d\Vert _\sigma \ge \langle u^d+v^d,\frac{u^d}{\Vert u\Vert ^d}\rangle _F\), we have
as asserted. \(\square\)
2.5 The case \(\alpha =\beta >0\)
In this section, we verify by a direct calculation that the infimum in (4) is not attained for the difference of two rank-one tensors with the same norm, i.e., when \(\alpha = \beta\) in (4).
Proposition 2.7
Let \(u\ne v\), \(\Vert u\Vert =\Vert v\Vert \ne 0\), \(\langle u,v\rangle \ge 0\) and \(d\ge 3\). Then,
We require the following version of Jensen’s inequality.
Lemma 2.8
Let \(f:[a,b]\rightarrow {{\,\mathrm{\mathbb {R}}\,}}\) be convex and continuously differentiable. If \(a+b=a'+b'\) and \(a<a'<b'<b\), then
The inequalities are strict if f is strictly convex.
Proof
Without loss of generality let \(a=-b\) and \(a'=-b'\). Then, using substitution, we have
by monotonicity of the derivative of a convex function. This shows the first of the asserted inequalities. The second inequality is just Jensen’s inequality, noting that \(\frac{a+b}{2}=\frac{a'+b'}{2}\). If f is strictly convex, then \(f'\) is strictly monotone and the inequalities are strict. \(\square\)
Proof of Proposition 2.7
We can assume that \(A \in {{\,\mathrm{Sym}\,}}_d({{\,\mathrm{\mathbb {R}}\,}}^2)\), so that \(u,v \in {{\,\mathrm{\mathbb {R}}\,}}^2\). After rotation and rescaling, we have \(u=\left( \begin{array}{l}1\\ t \end{array}\right)\) and \(v=\left( \begin{array}{l}1\\ -t\end{array}\right)\) with \(t\in (0,1].\) Then,
First, we apply the estimate
which yields
The right-hand side is monotonically increasing in the interval (0, 1]. For \(t=\sqrt{\frac{1}{d-1}}\) it equals
This value is larger than \(\left( 1-\frac{1}{d}\right) ^{d-1}=\left( \frac{d-1}{d}\right) ^{d-1}\) since, using Lemma 2.8 with \(f(t)=t^{d-1}\), it holds that \(d^d-(d-2)^d> 2 d (d-1)^{d-1}\) for \(d\ge 3\). This shows that
for all \(t\in \left[ \sqrt{\frac{1}{d-1}},1\right]\). It hence remains to verify this inequality for all \(t\in \left( 0,\sqrt{\frac{1}{d-1}}\right)\), which is a little bit more involved. The starting point is another lower bound for the spectral norm, namely
Note that \(\frac{u^d-v^d}{\Vert u^d-v^d\Vert _F}\rightarrow \frac{W_d}{\Vert W_d\Vert _F}\) for \(t\rightarrow 0\). This can be seen by taking the limit of \(\frac{u^d-v^d}{t}\) and noting that \(g(t)=\Vert u^d-v^d\Vert ^2_F\) is of order \(t^2\) by (11). We therefore have
where the second and third equalities are shown in Sect. 2.1. We now claim that
which then proves the assertion. This claim is equivalent to the positivity of
Elementary manipulations give
Note that for \(t\in \left( 0,\sqrt{\frac{1}{d-1}}\right)\) we have \(b> b'> a' > a\) and
Therefore with \(f(t)=(d-1) t^{d-2}\), we can rewrite (12) as
Moreover,
and therefore \(a'':= \frac{a+b -(b'-a')}{2}>a'>a\) and \(b>b'':= \frac{a+b+(b'-a')}{2}>b'\). Since \({a''+b''}={a+b}\) and \(a''-b''=a'-b'\), Lemma 2.8 yields
where the second inequality follows from monotonicity of f. This shows that (12) is positive. \(\square\)
2.6 Tensors of border rank two
We now consider tensors lying on the boundary of the set of symmetric rank-two tensors.
Proposition 2.9
Let A be a limit of symmetric rank-two tensors and\({{\,\mathrm{rank}\,}}A>2\). Then,
and equality is attained if and only if \(A=u^{d-1}v\)for some orthogonalu and v, that is, for tensors arising from scaling and orthogonal transformations of tensor \(W_d\).
The boundary of rank-two tensors is well-studied. We require the following well-known parametrization, see, e.g., [3]. We offer a self-contained proof for completeness.
Lemma 2.10
Let A be a limit of symmetric rank-two tensors and \({{\,\mathrm{rank}\,}}A>2\). Then, A is of the form
with \(\langle u,v\rangle =0\) and \(\Vert u\Vert =\Vert v\Vert =1\).
Proof
Let \(A_n=u_n^d\pm v_n^d\) with \(\lim _{n\rightarrow \infty }A_n=A\) or \(\lim _{n\rightarrow \infty }A_n=-A\). It is not difficult to see that \(u_n\) and \(v_n\) must be unbounded since otherwise there is a subsequence of \(A_n\) converging to a tensor of rank at most two, contradicting \({{\,\mathrm{rank}\,}}A>2\). We write \(v_n=s_n u_n+t_n w_n\) with \(\Vert w_n\Vert =1\) and \(\langle u_n, w_n\rangle =0\). Then,
and it can be checked that all terms are pairwise orthogonal. Hence, since \(A_n\) converges, all terms must be bounded and by passing to a subsequence we can assume that all of them converge. Due to \(\Vert u_n\Vert \rightarrow \infty\) we have \(1\pm s_n^d\rightarrow 0\) for the first term, which implies that the sequence \(s_n\) is bounded. Therefore, considering the term \(k=1\), the sequence \(t_n\Vert u_n\Vert ^{d-1}\) is bounded which automatically implies \(t_n^k\Vert u_n\Vert ^{d-k}\rightarrow 0\) for all \(k>1\). We conclude that
which proves the assertion. \(\square\)
Proof of Proposition 2.9
Using Lemma 2.10, scaling and orthogonal transformations, we can assume \(A=a e_1 ^d+bd e_1^{d-1}e_2 \in {{\,\mathrm{Sym}\,}}_d({{\,\mathrm{\mathbb {R}}\,}}^2)\) with \(a,b\ge 0\). Then, \(\Vert A\Vert _F^2=a^2+b^2d\) since the tensors \(e_1^d\) and \(e_1^{d-1}e_2\) are orthogonal and \(\Vert de_1^{d-1}e_2\Vert _F^2=d\). We have the following two lower bounds for the spectral norm:
and
We can restrict to tensors A with Frobenius norm \(\Vert A\Vert ^2_F=a^2+b^2d=1\) and need to show that
whenever \(a>0\). The first lower bound (13) implies that this is true whenever \(b> \frac{\sqrt{d}-a\sqrt{d-1}}{d}\). Together with \(1=a^2+b^2d\) and \(a,b\ge 0\) this verifies the claim for \(0< a< \frac{2\sqrt{d(d-1)}}{2d-1}\). If \(a\ge \frac{2\sqrt{d(d-1)}}{2d-1}\), then the second lower bound (14) yields the desired estimate
for \(d\ge 3\). \(\square\)
This concludes the proof of Theorem 1.1.
3 Approximation ratio for nonsymmetric rank-two tensors
Recall that the spectral norm for general \(n_1 \times \dots \times n_d\) tensors is defined as
The result for symmetric tensors raises the question whether the inequality
is also true for general real tensors of order \(d \ge 3\) and rank at most two. As stated in Theorem 1.2, the answer is indeed affirmative and a consequence of the following interesting fact.
Proposition 3.1
Let A be a real \(n_1 \times \dots \times n_d\) tensor of rank at most two. Then there is is symmetric rank-two tensor \(A_\mathsf S\in {{\,\mathrm{Sym}\,}}_d ({{\,\mathrm{\mathbb {R}}\,}}^2)\) with \(\Vert A\Vert _F=\Vert A_{\mathsf {S}}\Vert _F\) and \(\Vert A\Vert _\sigma \ge \Vert A_{\mathsf {S}}\Vert _\sigma\).
For the proof, we will require two lemmas. The first is on the behavior of successively taking geometric means of positive real numbers, and the second on the relation of Frobenius and spectral norm of two particular \(2\times 2\) matrices.
Lemma 3.2
Let \(x,z\ge 0\), \(k>0\), and define the sequence
Then, \(\lim _{\ell \rightarrow \infty } y_\ell =\left( x^k z\right) ^{\frac{1}{k+1}}\).
Proof
We may assume \(x,z> 0\), otherwise the result follows immediately. We show via induction that
The cases \(\ell =0\) and \(\ell =1\) follow directly. Now let (16) be true for \(1,\ldots ,\ell +1\). Then,
proving (16). Taking the limit \(\ell \rightarrow \infty\) gives the result. \(\square\)
Lemma 3.3
Let \(a,b \in {{\,\mathrm{\mathbb {R}}\,}}\) and \(0\le x_1,x_2\le 1\). Define the matrices
Then, \(\Vert S\Vert _F=\Vert T\Vert _F\) and \(\Vert S\Vert _\sigma \le \Vert T\Vert _\sigma\).
Proof
A direct calculation shows that \(\Vert S\Vert _F=\Vert T\Vert _F\). The singular values of \(2\times 2\) matrices are given by \(\sigma _{1,2}^2={F^2}/{2}\pm \sqrt{{F^4}/{4}-{\left|D \right|}^2}\), where F is the Frobenius norm and D is the determinant of the matrix. We have
Since \(2x_1x_2\le x_1^2+x_2^2\) implies \({\left|\det T \right|}^2 \le {\left|\det S \right|}^2\), the largest singular value of T, which equals its spectral norm, is larger or equal to the largest singular value of S. \(\square\)
Proof of 3.1
Write \(A=\alpha U + \beta V\) where \(U=u_1 \otimes \dots \otimes u_d\) and \(V=v_1 \otimes \dots \otimes v_d\) with \(\Vert u_i\Vert =\Vert v_i\Vert =1\). Then, \(\Vert A\Vert _F^2=\alpha ^2 + 2\alpha \beta \langle U, V\rangle _F+\beta ^2\). We may assume that \(u_i,v_i\in {{\,\mathrm{\mathbb {R}}\,}}^2\) and after an orthogonal change of bases and possibly changing sign of \(\beta\), we may also assume that
Our goal is to show that replacing any k factors \(v_{i_1},\ldots ,v_{i_k}\) of V with the same unit norm vector v defined by
leads to a tensor with the same Frobenius norm but smaller spectral norm. Since Frobenius and spectral norm are invariant under permutation of tensor factors, it suffices to prove this for the case that the first k vectors \(v_1,\dots ,v_k\) are replaced in this way. The resulting tensor is denoted by \(A_k=\alpha U+\beta V_k\) with \(V_k= v \otimes \dots \otimes v \otimes v_{k+1} \otimes \dots \otimes v_d\) and since
the Frobenius norms of A and \(A_k\) indeed coincide. In the remainder of the proof, we show by induction that the spectral norm does not increase with k, i.e., \(\Vert A_{k+1}\Vert _\sigma \le \Vert A_{k}\Vert _\sigma \le \Vert A\Vert _\sigma\). For \(k=d\) , this provides a symmetric tensor with the desired properties.
We start with \(k=2\). Let \(w_1,\ldots , w_d\) be the maximizers in
Let \(a=\alpha \prod _{i=3}^d\langle u_i,w_i \rangle\), \(b=\beta \prod _{i=3}^d\langle v_i,w_i \rangle\), and consider the matrices
and
They represent the bilinear forms
in \({\tilde{w}}_1\) and \({\tilde{w}}_2\). Clearly,
and
Lemma 3.3 implies \(\Vert S\Vert _\sigma \le \Vert T\Vert _\sigma\) and therefore \(\Vert A_2\Vert _\sigma \le \Vert A\Vert _\sigma\).
For the induction step, let \(2 \le k < d\) and assume that replacing any k factors of V in the described manner always results in a tensor with a smaller or equal spectral norm. Note that here V was in principle arbitrary. Starting from the given V, we now construct a sequence \({\widetilde{V}}_0, {\widetilde{V}}_1, \dots\) of rank-one tensors in which the first k factors and then the second to \((k+1)\)-st factors are successively replaced:
and so on (the term in brackets disappears when \(k=d-1\)). By induction hypothesis, the corresponding sequence \(B_\ell =\alpha U+\beta \widetilde{V}_\ell\) of tensors has nonincreasing spectral norm and in particular \(\Vert B_\ell \Vert _\sigma \le \Vert B_0 \Vert _\sigma = \Vert A_k \Vert _\sigma \le \Vert A \Vert _\sigma\). We claim the \(B_\ell\) converge to \(A_{k+1}\), which proves \(\Vert A_{k+1} \Vert _\sigma \le \Vert A_k \Vert _\sigma \le \Vert A \Vert _\sigma\) as desired. Indeed, the unit norm vectors
above are constructed according to
By Lemma 3.2, this sequence converges to
that is, the \({\tilde{V}}_\ell\) converge to \(V_{k+1}\) and hence the \(B_\ell\) converge to \(A_{k+1}\). This concludes the proof. \(\square\)
Based on Proposition 3.1, we obtain a proof for Theorem 1.2 for general real rank-two directly from Theorem 1.1.
Proof of Theorem 1.2
Again, since A has rank at most two, it suffices to prove the statement for general (i.e., nonsymmetric) \(2 \times \dots \times 2\) tensors. Obviously, the minimal ratio \(\Vert A \Vert _\sigma / \Vert A \Vert _F\) over general \(2 \times \dots \times 2\) tensors is smaller or equal than the minimum over symmetric ones. However, by Proposition 3.1, the converse is also true. The result hence follows from Theorem 1.1. \(\square\)
Theorem 1.2 suggests an interesting relation between results in [6, 10]. The authors in [6] found that the minimal possible ratio of spectral and Frobenius norm among all tensors in \({{\,\mathrm{\mathbb {C}}\,}}^2\otimes {{\,\mathrm{\mathbb {C}}\,}}^2\otimes {{\,\mathrm{\mathbb {C}}\,}}^2\) is \(\frac{2}{3}\), while in [10], it is shown that the minimal ratio for tensors in \({{\,\mathrm{\mathbb {R}}\,}}^2\otimes {{\,\mathrm{\mathbb {R}}\,}}^2\otimes {{\,\mathrm{\mathbb {R}}\,}}^2\) is only \(\frac{1}{2}\). However, Theorem 1.2 states that border rank-two tensors in \({{\,\mathrm{\mathbb {R}}\,}}^{2\times 2\times 2}\) have the minimal ratio \(\frac{2}{3}\). This might be related to the fact that tensors of real rank two and three both have positive volume in \({{\,\mathrm{\mathbb {R}}\,}}^2\otimes {{\,\mathrm{\mathbb {R}}\,}}^2\otimes {{\,\mathrm{\mathbb {R}}\,}}^2\), while almost all tensors in \({{\,\mathrm{\mathbb {C}}\,}}^2\otimes {{\,\mathrm{\mathbb {C}}\,}}^2\otimes {{\,\mathrm{\mathbb {C}}\,}}^2\) have complex rank two.
References
Agrachev, A., Kozhasov, K., Uschmajew, A.: Chebyshev polynomials and best rank-one approximation ratio. SIAM J. Matrix Anal. Appl. 41(1), 308–331 (2020)
Banach, S.: Über homogene Polynome in (\(L^{2}\)). Studia Math. 7, 36–44 (1938)
Buczyński, J., Landsberg, J.M.: On the third secant variety. J. Algebraic Combin. 40(2), 475–502 (2014)
Clarke, F.H.: Generalized gradients and applications. Trans. Am. Math. Soc. 205, 247–262 (1975)
Clarke, F.H.: Optimization and Nonsmooth Analysis, 2nd edn. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (1990)
Cobos, F., Kühn, T., Peetre, J.: Extreme points of the complex binary trilinear ball. Studia Math. 138(1), 81–92 (2000)
de Silva, V., Lim, L.-H.: Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM J. Matrix Anal. Appl. 30(3), 1084–1127 (2008)
Derksen, H., Friedland, S., Lim, L.-H., Wang, L.: Theoretical and computational aspects of entanglement. arXiv:1705.07160, 2017
Kong, X., Meng, D.: The bounds for the best rank-1 approximation ratio of a finite dimensional tensor space. Pac. J. Optim. 11(2), 323–337 (2015)
Kühn, T., Peetre, J.: Embedding constants of trilinear Schatten-von Neumann classes. Proc. Estonian Acad. Sci. Phys. Math. 55(3), 174–181 (2006)
Li, Z., Nakatsukasa, Y., Soma, T., Uschmajew, A.: On orthogonal tensors and best rank-one approximation ratio. SIAM J. Matrix Anal. Appl. 39(1), 400–425 (2018)
Li, Z., Zhao, Y.-B.: On norm compression inequalities for partitioned block tensors. Calcolo 57(1), 27 (2020)
Qi, L.: The best rank-one approximation ratio of a tensor space. SIAM J. Matrix Anal. Appl. 32(2), 430–442 (2011)
Shitov, Y.: A counterexample to Comon’s conjecture. SIAM J. Appl. Algebra Geom. 2(3), 428–443 (2018)
Zhang, X., Huang, Z.-H., Qi, L.: Comon’s conjecture, rank decomposition, and symmetric rank decomposition of symmetric tensors. SIAM J. Matrix Anal. Appl. 37(4), 1719–1728 (2016)
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Eisenmann, H., Uschmajew, A. Maximum relative distance between real rank-two and rank-one tensors. Annali di Matematica 202, 993–1009 (2023). https://doi.org/10.1007/s10231-022-01268-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10231-022-01268-w