Maximum relative distance between real rank-two and rank-one tensors

It is shown that the relative distance in Frobenius norm of a real symmetric order-d tensor of rank-two to its best rank-one approximation is upper bounded by 1-(1-1/d)d-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sqrt{1-(1-1/d)^{d-1}}$$\end{document}. This is achieved by determining the minimal possible ratio between spectral and Frobenius norm for symmetric tensors of border rank two, which equals 1-1/d(d-1)/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left( 1-{1}/{d}\right) ^{(d-1)/{2}}$$\end{document}. These bounds are also verified for arbitrary real rank-two tensors by reducing to the symmetric case.


Introduction
It is a well-known fact that the minimal possible ratio between spectral and Frobenius norm of a real n × n matrix is 1/ √ n, and is achieved for any matrix with identical singular values, that is, for multiples of orthogonal matrices.Since the spectral norm of a matrix measures the length of its best rank-one approximation, this statement has the geometric meaning that orthogonal matrices achieve the largest possible relative distance to rank-one matrices.More generally, using singular value decomposition, one can show that the minimal ratio between spectral and Frobenius norm of a rank-k matrix is 1/ √ k and is achieved when all nonzero singular values are equal.There has been considerable interest in determining the minimal possible ratio between spectral norm A σ and Frobenius norm A F of an n 1 × • • • × n d tensor A; see, e.g., [13,9,8,11,12,1].As in the matrix case, this ratio measures the distance of A to the set of rank-one tensors, and is hence of both theoretical and practical relevance in problems of low-rank approximation and entanglement.The precise relation between the spectral norm of A and its distance to rank-one tensors is as follows: Therefore, the minimal possible ratio A σ / A F that can be achieved is also called the best rank-one approximation ratio of the given tensor space [13].By (1), it expresses the maximum relative distance of a tensor to the set of rank-one tensors.Despite some recent progress achieved in the aforementioned references and others, determining the best rank-one approximation ratio for tensors remains a difficult problem in general and is largely open.One reason is the lack of a suitable analog to the singular value decomposition.Moreover, the best rank-one approximation ratio of tensors usually differs over the real and complex field, as well as for nonsymmetric and symmetric tensors of the same size.
The available results in the literature focus on the best rank-one approximation ratio in the full tensor space.As for matrices, it would however also be useful to estimate its value in dependence of the tensor rank.In this work we take a first step in this direction.We determine the minimal ratio between spectral and Frobenius norm of real rank-two tensors, and obtain that it is actually the same for symmetric and general tensors.Recall that for matrices this value equals 1/ √ 2. For tensors one should also take into account that the set of tensors of rank at most two is not closed.Our main result is on symmetric tensors and reads as follows.
Theorem 1.1.Let A be a real symmetric tensor of order d ≥ 3 and rank at most two.Then and this bound is sharp.In particular, , where brank denotes border rank, and the minimum is taken over real symmetric tensors.Up to orthogonal transformation and scaling the minimum is achieved only for the tensor Here e 1 , e 2 are two orthogonal unit tensors, The proof of Theorem 1.1 constitutes the main part of this work and will be given in section 2. The result however raises the question, whether the same bounds hold for general nonsymmetric tensors of rank two.In section 3 we show that the answer is affirmative by reducing the question to the symmetric case.Theorem 1.2.Let A be a real n 1 × • • • × n d tensor of rank at most two.Then and this bound is sharp.In particular, assuming n i ≥ 2 for i = 1, . . ., d, , where brank denotes border rank, and the minimum is taken over real Note that while for symmetric tensors the notions of rank and symmetric rank are not the same in general [14], they coincide for rank-two tensors, see, e.g., [15].
Due to the relation (1) the theorems above are equivalent to the following statement on the maximum relative distance of a real rank-two tensor to the set of rank-one tensors.
Theorem 1.3.Let A be a real tensor of order d ≥ 3 and rank at most two.Then and this bound is sharp both for general as well as for symmetric tensors.Equality is achieved for the symmetric tensor W d as above.
It is interesting to note that for d → ∞ our results imply min In particular, both quantities are bounded independently of d.

Notation
We consider the subspace Sym It will be convenient to introduce the notation for symmetric rank-one tensors, and similarly for the symmetrization of a nonsymmetric rank-one tensor Specifically, the notation u k v denotes the symmetrization of the rank-one tensor u ⊗k ⊗ v ⊗ .For symmetric rank-one tensors u d and v d it holds that u d , v d F = u, v d and, therefore, u d F = u d .To any symmetric tensor A one associates a homogeneous polynomial The spectral norm of A is then defined as Due to a result of Banach [2], this definition of spectral norm for symmetric tensors is consistent with the general one, which is given in (15) further below.If w is a normalized maximizer of and vice versa.
A symmetric tensor of rank at most two takes the form for vectors u, v and scalars α, β = 0, and the rank is equal to two if and only if u and v are linearly independent.Note that the difference notation will turn out to be convenient later.Technically, this defines tensors of symmetric rank at most two.But since for rank two both notions of rank coincide [15], we can just use the word rank throughout.It is well known that the set of tensors of rank at most two is not closed [7].This is also true when restricting to symmetric tensors.The tensors in the closure are said to have border rank at most two, denoted as brank A ≤ 2.

Proof of the main result
For proving Theorem 1.1 we will determine the infimum value of the optimization problem inf Here we can always additionally assume that u, v ≥ 0 and α > 0. We will proceed in several steps.First, in section 2.1, we validate that the tensor W d , which has symmetric border rank two, achieves equality in (2).Hence the infimum in ( 4) cannot be larger than (1 − 1 d ) d−1 .We next consider in section 2.2 the first-order necessary optimality condition for ( 4) and show that it cannot be fulfilled for rank-two tensors admitting a unique symmetric best rank-one approximation (Proposition 2.1).In other words, the potential candidates for achieving the infimum in (4) are rank-two tensors with more than one symmetric best rank-one approximation.In section 2.3 we therefore derive a criterion for a symmetric rank-two tensor to have a unique symmetric best rank-one approximation (Proposition 2.3), and validate by hand in sections 2.4 and 2.5 that for tensors which do not satisfy this criterion the value of F is strictly larger than (1 − 1 d ) d−1 .It then remains to show in section 2.6, that among the tensors of border rank two, and up to orthogonal transformation, only tensor W d achieves the infimum.Taken together, these steps provide a complete proof of Theorem 1.1.
In our proofs we will frequently assume that αu d − βv d ∈ Sym d (R 2 ) since we can always restrict to Sym d (span{u, v}).

The ratio for tensor
The spectral norm is given by following optimization problem: The KKT conditions for this problem lead to the relation d , and therefore

Optimality condition for symmetric rank-two tensors
The target function in (4) can be written as a composition where While ϕ is smooth, the map G is not differentiable in all points.However, it is the quotient of the smooth function A → A 2 F and the convex function A → A 2 σ .Therefore, the rules for generalized gradients of regular functions are applicable; see [5,Section 2.3].It follows that the subdifferential of G in a point A can be computed using a quotient rule, which yields Here ∂( A σ ) denotes the subdifferential of the spectral norm in A. The derivative of ϕ equals which leads to with A = ϕ(α, β, u, v) = αu d − βv d .The subdifferential of the spectral norm can be characterized as see [4, Theorem 2.1] in general, and [1, Section 2.3] in particular.In words, ∂( A σ ) equals the convex hull of the normalized symmetric best rank-one approximations of A. From ( 5) and ( 6) one concludes that the first-order optimality condition 0 ∈ ∂F (α, β, u, v) (see, e.g., [5, Proposition 2.3.2]) for problem (4) implies that there exists X in the convex set (6) such that for all (δα, δβ, δu, δv) and some λ ∈ R.This is equivalent with just requiring for all δu and δv.Let P u,v denote the orthogonal projection onto the linear subspace , we conclude that the optimality condition can be written as We now show that the condition (7) cannot hold for tensors αu d − βv d admitting a unique best symmetric rank-one approximation.This is an interesting analogy to the fact that matrices achieving a minimal ratio of spectral and Frobenius norm have equal singular values.Proposition 2.1.Let A = αu d − βv d have rank two.If A has a unique best symmetric rank-one approximation, then A is not a critical point of the optimization problem (4).
We use the following lemma that shows Proof.This follows from the definition of orthogonal projection by a direct calculation.
Proof of Proposition 2.1.Let one of ±w d be the normalized best symmetric rank-one approximation of A. Since it is unique, the optimality condition becomes However, since u and v are linearly independent, we have the decomposition into two complementary subspaces.Therefore, (8) would only be possible if w is both a multiple of u and v, which contradicts the linear independence of u and v.

A condition for unique symmetric best rank-one approximation
We now present a class of symmetric rank-two tensors admitting unique best symmetric rank-one approximations.By the result of Proposition 2.1 these can then be excluded from the further discussion on the minimal norm ratio.
Then A has exactly one best symmetric rank-one approximation.
For the proof we require auxiliary results.One is the following fact about polynomials.We are now in the position to prove Proposition 2.3.

Proof of Proposition 2.3. We can assume that
Hence, we only need to show that there is exactly one solution x of this equation corresponding to a global maximum of |p A | on the unit circle.If y = 1, then p A in (9) has a zero at x 0 = 1−b a .Then This shows that (10) has at least one solution x * > x 0 .We consider such a solution x * such that the corresponding unit vector w = λ x * 1 is a local maximum of |p A | on the unit circle.We have

The case α > 0 ≥ β
We show that Proposition 2.6.Let u = v and u, v ≥ 0. Then Proof.We can assume u ≥ v .Using that as asserted.

2.5
The case α = β > 0 In this section we verify by a direct calculation that the infimum in ( 4) is not attained for the difference of two rank-one tensors with the same norm, i.e. when α = β in (4).
We require the following version of Jensen's inequality.
The inequalities are strict if f is strictly convex.
Proof.Without loss of generality let a = −b and a = −b .Then using substitution we have by monotonicity of the derivative of a convex function.This shows the first of the asserted inequalities.The second inequality is just Jensen's inequality, noting that a+b 2 = a +b 2 .If f is strictly convex, then f is strictly monotone and the inequalities are strict.
Proof of Proposition 2.7.We can assume that A ∈ Sym d (R 2 ), so that u, v ∈ R 2 .After rotation and rescaling we have u = 1 t and v = 1 −t with t ∈ (0, 1].Then First, we apply the estimate The right-hand side is monotonically increasing in the interval (0, 1].For t = This value is larger than 1 − 1 It hence remains to verify this inequality for all t ∈ 0, , which is a little bit more involved.The starting point is another lower bound for the spectral norm, namely This can be seen by taking the limit of F is of order t 2 by (11).We therefore have where the second and third equalities have been shown in section 2.1.We now claim that which then proves the assertion.This claim is equivalent to the positivity of Elementary manipulations give Note that for t ∈ 0, Therefore with f (t) = (d − 1)t d−2 , we can rewrite (12) as where the second inequality follows from monotonicity of f .This shows that (12) is positive.

Tensors of border rank two
We now consider tensors lying on the boundary of the set of symmetric rank-two tensors.Proposition 2.9.Let A be a limit of symmetric rank-two tensors and rank A > 2. Then and equality is attained if and only if A = u d−1 v for some orthogonal u and v, that is, for tensors arising from scaling and orthogonal transformations of tensor W d .
The boundary of rank-two tensors is well studied.We require the following well-known parametrization, see, e.g., [3].We offer a self-contained proof for completeness.Lemma 2.10.Let A be a limit of symmetric rank-two tensors and rank A > 2. Then A is of the form It is not difficult to see that u n and v n must be unbounded since otherwise there is a subsequence of A n converging to a tensor of rank at most two, contradicting rank A > 2. We write v n = s n u n + t n w n with w n = 1 and u n , w n = 0. Then and it can be checked that all terms are pairwise orthogonal.Hence, since A n converges, all terms must be bounded and by passing to a subsequence we can assume that all of them converge.Due to u n → ∞ we have 1 ± s d n → 0 for the first term, which implies that the sequence s n is bounded.Therefore, considering the term k = 1, the sequence t n u n d−1 is bounded which automatically implies t k n u n d−k → 0 for all k > 1.We conclude that which proves the assertion.
Proof of Proposition 2.9.Using Lemma 2.10, scaling and orthogonal transformations, we can assume and We can restrict to tensors A with Frobenius norm A 2 F = a 2 + b 2 d = 1 and need to show that whenever a > 0. The first lower bound (13) implies that this is true whenever b > , then the second lower bound ( 14) yields the desired estimate This concludes the proof of Theorem 1.1.

Proof of Proposition 3.1. Write
We may assume that u i , v i ∈ R 2 and after an orthogonal change of bases and possibly changing sign of β, we may also assume that Our goal is to show that replacing any k factors v i1 , . . ., v i k of V with the same unit norm vector v defined by leads to a tensor with the same Frobenius norm but smaller spectral norm.Since Frobenius and spectral norm are invariant under permutation of tensor factors, it suffices to prove this for the case that the first k vectors v 1 , . . ., v k are replaced in this way.The resulting tensor is denoted by the Frobenius norms of A and A k indeed coincide.In the remainder of the proof we show by induction that the spectral norm does not increase with k, i.e., A k+1 σ ≤ A k σ ≤ A σ .For k = d this provides a symmetric tensor with the desired properties.We start with k = 2. Let w 1 , . . ., w d be the maximizers in

and consider the matrices
and They represent the bilinear forms in w1 and w2 .Clearly, Lemma 3.3 implies S σ ≤ T σ and therefore A 2 σ ≤ A σ .For the induction step, let 2 ≤ k < d and assume that replacing any k factors of V in the described manner always results in a tensor with a smaller or equal spectral norm.Note that here V was in principle arbitrary.Starting from the given V , we now construct a sequence V 0 , V 1 , . . . of rank-one tensors in which the first k factors and then the second to (k + 1)-st factors are successively replaced: . . ., that is, the Ṽ converge to V k+1 and hence the B converge to A k+1 .This concludes the proof.
Based on Proposition Proposition 3.1 we obtain a proof for Theorem 1.2 for general real rank-two directly from Theorem 1.1.
Proof of Theorem 1.2.Again, since A has rank at most two, it suffices to prove the statement for general (i.e.nonsymmetric) 2 × • • • × 2 tensors.Obviously, the minimal ratio A σ / A F over general 2 × • • • × 2 tensors is smaller or equal than the minimum over symmetric ones.However, by Proposition 3.1 the converse is also true.The result hence follows from Theorem 1.1.Theorem 1.2 suggests an interesting relation between results in [6] and [10].The authors in [6] found that the minimal possible ratio of spectral and Frobenius norm among all tensors in C 2 ⊗ C 2 ⊗ C 2 is 2 3 , while in [10] it is shown that the minimal ratio for tensors in R 2 ⊗ R 2 ⊗ R 2 is only 1  2 .However, Theorem 1.2 states that border rank-two tensors in R 2×2×2 have the minimal ratio 2  3 .This might be related to the fact that tensors of real rank two and three both have positive volume in R 2 ⊗ R 2 ⊗ R 2 , while almost all tensors in C 2 ⊗ C 2 ⊗ C 2 have complex rank two.

Lemma 2 . 4 .Lemma 2 . 5 .
Let a, γ > 0 and b ≥ 0 and d ≥ 2. The equation x = γ(x − a)(x + b) d−1 has two real solutions if d is even, and three real solutions if d is odd.Proof.Let p(x) = γ(x − a)(x + b) d−1 − x.Then by the intermediate value theorem, p must have at least two real zeros, namely one in the interval [−b, 0] and another one in the interval (a, ∞).On the other hand, p (x) = γd(x + b) d−2 x − (d − 1)a − b d − 1, has at most two sign changes, one at a value larger than (d−1)a−b d and another at one at a value smaller than −b if d is odd.Therefore, p has at most three real zeros.The statement follows from the fact that the number of real zeros of a polynomial with real coefficients has the same parity as its degree.The second lemma narrows the possible locations of maximizers of the homogeneous form |p A |.Under the assumptions of Proposition 2.3, let w be a maximizer of |p A (w)| = αu d − βv d , w d F subject to w = c > 0. Then | u, w | ≥ | v, w |.Proof.Assume to the opposite that | u, w | < | v, w | and without loss of generality v, w > 0. Let Q be the symmetric orthogonal matrix mapping u to v and v to u (i.e.Q = I − zz T with z = (u + v)/ u + v ), and let w = Qw.Then u, w = v, w and v, w = u, w .By assumption, we then have αu d − βv d , wd F = αu d − βv d , wd F .If αu d − βv d , w d F = αu d − βv d , w d F this yields αu d − βv d , wd F > αu d − βv d , w d F (by using (α + β) v, w d > (α + β) u, w d ) which contradicts the optimality of w.In the other case, αu d − βv d , w d F = − αu d − βv d , w d F , optimality implies β( u, w d + v, w d ) > α( u, w d + v, w d ) which contradicts α > β.

9 )
loss of generality, since we can change coordinates, we can consider α = 1, u = 0 1 and d √ βv = a b with a > 0, b ≥ 0 (since u, v ≥ 0), and a 2 + b 2 < 1 (since β < α = 1).Writing w = λ x y for points on the unit circle, where λ > 0 is a normalization constant, we then have p A (w) = λ d [y d − (ax + by) d ]. (Critical points on the circle are characterized by w, ∇p A (w) = 0, which means y d−1 x − (bx − ay)(ax + by) d−1 = 0 independent of λ.Note that here y = 0 is not possible since both a and b are nonzero.Recall that a symmetric best rank-one approximation of A is given as p A (w)w d , where w maximizes |p A (w)| on the circle.Since p A (−w) = (−1) d p A (w), in order to prove the assertion it suffices to show that |p A (w)| has exactly one maximizer w with y = 1.The optimality condition at such a w reduces to x = (bx − a)(ax + b) d−1 .

Lemma 2 . 8 .
Let f : [a, b] → R be convex and continuously differentiable.If a + b = a + b and a < a < b < b, then

, 2 > 2 >
and therefore a := a+b−(b −a ) a > a and b > b := a+b+(b −a ) b .Since a + b = a + b and a − b = a − b , Lemma 2.8 yields

2 F = a 2 + b 2 d since the tensors e d 1 and e d−1 1 e 2 1 e 2 2F
with a, b ≥ 0. Then A are orthogonal and de d−1 = d.We have the following two lower bounds for the spectral norm:

1 k
and so on (the term in brackets disappears when k = d − 1).By induction hypothesis, the corresponding sequence B = αU + β V of tensors has nonincreasing spectral norm and in particular B σ ≤ B 0 σ = A k σ ≤ A σ .We claim the B converge to A k+1 , which proves A k+1 σ ≤ A k σ ≤ A σ as desired.Indeed, the unit norm vectors ṽ = y e 1 + 1 − y 2 e 2 above are constructed according toy 0 = x = k i=1 x 1/k i , y 1 = x k−1 x k+1 , y +2 = y k−1 +1 y a i1,...,i d ] of order d.It inherits the Euclidean inner product A, B F = i1,...,i d a i1...i d b i1...i d from the ambient space, which induces the Frobenius norm via w |.