A Note on Nonclosed Tensor Formats

Various tensor formats exist which allow a data-sparse representation of tensors. Some of these formats are not closed. The consequences are (i) possible non-existence of best approximations and (ii) divergence of the representing parameters when a tensor within the format tends to a border tensor outside. The paper tries to describe the nature of this divergence. A particular question is whether the divergence is uniform for all border tensors.


Introduction
Given (finite-dimensional) vector spaces V j we denote the corresponding tensor space by Since usually the dimension ∏ d j =1 dim(V j ) of V is rather huge, the numerical treatment of tensor needs special data-sparse representation techniques. The oldest one is the r-term format: given a representation rank r we form all tensors which can be written as a sum of r elementary tensors, where r ∈ N 0 := N ∪ {0}. This yields the subset  Section 3.2). Although this approach may be very successful for certain problems, it also has an unpleasant property which we are going to explain.
The tensor rank of v ∈ V is defined as the smallest r with v ∈ R r : 1 rank(v) := min {r ∈ N 0 : v ∈ R r } .
This allows to describe R r by {v ∈ V : rank(v) ≤ r}. In the case of d = 2, tensor spaces are isomorphic to matrix spaces. Then the tensor rank coincides with the usual matrix rank. For matrices it is well known that a convergent sequence {M k } with rank(M k ) ≤ r has a limit M with rank(M) ≤ r, i.e., the set of matrices of rank ≤ r is closed. This is not true for tensors of order d ≥ 3. As an example consider the tensor-valued function where a, b ∈ V are linearly independent vectors (i.e., V = V j and dim(V ) ≥ 2). The derivative v := w ′ (0) is the symmetric tensor The derivative can be approximated by Newton's divided difference quotient Remark 1 (a) It can be proved that rank(v) = 3 for v in (1). (b) Since w(t) is of rank 1, the approximation satisfies rank(ṽ(h)) ≤ 2, i.e.,ṽ(h) ∈ R 2 . (c) From (a) and (b) we conclude that R 2 is not closed.
Part (c) states that, in general, the r-term representation is not closed. This leads to the notation of the border rank It is related to the usual rank by rank(v) ≤ rank(v) ≤ ( rank(v) ) d−1 . The first inequality is trivial. For the second one, let v i ∈ R r (r := rank(v)) be tensors converging to v.
The nonclosedness implies that a typical approximation problem as might be unsolvable since any convergent sequence w k ∈ R r with ∥v−w k ∥ → inf ∥v−w∥ may tend to a tensor w of larger rank (but with rank(w) ≤ r) outside of R r . Since the border set R r \R r is of measure zero, one might consider this as a marginal problem. However, De Silva-Lim [4] prove that those v ∈ V for which problem (3) is unsolvable have a positive measure. To be precise, this result holds for R as underlying field. A new result by Qi-Michałek-Lim [13] states that in the complex case this exceptional set is of measure zero. Many numerical approaches lead to optimisation problems within the set R r . If the minimiser does not exist, any numerical method is in trouble. As in the case of (2) the coefficients increase to infinity although its sum is bounded. This fact leads to the typical numerical cancellation. The instability of divided difference quotients is well known in numerical mathematics.
The paper is not restricted to the format R r but to rather general nonclosed formats. An example of another nonclosed tensor format is the cyclic matrix product representation. A special variant is the site-independent cyclic matrix product representation for the case of Such tensors define the set C ind (d, r, n). Already C ind (3, 2, 3) and C ind (4, 2, 2) are nonclosed (cf. [5]). The general conclusion is that, in general, graph-based tensor format are nonclosed if the graph does not degenerate to a tree (cf. Landsberg [11,Theorem 14.1.2.2]).
In the following we characterise the divergence of the coefficients of a sequence w k ∈ F converging to a border tensor (i.e., a tensor in F\F). In particular it is interesting to know how strong the divergence is and whether it is uniform for all such tensors. If the divergence could be arbitrarily weak, the numerical instability would be negligible.
We also study the order of divergence in a neighbourhood of a border tensor and whether this quantity behaves continuously.

Tensor Representation
K ∈ {R, C} denotes the field on the following vector spaces. In general, a tensor representation is given by a map ρ from a parameter set into the tensor space. We suppose that P is a vector space with dim(P ) < ∞, D ⊂ P a closed subset, V is a tensor space with dim(V) < ∞, F ⊂ V subset of two-sided cone structure, (4b) ρ : D → F continuous and surjective, In the sequel, (4a-4d) are assumed to be valid. By definition, the tensor subset F is the range of ρ. The cone structure ensures that with v also λv belongs to F for all λ ∈ K. In most of the examples, ρ is not injective, and D = P holds (cf. Section 2.2). The standard representations ρ are multilinear so that (4a-4d) is an easy consequence. In the following, we choose some norms on P and V, both denoted by ∥ · ∥. Because of the finite dimensions, the choice of the norm is not essential for the following considerations.
Let v = ρ(p). The ratio ∥p∥/∥v∥ = ∥p∥/∥ρ(p)∥ may be considered as a stability measure for the representation of v by p (cf. Section 1). Since ρ is not necessarily injective, there might be many p with v = ρ(p). Therefore, we define By compactness, the infimum may be replaced by a minimum: Since, in general, there is no scale invariance (cf. Section 2.2), we shall consider the ratio ∥p∥/∥v∥ only for normalised tensors so that ∥p∥/∥v∥ = ∥p∥.
In numerical applications we often work with approximations instead of the exact tensor. Therefore, the quantity σ (ṽ) is of interest forṽ in a neighbourhood of v. For this purpose we define the ε-neighbourhood of some v ∈ F by U F ,ε (v) := {w ∈ F : ∥v − w∥ < ε} and the stability quantity by Note that σ ε is defined for all v ∈ F and that σ ε (v) = ∞ may happen.
Since σ ε (v) is weakly decreasing as ε ↘ 0 and bounded from below by zero, the improper limit

Standard Example
In the case of the r-term format F = R r we may choose the vector space P = (V 1 × · · · × V d ) r , the domain D = P , and the mapping To ensure scalability we may choose P := V r instead of which we restrict to the domain of r-tuples of elementary tensors: Since R 1 is closed (cf. Hackbusch [9, Lemma 9.11]), D ⊂ P satisfies (4a). Now the tensor representation is As ρ(λp) = λρ(p), the ratio ∥p∥/∥ρ(p)∥ is scale invariant.

Nonclosed Formats
As mentioned above, we assume throughout the article that (4a-4d) hold.

Instability Properties
Now we suppose that the format F is not closed. Then there is a nonempty set B such that the closure of F can be split into We call B the border set since in the case of F = R r it consists of tensors with border rank ≤ r, while the usual rank is > r. Any tensor v ∈ B is the limit of tensors v i in F. The next statement is a simple observation, but fundamental for the following.
By negation we conclude the following result.
Note that σ ε (v) ≥ 0 is well-defined for v ∈ B and ε > 0 since U F ,ε (v) is a nonempty subset of F. A consequence of Lemma 2 is In Section 3.2 we shall comment on the continuity of σ . A general negative result follows.

Discussion of F = R r
Let K = C be the underlying field. An interesting question is whether the quantity σ (v) is continuous. As known from algebraic geometry general tensors 2 in R r admit only finitely many (essentially different) decompositions (cf. Section 1) and these decompositions depend continuously on the tensor, at least for r not too large. Then σ ε (v) < ∞ holds for sufficiently small ε > 0 and has the limit σ 0 (v) = σ (v).
Above we require that r be not too large. In the case of d = 3 and V = K n ⊗ K m ⊗ K p the concrete condition is as follows. The term 'general tensor' admits the existence of exceptional tensors. A particular exceptional situation holds for the tensor in For c linearly independent of a, the tensor w := c⊗a⊗a+a⊗c⊗a+a⊗a⊗c is well known to be in B = R 2 \R 2 (cf. Remark 1a). Let φ be an isomorphism on V with φ(a) = a and φ(c) = 1 t c, while ψ = t · id. Then := ψ ⊗ φ ⊗ φ is an isomorphism on V with u := (w) = tc ⊗ a ⊗ a + a ⊗ c ⊗ a + a ⊗ a ⊗ c. Hence also u ∈ B. Substitution c = b + 1 t+2 a yields u = v(t) and proves v(t) ∈ B for t > 0.
The interesting conclusion from Example 1 is that the set B is not closed. We define Note that the tensor v(0) defined in Example 1 belongs to ∂B.

General Case
If B is nonempty, also ∂B ̸ = ∅ holds since is always true (consider λv with v ∈ B for λ → 0 and note that 0 ∈ F because of (4d)). Example 1 ensures that ∂B may also contain nontrivial tensors of F = R 2 . We remark that ∂B ⊂ F since ∂B ⊂ B ⊂ F = F ∪ B and ∂B ∩ B = ∅ (cf. (5)).
Proof By definition of ∂B there is some w ∈ B with 0 < η := ∥v − w∥ < ε/2 and A consequence is the discontinuity of σ on ∂B. Conclusion 2 ensures the existence of a sequence v i → v with lim σ (v i ) → ∞ > σ (v). This proves:

On the Strength of Divergence
The numerical instability is caused by the fact that σ (v i ) → ∞ holds for any sequence F ∋ v i → v ∈ B. Whether this is a severe problem or not depends on the order of divergence. The introductory example (2) is the classical one-sided difference quotient. Using the step size h for v(h), we get σ (ṽ(h)) = O(1/ h) and the accuracy ε = O(h). Expressing the quantity σ as a function of the accuracy ε, σ = O(1/ε) shows divergence of first order. However, this is not the general behaviour. We may choose the central difference quotient.
. This example shows that, given an accuracy ε > 0, we have to look for an approximation w ∈ F with minimal σ (w). This leads to the following definition.

Definition 1
Let v ∈ B and ε > 0. The instability of the approximation problem in F is described by Note that δ(v, ε) is the infimum, whereas σ ε (v) is the supremum over the same set. Again, δ(v, ε) diverges as ε → 0.

Definitions
The function δ(v, ·) is the exact description of the kind of divergence at v ∈ B. In the case of the model example (1) and F = R 2 we have seen that the divergence is not stronger than There are difference formulae of higher consistency order, but they involve more than two terms, i.e., such approximations are not in R 2 . This leads to the conjecture that δ(v, ε) ∼ 1/ √ ε holds for v in (1). The next question is as to whether this behaviour might hold for all v ∈ B. The answer will be negative. There are particular tensors which behave differently. On the algebraic side, one might expand the difference quotient with step size h into a power series v + ∑ j v j h j . In general, the central difference leads to v 1 = 0 and some v 2 . However, for certain v it might happen that v 2 = v 3 = · · · = v k−1 = 0 so that δ(v, ε) ∼ ε −1/k . The characterisation of the largest possible k seems to be an unsolved problem. In the following we try to treat this problem by analytic tools.

Proposition 2 Uniform divergence as in (8a-8b) holds if and only if B ∪ {0} is closed.
Proof As remarked in (6), zero does not belong to B, but to its closure. Therefore, closedness of B∪{0} means that B contains no nontrivial tensor. In particular B 1 :=B ∩{v: ∥v∥=1} would be closed. (a) Let B ∪ {0} be closed. For an indirect proof assume lim ε→0 δ 0 (ε) =: K < ∞. Then for any ε = 1/n, n ∈ N, there are v n ∈ B 1 and w n ∈ F with σ (w n ) ≤ K + 1 and ∥v n −w n ∥ ≤ 1/n. By compactness we may take subsequences-again denoted by v n , w nso that v n → v and w n → w. Since B 1 is closed, we obtain v ∈ B 1 ⊂ B. As σ (w n ) is uniformly bounded, the limit belongs to F (cf. Lemma 1), i.e., w ∈ F. Now ∥v n − w n ∥ ≤ 1/n yields the contradiction v = w (F and B are disjoint!).
In the interesting case of F = R r we know that B ∪ {0} is not closed (cf. Example 1). Hence uniform divergence (8a-8b) does not hold for F = R r . Nevertheless it is possible to refine the definition of divergence.

Weaker Form of Uniform Divergence
In the case of F = R r , the exceptional set ∂B = B\B is a rather small subset of F. In the following we formulate an inequality involving the distance from ∂B.
The interpretation of Theorem 4 depends on the topological structure of ∂B as seen next.

Remark 4
If ∂B is closed, the distance dist(v, ∂B) is positive for all v ∈ B. This yields a nontrivial estimate (10b) for all v ∈ B.
Proof v ∈ B and ∂B ⊂ F implies v / ∈ ∂B. Note that dist(v, ∂B) = 0 for a closed set ∂B is equivalent to v ∈ ∂B.
Finally we consider the case of a nonclosed set ∂B. We split the closure ∂B into disjoint sets ∂B = ∂B ∪ C.
In the latter case there is some w ∈ ∂B with ∥v − w∥ = dist(v, ∂B) = 0, i.e., v = w. Comparing v ∈ B and w ∈ ∂B = ∂B ∪ C and noting that ∂B ⊂ F, it follows that v = w ∈ C.
In case of a nonclosed ∂B, the estimate (10b) degenerates to δ(v, ε) ≥ 0 if and only if v ∈ C.
Remark 6 For F = R r it is not hard to prove that the divergence behaviour only depends on the border rank and the order d of the tensor, but not on dim(V j ).

Example ⊗ 3 R 2
Obviously, it is interesting to know more about the topological structure of B for various nonclosed tensor formats. Finally we consider the tensor space which is the smallest nontrivial example. The maximal rank in V is 3 (cf. Kruskal [10]). Hence R 3 coincides with V and is obviously closed. As seen by the tensor (1), R 2 is not closed. In fact, (1) describes all border tensors up to tensor space isomorphisms: where v is defined in (1) with {a, b} being a fixed basis of R 2 . Let ϕ (2) = ϕ (3) be the identity and define ϕ (1) by ϕ (1) (a) = a, ϕ (1) (a) = a + tb. For t ̸ = 0, ϕ (1) is an isomorphism, whereas for t = 0 it is not invertible. Note that with these mappings ( ⊗ 3 j =1 ϕ (j ) ) (v) coincides with the tensor in Example 1. For t = 0 we obtain the tensor The same construction with respect to the directions j = 2 and j = 3 yields We obtain all tensors in ∂B by ( ⊗ 3 j =1 ϕ (j ) ) (v) when at least one ϕ (j ) ∈ L(R 2 , R 2 ) is not invertible. Such a tensor can be written as Since L(R 2 , R 2 ) is closed we obtain the desired result: Proposition 3 ∂B is closed.