# A Note on Nonclosed Tensor Formats

- 98 Downloads

## Abstract

Various tensor formats exist which allow a data-sparse representation of tensors. Some of these formats are not closed. The consequences are (i) possible non-existence of best approximations and (ii) divergence of the representing parameters when a tensor within the format tends to a border tensor outside. The paper tries to describe the nature of this divergence. A particular question is whether the divergence is uniform for all border tensors.

## Keywords

Tensor representation Tensor format Nonclosed tensor format Numerical instability## Mathematics Subject Classification (2010)

15A69 65F99## 1 Introduction

*V*

_{j}we denote the corresponding tensor space by

**V**is rather huge, the numerical treatment of tensor needs special data-sparse representation techniques. The oldest one is the

*r*-term format: given a

*representation rank*

*r*we form all tensors which can be written as a sum of

*r*elementary tensors, where \(r\in \mathbb {N}_{0}:=\mathbb {N} \cup \{0\}\). This yields the subset

**V**. Under certain conditions a tensor in \(\mathcal {R}_{r}\) might have an

*essentially unique*representation, i.e., different representations only differ by the order of the terms and the scaling of the vectors \(\{v_{\nu }^{(j)}:1\leq j\leq d\}\) (cf. Section 3.2).

Although this approach may be very successful for certain problems, it also has an unpleasant property which we are going to explain.

*tensor rank*of

**v**∈

**V**is defined as the smallest

*r*with \(\mathbf {v}\in \mathcal {R}_{r}\):

^{1}

**v**∈

**V**: rank(

**v**) ≤

*r*}. In the case of

*d*= 2, tensor spaces are isomorphic to matrix spaces. Then the tensor rank coincides with the usual matrix rank. For matrices it is well known that a convergent sequence {

*M*

_{k}} with rank(

*M*

_{k}) ≤

*r*has a limit

*M*with rank(

*M*) ≤

*r*, i.e., the set of matrices of rank ≤

*r*is closed. This is not true for tensors of order

*d*≥ 3. As an example consider the tensor-valued function

*a*,

*b*∈

*V*are linearly independent vectors (i.e.,

*V*=

*V*

_{j}and \(\dim (V)\geq 2\)). The derivative \(\mathbf {v}:=\mathbf {w}^{\prime }(0)\) is the symmetric tensor

### *Remark 1*

- (a)
It can be proved that rank(

**v**) = 3 for**v**in (1). - (b)
Since

**w**(*t*) is of rank 1, the approximation satisfies \(\text {rank}({\tilde {\mathbf {v}}}(h))\leq 2,\) i.e., \({\tilde {\mathbf {v}}}(h)\in \mathcal {R}_{2}\). - (c)
From (a) and (b) we conclude that \(\mathcal {R}_{2}\) is not closed.

*r*-term representation is not closed. This leads to the notation of the

*border rank*

**v**. Then \(\mathbf {v}_{i}\in {\bigotimes }_{j=1}^{d}U_{j}^{\min \limits }(\mathbf {v}_{i})\) holds for the minimal subspaces \(U_{j}^{\min \limits }(\mathbf {v}_{i})\) (cf. Hackbusch [9, Section 6]). \(\mathbf {v}_{i}\in \mathcal {R}_{r}\) implies \(\dim U_{j}^{\min \limits }(\mathbf {v}_{i})\leq r\). [9, Theorem 6.24] proves \(\dim U_{j}^{\min \limits }(\mathbf {v})\leq r\) for the corresponding subspaces in \(\mathbf {v}\in \mathbf {U}:={\bigotimes }_{j=1}^{d}U_{j}^{\min \limits }(\mathbf {v})\). The maximal rank in

**U**is bounded by

*r*

^{d− 1}proving the second inequality (cf. [9, Section 3.2.6.4]).

**w**of larger rank (but with \(\underline {\text {rank}}(\mathbf {w})\leq r\)) outside of \(\mathcal {R}_{r}\). Since the border set \(\overline {\mathcal {R}_{r}}\backslash \mathcal {R}_{r}\) is of measure zero, one might consider this as a marginal problem. However, De Silva–Lim [4] prove that those

**v**∈

**V**for which problem (3) is unsolvable have a positive measure. To be precise, this result holds for \(\mathbb {R}\) as underlying field. A new result by Qi–Michałek–Lim [13] states that in the complex case this exceptional set is of measure zero.

Many numerical approaches lead to optimisation problems within the set \(\mathcal {R}_{r}\). If the minimiser does not exist, any numerical method is in trouble. As in the case of (2) the coefficients increase to infinity although its sum is bounded. This fact leads to the typical numerical cancellation. The instability of divided difference quotients is well known in numerical mathematics.

*cyclic matrix product representation*. A special variant is the

*site-independent cyclic matrix product representation*for the case of

*V*

_{j}=

*V*(cf. Perez–Garcia et al. [12, Section 3.2.1]). Let \(n:=\dim (V)\). For tuples (

*M*[

*i*] : 1 ≤

*i*≤

*n*) of

*r*×

*r*matrices define the corresponding tensor

**v**∈⊗

^{d}

*V*componentwise by

In the following we characterise the divergence of the coefficients of a sequence \(\mathbf {w}_{k}\in \mathcal {F}\) converging to a border tensor (i.e., a tensor in \(\overline {\mathcal {F}}\backslash \mathcal {F}\)). In particular it is interesting to know how strong the divergence is and whether it is uniform for all such tensors. If the divergence could be arbitrarily weak, the numerical instability would be negligible.

We also study the order of divergence in a neighbourhood of a border tensor and whether this quantity behaves continuously.

## 2 Notations and Definitions

### 2.1 Tensor Representation

*ρ*from a parameter set into the tensor space. We suppose that

In the sequel, (4d–4d) are assumed to be valid.

By definition, the tensor subset \(\mathcal {F}\) is the range of *ρ*. The cone structure ensures that with **v** also *λ***v** belongs to \(\mathcal {F}\) for all \(\lambda \in \mathbb {K}\). In most of the examples, *ρ* is not injective, and \(\mathcal {D}=P\) holds (cf. Section 2.2). The standard representations *ρ* are multilinear so that (4d–4d) is an easy consequence. In the following, we choose some norms on *P* and **V**, both denoted by ∥⋅∥. Because of the finite dimensions, the choice of the norm is not essential for the following considerations.

**v**=

*ρ*(

*p*). The ratio ∥

*p*∥/∥

**v**∥ = ∥

*p*∥/∥

*ρ*(

*p*)∥ may be considered as a

*stability measure*for the representation of

**v**by

*p*(cf. Section 1). Since

*ρ*is not necessarily injective, there might be many

*p*with

**v**=

*ρ*(

*p*). Therefore, we define

### *Remark 2*

Each \(\mathbf {v}\in \mathcal {F}\) has at least one \(p_{\mathbf {v}}\in \mathcal {D}\) with **v** = *ρ*(*p*_{v}) and *σ*(**v**) = ∥*p*_{v}∥.

Since, in general, there is no scale invariance (cf. Section 2.2), we shall consider the ratio ∥*p*∥/∥**v**∥ only for normalised tensors so that ∥*p*∥/∥**v**∥ = ∥*p*∥.

**v**. For this purpose we define the

*ε*-neighbourhood of some \(\mathbf {v}\in \overline {\mathcal {F}}\) by

*σ*

_{ε}is defined for all \(\mathbf {v}\in \overline {\mathcal {F}}\) and that \(\sigma _{\varepsilon } (\mathbf {v})=\infty \) may happen.

*σ*

_{ε}(

**v**) is weakly decreasing as

*ε*↘ 0 and bounded from below by zero, the improper limit

*ε*> 0).

### 2.2 Standard Example \(\mathcal {F}=\mathcal {R}_{r}\)

*r*-term format \(\mathcal {F}=\mathcal {R}_{r}\) we may choose the vector space

*P*= (

*V*

_{1}×⋯ ×

*V*

_{d})

^{r}, the domain \(\mathcal {D}=P\), and the mapping

*P*is \(\|(v_{i}^{(j)})\|=\sqrt {{\sum }_{i=1}^{r}{\sum }_{j=1}^{d}\|v_{i}^{(j)}\|^{2}}\). Since

*ρ*(

*λ*

*p*) =

*λ*

^{d}

*ρ*(

*p*), the ratio ∥

*p*∥/∥

*ρ*(

*p*)∥ is not invariant with respect to scaling.

*P*:=

**V**

^{r}instead of which we restrict to the domain of

*r*-tuples of elementary tensors:

*ρ*(

*λ*

*p*) =

*λ*

*ρ*(

*p*), the ratio ∥

*p*∥/∥

*ρ*(

*p*)∥ is scale invariant.

## 3 Nonclosed Formats

As mentioned above, we assume throughout the article that (4d–4d) hold.

### 3.1 Instability Properties

*border set*since in the case of \(\mathcal {F}=\mathcal {R}_{r}\) it consists of tensors with border rank ≤

*r*, while the usual rank is >

*r*.

Any tensor \(\mathbf {v}\in {\mathscr{B}}\) is the limit of tensors **v**_{i} in \(\mathcal {F}\). The next statement is a simple observation, but fundamental for the following.

### **Lemma 1**

*Let*\(\mathbf {v}_{i}\in \mathcal {F}\)*with v*_{i} := *ρ*(*p*_{i}) **be a convergent sequence with the limit**\(\mathbf {v}=\lim \mathbf {v}_{i}\). **Then**\(\sup _{i}\|p_{i}\| <\infty \)**implies**\(\mathbf {v}\in \mathcal {F}\). **The condition**\(\sup _{i}\|p_{i}\| <\infty \)**can be replaced by**\(\sup _{i}\sigma (\mathbf {v}_{i})<\infty \).

### *Proof*

Since the set \(\{p\in \mathcal {D}:~\|p\| \leq C\}\) with \(C:=\sup _{i}\|p_{i}\|\) is compact (cf. (4d)), there is a subsequence—again denoted by (*p*_{i})—with \(p_{i}\rightarrow p\in \mathcal {D}\). Continuity of *ρ* (cf. (4d)) implies that \(\mathbf {v}=\lim \rho (p_{i})=\rho (p)\). Hence **v** belongs to the range \(\mathcal {F}\) of *ρ*, i.e., \(\mathbf {v}\in \mathcal {F}\). □

By negation we conclude the following result.

### **Lemma 2**

**Let**\(\mathbf {v}_{i}:=\rho (p_{i})\in \mathcal {F}\)**converge to**\(\mathbf {v}\in {\mathscr{B}}\). **Then**\(\|p_{i}\| \rightarrow \infty \).

Note that *σ*_{ε}(**v**) ≥ 0 is well-defined for \(\mathbf {v}\in {\mathscr{B}}\) and *ε* > 0 since \(U_{\mathcal {F},\varepsilon }(\mathbf {v})\) is a nonempty subset of \(\mathcal {F}\). A consequence of Lemma 2 is

### **Conclusion 1**

If \(\mathbf {v}\in {\mathscr{B}}\), then \(\sigma _{\varepsilon }(\mathbf {v})=\infty \) holds for all *ε* > 0 and leads to \(\sigma _{0}(\mathbf {v})=\infty \).

### *Proof*

If \(\sigma _{\varepsilon }(\mathbf {v})=:C<\infty \) we can choose \(\mathbf {v}_{i}\in U_{\mathcal {F},\varepsilon }(\mathbf {v})\) with \(\mathbf {v}_{i}\rightarrow \mathbf {v}\) and parameters \(p_{i}\in \mathcal {D}\) with **v**_{i} = *ρ*(*p*_{i}) and ∥*p*_{i}∥≤ *C* (cf. Remark 2). Lemma 2 yields the contradiction \(\mathbf {v}\notin {\mathscr{B}}\). □

In Section 3.2 we shall comment on the continuity of *σ*. A general negative result follows.

### *Remark 3*

If \({\mathscr{B}}\neq \emptyset \) (i.e., if the format is nonclosed), *σ* is discontinuous at 0 ∈**V**.

### *Proof*

(4d) implies *σ*(0) = 0. Assume that *σ* is continuous at 0. There is a neighbourhood \(U_{\mathcal {F},\varepsilon }(0)=:U_{\mathcal {F},\varepsilon }\) for some *ε* > 0 with *σ*(**v**) ≤ 1 for all \(\mathbf {v}\in U_{\mathcal {F},\varepsilon }\). Hence, *σ*_{ε−∥w∥}(**w**) ≤ 1 holds for all \(\mathbf {w}\in U_{\mathcal {F},\varepsilon }\). Conclusion 1 implies that \(\overline {U_{\mathcal {F},\varepsilon }}\cap {\mathscr{B}}=\emptyset \). On the other hand, there is some \(0\neq \mathbf {v}\in {\mathscr{B}}\). The cone property (4d) implies that \(\lambda \mathbf {v}\in {\mathscr{B}}\) for all *λ*≠ 0. For sufficiently small *λ*≠ 0, \(\lambda \mathbf {v}\in \overline {U_{\mathcal {F},\varepsilon }}\) yields the contradiction. □

### 3.2 Discussion of \(\mathcal {F}=\mathcal {R}_{r}\)

Let \(\mathbb {K}=\mathbb {C}\) be the underlying field. An interesting question is whether the quantity *σ*(**v**) is continuous. As known from algebraic geometry *general tensors*^{2} in \(\mathcal {R}_{r}\) admit only finitely many (essentially different) decompositions (cf. Section 1) and these decompositions depend continuously on the tensor, at least for *r* not too large. Then \(\sigma _{\varepsilon }(\mathbf {v})<\infty \) holds for sufficiently small *ε* > 0 and has the limit *σ*_{0}(**v**) = *σ*(**v**).

A particular positive result holds if the representation \(\rho :{\mathcal {D}}_{0}\subset \mathcal {D}\rightarrow \mathbf {V}\) is injective for a certain subset \(\mathcal {D}_{0}\) and the inverse map—the decomposition—\(\rho ^{-1}:\mathcal {F}_{0}:=\rho (P_{0})\rightarrow P\) is continuous. Then *σ*_{ε}(**v**) is bounded for \(\mathbf {v}\in \mathcal {F}_{0}\) and *σ*(**v**) = *σ*_{0}(**v**). This situation occurs under the conditions studied in Sørensen et al. [14, 15, 16] and Domanov–De Lathauwer [6, 7, 8].

Above we require that *r* be not too large. In the case of *d* = 3 and \(\mathbf {V}=\mathbb {K}^{n}\otimes \mathbb {K}^{m}\otimes \mathbb {K}^{p}\) the concrete condition is as follows. For *r* ≤ (*n* − 1)(*m* − 1) general tensors have a *unique* decomposition as stated in Domanov–De Lathauwer [7, Corollary 1.7]. However, if *r* ≤ (*n* − 1)(*m* − 1) + 1 and^{3}\(\mathbb {K}=\mathbb {C}\), general tensors have *finitely* many decompositions (cf. Chiantini–Ottaviani [2, Proposition 5.4]).

The term ‘general tensor’ admits the existence of exceptional tensors. A particular exceptional situation holds for the tensor in

### *Example 1*

**V**= ⊗

^{3}

*V*with \(\dim (V)\geq 2\) and choose any linearly independent vectors

*a*,

*b*∈

*V*. In the case of the 2-term format \(\mathcal {F}=\mathcal {R}_{2}\), the tensor

^{4}

*t*> 0, while \(\mathbf {v}(0)\in \mathcal {F}\).

### *Proof*

- (a)
For

*t*= 0 we rewrite**v**(0) as \(a\otimes a\otimes (a+b) + a\otimes b\otimes a\in \mathcal {R}_{2}=\mathcal {F}\). - (b)
Let

*t*> 0. For*c*linearly independent of*a*, the tensor**w**:=*c*⊗*a*⊗*a*+*a*⊗*c*⊗*a*+*a*⊗*a*⊗*c*is well known to be in \({\mathscr{B}}=\overline {\mathcal {R}_{2}}\backslash \mathcal {R}_{2}\) (cf. Remark 1a). Let*φ*be an isomorphism on*V*with*φ*(*a*) =*a*and \(\varphi (c)=\frac {1}{t}c\), while*ψ*=*t*⋅*i**d*. Then Λ :=*ψ*⊗*φ*⊗*φ*is an isomorphism on**V**with**u**:= Λ(**w**) =*t**c*⊗*a*⊗*a*+*a*⊗*c*⊗*a*+*a*⊗*a*⊗*c*. Hence also \(\mathbf {u}\in {\mathscr{B}}\). Substitution \(c=b+\frac {1}{t+2}a\) yields**u**=**v**(*t*) and proves \(\mathbf {v}(t)\in {\mathscr{B}}\) for*t*> 0.

**v**(0) defined in Example 1 belongs to \(\mathcal {\partial B}\).

### 3.3 General Case

*λ*

**v**with \(\mathbf {v} \in {\mathscr{B}}\) for \(\lambda \rightarrow 0\) and note that \(0\in \mathcal {F}\) because of (4dd)). Example 1 ensures that \(\mathcal {\partial B}\) may also contain

*nontrivial*tensors of \(\mathcal {F}=\mathcal {R}_{2}\). We remark that

### **Conclusion 2**

Let \(0\neq \mathbf {v}\in \mathcal {\partial B}\). Then \(\sigma _{\varepsilon }(\mathbf {v})=\infty \) for all *ε* > 0 and \(\sigma _{0}(\mathbf {v})=\infty \), although \(\sigma (\mathbf {v})<\infty \).

### *Proof*

By definition of \(\mathcal {\partial B}\) there is some \(\mathbf {w}\in {\mathscr{B}}\) with 0 < *η* := ∥**v** −**w**∥ < *ε*/2 and \(\sigma _{\eta }(\mathbf {w})=\infty \) (cf. Conclusion 1). Since \(U_{\mathcal {F},\eta }(\mathbf {w})\subset U_{\mathcal {F},\varepsilon }(\mathbf {v})\), *σ*_{ε}(**v**) ≥ *σ*_{η}(**w**) yields the assertion. □

A consequence is the discontinuity of *σ* on \(\mathcal {\partial B}\). Conclusion 2 ensures the existence of a sequence \(\mathbf {v}_{i}\rightarrow \mathbf {v}\) with \(\lim \sigma (\mathbf {v}_{i})\rightarrow \infty >\sigma (\mathbf {v})\). This proves:

### **Conclusion 3**

*σ* is not continuous at \(\mathbf {v}\in \mathcal {\partial B}\backslash \{0\}\).

## 4 On the Strength of Divergence

The numerical instability is caused by the fact that \(\sigma (\mathbf {v}_{i})\rightarrow \infty \) holds for any sequence \(\mathcal {F}\ni \mathbf {v}_{i}\rightarrow \mathbf {v}\in {\mathscr{B}}\). Whether this is a severe problem or not depends on the order of divergence. The introductory example (2) is the classical one-sided difference quotient. Using the step size *h* for **v**(*h*), we get \(\sigma ({\tilde {\mathbf {v}}}(h))=\mathcal {O}(1/h)\) and the accuracy \(\varepsilon =\mathcal {O}(h)\). Expressing the quantity *σ* as a function of the accuracy *ε*, \(\sigma =\mathcal {O}(1/\varepsilon )\) shows divergence of first order. However, this is not the general behaviour. We may choose the central difference quotient. Since still \(\sigma ({\tilde {\mathbf {v}}}(h))=\mathcal {O}(1/h)\) but \(\varepsilon =\mathcal {O}(h^{2})\), we now have the weaker divergence \(\sigma =\mathcal {O}(1/\sqrt {\varepsilon })\).

This example shows that, given an accuracy *ε* > 0, we have to look for an approximation \(\mathbf {w}\in \mathcal {F}\) with minimal *σ*(**w**). This leads to the following definition.

### **Definition 1**

*ε*> 0. The instability of the approximation problem in \(\mathcal {F}\) is described by

Note that *δ*(**v**, *ε*) is the infimum, whereas *σ*_{ε}(**v**) is the supremum over the same set. Again, *δ*(**v**, *ε*) diverges as \(\varepsilon \rightarrow 0\).

### **Proposition 1**

*Weakly monotone divergence*\(\delta (\mathbf {v},\varepsilon )\nearrow \infty \)*holds for all*\(\mathbf {v}\in {\mathscr{B}}\)*as ε* ↘ 0.

### *Proof*

For an indirect proof assume that \(\delta (\mathbf {v},\varepsilon )\leq K<\infty \) for all \(\varepsilon =\frac {1}{n}>0\). Then, for any \(n\in \mathbb {N}\), there are \(\mathbf {w}_{n}\in U_{\mathcal {F},1/n}\) (i.e., \(\mathbf {w}_{n}\in \mathcal {F}\) and ∥**v** −**w**_{n}∥≤ 1/*n*) with *σ*(**w**_{n}) ≤ *K* + 1. Since \(\mathbf {w}_{n}\rightarrow \mathbf {v}\), Lemma 1 proves the contradicting statement \(\mathbf {v}\in \mathcal {F}\). □

### 4.1 Uniform Strength of Divergence

#### 4.1.1 Definitions

The function *δ*(**v**,⋅) is the exact description of the kind of divergence at \(\mathbf {v}\in {\mathscr{B}}\). In the case of the model example (1) and \(\mathcal {F}=\mathcal {R}_{2}\) we have seen that the divergence is not stronger than \(\mathcal {O}(1/\sqrt {\varepsilon })\)—so that \(\delta (\mathbf {v},\varepsilon )\lesssim 1/\sqrt {\varepsilon }\)—if we use the central difference quotient. There are difference formulae of higher consistency order, but they involve more than two terms, i.e., such approximations are not in \(\mathcal {R}_{2}\). This leads to the conjecture that \(\delta (\mathbf {v},\varepsilon )\sim 1/\sqrt {\varepsilon }\) holds for **v** in (1). The next question is as to whether this behaviour might hold for all \(\mathbf {v}\in {\mathscr{B}}\). The answer will be negative. There are particular tensors which behave differently. On the algebraic side, one might expand the difference quotient with step size *h* into a power series \(\mathbf {v}+{\sum }_{j}\mathbf {v}_{j}h^{j}\). In general, the central difference leads to **v**_{1} = 0 and some **v**_{2}. However, for certain **v** it might happen that **v**_{2} = **v**_{3} = ⋯ = **v**_{k− 1} = 0 so that \(\delta (\mathbf {v},\varepsilon )\sim \varepsilon ^{-1/k}\). The characterisation of the largest possible *k* seems to be an unsolved problem. In the following we try to treat this problem by analytic tools.

#### 4.1.2 Uniform Divergence

*δ*

_{0}satisfying (8a) is

*δ*

_{0}(

*ε*) is weakly increasing. The crucial question is whether \(\lim _{\varepsilon \rightarrow 0}\delta _{0}(\varepsilon )<\infty \) or \(=\infty \).

### **Proposition 2**

*Uniform divergence as in* (8a–8b) *holds if and only if*\({\mathscr{B}}\cup \{0\}\)*is closed*.

### *Proof*

As remarked in (6), zero does not belong to \({\mathscr{B}}\), but to its closure. Therefore, closedness of \({\mathscr{B}}\cup \{0\}\) means that \(\overline {{\mathscr{B}}}\) contains no *nontrivial* tensor. In particular \({\mathscr{B}}_{1}{:=}{\mathscr{B}}\cap \{\mathbf {v}{:}~\|\mathbf {v}\|{=}1\}\) would be closed.(a) Let \({\mathscr{B}}\cup \{0\}\) be closed. For an indirect proof assume \(\lim _{\varepsilon \rightarrow 0}\delta _{0}(\varepsilon )=:K<\infty \). Then for any *ε* = 1/*n*, \(n\in \mathbb {N}\), there are \(\mathbf {v}_{n}\in {\mathscr{B}}_{1}\) and \(\mathbf {w}_{n}\in \mathcal {F}\) with *σ*(**w**_{n}) ≤ *K* + 1 and ∥**v**_{n} −**w**_{n}∥≤ 1/*n*. By compactness we may take subsequences—again denoted by **v**_{n}, **w**_{n}—so that \(\mathbf {v}_{n}\rightarrow \mathbf {v}\) and \(\mathbf {w}_{n}\rightarrow \mathbf {w}\). Since \({\mathscr{B}}_{1}\) is closed, we obtain \(\mathbf {v}\in {\mathscr{B}}_{1}\subset {\mathscr{B}}\). As *σ*(**w**_{n}) is uniformly bounded, the limit belongs to \(\mathcal {F}\) (cf. Lemma 1), i.e., \(\mathbf {w}\in \mathcal {F}\). Now ∥**v**_{n} −**w**_{n}∥≤ 1/*n* yields the contradiction **v** = **w** (\(\mathcal {F}\) and \({\mathscr{B}}\) are disjoint!).(b) If \({\mathscr{B}}\cup \{0\}\) is not closed, there is some \(0\neq \mathbf {w}\in \partial {\mathscr{B}}:=\overline {{\mathscr{B}}}\backslash {\mathscr{B}}\). Thanks to the cone property (4db), we may assume without loss of generality that ∥**w**∥ = 1. Note that \(\partial {\mathscr{B}}\subset \mathcal {F}\) (cf. (7)). Hence **w** has a finite value *ω* := *σ*(**w**). For any *ε* > 0 we find some \(\mathbf {v}\in {\mathscr{B}}_{1}\) with ∥**v** −**w**∥≤ *ε*. Now (9) implies that *δ*_{0}(*ε*) ≤ *ω* for all *ε* > 0, i.e., the property (8a) is not valid. □

In the interesting case of \(\mathcal {F}=\mathcal {R}_{r}\) we know that \({\mathscr{B}}\cup \{0\}\) is not closed (cf. Example 1). Hence uniform divergence (8a–8a) does not hold for \(\mathcal {F}=\mathcal {R}_{r}\). Nevertheless it is possible to refine the definition of divergence.

#### 4.1.3 Weaker Form of Uniform Divergence

In the case of \(\mathcal {F}=\mathcal {R}_{r}\), the exceptional set \(\partial {\mathscr{B}}=\overline {{\mathscr{B}}}\backslash {\mathscr{B}}\) is a rather small subset of \(\mathcal {F}\). In the following we formulate an inequality involving the distance from \(\partial {\mathscr{B}}\).

### **Theorem 4**

*There is a function δ* _{1} *with*

*such that*

### *Proof*

*δ*(

**v**,

*ε*) ≥ 0 is trivial.(b) In the following we consider those

**v**with \(\mathbf {v}\in {\mathscr{B}}\), ∥

**v**∥ = 1, and \(\text {dist}(\mathbf {v},\partial {\mathscr{B}})>0\). In this case the best possible

*δ*

_{1}(

*ε*) is

*δ*

_{1}is weakly increasing as \(\varepsilon \rightarrow 0\). For an indirect proof of (10a) we assume that \(\delta _{1}(\varepsilon )\leq K<\infty \). As in the proof of Proposition 2 there are convergent subsequences \(\mathbf {w}_{n}\in \mathcal {F}\), \(\mathbf {v}_{n}\in {\mathscr{B}}\) with

*σ*(

**w**

_{n}) ≤

*K*+ 1 implies

**w**

_{n}=

*ρ*(

*p*

_{n}) and

*σ*(

**w**

_{n}) = ∥

*p*

_{n}∥. Now \(\sigma (\mathbf {w}_{n})=\|p_{n}\| \rightarrow 0\) proves \(p_{n}\rightarrow 0\), while (4d–4d) show that \(\mathbf {w}=\lim \mathbf {w}_{n}=\lim \rho (p_{n})=\rho (0)=0\). However, since the norm is continuous, ∥

**w**∥ = 0 is a contradiction to ∥

**w**∥ = 1. The latter equality follows from \(\|\mathbf {v}_{n}-\mathbf {w}_{n}\| \leq \frac {1}{n}\) and ∥

**v**

_{n}∥ = 1. Hence \(\lim _{n\rightarrow \infty }\text {dist}(\mathbf {v}_{n},\partial {\mathscr{B}})=\text {dist}(\mathbf {v},\partial {\mathscr{B}})>0\) holds and implies that \(\mathbf {v}\notin \partial {\mathscr{B}}\). Since \(\mathbf {v}_{n}\in {\mathscr{B}}\), the limit

**v**is in \(\overline {{\mathscr{B}}}={\mathscr{B}}\cup \partial {\mathscr{B}}\) (cf. (5)) and \(\mathbf {v}\notin \partial {\mathscr{B}}\) proves

**v**=

**w**which is a contradiction since both tensors are in disjoint sets (cf. (11), (12)). □

The interpretation of Theorem 4 depends on the topological structure of \(\partial {\mathscr{B}}\) as seen next.

### *Remark 4*

If \(\partial {\mathscr{B}}\) is closed, the distance \(\text {dist}(\mathbf {v},\partial {\mathscr{B}})\) is positive for all \(\mathbf {v}\in {\mathscr{B}}\). This yields a nontrivial estimate (??) for all \(\mathbf {v}\in {\mathscr{B}}\).

### *Proof*

\(\mathbf {v}\in {\mathscr{B}}\) and \(\partial {\mathscr{B}}\subset \mathcal {F}\) implies \(\mathbf {v}\notin \partial {\mathscr{B}}\). Note that \(\text {dist}(\mathbf {v},\partial {\mathscr{B}})=0\) for a closed set \(\partial {\mathscr{B}}\) is equivalent to \(\mathbf {v}\in \partial {\mathscr{B}}\). □

### *Remark 5*

(a) \(\mathcal {C}\) is a subset of \({\mathscr{B}}\). (b) \(\text {dist}(\mathbf {v},\partial {\mathscr{B}})=0\) holds for \(\mathbf {v}\in {\mathscr{B}}\) if and only if \(\mathbf {v}\in \mathcal {C}\).

### *Proof*

(a) \(\partial {\mathscr{B}}\subset \overline {{\mathscr{B}}}\) implies \(\overline {\partial {\mathscr{B}}}\subset \overline {{\mathscr{B}}}={\mathscr{B}}\cup \partial {\mathscr{B}}\) and \(\mathcal {C}\subset {\mathscr{B}}\cup \partial {\mathscr{B}}\). Since \(\mathcal {C}\cap \partial {\mathscr{B}}=\emptyset \), \(\mathcal {C}\subset {\mathscr{B}}\) is proved.(b) Note that \(\text {dist}(\mathbf {v},\partial {\mathscr{B}})=0\) is equivalent to \(\text {dist}(\mathbf {v},\overline {\partial {\mathscr{B}}})=0\). In the latter case there is some \(\mathbf {w}\in \overline {\partial {\mathscr{B}}}\) with \(\|\mathbf {v}-\mathbf {w}\|=\text {dist}(\mathbf {v},\overline {\partial {\mathscr{B}}})=0\), i.e., **v** = **w**. Comparing \(\mathbf {v}\in {\mathscr{B}}\) and \(\mathbf {w}\in \overline {\partial {\mathscr{B}}}=\partial {\mathscr{B}}\cup \mathcal {C}\) and noting that \(\partial {\mathscr{B}}\subset \mathcal {F}\), it follows that \(\mathbf {v}=\mathbf {w}\in \mathcal {C}\). □

In case of a nonclosed \(\partial {\mathscr{B}}\), the estimate (??) degenerates to *δ*(**v**, *ε*) ≥ 0 if and only if \(\mathbf {v}\in \mathcal {C}\).

### *Remark 6*

For \(\mathcal {F}=\mathcal {R}_r\) it is not hard to prove that the divergence behaviour only depends on the border rank and the order *d* of the tensor, but not on \(\dim ({V}_{j})\).

#### 4.1.4 Example \({\otimes }^{3}{\mathbb {R}}^{2}\)

**V**is 3 (cf. Kruskal [10]). Hence \(\mathcal {R}_{3}\) coincides with

**V**and is obviously closed. As seen by the tensor (1), \(\mathcal {R}_{2}\) is not closed. In fact, (1) describes all border tensors up to tensor space isomorphisms:

**v**is defined in (1) with {

*a*,

*b*} being a fixed basis of \(\mathbb {R}^{2}\). Let

*ϕ*

^{(2)}=

*ϕ*

^{(3)}be the identity and define

*ϕ*

^{(1)}by

*ϕ*

^{(1)}(

*a*) =

*a*,

*ϕ*

^{(1)}(

*a*) =

*a*+

*t*

*b*. For

*t*≠ 0,

*ϕ*

^{(1)}is an isomorphism, whereas for

*t*= 0 it is not invertible. Note that with these mappings \(\left ({\bigotimes }_{j=1}^{3}\phi ^{(j)}\right )(\mathbf {v)}\) coincides with the tensor in Example 1. For

*t*= 0 we obtain the tensor

*j*= 2 and

*j*= 3 yields

**w**

_{i},

*i*∈{1,2,3}, and general linear maps \(\psi ^{(j)}\in L(\mathbb {R}^{2},\mathbb {R}^{2})\), i.e.,

### **Proposition 3**

\(\partial {\mathscr{B}}\)*is closed*.

## Footnotes

- 1.
Since we only consider finite-dimenional tensor spaces, all tensors are algebraic tensors, i.e., their rank is finite.

- 2.
The term ‘general’ tensor means all tensors for which a polynomial does not vanish. The set of exceptional tensors is of measure zero.

- 3.
The case of \(\mathbb {K}=\mathbb {R}\) is more involved (cf. Angelini–Bocci–Chiantini [1, Theorem 4.2]).

- 4.
Private communication by M. Michałek.

## Notes

### Acknowledgments

Open access funding provided by Max Planck Society. I thank Mateusz Michałek (Leipzig) for many instructive discussions. From him I got better insight into the nature of the set \(\partial {\mathscr{B}}\).

## References

- 1.Angelini, E., Bocci, C., Chiantini, L.: Real identifiability vs. complex identifiability. Linear Multilinear Algebra
**66**, 1257–1267 (2018)MathSciNetCrossRefGoogle Scholar - 2.Chiantini, L., Ottaviani, G.: On generic identifiability of 3-tensors of small rank. SIAM J. Matrix Anal. Appl.
**33**, 1018–1037 (2012)MathSciNetCrossRefGoogle Scholar - 3.Coppi, R., Bolasco, S. (eds.): Multiway Data Analysis. North-Holland, Amsterdam (1989)Google Scholar
- 4.De Silva, V., Lim, L. H.: Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM J. Matrix Anal. Appl.
**30**, 1084–1127 (2008)MathSciNetCrossRefGoogle Scholar - 5.Czapliński, A., Michałek, M., Seynnaeve, T.: Uniform matrix product states from an algebraic geometer’s point of view. arXiv:1904.07563 (2019)
- 6.Domanov, I., De Lathauwer, L.: Canonical polyadic decomposition of third-order tensors: Reduction to generalized eigenvalue decomposition. SIAM J. Matrix Anal. Appl.
**35**, 636–660 (2014)MathSciNetCrossRefGoogle Scholar - 7.Domanov, I., De Lathauwer, L.: Generic uniqueness conditions for the canonical polyadic decomposition and INDSCAL. SIAM J. Matrix Anal. Appl.
**36**, 1567–1589 (2015)MathSciNetCrossRefGoogle Scholar - 8.Domanov, I., De Lathauwer, L.: Canonical polyadic decomposition of third-order tensors: Relaxed uniqueness conditions and algebraic algorithm. Linear Algebra Appl.
**513**, 342–375 (2017)MathSciNetCrossRefGoogle Scholar - 9.Hackbusch, W.: Tensor Spaces and Numerical Tensor Calculus. Springer, Berlin (2012). 2nd edn. appears in 2020CrossRefGoogle Scholar
- 10.Kruskal, J. B.: Rank, decomposition, and uniqueness for 3-way and
*N*-way arrays. In: Coppi, Bolasco (eds.) Multiway Data Analysis, pp 7–18. North-Holland, Amsterdam (1989)Google Scholar - 11.Landsberg, J. M.: Tensors: Geometry and Applications. AMS, Providence (2012)zbMATHGoogle Scholar
- 12.Perez-García, D., Verstraete, F., Wolf, M. M., Cirac, J. I.: Matrix product state representations. Quantum Inf. Comput.
**7**, 401–430 (2007)MathSciNetzbMATHGoogle Scholar - 13.Qi, Y., Michałek, M., Lim, L.H.: Complex best
*r*-term approximations almost always exist in finite dimensions. Appl. Comput. Harmon. Anal. Available online (2019)Google Scholar - 14.Sørensen, M., De Lathauwer, L.: Coupled canonical polyadic decompositions and (coupled) decompositions in multilinear rank-(
*l*_{r, n},*L*_{r, n}, 1) terms—part I: Uniqueness. SIAM J. Matrix Anal. Appl.**36**, 496–522 (2015)MathSciNetCrossRefGoogle Scholar - 15.Sørensen, M., De Lathauwer, L., Comon, P., Icart, S., Deneire, L.: Canonical polyadic decomposition with a columnwise orthonormal factor matrix. SIAM J. Matrix Anal. Appl.
**33**, 1190–1213 (2012)MathSciNetCrossRefGoogle Scholar - 16.Sørensen, M., Domanov, I., De Lathauwer, L.: Coupled canonical polyadic decompositions and (coupled) decompositions in multilinear rank-(
*l*_{r, n},*L*_{r, n}, 1) terms—part II: Algorithms. SIAM J. Matrix Anal. Appl.**36**, 1015–1045 (2015)MathSciNetCrossRefGoogle Scholar

## Copyright information

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.