1 Introduction

In this paper, we recapitulate the numerical techniques which are needed to handle high-dimensional problems. As discussion starter, we use an example from quantum chemistry. The following function h is to be determined:

$$ h(x,z)={\int}_{\mathbb{R}^{3}}f(x,x-y)g(y,z)\mathrm{d}y\qquad(x,z\in\mathbb{R}^{3}) $$
(1)

(for instance, f and g describe the pair amplitude and the pair interaction; cf. Flad–Flad-Harutyunyan [5]). A discretisation by a uniform grid {ih = (i1h,i2h,i3h) : 0 ≤ i1,i2,i3n − 1} (h: grid size) in a cube leads to the discrete problem

$$ h_{\mathbf{ik}}=h^{3}{\sum}_{\mathbf{j}}f_{\mathbf{i},\mathbf{i} - \mathbf{j}}g_{\mathbf{j},\mathbf{k}}\qquad(\mathbf{i}=(i_{1},i_{2},i_{3}),\mathbf{k}=(k_{1},k_{2},k_{3}),0\leq i_{\nu},k_{\nu}\leq n-1). $$
(2)

Equation (2) describes an unusual matrix multiplication of convolution type:

$$ H=F\star G\qquad(H=(h_{\mathbf{ik}}),F=(f_{\mathbf{i},\mathbf{j}}),G=(g_{\mathbf{j},\mathbf{k}})). $$
(3)

The size of the matrices (number of entries) is n6. Taking n of the size 210 ≈ 103 to 220 ≈ 106, it becomes obvious that naive methods cannot be used to perform the multiplication (3).

In Section 2, we shall consider the matrices in (3) as tensors of the spaceFootnote 1\(\mathbb {R}^{N}\otimes \mathbb {R}^{N}\) with

$$N=n^{3}. $$

Then, problem (3) reduces to operations of vectors in \(\mathbb {R}^{N}\).

In a second step (Section 3), \(\mathbb {R}^{N}\) is regarded as the tensor space \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\otimes \mathbb {R}^{n}\). For such tensors, we describe an efficient representation and show how operations are performed. In our example, we need two operations in \(\mathbb {R}^{n}\):

  • the Hadamard productvw defined by the componentwise product (vw)i = viwi, and

  • the convolutionvw defined by \((v\star w)_{i}={\sum }_{\ell }v_{i-\ell }w_{\ell }\).

The convolution vw is a discretisation of the convolution of functions, \({\int }_{\mathbb {R}}v(x-y)w(y)\mathrm {d}y\), provided that vi (wi) are the nodal values of v (w) in an equidistant grid. For instance, the convolution in \(\mathbb {R}^{n}\) can be performed by the fast Fourier transform (FFT) requiring O(n log n) operations. However, as explained in Section 4, we can perform the convolution (as well as the Hadamard product) much faster using the tensorisation technique. Here, \(\mathbb {R}^{n}\) for n = 2L is replaced by the isomorphic tensor space \(\otimes ^{L}\mathbb {R}^{2}\). In many cases, grid functions in \(\mathbb {R}^{n}\)—in particular those from quantum chemistry—can be approximated by a tensor representation using only \(\mathcal {O}(\log ^{\ast }n)\) data.Footnote 2 Then, the exact convolution of vw requires not more than \(\mathcal {O}(\log ^{\ast }n)\) operations.

The convolution algorithm mentioned above is also interesting outside of quantum chemistry applications. Often, the functions v and w in \({\int }_{\mathbb {R}}v(x-y)w(y)\mathrm {d}y\) are represented by finite elements using locally refined grids or even hp techniques to reduce the number of degrees of freedom. If FFT is used for the convolution, one must transfer the finite-element functions to a uniform grid corresponding to the minimal grid size and thus one is destroying the advantages of the nonuniform finite-element approach.Footnote 3 The tensorisation technique is able to represent the data at least as efficient as in the finite-element case. Then, the operation cost is determined by the data sizes of the representations. Moreover, it yields the optimal representation of the result vw.

2 Low-Rank Techniques for Matrices

2.1 Low-Rank Representation

In quantum chemistry, it is more usual to write the integral (1) as

$$ h(x,z)={\int}_{\mathbb{R}^{3}}\tilde{f}(x,y)g(y,z)\mathrm{d}y \qquad(x,z\in\mathbb{R}^{3}) $$
(4)

by introducing \(\tilde {f}(x,y):=f(x,x-y)\) (cf. [5, (1.4)]). Then, the discrete analogue is the standard matrix product \(\tilde {F}G\) instead of (3). However, this notation is less appropriate since the properties of the function f and of the matrix F are swept under the carpet.

The function f has a (representation) rank r if \(f(x,y)={\sum }_{\nu = 1}^{r}a_{\nu }(x)b_{\nu }(y)\), where {aν} and {bν} are linearly independent univariate functions. The latter identity is also written in tensor form as

$$f=\sum\limits_{\nu= 1}^{r}a_{\nu}\otimes b_{\nu}. $$

For instance, the function f(x,y) = φ(x)/∥yy0∥ (y0 position of a nucleus) has rank r = 1. However, the function \(\tilde {f}(x,y):=\varphi (x)/\|y_{0}+x-y\|\) involved in (4) has infinite rank.

If the matrix \(F\in \mathbb {R}^{N\times N}\) has the rank r, it allows a representation \(F={\sum }_{\nu = 1}^{r}a_{\nu }b_{\nu }^{\mathsf {T}}~(a_{\nu },b_{\nu }\in \mathbb {R}^{N})\). Again, we write

$$ F={\sum}_{\nu= 1}^{r}a_{\nu}\otimes b_{\nu}. $$
(5)

The splitting of the tensor space \(\mathbb {R}^{N}\otimes \mathbb {R}^{N}\cong \mathbb {R}^{N\times N}\) (≅ denotes isomorphy) into the two factors \(\mathbb {R}^{N}\) is depicted in Fig. 1. In general, the tensor product v = v(1)v(2) ⊗⋯ ⊗ v(d) with \(v^{(j)}\in \mathbb {R}^{n_{j}}\) is a quantity indexed by d-tuples i = (i1,…,id) with the values

$$ \mathbf{v[i]}=v^{(1)}[i_{1}]\cdot v^{(2)}[i_{2}]\cdot\ldots\cdot v^{(d)} [i_{d}]\qquad(1\leq i_{j}\leq n_{j}). $$
(6)

Here and in the sequel, we use boldface letters for tensors and tensor spaces, while vectors, matrices, and vector spaces are denoted by standard letters.

Fig. 1
figure 1

Tensor space \(\mathbb {R}^{N}\otimes \mathbb {R}^{N}\cong \mathbb {R}^{N\times N}\) and its factors \(\mathbb {R}^{N},\mathbb {R}^{N}\)

If r is much smaller than N, (5) describes the low-rank representation of F. Note that the right-hand side of (5) requires only 2rNN2 data.

v(1)v(2) ⊗⋯ ⊗ v(d) is called an elementary tensor. In general, v(j) may be elements of arbitrary vector spaces Vj. The (algebraic) tensor space \(\mathbf {V}=V_{1}\otimes V_{2}\otimes \cdots \otimes V_{d}=\bigotimes _{j = 1}^{d}V_{j}\) is defined as the span of all elementary tensors (cf. [10, Section 3.2]).

Remark 1

As a consequence, linear maps on V are uniquely defined by their values of elementary tensors. The same holds for bilinear maps on Cartesian products V ×W of two tensor spaces.

2.2 SVD Truncation

Even if F has maximal rank N, it might be well approximated by a low-rank matrix Fε with rank rε. For the precise analysis, we need the singular-value decomposition (SVD) of F which is

$$F=\sum\limits_{\nu= 1}^{r}\sigma_{\nu}a_{\nu}\otimes b_{\nu},\qquad \{a_{\nu}\},\{b_{\nu}\}\text{ orthonormal systems}, $$

with the singular values σ1σ2 ≥⋯ ≥ σr > 0. The traditional formulation is F = UΣVT, where the columns of U and V are defined by aν and bν, respectively, and Σ is the diagonal matrix containing the singular values.

If \(\sigma _{r_{\varepsilon }}\leq \varepsilon \) for some rε < r, the truncated matrix \(F_{\varepsilon }:={\sum }_{\nu = 1}^{r_{\varepsilon }}\sigma _{\nu }a_{\nu }\otimes b_{\nu }\) has rank rε and satisfies the spectral norm estimate ∥FFε2ε.

Now, we assume

$$F=\sum\limits_{\nu= 1}^{r}a_{\nu}\otimes b_{\nu},\qquad G=\sum\limits_{\mu= 1}^{s}c_{\mu}\otimes d_{\mu} $$

for the matrices in (3). We denote the entries of the vectors aν,bν,… by aν[i],bν[i],…, where i abbreviates the triple (i1,i2,i3). Since \(F_{\mathbf {i},\mathbf {j}}={\sum }_{\nu = 1}^{r}a_{\nu }[\mathbf {i}]b_{\nu }[\mathbf {j}]\) etc., the operation described in (2) becomes

$$h_{\mathbf{ik}}=h^{3}\sum\limits_{\nu= 1}^{r}\sum\limits_{\mu= 1}^{s}\sum\limits_{\mathbf{j}}a_{\nu }[\mathbf{i}] b_{\nu}[\mathbf{i}-\mathbf{j}] c_{\mu}[\mathbf{j}] d_{\mu}[\mathbf{k}]. $$

\({\sum }_{\mathbf {j}}b_{\nu }[\mathbf {i}-\mathbf {j}] c_{\mu }[\mathbf {j}]\) is the component of the convolution bνcμ at index i. Set qνμ := bνcμ. Then, the expression \({\sum }_{\mathbf {j}}a_{\nu }[\mathbf {i}] b_{\nu }[\mathbf {i}-\mathbf {j}] c_{\mu }[\mathbf {j}]\) is the i-component of the Hadamard product aνqνμ. Together, we obtain the representation of the matrix H in (3) by

$$ H=\sum\limits_{\mu= 1}^{s}\left( h^{3}\sum\limits_{\nu= 1}^{r}\left[a_{\nu}\odot\left( b_{\nu}\star c_{\mu}\right)\right] \right) \otimes d_{\mu}. $$
(7)

Hence, the following has to be calculated:

  1. (a)

    determine the vectors \(q_{\nu \mu }:=b_{\nu }\star c_{\mu }\in \mathbb {R}^{N}\),

  2. (b)

    calculate the Hadamard products \(a_{\nu }\odot q_{\nu \mu }\in \mathbb {R}^{N}\),

  3. (c)

    determine the sum \(e_{\mu }:=h^{3}{\sum }_{\nu = 1}^{r}a_{\nu }\odot q_{\nu \mu }\).

Then, \(H={\sum }_{\mu = 1}^{s}e_{\mu }\otimes d_{\mu }\) is the representation of the resulting matrix. This shows that H is again a low-rank matrix if G is so. Nevertheless, one may apply a singular-value decomposition and truncate H to a lower rank.

Since N = n3 holds with a large value of n, even the simple Hadamard product in Step (b) is too costly when using the standard vector format. Instead we shall exploit the tensor structure of \(\mathbb {R}^{N}\).

For later use, we return to the representation (5). Let

$$U:=\operatorname{span}\{a_{\nu}:1\leq\nu\leq r\},\qquad V:=\operatorname{span} \{b_{\nu}:1\leq\nu\leq r\}. $$

Then, the tensor (matrix) F satisfies

$$ F\in U\otimes V\qquad\text{with }\dim(U)=\dim(V)=r. $$
(8)

Comparing (8) with \(F\in \mathbb {R}^{N}\otimes \mathbb {R}^{N}\), we see that the full space \(\mathbb {R}^{N}\) of dimension N is replaced by subspaces of dimension rN.

3 The Hierarchical Tensor Format

3.1 Separation and Bilinear Operations

Here, we make use of the Cartesian product structure of the grid {(i1h,i2h,i3h) : 0 ≤ i1,i2,i3n − 1}. The tensor product of three vectors \(a,b,c\in \mathbb {R}^{n}\) is defined in (6). These tensors span the tensor space \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\otimes \mathbb {R}^{n}\) which is isomorphic to \(\mathbb {R}^{N}\) (both spaces have dimension N = n3).

The analogue of the decomposition (5) would be the representation of \(\mathbf {v}\in \mathbf {V}:=\mathbb {R}^{n}\otimes \mathbb {R}^{n}\otimes \mathbb {R}^{n}\) by

$$ \mathbf{v}=\sum\limits_{\nu= 1}^{r}a_{\nu}\otimes b_{\nu}\otimes c_{\nu}. $$
(9)

The smallest possible value of r is called the rank of the tensor v. The fact that in general the determination of this rank is NP hard (cf. Håstad [12]) already shows that the case of tensors of order ≥ 3 is much more involved. In particular, there is no direct analogue of the singular-value decomposition. This leads to difficulties when one wants to truncate a tensor to lower order (cf. Espig–Hackbusch [4]).

The Hadamard product (componentwise product) ⊙ is a bilinear operation V ×VV. Another bilinear map is the matrix-vector multiplication. For a unified approach let \(\boxdot \) be the symbol of a general bilinear operation between two tensor spaces. An efficient computation of such a tensor operation \(\boxdot :\mathbf {X}\times \mathbf {Y}\rightarrow \mathbf {Z}\) (with \(\mathbf {X}=\bigotimes _{j = 1}^{d}X_{j}\), etc.) can be based on the following property (10), provided this property holds. Let \(\mathbf {x} = \bigotimes _{j = 1}^{d}x^{(j)}\) and \(\mathbf {y}=\bigotimes _{j = 1}^{d}y^{(j)}\) be elementary tensorsFootnote 4 with x(j)Xj, y(j)Yj. Then,

$$ \left( \bigotimes_{j = 1}^{d}x^{(j)}\right) \boxdot \left( \bigotimes_{j = 1}^{d} y^{(j)}\right) = \bigotimes_{j = 1}^{d}\left( x^{(j)}\boxdot_{j}y^{(j)}\right) $$
(10)

reduces the operation \(\boxdot \) to simpler bilinear operations \(\boxdot _{j}:X_{j}\times Y_{j}\rightarrow Z_{j}\) on the individual vector spaces.

In the case of the Hadamard product, \(\boxdot =\odot \) is the componentwise product of tensors, while \(\boxdot _{j}=\odot \) is the componentwise product of vectors. In fact, the property

$$ \left( a\otimes b\otimes c\right) \odot \left( a^{\prime}\otimes b^{\prime}\otimes c^{\prime}\right) = \left( a\odot a^{\prime}\right) \otimes \left( b\odot b^{\prime}\right) \otimes\left( c\odot c^{\prime}\right) $$
(11)

follows since {(abc) ⊙ (abc)}[i] = (abc)[i] ⋅ (abc)[i] = a[i1]b[i2]c[i3]a[i1]b[i2]c[i3] and {(aa)⊗(bb)⊗(cc)}[i] = (aa)[i1](bb)[i2](cc)[i3] = a[i1]a[i1]b[i2]b[i2]c[i3]c[i3]coincide. Note that on the left-hand side of (11) ⊙ acts on V ×V, whereas on the right-hand side ⊙ acts on \(\mathbb {R}^{n}\times \mathbb {R}^{n}\).

Another example is the canonical scalar product of a (pre-)Hilbert tensor space X satisfying

$$\left\langle \bigotimes_{j = 1}^{d} x^{(j)},~ \bigotimes_{j = 1}^{d} y^{(j)}\right\rangle = {\prod}_{j = 1}^{d} \left\langle x^{(j)},y^{(j)}\right\rangle. $$

This corresponds to (10) with Y = X and \(\mathbf {Z}=\mathbb {R}\) (the field \(\mathbb {R}\) is considered as a tensor space of order d = 0).

The notation \((\mathbf {x}\star \mathbf {y})[\mathbf {i}]={\sum }_{\mathbf {j}}\mathbf {x}[\mathbf {i}-\mathbf {j}]\mathbf {y}[\mathbf {j}]\) of the multivariate convolution involving multiindices \(\mathbf {i}\in \mathbb {N}_{0}^{d}\) shows that also \(\boxdot =\star \) satisfies (10). For d = 3, we have

$$ \left( a_{\nu}\otimes b_{\nu}\otimes c_{\nu}\right) \star \left( a_{\nu}^{\prime}\otimes b_{\nu}^{\prime}\otimes c_{\nu}^{\prime}\right) = \left( a_{\nu}\star a_{\nu}^{\prime}\right) \otimes\left( b_{\nu}\star b_{\nu}^{\prime}\right) \otimes\left( c_{\nu}\star c_{\nu}^{\prime}\right). $$
(12)

Hence, the Hadamard and convolution operations can be reduced to operations acting on vectors in \(\mathbb {R}^{n}\). If v and w are given in the form (9), all pairs of elementary terms can be treated by (11) or (12), respectively.

3.2 Introduction of the Hierarchical Format

In the following, we use the hierarchical format, which has the additional advantage that a SVD truncation can be performed (cf. [10, Section 11]). For that purpose, we need tensors of order 2 (matrix case) and rewrite \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\otimes \mathbb {R}^{n}\) as \((\mathbb {R}^{n}\otimes \mathbb {R}^{n}) \otimes \mathbb {R}^{n}\cong \mathbb {R}^{n^{2}}\otimes \mathbb {R}^{n}\). In a second step, we split \(\mathbb {R}^{n^{2}}\) into \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\). This leads to the binary tree shown in Fig. 2.

Fig. 2
figure 2

Decomposition of \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\otimes \mathbb {R}^{n}\)

In the first step, we regard the components v[i] = v[i1,i2,i3] of \(v\in \mathbb {R}^{N}\) as entries V [(i1,i2),i3] of the matrix \(V\in \mathbb {R}^{n^{2}\times n}\cong \mathbb {R}^{n^{2}}\otimes \mathbb {R}^{n}\). As in Section 2, we may write V as \({\sum }_{\nu = 1}^{s}v_{\nu }^{(12)}\otimes v_{\nu }^{(3)}\) (cf. (5)) with \(v_{\nu }^{(12)}\in \mathbb {R}^{n^{2}}\) and \(v_{\nu }^{(3)}\in \mathbb {R}^{n}\). In the second step, we regard \(v_{\nu }^{(12)}\) as n × n matrices or equivalently as tensors of \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\) of the form \({\sum }_{\nu = 1}^{r}v_{\nu } ^{(1)}\otimes v_{\nu }^{(2)}\).

Combining the structures of Figs. 1 and 2 yields the splitting depicted in Fig. 3. At the top of the tree, we see the matrix space \(\mathbb {R}^{N\times N}\cong \mathbb {R}^{N}\otimes \mathbb {R}^{N}\) with the sons \(\mathbb {R}^{N}\) on both sides. \(\mathbb {R}^{N}\cong \mathbb {R}^{n^{2}}\otimes \mathbb {R}^{n}\) is split into \(\mathbb {R}^{n^{2}}\) and \(\mathbb {R}^{n}\). Finally, \(\mathbb {R}^{n^{2}}\cong \mathbb {R}^{n}\otimes \mathbb {R}^{n}\) is split in two factors \(\mathbb {R}^{n}\).

Fig. 3
figure 3

Decomposition of \(\mathbb {R}^{N\times N}\)

Following the construction (8), we associate each vertex of the tree with a subspace. The leaves of the tree correspond to \(\mathbb {R}^{n}\). Therefore, there are six subspaces \(U_{1},\ldots ,U_{6}\subset \mathbb {R}^{n}\). U12 and U45 are subspaces of \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\cong \mathbb {R}^{n^{2}}\), while U123 and U456 are subspaces of \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\otimes \mathbb {R}^{n}\cong \mathbb {R}^{N}\). Also, the root \(\mathbb {R}^{N\times N}\) has a subspace U1 − 6. The hierarchical structure is given by

$$ \mathbf{U}_{\alpha}\subset\mathbf{U}_{\alpha_{1}}\otimes\mathbf{U}_{\alpha_{2}}\qquad(\alpha_{1},\alpha_{2}\text{ sons of }\alpha), $$
(13)

where α belongs to the index set {12,123,45,456,1-6}, i.e., U12U1U2, U123U12U3,…,U1-6U123U456 (cf. Fig. 4). The condition (8) becomes

$$ F\in\mathbf{U}_{1\text{-}6}\qquad(1\text{-}6\text{ is the index of the root).} $$
(14)

The subspaces are (in principle) described by a basis (or at least a generating system). The bases of U1,…,U6 corresponding to the leaves must be given explicitly. For the other indices, we avoid an explicit description since the basis vectors of \(\mathbb {R}^{n^{2}}\), \(\mathbb {R}^{N}=\mathbb {R}^{n^{3}}\), etc. are too large. Instead, we make use of (13). Let α be an index of an inner vertex of the tree (no leaf) and α1, α2 its sons. Let \(\{\mathbf {b}_{i}^{(\alpha _{1})}:1\leq i\leq r_{\alpha _{1}}\}\) and \(\{\mathbf {b}_{j}^{(\alpha _{2})}:1\leq j\leq r_{\alpha _{2}}\}\) be the bases of \(\mathbf {U}_{\alpha _{1}}\) and \(\mathbf {U}_{\alpha _{2}}\). Then \(\{\mathbf {b}_{i}^{(\alpha _{1})}\otimes \mathbf {b}_{j}^{(\alpha _{2})}:1\leq i\leq r_{\alpha _{1}},1\leq j\leq r_{\alpha _{2}}\}\) is a basis of \(\mathbf {U}_{\alpha _{1}}\otimes \mathbf {U}_{\alpha _{2}}\). A basis vector \(\mathbf {b}_{\ell }^{(\alpha )}\in \mathbf {U}_{\alpha }\subset \mathbf {U}_{\alpha _{1}}\otimes \mathbf {U}_{\alpha _{2}}\) must have a representation

$$ \mathbf{b}_{\ell}^{(\alpha)}=\sum\limits_{i,j}c_{ij}^{(\alpha,\ell)}\mathbf{b}_{i}^{(\alpha_{1})}\otimes\mathbf{b}_{j}^{(\alpha_{2})} $$
(15)

with coefficients \(c_{ij}^{(\alpha ,\ell )}\) forming an \(r_{\alpha _{1}}\times r_{\alpha _{2}}\) matrix

$$ C^{(\alpha,\ell)}=(c_{ij}^{(\alpha,\ell)}). $$
(16)

It is sufficient to store C(α,) instead of \(\mathbf {b}_{\ell }^{(\alpha )}\). Note that the necessary memory is independent of the vector size n.

Fig. 4
figure 4

Corresponding subspaces

If (14) holds, the subspace U1-6 can be reduced to the one-dimensional space Uroot = span{F}. Let \(\mathbf {b}_{1}^{(\text {root})}\) be the only basis vector. Then, only one additional factor \(c_{1}^{(\text {root})}\) is needed to characterise

$$ F=c_{1}^{(\text{root})}\mathbf{b}_{1}^{(\text{root})}. $$
(17)

Remark 2

  1. (a)

    In the given example, we have to store the bases of U1,…,U6 with the memory size \({\sum }_{j = 1}^{6}n_{j}r_{j}\). The matrices C(α,) require the memory size r12r1r2 + r45r4r5 + r123r12r3 + r456r45r6 + 1 ⋅ r123r456. \(c_{1}^{(\text {root})}\) is only one real number. If njn and rjr, the required memory size is bounded by 6nr + 4r3 + r2 + 1.

  2. (b)

    In the general case of tensors of order d (instead of 6 as above), the bound is dnr + (d − 1)r3 + 1.

Below, we shall demonstrate that we can perform the required operations although we only have an indirect access to the bases.

3.3 Matricisation

The above construction gives rise to two questions: Do subspaces with the properties (13), (14) exist and what are their dimensions

$$r_{\alpha}=\dim(\mathbf{U}_{\alpha}) $$

in the best case? The answer is given by the matricisation which maps a tensor isomorphically into a matrix. We explain this isomorphism for the example α = 45. The tensor \(F\in \bigotimes _{j = 1}^{6}\mathbb {R}^{n}\) has six indices (we write F[i1,…,i6] instead of F[i1,i2,i3,j1,j2,j3] = F[i,j]). The matrix M(45) is of the size \(\mathbb {R}^{n^{2}\times n^{4}}\) and has the entries

$$M^{(45)}[(i_{4},i_{5}), (i_{1},i_{2},i_{3},i_{6})] := F[i_{1},i_{2},i_{3},i_{4},i_{5},i_{6}]. $$

The subspace

$$\mathbf{U}_{45}:=\text{range}(M^{(45)})\qquad\text{with }r_{45}= \dim(\mathbf{U}_{45})=\operatorname{rank}(M^{(45)}) $$

is the smallest subspace satisfying (13) and (14). For a more general description of the minimal subspaces see [10, Section 6].

For \(\mathbf {v}\in \bigotimes _{j = 1}^{d}\mathbb {R}^{n_{j}}\) let \(\emptyset \neq \alpha \subsetneqq \{1,\ldots ,d\}\) be a subset with the complement αc := {1,…,d}∖α. In general, the minimal subspace \(\mathbf {U}_{\alpha }^{\min }(\mathbf {v}):=\text {range}(M^{(\alpha )})\) involves the matricisation M(α) = M(α)(v) which is defined by \(M^{(\alpha )}[(i_{j})_{j\in \alpha },(i_{j})_{j\in \alpha ^{c}}]=v[i_{1},\ldots ,i_{d}]\). Note that the index sets need not be ordered, since we only use properties of M(α) which do not depend on the ordering. The (matrix) rank of M(α) is called the α-rank of v (cf. Hitchcock [13]):

$$\operatorname{rank}_{\alpha}(\mathbf{v}):=\operatorname{rank}(M^{(\alpha )}(\mathbf{v})). $$

3.4 Hadamard Product and General Bilinear Operations

In the following, the Hadamard product ⊙ can be replaced by a general bilinear operation \(\boxdot \) (cf. (10)).

In (7), we need the Hadamard product vw of two tensors in \(\bigotimes _{j = 1}^{3}\mathbb {R}^{n}\). We assume that both v and w are represented in the hierarchical format corresponding to the tree depicted in Fig. 2. v uses the bases \(\{b_{i}^{(j)}:1\leq i\leq r_{j}\},~1\leq j\leq 3\), at the leaves and the coefficients \(c_{ij}^{(\alpha ,\ell )}\), \(c_{1}^{(\text {root})}\), whereas w is represented by \(\{b_{i}^{\prime (j)}\}\), \(c_{ij}^{\prime (\alpha ,\ell )}\), \(c_{1}^{\prime (\text {root})}\). Also, the ranks rα and \(r_{\alpha }^{\prime }\) may be different.

We start at the leaves and determine the Hadamard product of the basis vectors explicitly:

$$b_{(i,i^{\prime})}^{\prime\prime(j)}:=b_{i}^{(j)}\odot b_{i^{\prime}}^{\prime(j)}\qquad(1\leq j\leq3,~ 1\leq i\leq r_{j},~ 1\leq i^{\prime}\leq r_{j}^{\prime}). $$

By induction, we assume that the products \(\mathbf {b}_{(i,i^{\prime })}^{\prime \prime (\alpha _{1})}\) and \(\mathbf {b}_{(j,j^{\prime })}^{\prime \prime (\alpha _{2})}\) are (directly or indirectly) determined. Then, (15) and (11) prove that

$$\begin{array}{@{}rcl@{}} \mathbf{b}_{(\ell,m)}^{\prime\prime(\alpha)} & :=&\mathbf{b}_{\ell}^{(\alpha)}\odot\mathbf{b}_{m}^{\prime(\alpha)}=\left( \sum\limits_{i,j}c_{ij}^{(\alpha,\ell)}\mathbf{b}_{i}^{(\alpha_{1})}\otimes\mathbf{b}_{j}^{(\alpha_{2})}\right) \odot\left( \sum\limits_{i^{\prime},j^{\prime}}c_{i^{\prime}j^{\prime}}^{\prime(\alpha,m)}\mathbf{b}_{i^{\prime}}^{\prime(\alpha_{1})}\otimes\mathbf{b}_{j^{\prime}}^{\prime(\alpha_{2})}\right)\\ &=&\sum\limits_{i,j}\sum\limits_{i^{\prime},j^{\prime}}c_{ij}^{(\alpha,\ell)}c_{i^{\prime}j^{\prime}}^{\prime(\alpha,m)}\left( \mathbf{b}_{i}^{(\alpha_{1})}\odot\mathbf{b}_{i^{\prime}}^{\prime(\alpha_{1})}\right) \otimes\left( \mathbf{b}_{j}^{(\alpha_{2})}\odot\mathbf{b}_{j^{\prime}}^{\prime(\alpha_{2})}\right) \\ &=&\sum\limits_{\left( i,i^{\prime}\right)}\sum\limits_{\left( j,j^{\prime}\right)}c_{ij}^{(\alpha,\ell)}c_{i^{\prime}j^{\prime}}^{\prime(\alpha,m)}\mathbf{b}_{(i,i^{\prime})}^{\prime\prime(\alpha_{1})}\otimes\mathbf{b}_{(j,j^{\prime})}^{\prime\prime(\alpha_{2})}. \end{array} $$
(18)

The result x := vw is represented by the generating system \(\{b_{(i,i^{\prime })}^{\prime \prime (j)}\}\), 1 ≤ j ≤ 3, at the leaves. Here, the pairs (i,i) are the indices; thus, the index set has the size \(r_{j}^{\prime \prime }:=r_{j}r_{j}^{\prime }\). The equation (15) for the new vector contains the coefficients \(c_{(i,i^{\prime }),(j,j^{\prime })}^{\prime \prime (\alpha ,(\ell ,m))}:=c_{ij}^{(\alpha ,\ell )}c_{i^{\prime }j^{\prime }}^{\prime (\alpha ,m)}\). The coefficient \(c_{1}^{\prime \prime (\text {root})}\) is \(c_{1}^{(\text {root})}c_{1}^{\prime (\text {root})}\), since \(\mathbf {v}\odot \mathbf {w}=\left (c_{1}^{(\text {root})}\mathbf {b}_{1}^{(\text {root})}\right ) \odot \left (c_{1}^{\prime (\text {root})}\mathbf {b}_{1}^{\prime (\text {root})}\right ) =c_{1}^{(\text {root})}c_{1}^{\prime (\text {root})}\mathbf {b}_{1}^{(\text {root})}\odot \mathbf {b}_{1}^{\prime (\text {root})}=c_{1}^{(\text {root})}c_{1}^{\prime (\text {root})}\mathbf {b}_{(1,1)}^{\prime \prime (\text {root})}\).

We call \(\{\mathbf {b}_{(i,i^{\prime })}^{\prime \prime (\alpha )}\}\) a generating system (or frame) since these vectors are not necessarily linearly independent. If not, the system \(\{\mathbf {b}_{(i,i^{\prime })}^{\prime \prime (\alpha )}\}\) is larger than necessary and we can shorten the system. Even if \(\{\mathbf {b}_{(i,i^{\prime })}^{\prime \prime (\alpha )}\}\) forms a basis, the question remains whether we can truncate the basis within a given tolerance. This will be the subject of Section 3.6.

Remark 3

The computation of all \(b_{(i,i^{\prime })}^{\prime \prime (j)}\) requires \(3nr_{j}r_{j}^{\prime }\) multiplications. If all coefficients \(c_{(i,i^{\prime }),(j,j^{\prime })}^{\prime \prime (\alpha ,(\ell ,m))}\) are computed explicitly, we need \(r_{\alpha }r_{\alpha }^{\prime }r_{\alpha _{1}}r_{\alpha _{1}}^{\prime }r_{\alpha _{2}}r_{\alpha _{2}}^{\prime }\) multiplications. The resulting cost is the product of the data sizes of v and w.

In Section 4, the ranks \(r_{\alpha }^{\prime }\), \(r_{\alpha _{1}}^{\prime }\), \(r_{\alpha _{2}}^{\prime }\) will be equal to 2.

3.5 Scalar Product, Orthonormalisation, Transformations

As mentioned above, the linear independence of the new frame \(\{\mathbf {b}_{(i,i^{\prime })}^{\prime \prime (\alpha )}\}\) has to be checked. This can be done by the QR algorithm, provided we are able to determine scalar products \(\left \langle \mathbf {b}_{(i,i^{\prime })}^{\prime \prime (j)},\mathbf {b}_{(m,m^{\prime })}^{\prime \prime (j)}\right \rangle \) of the vectors determined in (18). We simplify the notation (index i instead of (,m)) and consider the bases \(\{\mathbf {b}_{i}^{(\alpha )}\}\) at the vertex α and their connection by (15). We proceed from the leaves to the root as in Section 3.4.

At the leaves, the bases are explicitly given so that the scalar products

$$ \sigma_{ij}^{(\alpha)}:=\left\langle \mathbf{b}_{i}^{(\alpha)},\mathbf{b}_{j}^{(\alpha)}\right\rangle $$
(19)

can be determined as usual. As soon as \(\sigma _{ij}^{(\alpha _{1})}\) and \(\sigma _{ij}^{(\alpha _{2})}\) are known for the sons of α, \(\sigma _{\ell m}^{(\alpha )}\) can be determined by

$$\begin{array}{@{}rcl@{}} \sigma_{\ell m}^{(\alpha)} & =& \left\langle \mathbf{b}_{\ell}^{(\alpha)},\mathbf{b}_{m}^{(\alpha)}\right\rangle =\left\langle \sum\limits_{i,j} c_{ij}^{(\alpha,\ell)}\mathbf{b}_{i}^{(\alpha_{1})}\otimes\mathbf{b}_{j}^{(\alpha_{2})},\sum\limits_{i^{\prime},j^{\prime}}c_{i^{\prime}j^{\prime}}^{(\alpha,m)}\mathbf{b}_{i^{\prime}}^{(\alpha_{1})}\otimes\mathbf{b}_{j^{\prime}}^{(\alpha_{2})}\right\rangle\\ & =& {\sum}_{i,j}\sum\limits_{i^{\prime},j^{\prime}}c_{ij}^{(\alpha,\ell)}c_{i^{\prime}j^{\prime}}^{(\alpha,m)}\left\langle \mathbf{b}_{i}^{(\alpha_{1})},\mathbf{b}_{i^{\prime}}^{(\alpha_{1})}\right\rangle \left\langle \mathbf{b}_{j}^{(\alpha_{2})},\mathbf{b}_{j^{\prime}}^{(\alpha_{2})}\right\rangle ={\sum}_{i,j}\sum\limits_{i^{\prime},j^{\prime}}c_{ij}^{(\alpha,\ell)}c_{i^{\prime}j^{\prime}}^{(\alpha,m)}\sigma_{ii^{\prime}}^{(\alpha_{1})}\sigma_{jj^{\prime}}^{(\alpha_{2})}, \end{array} $$
(20)

since the Euclidean scalar product satisfies the rule 〈vw,xy〉 = 〈v,x〉〈w,y〉. The induction (20) terminates at the vertex α, where the scalar products (19) are desired.

Of particular interest are orthonormal bases: \(\sigma _{ij}^{(\alpha )}=\delta _{ij}\). Using (15), we obtain the following result.

Remark 4

Let α be a non-leaf vertex. The basis \(\{\mathbf {b}_{\ell }^{(\alpha )}\}\) is orthonormal, if (a) the bases \(\{\mathbf {b}_{i}^{(\alpha _{1})}\}\) and \(\{\mathbf {b}_{j}^{(\alpha _{2})}\}\) of the sons α1,α2 are orthonormal and (b) the matrices C(α,) in (16) are orthonormal with respect to the Frobenius scalar product: \(\langle C^{(\alpha ,\ell )},C^{(\alpha ,m)}\rangle _{\mathsf {F}}={\sum }_{ij}c_{ij}^{(\alpha ,\ell )}c_{ij}^{(\alpha ,m)}=\delta _{\ell m}\).

The bases (or frames) can be orthonormalised as follows. Orthonormalise the explicitly given bases at the leaves (e.g., by QR). As soon as \(\{\mathbf {b}_{i}^{(\alpha _{1})}\}\) and \(\{\mathbf {b}_{j}^{(\alpha _{2})}\}\) are orthonormal, orthonormalise the matrices C(α,). The new matrices \(C_{\text {new}}^{(\alpha ,\ell )}\) define a new orthonormal basis \(\{\mathbf {b}_{\ell ,\text {new}}^{(\alpha )}\}\). The cost is described in [10, Remark 11.32].

The above mentioned calculations require basis transformations. Here, the following has to be taken into account (cf. [10, Section 11.3.1.4]).

  • Case A1. Let α1 be the first son of α. Assume that the basis \(\{\mathbf {b}_{i}^{(\alpha _{1})}\}\) is transformed into a new basis \(\{\mathbf {b}_{i,\text {new}}^{(\alpha _{1})}\}\) so that \(\textbf {b}_{i}^{(\alpha _{1})}={\sum }_{k}T_{ki} \mathbf {b}_{k,\text {new}}^{(\alpha _{1})}\). Changing C(α,) into \(C_{\text {new}}^{(\alpha ,\ell )}:=TC^{(\alpha ,\ell )}\), the basis \(\{\mathbf {b}_{\ell }^{(\alpha )}\}\) remains unchanged.

  • Case A2. If \(\textbf {b}_{i}^{(\alpha _{2})}={\sum }_{k}T_{ki}\mathbf {b}_{k,\text {new}}^{(\alpha _{2})}\) is a transformation of the second son of α, C(α,) must be changed into C(α,)TT.

  • Case B. Consider a non-leaf vertex α. If the basis \(\{\mathbf {b}_{\ell }^{(\alpha )}\}\) should be transformed into \(\mathbf {b}_{\ell ,\text {new}}^{(\alpha )}:={\sum }_{i}T_{\ell i}\mathbf {b}_{i}^{(\alpha )}\), one has to change the coefficient matrices C(α,) by \(C_{\text {new}}^{(\alpha ,\ell )}:={\sum }_{i}T_{\ell i}C^{(\alpha ,i)}\). (In addition, this transformation causes changes at the father vertex according to Case A1 or Case A2.)

3.6 SVD Truncation

The example in Section 3.4 shows that the Hadamard product is given by means of a generating system of increased size \(r_{j}^{\prime \prime }:=r_{j}r_{j}^{\prime }\). This size may be larger than necessary and should be truncated. The truncation is prepared by an orthonormalisation as described in Section 3.5.

In principle, the SVD truncation is based on the singular-value decompositions of the matricisationsFootnote 5M(α) (cf. Section 3.3). However, the singular values and singular vectors can be determined without the explicit knowledge of the huge matrix M(α).

Having generated orthonormal bases at all nodes, the singular-value decomposition starts at the root and proceeds to the leaves. It produces a basis \(\{\mathbf {b}_{\ell ,\text {new}}^{(\alpha )}\}\) together with singular values \(\sigma _{\ell }^{(\alpha )}\) indicating the importance of \(\mathbf {b}_{\ell ,\text {new}}^{(\alpha )}\). At the start α = root there is only one (normalised) basis vector \(\mathbf {b}_{1}^{(\text {root})}=\mathbf {b}_{1,\text {new}}^{(\text {root})}\) which remains unchanged. The corresponding weight factor is \(\sigma _{1}^{(\text {root})}=|c_{1}^{(\text {root})}|\) (cf. (17)).

Assume that the new basis \(\{\mathbf {b}_{\ell ,\text {new}}^{(\alpha )}\}\) is already computed at the vertex α and that α is not a leaf but has sons α1, α2. The basis \(\{\mathbf {b}_{\ell }^{(\alpha )}\}\) is characterised by the matrices C(α,). Together with the given values \(\sigma _{\ell }^{(\alpha )}\), we define the matricesFootnote 6

$$\begin{array}{@{}rcl@{}} \mathbf{Z}_{1} & :=& \left[\sigma_{1}^{(\alpha)}C^{(\alpha,1)},\sigma_{2}^{(\alpha)}C^{(\alpha,2)},\ldots,\sigma_{r_{\alpha}}^{(\alpha)} C^{(\alpha,r_{\alpha})}\right] \in\mathbb{R}^{r_{\alpha_{1}}\times(r_{\alpha}r_{\alpha_{2}})},\\ \mathbf{Z}_{2} & :=& \left[\sigma_{1}^{(\alpha)}\left( C^{(\alpha,1)}\right)^{\mathsf{T}},\sigma_{2}^{(\alpha)}\left( C^{(\alpha,2)}\right)^{\mathsf{T}},\ldots,\sigma_{r_{\alpha}}^{(\alpha)}\left( C^{(\alpha,r_{\alpha})}\right)^{\mathsf{T}}\right]^{\mathsf{T}}\in\mathbb{R}^{(r_{\alpha}r_{\alpha_{1}}) \times r_{\alpha_{2}}}. \end{array} $$

The SVD of these matrices yields \(\mathbf {Z}_{1}={\sum }_{i}\sigma _{i}^{(\alpha _{1})}u_{i}^{(\alpha _{1})}\otimes v_{i}^{(\alpha _{1})}\) and \(\mathbf {Z}_{2}={\sum }_{i}\sigma _{i}^{(\alpha _{2})}u_{i}^{(\alpha _{2})}\otimes v_{i}^{(\alpha _{2})}\) with orthonormal vectors \(u_{i}^{(\alpha _{1})}\in \mathbb {R}^{r_{\alpha _{1}}}\) and \(v_{i}^{(\alpha _{2})}\in \mathbb {R}^{r_{\alpha _{2}}}\). Now, we have to transform the bases at the son nodes: \(\{\mathbf {b}_{i,\text {new}}^{(\alpha _{1})}\}:=\{u_{i}^{(\alpha _{1})}\}\) becomes the new basis for α1, and \(\{\mathbf {b}_{i,\text {new}}^{(\alpha _{2})}\}:=\{v_{i}^{(\alpha _{2})}\}\) becomes the new basis for α2. The new bases are called the HOSVD bases (cf. Footnote 5).

The procedure is repeated for the sons of α1, α2 until we reach the leaves. Then, at all vertices, HOSVD bases are introduced together with singular values \(\sigma _{\nu }^{(\alpha )}\). As in Section 2.2, the SVD truncation consists of omitting all basis vectors corresponding to small enough singular values. Let \(\sigma _{\nu }^{(\alpha )}\), 1 ≤ νrα, be all singular values at α. Assume that we keep \(\sigma _{\nu }^{(\alpha )}\) for 1 ≤ νsα and omit those for ν > sα. This means that (15) is reduced to \(\mathbf {b}_{\ell }^{(\alpha )}\) with sα and that the double sum in (15) is taken over \(i\leq s_{\alpha _{1}}\) and \(j\leq s_{\alpha _{2}}\). Let v be the input tensor, while vHOSVD denotes the truncated version. Then, the following estimate holds (cf. [10, Theorem 11.58]):

$$\|\mathbf{v}-\mathbf{v}_{\text{HOSVD}}\| \leq\sqrt{\sum\limits_{\alpha}\sum\limits_{\nu\geq s_{\alpha}+ 1}(\sigma_{\nu}^{(\alpha)})^{2}}\leq\sqrt{2d-3}\|\mathbf{v}-\mathbf{v}_{\text{best}}\|. $$

The first inequality allows us to explicitly control the error with respect to the Euclidean norm by the choice of the omitted singular values. The second inequality proves quasi-optimality of this truncation. vbest is the best approximation with the property that vbest satisfies rankα(vbest) ≤ sα. The parameter d is the order of the tensor, i.e., d = 6 in the case of Fig. 3 and d = 3 for Fig. 2. Only in the (matrix) case of d = 2, vHOSVD coincides with vbest.

3.7 Convolution

The treatment of Section 3.4 for the Hadamard operation ⊙ holds for any binary operation with the property (10). Because the multivariate convolution satisfies the analogous condition (12), the constructions of Section 3.4 also hold for the convolution ⋆ instead of ⊙. Therefore, we can perform the convolution in \(\mathbb {R}^{n}\otimes \mathbb {R}^{n}\otimes \mathbb {R}^{n}\cong \mathbb {R}^{N}\), provided that we are able to perform the convolution \((v\star w)_{i}={\sum }_{\ell }v_{i-\ell }w_{\ell }\) in \(\mathbb {R}^{n}\).

The standard approach is the use of FFT (fast Fourier transform): First, the vectors v,w are mapped into their (discrete) Fourier images \(\hat {v},\hat {w}\); then, the Hadamard product \(x:=\hat {v}\odot \hat {w}\) is back-transformed into the convolution result \(\check {x}=v\star w\) (with suitable scaling). As well-known, the corresponding work is \(\mathcal {O}(n\log n)\). For large n, this is still expensive. In the next chapter, we shall describe a much cheaper algorithm for vw.

4 Tensorisation

The tensorisation has been introduced by Oseledets [17] (but for matrices instead of vectors). It is more natural to study this technique for vectors. The article Khoromskij [15]Footnote 7 is the first one in this direction and contains several examples of this technique. TensorisationFootnote 8 together with truncation can be considered as an algebraic data compression method which is at least as successful as particular analytical compressions, e.g., by means of wavelets, hp methods. The analysis by Grasedyck [6] shows that under suitable conditions, the data size \(N(\tilde {\mathbf {v}}_{\varepsilon })=\mathcal {O}(\log n)\) can be expected. Compression by tensorisation can be seen as a quite general multi-scale approach.

Here, we consider operations between vectors. The crucial point is that the computational work of the operations should be related to the data size of the operands. Assuming a data size ≪ n, the cost should also be much smaller than the operation cost in the standard \(\mathbb {R}^{n}\) vector format. In particular, we discuss the Hadamard product and the (one-dimensional) convolution operation u := vw with \(u_{i}={\sum }_{k}v_{k}w_{i-k}\). We shall show that the convolution procedure can be applied directly to the tensor approximations \(\tilde {\mathbf {v}}_{\varepsilon }\) and \(\tilde {\mathbf {w}}_{\varepsilon }\). The algorithm developed in Section 4.4 has a cost related to the data sizes \(N(\tilde {\mathbf {v}}_{\varepsilon })\), \(N(\tilde {\mathbf {w}}_{\varepsilon })\).

4.1 Grid Functions in \(\mathbb {R}^{n}\)

The following algorithms will apply to vectors in \(\mathbb {R}^{n}\) with n = 2L. The connection to the previous part is given by the fact that in Section 3 we have to perform various operations with the basis vectors \(b_{i}^{(j)}\in \mathbb {R}^{n}\). However, more general, the techniques of this chapter can be used for computations in \(\mathbb {R}^{n}\) without connection to the tensor problems in Sections 2 and 3.

Tensorisation is an interpretation of a usual \(\mathbb {R}^{n}\) vector as a tensor. Since n = 2L, there is a representation of the indices 0 ≤ kn − 1 by the binary numeral (iL,iL− 1,…,i1)2:

$$ k=\sum\limits_{\ell= 1}^{L}i_{\ell}2^{\ell-1},\qquad i_{\ell}\in\{0,1\}. $$
(21)

We map the vector \(v\in \mathbb {R}^{n}\) into the tensor \(\mathbf {v}\in \otimes ^{L}\mathbb {R}^{2}:=\bigotimes _{j = 1}^{L}\mathbb {R}^{2}\) of order L by means of

$$ \mathbf{v}[i_{1},\ldots,i_{L}]=v_{k}\qquad\text{with }k\text{ and }i_{j}\text{ as in (21)}. $$
(22)

Since \(n=\dim (\mathbb {R}^{n})=\dim (\otimes ^{L}\mathbb {R}^{2})= 2^{L}\), (22) describes an isomorphism

$$ {\Phi}:\otimes^{L}\mathbb{R}^{2}\rightarrow\mathbb{R}^{n},\quad\mathbf{v}\mapsto v. $$
(23)

On the side of tensors, we shall introduce a hierarchical tensor representation (cf. Section 3). This allows a simple truncation procedure vvε (cf. Section 3.6). Often, the data size N(vε) of vε is much smaller than n (see Example 2). As a consequence, the tensorisation together with the truncation yields a black-box compression method for vectors in \(\mathbb {R}^{n}\).

4.2 TT Format

The underlying tree of the hierarchical representation is the linear treeFootnote 9 depicted in Fig. 5. Hierarchical representations based on a linear tree are introduced by Oseledets [17] as TT format (cf. Oseledets–Tyrtyshnikov [18]). In principle, the hierarchical format requires subspaces at the leaves. Since \(\mathbb {R}^{2}\) is extremely low-dimensional, we take the full space \(\mathbb {R}^{2}\) and fix the basis by \(b_{1}^{(j)}=\binom {1}{0}\) and \(b_{2}^{(j)}=\binom {0}{1}\). Figure 5 corresponds to L = 4 (i.e., n = 16). We replace the index α = {1,2,…,μ} for the inner vertices by μ ∈{2,…,L}. The subspaces Uμ belong to \(\otimes ^{\mu }\mathbb {R}^{2}\cong \mathbb {R}^{2^{\mu }}\) (in particular \(\mathbf {U}_{1}=\mathbb {R}^{2})\).

Fig. 5
figure 5

Linear tree for the TT format

Since the TT-rank rμ = rank(M(μ)) is the minimal dimension of the required subspace \(\mathbf {U}_{\mu }\subset \otimes ^{\mu }\mathbb {R}^{2}\), the matricisation M(μ) of a tensor v is of interest. In fact, M(μ) can be expressed by means of the corresponding vector v = Φ(v):

$$ M^{(\mu)}=\left[ \begin{array}[c]{cccc} v_{0} & v_{2^{\mu}} & {\ldots} & v_{2^{L-1}}\\ v_{1} & v_{2^{\mu}+ 1} & {\ldots} & v_{2^{L-1}+ 1}\\ {\vdots} & {\vdots} & {\ddots} & \vdots\\ v_{2^{\mu}-1} & v_{2^{\mu+ 1}-1} & {\ldots} & v_{2^{L}-1} \end{array} \right] $$
(24)

Since we use the spaces \(\mathbb {R}^{2}\) at the leaves, condition (13) becomes

$$ \mathbf{U}_{\mu+ 1}\subset\mathbf{U}_{\mu}\otimes\mathbb{R}^{2}\qquad(1\leq\mu\leq L-1), $$
(25)

while (15) is

$$ \mathbf{b}_{\ell}^{(\mu+ 1)}=\sum\limits_{i = 1}^{r_{\mu}}\left[c_{i1}^{(\mu+ 1,\ell)}\mathbf{b}_{i}^{(\mu)}\otimes\binom{1}{0}+c_{i2}^{(\mu+ 1,\ell)}\mathbf{b}_{i}^{(\mu)}\otimes\binom{0}{1}\right] \quad\text{for }1\leq \ell\leq r_{\mu+ 1}. $$
(26)

Before we discuss the operations, we want to show that grid functions appearing in practice may have ranks of the order \(\mathcal {O}(L)=\mathcal {O}(\log n)\ll n\).

Remark 5

Let f be an analytic function in (0,1] with a singularity at x = 0. An efficient approximation is given by the hp finite-element approach. In a simplified version, one uses polynomials of degree g to interpolate f in [1/2,1],[1/4,1/2],…, [2L,2 ⋅ 2L],[0,2L]. The data size is D = (L + 1)(g + 1) since there are L + 1 intervals and the polynomials have g + 1 coefficients. For the typical asymptotically smooth functions (cf. [11, Appendix E]), one obtains an error estimate decaying exponentially in D. Let F be the piecewise interpolation polynomial and evaluate F at the equidistant grid points: vi := F(i ⋅ 2L) for 0 ≤ in − 1. Inspection of the matrix M(μ) shows that all columns except the first one contain grid values of a polynomial of degree g. Hence this part has at most the rank g + 1. The first column can increase the rank only by one so that rμ = rank(M(μ)) ≤ g + 2. Therefore, the TT format representing v = Φ− 1(F) is of the same size as the hp approach. The optimal approximation of f by the TT format with rank(M(μ)) ≤ g + 2 yields an error which is as most as large as the hp error, i.e., it is exponentially decreasing with g. More details can be found in Grasedyck [6].

Example 1

A particular function is the exponential zx, where z≠ 0 may be any complex number. The grid values vi are ζi with \(\zeta =z^{2^{-L}}\). For this vector, the columns of M(μ) in (24) are linearly dependent so that rank(M(μ)) = 1. In fact, v = Φ− 1(v) is the elementary tensor \(\mathbf {v}=\bigotimes _{j = 1}^{L}\left (\begin {array}[c]{c} 1\\ \zeta ^{2^{j-1}} \end {array} \right )\). Since \(\sin (ax)=\frac {\exp (\mathrm {i}ax)-\exp (-\mathrm {i}ax)}{2\mathrm {i}}\), any trigonometric function leads to rank(M(μ)) = 2.

This example (mentioned in [15]) implies the next remark.

Remark 6

All functions with a limited number of exponential terms lead to a constant bound of rank(M(μ)) (e.g., \(f(x)={\sum }_{\nu = 1}^{r}\alpha _{\nu }\exp (-\beta _{\nu }x)\) yields rank(M(μ)) ≤ r). A similar result holds for functions involving a fixed number of trigonometric terms (band-limited functions).

An example of a band-limited function can be found in Khoromskij–Veit [16].

The next example again shows that exponential sums can approximate functions with point singularities (Remark 5 is another approach to this problem). This fact is important for applications in quantum chemistry where singularities appear at the positions of the nuclei. This is an indication that the basis vectors appearing in Uj (1 ≤ j ≤ 6) for the problem (1) allow a tensorisation with moderate ranks.

Example 2

For n = 2L set \(v=(f(k\cdot 2^{-L}))_{k = 0}^{n-1}\in \mathbb {R}^{n}\) for the function f(x) = 1/(1 − x) in [0,1). For any \(r\in \mathbb {N}\), there is an approximation \(v_{(r)}\in \mathbb {R}^{n}\) such that v(r) := Φ− 1(v(r)) yields ranks rμ = rank(M(μ)) ≤ r and satisfies the componentwise error estimate

$$\left| v[k]-v_{(r)}[k]\right| \leq C_{1}n\exp(-C_{2}r)\qquad\text{ with }C_{1},C_{2}>0\text{ for all }0\leq k<n. $$

Hence, for a given error bound ε > 0, the choice \(r=\mathcal {O}(\log (n)+\log \frac {1}{\varepsilon })\) is sufficient. The storage size of the tensor v(r) is \(\mathcal {O}(\log ^{2}(n)+\log (n)\log \frac {1}{\varepsilon })\).

Proof

The function 1/t can be approximated in [2L,1] by an expression of the form \({\sum }_{\nu = 1}^{r}\alpha _{\nu }\exp (-\beta _{\nu }x)\). The error estimates follow from Braess–Hackbusch [1]. □

4.3 Hadamard Product in \(\mathbb {R}^{n}\)

Since it does not matter whether the componentwise multiplication is realised via vkwk or v[i1,…,iL] ⋅w[i1,…,iL], the property (10) holds also in the case of the artificial tensor product \(\otimes ^{L}\mathbb {R}^{2}\); more precisely,

$${\Phi}\left( \bigotimes_{j = 1}^{L}v^{(j)}\right) \odot {\Phi}\left( \bigotimes_{j = 1}^{L}w^{(j)}\right) = {\Phi}\left( \bigotimes_{j = 1}^{L}\left( v^{(j)}\odot w^{(j)}\right)\right) = {\Phi}(\mathbf{v}\odot\mathbf{w}). $$

Conclusion 1

Assume v = Φ(v) and w = Φ(w). Let v,w be represented by the TT format. Then the Hadamard product vw can be computed as explained in Section 3.4. Since Φ(vw) = vw, the result is the tensorisation of vw. The computational cost is discussed in Section 3.4.

We return to the hierarchical format for true tensors as in Figs. 2 or 3. The subspaces at the leaves are described by bases containing \(\mathbb {R}^{n}\) vectors. The application of the tensorisation to these vectors corresponds to an extended tree as sketched in Fig. 6.

Fig. 6
figure 6

Extended tree

The combination of the tree in Fig. 2 with the TT tree corresponds to \(\mathbb {R}^{N}\cong \otimes ^{3}(\otimes ^{L}\mathbb {R}^{2}) \cong \otimes ^{3L}\mathbb {R}^{2}\). For tensors represented in this format, we can again apply the algorithm in Section 3.4 to compute vw for \(\mathbf {v},\mathbf {w}\in \mathbb {R}^{N}\).

4.4 Convolution in \(\mathbb {R}^{n}\)

4.4.1 Definition of the Convolution

We take a closer look to the convolution operation. The sum in \((v\star w)_{i}={\sum }_{\ell }v_{i-\ell }w_{\ell }\) is restricted to those with 0 ≤ i, n − 1, i.e.,

$$ (v\star w)_{i}=\sum\nolimits_{\ell=\max\{0,i + 1-n\}}^{\min\{n-1,i\}}v_{i-\ell}w_{\ell}. $$
(27)

If i varies in \([0,n-1]\cap \mathbb {Z}\), the sum can be written as \({\sum }_{\ell = 0}^{i}\). For i < 0, the empty sum yields (vw)i = 0, but for ni ≤ 2n − 2, the sum in (27) is not empty. This shows the following remark.

Remark 7

The convolution of two \(\mathbb {R}^{n}\) vectors yield an \(\mathbb {R}^{2n-1}\) vector.

The notation becomes simpler if we replace the vector \(v\in \mathbb {R}^{n}\) by the infinite sequence \(v=(v_{i})_{i\in \mathbb {N}_{0}}\) with \(\mathbb {N}_{0}=\mathbb {N}\cup \{0\}\) and vi := 0 for all in. The set \(\ell _{0}=\ell _{0}(\mathbb {N}_{0})\) consists of all sequences with only finitely many nonzero components. Now, the sum becomes

$$(v\star w)_{i}=\sum\limits_{\ell= 0}^{i}v_{i-\ell}w_{\ell}\qquad\text{for all }i\in\mathbb{N}_{0}\text{ and all }v,w\in\ell_{0}. $$

Remark 8

The n-periodic convolution is \((v\star _{\text {per}}w)_{i}={\sum }_{\ell = 0}^{i}v_{i-\ell }w_{\ell }~(0\leq i\leq n-1)\), where all indices are understood modulo n. These values can be obtained by (vperw)i = (vw)i + (vw)n + i for 0 ≤ in − 1.

4.4.2 Principal Idea of the Algorithm

For multivariate (grid) functions, the definition of the convolution implies the property (10): the convolution of elementary tensors can be reduced to the tensor product of one-dimensional convolutions.

Since now the vector v is replaced by the tensor \(\mathbf {v}\in \otimes ^{L}\mathbb {R}^{2}\), an obvious question is whether the product of \(\mathbf {v}=\otimes _{j = 1}^{L}v^{(j)}\) and \(\mathbf {w}=\otimes _{j = 1}^{L}w^{(j)}\) can be expressed by \(\mathbf {x}:=\otimes _{j = 1}^{L}(v^{(j)}\star w^{(j)})\) corresponding to (10), i.e., whether the corresponding vectors satisfy Φ(v) ⋆ Φ(w) = Φ(x). In the naive sense, this cannot be true by the simple reason that v(j)w(j) is a vector with three nontrivial components (cf. Remark 7). Therefore, the result does not belong to \(\otimes ^{L}\mathbb {R}^{2}\). Furthermore, we must expect a result in \(\otimes ^{L + 1}\mathbb {R}^{2}\) since vw has the length 2n − 1 > 2L and < 2L+ 1.

4.4.3 Extension to \(\otimes ^{L}\mathbb {\ell }_{0}\)

According to Section 4.4.1, \(\mathbb {R}^{2}\) can be considered as a subspace of \(\mathbb {\ell }_{0}\). Hence, \(\otimes ^{L}\mathbb {R}^{2}\) is contained in \(\otimes ^{L}\mathbb {\ell }_{0}\). The linear map Φ defined in (23) can be extended to \({\Phi }:\otimes ^{L}\mathbb {\ell }_{0}\rightarrow \mathbb {\ell }_{0}\) by

$$ a={\Phi}\left( \bigotimes_{j = 1}^{L}v^{(j)}\right) \in\mathbb{\ell}_{0}\qquad\text{with }a_{k}=\underset{k={\sum}_{j = 1}^{L}i_{j}2^{j-1}}{\sum\limits_{i_{1},\ldots,i_{L}\in \mathbb{N}_{0}}}{\prod}_{j = 1}^{d}v^{(j)}[i_{j}] $$
(28)

(cf. Remark 1). In the case of \(v^{(j)}\in \mathbb {R}^{2}\), the sum on the right-hand side of (28) contains only one term for 0 ≤ kn − 1 and the product \({\prod }_{j = 1}^{L}v^{(j)}[i_{j}]\) coincides with v[i1,…,iL] for \(\mathbf {v}:=\bigotimes _{j = 1}^{L}v^{(j)}\) (cf. (22)).

For a better understanding, we look at the case of L = 2.

Remark 9

Let \(e_{i}\in \mathbb {\ell }_{0}\) be the ith unit vector, i.e., ei[j] = δij (\(i,j\in \mathbb {N}_{0}\)). Then, b := Φ(aei) is the vector \(a\in \mathbb {\ell }_{0}\) shifted by 2i positions: bk := 0 for 0 ≤ k < 2i and bk = ak− 2i for ki.

The shift by p positions is denoted by Sp. Thus, we can write b = S2ia.

4.4.4 Polynomials

Next, we use the isomorphism between \(\mathbb {\ell }_{0}\) and the space \(\mathbb {P}\) of polynomials described by

$$\pi:\ell_{0}\rightarrow\mathbb{P}\quad\text{ with }~v\mapsto\pi[v](x):=\sum\limits_{k\in\mathbb{N}_{0}}v_{k}x^{k}. $$

The connection with the convolution is given by the property that the product of two polynomials has the coefficients of the convolution product:

$$ \pi[v] \pi[w]=\pi[v\star w]\qquad\text{for }v,w\in\ell_{0}. $$
(29)

We define an extension of \(\pi :\ell _{0}\rightarrow \mathbb {P}\) to \(\hat {\pi }:\otimes ^{L}\mathbb {\ell }_{0}\rightarrow \mathbb {P}\) by

$$ \hat{\pi}:\otimes^{L}\mathbb{\ell}_{0}\rightarrow\mathbb{P}\qquad\text{with }~\hat{\pi}\left[\bigotimes_{j = 1}^{L}v^{(j)}\right](x):={\prod}_{j = 1}^{L}\pi[v^{(j)}](x^{2^{j-1}}). $$
(30)

A shift of v by i positions corresponds to the product π[Siv] = π[v](x) ⋅ xi. This result together with Remark 9 shows that

$$ \hat{\pi}\left[\bigotimes_{j = 1}^{L}v^{(j)}\right] =\pi\left[{\Phi}\left( \bigotimes_{j = 1}^{L}v^{(j)}\right) \right]. $$
(31)

The extended map \({\Phi }:\otimes ^{L}\mathbb {\ell }_{0}\rightarrow \mathbb {\ell }_{0}\) is not injective. Two tensors \(\mathbf {v}^{\prime },\mathbf {v}^{\prime \prime }\in \otimes ^{L}\mathbb {\ell }_{0}\) are called equivalent—denoted by vv — if they represent the same vector: Φ(v) = Φ(v). From (31), we learn that the equivalence of v,v can also be expressed by \(\hat {\pi }[\mathbf {v}^{\prime }]=\hat {\pi }[\mathbf {v}^{\prime \prime }]\).

By comparing the values under the map \(\hat {\pi }\), we obtain the following result.

Lemma 1

\({\Phi }\left (\bigotimes _{j = 1}^{L}S^{m_{j}}v^{(j)}\right ) = S^{m}{\Phi }\left (\bigotimes _{j = 1}^{L}v^{(j)}\right )\) holds for \(m={\sum }_{j = 1}^{L}m_{j}2^{j-1}\) .

According to (10), we define the convolution of two (elementary) tensors in \(\otimes ^{L}\mathbb {\ell }_{0}\) by

$$ \left( \bigotimes_{j = 1}^{L}v^{(j)}\right) \star\left( \bigotimes_{j = 1}^{L}w^{(j)}\right) := \bigotimes_{j = 1}^{L}\left( v^{(j)}\star w^{(j)}\right). $$
(32)

Now, the product v(j)w(j) makes sense since it belongs to \(\mathbb {\ell }_{0}\). Next, we have to prove that the convolution introduced in (32) is consistent with the usual convolution of vectors.

Lemma 2

Let\(v={\Phi }\left (\bigotimes _{j = 1}^{L}v^{(j)}\right )\)and\(w={\Phi }\left (\bigotimes _{j = 1}^{L}w^{(j)}\right )\)bevectors in\(\mathbb {\ell }_{0}\). Then, (32) implies

$${\Phi}\left( \bigotimes_{j = 1}^{L}\left( v^{(j)}\star w^{(j)}\right)\right)=v\star w. $$

Proof

Since \(\pi :\ell _{0}\rightarrow \mathbb {P}\) is an isomorphism, the statement is equivalent to \(\pi [{\Phi }(\bigotimes _{j = 1}^{L}(v^{(j)}\star w^{(j)}))]=\pi [v\star w]\). The left-hand side of this equation is

$$\begin{array}{@{}rcl@{}} \pi\left[{\Phi}\left( \bigotimes_{j = 1}^{L}\left( v^{(j)}\star w^{(j)}\right)\right)\right](x) & \underset{(31)}{=}&\hat{\pi}\left[\bigotimes_{j = 1}^{L}\left( v^{(j)}\star w^{(j)}\right) \right](x)\\ &\underset{(30)}{=}& {\prod}_{j = 1}^{L}\pi[v^{(j)}\star w^{(j)}](x^{2^{j-1}})\\ &\underset{(29)}{=}&{\prod}_{j = 1}^{L}\pi[v^{(j)}](x^{2^{j-1}})\cdot\pi[w^{(j)}](x^{2^{j-1}})\\ &=&\left( {\prod}_{j = 1}^{L}\pi[v^{(j)}](x^{2^{j-1}})\right) \cdot \left( {\prod}_{j = 1}^{L}\pi[w^{(j)}](x^{2^{j-1}})\right)\\ &\underset{(30)}{=}&\hat{\pi}\left[\bigotimes_{j = 1}^{L}v^{(j)}\right] (x)\cdot\hat{\pi}\left[\bigotimes_{j = 1}^{L}w^{(j)}\right](x)\\ &\underset{(31)}{=}&\pi\lbrack v](x)\cdot\pi[w](x)\underset{(29)}{=}\pi[v\star w](x). \end{array} $$

4.5 Carry-over Procedure

The result \(\bigotimes _{j = 1}^{L}(v^{(j)}\star w^{(j)})\) is still unsatisfactory because \(v^{(j)},w^{(j)}\in \mathbb {R}^{2}\) produce \(v^{(j)}\star w^{(j)}\in \mathbb {R}^{3}\). A solution can be as follows. Let L = 2 as in Remark 9. Consider ab with a,b0. We want to find an equivalent tensor with factors in \(\mathbb {R}^{2}\). Assume that aK≠ 0, but ai = 0 for i > K, which implies \(a\in \mathbb {R}^{K + 1}\). If K = 1, a belongs to \(\mathbb {R}^{2}\) and nothing has to be done. If K > 1 set \(a^{\prime }\in \mathbb {R}^{2}\) with \(a_{i}^{\prime }=a_{i}\) for i = 0,1 and a0 with \(a_{i}^{\prime \prime }=a_{i + 2}\) for \(i\in \mathbb {N}_{0}\). Using Remark 9, one checks that ab represents the same vector as ab + aSb, where Sb is the shifted version of b:

$${\Phi}(a\otimes b)={\Phi}(a^{\prime}\otimes b+a^{\prime\prime}\otimes Sb). $$

\(a^{\prime }\in \mathbb {R}^{2}\) is already of the desired form. a belongs to \(\mathbb {R}^{K-1}\). This procedure can again be applied to ab until all first factors belong to \(\mathbb {R}^{2}\).

In the case of a general tensor \(\bigotimes _{j = 1}^{L}v^{(j)}\), this procedure is applied to the first factor v(1) and yields sums of elementary tensors of the form \(w^{(1)}\otimes \bigotimes _{j = 2}^{L}w^{(j)}\) with \(w^{(1)}\in \mathbb {R}^{2}\). Then, the procedure is repeated with the second factor resulting in sums of the terms \(x^{(1)}\otimes x^{(2)}\otimes \bigotimes _{j = 3}^{L}x^{(j)}\) with \(x^{(1)},x^{(2)} \in \mathbb {R}^{2}\), etc. In the case of the last factor, we may have to add an (L + 1)-th factor. Since we know that vw belongs to \(\mathbb {R}^{2n-1}\), the (L + 1)-th factor must belong to \(\mathbb {R}^{2}\).

4.6 Convolution Algorithm

We recall Remark 7: If \(\mathbf {v},\mathbf {w}\in \bigotimes _{j = 1}^{L}\mathbb {R}^{2}\), the result is a tensor u := vw in \(\bigotimes _{j = 1}^{L + 1}\mathbb {R}^{2}\). Lemma 3 describes the start at δ = 1, while Lemma 4 characterises the recursion. In the following, the vector notation v =[ αβ] means v0 = α, v1 = β, i.e., the components must be read from the top to the bottom. By vw, we denote the equivalence Φ(v) = Φ(w).

Lemma 3

The convolution ofv =[ αβ] and\(w=\genfrac {[}{]}{0pt}{1}{\gamma }{\delta } \in \mathbb {R}^{2}=\bigotimes _{j = 1}^{1}\mathbb {R}^{2}\)yields

$$ \genfrac{[}{]}{0pt}{1}{\alpha}{\beta} \star \genfrac{[}{]}{0pt}{1}{\gamma}{\delta} =\left[ \begin{array}[c]{c} \alpha\gamma\\ \alpha\delta+\beta\gamma\\ \beta\delta\\ 0\end{array} \right] \sim \genfrac{[}{]}{0pt}{1}{\alpha\gamma}{\alpha\delta+\beta\gamma} \otimes \genfrac{[}{]}{0pt}{1}{1}{0} + \genfrac{[}{]}{0pt}{1}{\beta\delta}{0} \otimes \genfrac{[}{]}{0pt}{1}{0}{1} \in\bigotimes_{j = 1}^{2}\mathbb{R}^{2}. $$
(33a)

Furthermore, the shifted vector has the tensor representation

$$ S\left[ \begin{array}[c]{c} \alpha\gamma\\ \alpha\delta+\beta\gamma\\ \beta\delta\\ 0\end{array} \right] =\left[ \begin{array}[c]{c} 0\\ \alpha\gamma\\ \alpha\delta+\beta\gamma\\ \beta\delta \end{array} \right] \sim \genfrac{[}{]}{0pt}{1}{0}{\alpha\gamma} \otimes \genfrac{[}{]}{0pt}{1}{1}{0} + \genfrac{[}{]}{0pt}{1}{\alpha\delta+\beta\gamma}{\beta\delta} \otimes \genfrac{[}{]}{0pt}{1}{0}{1} \in\bigotimes_{j = 1}^{2}\mathbb{R}^{2}. $$
(33b)

The basic identity is given in the next lemma.

Lemma 4

For given\(\mathbf {v},\mathbf {w}\in \bigotimes \nolimits _{j = 1}^{\delta -1}\mathbb {R}^{2}\)letthe convolution result be

$$ \mathbf{v\star w}\sim\mathbf{a}=\mathbf{a}^{\prime}\otimes \genfrac{[}{]}{0pt}{1}{1}{0} +\mathbf{a}^{\prime\prime}\otimes \genfrac{[}{]}{0pt}{1}{0}{1} \in\bigotimes_{j = 1}^{\delta}\mathbb{R}^{2}. $$
(34a)

Then, convolution of the tensors vx and wy with x =[ αβ], \(y= \genfrac {[}{]}{0pt}{1}{\gamma }{\delta } \in \mathbb {R}^{2}\) yields

$$\begin{array}{@{}rcl@{}} (\mathbf{v}\otimes x)\mathbf{\star}(\mathbf{w}\otimes y)\sim\mathbf{u}&=&\mathbf{u}^{\prime}\otimes \genfrac{[}{]}{0pt}{1}{1}{0} +\mathbf{u}^{\prime\prime}\otimes \genfrac{[}{]}{0pt}{1}{0}{1} \in\bigotimes_{j = 1}^{\delta+ 1}\mathbb{R}^{2}\\ \text{with }\quad\mathbf{u}^{\prime} & =& \mathbf{a}^{\prime}\otimes \genfrac{[}{]}{0pt}{1}{\alpha\gamma}{\alpha\delta+\beta\gamma} +\mathbf{a}^{\prime\prime}\otimes \genfrac{[}{]}{0pt}{1}{0}{\alpha\gamma} \in\bigotimes_{j = 1}^{\delta}\mathbb{R}^{2}\\ \text{and }\quad\mathbf{u}^{\prime\prime} & =& \mathbf{a}^{\prime}\otimes \genfrac{[}{]}{0pt}{1}{\beta\delta}{0} +\mathbf{a}^{\prime\prime}\otimes \genfrac{[}{]}{0pt}{1}{\alpha\delta+\beta\gamma}{\beta\delta} \in\bigotimes_{j = 1}^{\delta}\mathbb{R}^{2}. \end{array} $$
(34b)

Proof

Lemma 2 implies that

$$(\mathbf{v}\otimes x)\mathbf{\star}(\mathbf{w}\otimes y) \sim (\mathbf{v\star w}) \otimes z\qquad\text{ with }~z:=x \mathbf{\star}y\in\mathbb{R}^{3}\subset\ell_{0}. $$

Assumption (34a) yields

$$(\mathbf{v\star w}) \otimes z\sim\left( \mathbf{a}^{\prime}+S^{2^{\delta-1}}\mathbf{a}^{\prime\prime}\right)\otimes z. $$

Lemma 1 shows that

$$S^{2^{\delta-1}}\mathbf{a}^{\prime\prime}\otimes z=S^{2^{\delta-1}}(\mathbf{a}^{\prime\prime}\otimes z)\sim\mathbf{a}^{\prime\prime}\otimes(Sz). $$

Using (33a) and (33b), we obtain

$$\begin{array}{@{}rcl@{}} \mathbf{a}^{\prime}\otimes z & \sim& \mathbf{a}^{\prime}\otimes \genfrac{[}{]}{0pt}{1}{\alpha\gamma}{\alpha\delta+\beta\gamma} \otimes \genfrac{[}{]}{0pt}{1}{1}{0} +\mathbf{a}^{\prime}\otimes \genfrac{[}{]}{0pt}{1}{\beta\delta}{0} \otimes \genfrac{[}{]}{0pt}{1}{0}{1},\\ (S^{2^{\delta-1}}\mathbf{a}^{\prime\prime})\otimes z & \sim& \mathbf{a}^{\prime\prime}\otimes(Sz)\sim\mathbf{a}^{\prime\prime}\otimes \genfrac{[}{]}{0pt}{1}{0}{\alpha\gamma} \otimes \genfrac{[}{]}{0pt}{1}{1}{0} +\mathbf{a}^{\prime\prime}\otimes \genfrac{[}{]}{0pt}{1}{\alpha\delta+\beta\gamma}{\beta\delta} \otimes \genfrac{[}{]}{0pt}{1}{0}{1}. \end{array} $$

Summation of both identities yields the assertion of the lemma. □

If the vectors x,y in Lemma 4 belong to {[10],[ 0 1]}, the vectors [ αγαδ + βγ], [ 0αγ], [βδ0], [αδ + βγβδ] from (34b) belong to {[00],[ 1 0],[ 0 1]}.

Lemma 3 proves assumption (34a) for δ = 2, while Lemma 4 shows that vx and wy satisfy the requirement (34a) (for δ + 1 instead of δ).

4.7 Convolution of Tensors in Hierarchical Format

We recall that the subspaces \(\mathbf {U}_{\delta }\subset \otimes ^{\delta }\mathbb {R}^{2}\) satisfy (25): \(\mathbf {U}_{\delta + 1}\subset \mathbf {U}_{\delta }\otimes \mathbb {R}^{2}\). The essential observation is that also the results of the convolution yield subspaces with this property.

Note that there are three different tensors v, w, u := vw involving representations with three different subspace families \(\mathbf {U}_{\delta }^{\prime }\), \(\mathbf {U}_{\delta }^{\prime \prime }\), Uδ (1 ≤ δL). The bases spanning these subspaces consist of the vectors \(\mathbf {b}_{i}^{\prime (\delta )}\), \(\mathbf {b}_{i}^{\prime \prime (\delta )}\), \(\mathbf {b}_{i}^{(\delta )}\). The dimensions of the subspaces are \(r_{\delta }^{\prime }\), \(r_{\delta }^{\prime \prime }\), rδ.

Any tensor \(\mathbf {a}\in \otimes ^{\delta }\mathbb {R}^{2}\) (δ ≥ 1) can be written as a = a⊗[ 1 0] + a⊗[ 0 1]. Define the linear maps \(\phi _{\delta }^{\prime }\), \(\phi _{\delta }^{\prime \prime }:\otimes ^{\delta }\mathbb {R}^{2}\rightarrow \otimes ^{\delta -1}\mathbb {R}^{2}\) by \(\phi _{\delta }^{\prime }(\mathbf {a})=\mathbf {a}^{\prime }\), \(\phi _{\delta }^{\prime \prime }(\mathbf {a})=\mathbf {a}^{\prime \prime }\).

Theorem 2

Let the tensors\(\mathbf {v},\mathbf {w}\in \bigotimes _{j = 1}^{L}\mathbb {R}^{2}\)berepresented by (possibly different) hierarchical formats using the respectivesubspaces\(\mathbf {U}_{\delta }^{\prime }\)and\(\mathbf {U}_{\delta }^{\prime \prime }\),1 ≤ δL,satisfying

$$ \begin{array}[c]{lll} \mathbf{U}_{1}^{\prime}=\mathbb{R}^{2},\qquad & \mathbf{U}_{\delta}^{\prime}\subset\mathbf{U}_{\delta-1}^{\prime}\otimes\mathbb{R} ^{2},\qquad & \mathbf{v}\in\mathbf{U}_{L}^{\prime},\\ \mathbf{U}_{1}^{\prime\prime}=\mathbb{R}^{2}, & \mathbf{U}_{\delta} ^{\prime\prime}\subset\mathbf{U}_{\delta-1}^{\prime\prime}\otimes \mathbb{R}^{2}, & \mathbf{w}\in\mathbf{U}_{L}^{\prime\prime}. \end{array} $$
(35a)

The subspaces

$$ \mathbf{U}_{\delta}:=\operatorname*{span}\{\phi_{\delta+ 1}^{\prime} (\mathbf{x}\star\mathbf{y}),\phi_{\delta+ 1}^{\prime\prime}(\mathbf{x}\star\mathbf{y}):\mathbf{x}\in\mathbf{U}_{\delta}^{\prime},~\mathbf{y}\in\mathbf{U}_{\delta}^{\prime\prime}\}\qquad(1\leq\delta\leq L) $$
(35b)

satisfy

$$ \mathbf{U}_{1}=\mathbb{R}^{2},\quad\mathbf{U}_{\delta}\subset \mathbf{U}_{\delta-1}\otimes\mathbb{R}^{2},\quad\mathbf{v} \star \mathbf{w}\in\mathbf{U}_{L + 1}. $$
(35c)

The dimension of Uδ can be bounded by

$$ \dim(\mathbf{U}_{\delta})\leq\min\left\{2\dim(\mathbf{U}_{\delta}^{\prime})\dim(\mathbf{U}_{\delta}^{\prime\prime}),2^{\delta},2^{L + 1-\delta}\right\}. $$
(35d)

Proof

  1. (i)

    \(\mathbf {U}_{1}=\mathbb {R}^{2}\) can be concluded from Lemma 3.

  2. (ii)

    Write \(\mathbf {x,y}\in \mathbf {U}_{\delta }^{\prime }\subset \mathbf {U}_{\delta -1}^{\prime }\otimes \mathbb {R}^{2}\) as x=x⊗[ 1 0] + x⊗[ 0 1] and y=y⊗[ 1 0] + y⊗[ 0 1] with \(\mathbf {x}^{\prime },\mathbf {x}^{\prime \prime },\mathbf {y}^{\prime },\mathbf {y}^{\prime \prime }\in \mathbf {U}_{\delta -1}^{\prime }\). Expansion of the sums yields xy = (x⊗[ 1 0]) ⋆ (y⊗[ 1 0]) + ⋯ For each term z of this expansion, Lemma 4 (with v,w renamed x,x) states that \(\phi _{\delta + 1}^{\prime }(\mathbf {z}) =\mathbf {u}^{\prime }\) and \(\phi _{\delta + 1}^{\prime \prime }(\mathbf {z})=\mathbf {u}^{\prime \prime }\) belong to \(\mathbf {U}_{\delta -1}\otimes \mathbb {R}^{2}\) (cf. (34b)). Hence, \(\phi _{\delta + 1}^{\prime }(\mathbf {x}\star \mathbf {y}),\phi _{\delta + 1}^{\prime \prime }(\mathbf {x}\star \mathbf {y})\in \mathbf {U}_{\delta -1}\otimes \mathbb {R}^{2}\) holds, and the definition of Uδ implies the inclusion \(\mathbf {U}_{\delta }\subset \mathbf {U}_{\delta -1}\otimes \mathbb {R}^{2}\).

  3. (iii)

    \(\mathbf {v}\in \mathbf {U}_{L}^{\prime }\) and \(\mathbf {w}\in \mathbf {U}_{L}^{\prime \prime }\) together with the definition of UL lead to vwUL.

  4. (iv)

    The first bound of dim(Uδ) follows directly from (35b). The bound min{2δ,2L+ 1−δ} holds for any rank(M(1,…,δ)(v)) of \(\mathbf {v}\in \otimes ^{L + 1}\mathbb {R}^{2}\).

The bound \(2\dim (\mathbf {U}_{\delta }^{\prime })\dim (\mathbf {U}_{\delta }^{\prime \prime })\) corresponds to the product mentioned in Remark 3.

For δ = 1,…,L, the numerical scheme has

  1. 1.

    to introduce an orthonormal basis \(\{\mathbf {b}_{1}^{(\delta )},\ldots ,\mathbf {b}_{r_{\delta }}^{(\delta )}\}\) of Uδ, where rδ := dim(Uδ) (cf. Section 3.5),

  2. 2.

    to represent the convolution \(\mathbf {b}_{i}^{\prime (\delta )}\star \mathbf {b}_{j}^{\prime \prime (\delta )}\) by

    $$ \mathbf{b}_{i}^{\prime(\delta)}\star\mathbf{b}_{j}^{\prime\prime(\delta)}= \sum\limits_{k = 1}^{r_{\delta}}\sum\limits_{m = 1}^{2}\beta_{ij,km}^{(\delta)}\mathbf{b}_{k}^{(\delta)}\otimes b_{m}. $$
    (36)

As soon as the β-coefficients from (36) are known, general products xy of \(\mathbf {x}\in \mathbf {U}_{\delta }^{\prime }\) and \(\mathbf {y}\in \mathbf {U}_{\delta }^{\prime \prime }\) can be evaluated easily as shown in the next remark.

Remark 10

Let \(\mathbf {x}={\sum }_{i = 1}^{r_{\delta }^{\prime }}\xi _{i}\mathbf {b}_{i}^{\prime (\delta )}\in \mathbf {U}_{\delta }^{\prime }\) and \(\mathbf {y}={\sum }_{j = 1}^{r_{\delta }^{\prime \prime }}\eta _{j}\mathbf {b}_{j}^{\prime \prime (\delta )}\in \mathbf {U}_{\delta }^{\prime \prime }\). Then, convolution yields

$$\begin{array}{@{}rcl@{}} &&\mathbf{x}\star\mathbf{y}=\mathbf{z}=\mathbf{z}^{\prime}\otimes \genfrac{[}{]}{0pt}{1}{1}{0} +\mathbf{z}^{\prime\prime}\otimes \genfrac{[}{]}{0pt}{1}{0}{1} \quad\text{ with}\quad\mathbf{z}^{\prime}=\sum\limits_{k = 1}^{r_{\delta}}\zeta_{k}^{\prime}\mathbf{b}_{k}^{(\delta)},~~\mathbf{z}^{\prime\prime}=\sum\limits_{k = 1}^{r_{\delta}}\zeta_{k}^{\prime\prime}\mathbf{b}_{k}^{(\delta)},\\ \text{where}\quad&&\zeta_{k}^{\prime}=\sum\limits_{i = 1}^{r_{\delta}^{\prime}}\sum\limits_{j = 1}^{r_{\delta}^{\prime\prime}}\xi_{i}\eta_{j}\beta_{ij,k1}^{(\delta)}\quad\text{ and }\quad\zeta_{k}^{\prime\prime}=\sum\limits_{i = 1}^{r_{\delta}^{\prime}}\sum\limits_{j = 1}^{r_{\delta}^{\prime\prime}}\xi_{i}\eta_{j}\beta_{ij,k2}^{(\delta)} \end{array} $$

with \(\beta _{ij,km}^{(\delta )}\) from (36). The computation of \(\zeta _{k}^{\prime }\), \(\zeta _{k}^{\prime \prime }~~(1\leq k\leq r_{\delta })\) requires \(4r_{\delta }r_{\delta }^{\prime }(r_{\delta }^{\prime \prime }+ 1)\) operations.

The total cost is described in [p. 482][9]. It is the sum

$$ 8r_{\delta}^{\prime\prime}r_{\delta-1}^{\prime}r_{\delta-1}\left( r_{\delta-1}^{\prime\prime}+r_{\delta}^{\prime}\right) + 8\left( r_{\delta}^{\prime}r_{\delta}^{\prime\prime}\right)^{2}r_{\delta-1}+\frac{4}{3}\left( r_{\delta}^{\prime}r_{\delta}^{\prime\prime}\right)^{3} + 2r_{\delta-1}r_{\delta}^{2}\qquad\text{ for }2\leq\delta\leq L. $$
(37)

A rough estimate by \(r_{\delta }^{\prime },r_{\delta }^{\prime \prime }\leq r\) and rδ ≤ 2r2 yields the asymptotic bound \(\frac {100}{3}(L-1)r^{6}\). The higher order terms are caused by the orthonormalisation.

5 Toeplitz Matrices

5.1 Notation

A matrix (aij) is called a Toeplitz matrix if aij only depends on ij. A multiplication by a Toeplitz matrix and a convolution are almost equivalent (cf. Kazeev et al. [14]).

If we fix the vector x in xy, this expression defines a linear map yxy which may be expressed by a matrix T = Tx, i.e., Ty := xy. In the case of \(x,y\in \mathbb {R}^{n}\) and \(x\star y\in \mathbb {R}^{2n-1}\), T is the (rectangular) Toeplitz matrix of size (2n − 1) × n with Ti0 = xi (0 ≤ in − 1), Tn− 1 + i,0 = T0i = 0 (1 ≤ in − 1).

A general n × n Toeplitz matrix is uniquely determined by the coefficient vector a = [a0,…,a2n− 2]:

$$ T(a):=\left[ \begin{array}{llll} a_{n-1} & a_{n-2} & {\cdots} & a_{0}\\ a_{n} & {\ddots} & {\ddots} & \vdots\\ {\vdots} & {\ddots} & {\ddots} & a_{n-2}\\ a_{2n-2} & {\cdots} & a_{n} & a_{n-1} \end{array} \right],\qquad \begin{array}{ll} \text{i.e., } T(a)_{i,j}=a_{n-1+i-j}\\ \text{for } 0\leq i,j\leq n-1. \end{array} $$
(38)

The product z := ay belongs to \(\mathbb {R}^{3n-1}\). The part \(\hat {z}\) with \(\hat {z}_{i}:=z_{n-1+i}\) (0 ≤ in − 1) coincides with \(T(a)y\in \mathbb {R}^{n}\).

5.2 Tensorisation for Matrices

The matrix space \(\mathbb {R}^{n\times n}\) for n = 2L is isomorphic to \(\bigotimes _{j = 1}^{L}\mathbb {R}^{2\times 2}\). As in (23), the isomorphism \(\mathbf {M}\in \bigotimes _{j = 1}^{L}\mathbb {R}^{2\times 2}\mapsto M\in \mathbb {R}^{n\times n}\) is defined by M[i,j] = M[(i1,j1),…,(iL,jL)] where \(i={\sum }_{\ell = 1}^{L}i_{\ell }2^{\ell -1},~j={\sum }_{\ell = 1}^{L}j_{\ell }2^{\ell -1},~i_{\ell },j_{\ell }\in \{0,1\}\) (cf. [17]). In particular, a block matrix \(\left [ \begin {array}[c]{cc} M_{11} & M_{12}\\ M_{21} & M_{22} \end {array} \right ]\) corresponds to the tensor product M11 ⊗[10 0 0] + M12 ⊗[00 1 0] + M21 ⊗[01 0 0] + M22 ⊗[00 0 1].

In the case of a Toeplitz matrix, all submatrices are again Toeplitz. In the previous example, M11 = M22 follows. Therefore, a suitable subspace U of \(\mathbb {R}^{2\times 2}\) is spanned by b1 := [00 1 0], b2 := [10 0 1], b2 := [01 0 0]. For the hierarchical representation, we use the linear tree of Fig. 5 with \(\mathbb {R}^{2}\) replaced by U.

The TT-rank rμ = dim(Uμ) is described next. Let \(T=T(a)\in \mathbb {R}^{n\times n}\) be a Toeplitz matrix defined by the coefficient vector \(a\in \mathbb {R}^{2n-1}\) (cf. (38). Consider a regular block structure of T with blocks of size 2μ × 2μ. Denote these blocks by \(T^{\alpha \beta }=(T_{ij})_{\alpha 2^{\mu }\leq i\leq (\alpha + 1) 2^{\mu }-1,~ \beta 2^{\mu }\leq j\leq (\beta + 1) 2^{\mu }-1}\) for 0 ≤ α,β ≤ 2Lμ − 1. Then, the matricisation yields Uμ = span{Tαβ : 0 ≤ α,β ≤ 2Lμ − 1} and rμ = dim(Uμ).

A simpler description follows from the fact that

$$T^{\alpha\beta}=T\left( \left[a_{n+(\alpha-\beta-1)2^{\mu}},\dots,a_{n-2+(\alpha-\beta+ 1) 2^{\mu}}\right]\right)=T(a^{(\alpha-\beta)}), $$

where \(a^{(\gamma )}=[a_{n+(\gamma -1)2^{\mu }},\dots ,a_{n-2+(\gamma + 1) 2^{\mu }}]\in \mathbb {R}^{2^{\mu + 1}-1}\) is a part of the vector a defining T = T(a). Since the linear map aT(a) is an isomorphism, we obtain the TT-ranks

$$\begin{array}{@{}rcl@{}} r_{\mu} &=& \dim(\mathbf{U}_{\mu})=\dim\operatorname{span}\{a^{(\gamma)}:1-2^{L-\mu}\leq\gamma\leq2^{L-\mu}-1\}\\ & =&\operatorname{rank}\left[ \begin{array}[c]{cccc} a_{0} & a_{2^{\mu}} & {\ldots} & a_{2^{2L}-2\cdot2^{\mu}}\\ a_{1} & a_{2^{\mu}+ 1} & {\ldots} & a_{2^{2L}-2\cdot2^{\mu}+ 1}\\ {\vdots} & {\vdots} & {\ddots} & \vdots\\ a_{2\cdot2^{\mu}-2} & a_{3\cdot2^{\mu}-2} & {\ldots} & a_{2^{2L}-2} \end{array} \right]. \end{array} $$
(39)

The latter matrix looks similar to the matricisation M(μ) in (24). It can be used for the following bound (cf. [14]).

Lemma 5

The TT-rankrμofT = T(a) is bounded by 2rμ(a),whererμ(a) is the TT-rank of the tensorisation of thevector\(a\in \mathbb {R}^{2n}\)(herea2n− 1can be defined arbitrarily).

Proof

Split the matrix in (39) into the upper part \(\left [ \begin {array}[c]{ccc} a_{0} & {\ldots } & a_{2n-2\cdot 2^{\mu }}\\ {\vdots } & {\ddots } & \vdots \\ a_{2^{\mu }-1} & {\ldots } & a_{2n-2^{\mu }-1} \end {array} \right ]\) and the lower part \(\left [ \begin {array}[c]{ccc} a_{2^{\mu }} & {\ldots } & a_{2n-2^{\mu }}\\ {\vdots } & {\ddots } & \vdots \\ a_{2\cdot 2^{\mu }-1} & {\ldots } & a_{2n-1} \end {array}\right ]\), where the last column is added. The rank (39) is bounded by the sum of the ranks of the latter two matrices. These, however, are submatrices of the matricisation M(μ) belonging to the vector a. This proves the assertion. □

5.3 Matrix-Vector Multiplication

For the evaluation of the product Ty, we assume that the Toeplitz matrix T is expressed by the tensorised analogue \(\mathbf {T}\in \bigotimes _{j = 1}^{L}\mathbb {R}^{2\times 2}\). Here, it is important that for the tensorised quantities \(\mathbf {T}=\bigotimes _{j = 1}^{L}T^{(j)}\) and \(\mathbf {y}=\bigotimes _{j = 1}^{L}y^{(j)}\) the directionwise product \(\mathbf {z}:=\bigotimes _{j = 1}^{L}(T^{(j)}y^{(j)})\) is the tensorisation of z = Ty.

The hierarchical representation of T uses the bases \(_{T}\mathbf {b}_{\ell }^{(\mu )}\) (1 ≤ rμ) of Uμ, while the leaves j are associated with the subspaces Uj = U spanned by the fixed basis \({b_{1}^{U}}:= \left [ {0}{0} \genfrac {}{}{0pt}{}{1}{0} \right ]\), \({b_{2}^{U}}:=\left [\genfrac {}{}{0pt}{}{1}{0} \genfrac {}{}{0pt}{}{0}{1} \right ]\), \({b_{3}^{U}}:=\left [ \genfrac {}{}{0pt}{}{0}{1} \genfrac {}{}{0pt}{}{0}{0} \right ]\). The coefficient matrices are \({~}_{T}C^{(\mu ,\ell )}=\left ({~}_{T}c_{ij}^{(\mu ,\ell )}\right )\), i.e., \(_{T}\mathbf {b}_{\ell }^{(\mu )}={\sum }_{i = 1}^{r_{\mu }}{\sum }_{j = 1}^{3} {{~}_{T}c_{ij}^{(\mu ,\ell )}}{{~}_{T}\mathbf {b}_{i}^{(\mu -1)}} \otimes {b_{j}^{U}}\).

Let yRn have the tensorised analogue \(\mathbf {y}\in \bigotimes _{j = 1}^{L}\mathbb {R}^{2}\) represented via (26) with data \(_{y}c_{ij}^{(\mu + 1,\ell )}\) and \(_{y}\mathbf {b}_{i}^{(\mu )}\). At the leaves, the basis vectors b1 := [10] , b2 := [01] are fixed.

Then, the product \(z:=Ty\in \mathbb {R}^{2}\) has the tensorised analogue \(\mathbf {z}\in \bigotimes _{j = 1}^{L}\mathbb {R}^{2}\) with data \(_{z}c_{(\ell ,m),j}^{(\mu + 1,\ell )}\) and \(_{z}\mathbf {b}_{(\ell ,m)}^{(\mu )}\) which are obtained as follows. The recursion

$$\begin{array}{@{}rcl@{}} {~}_{z}\mathbf{b}_{(\ell,m)}^{(\mu)} & :=&{{~}_{T}\mathbf{b}_{\ell}^{(\mu)}}~{{~}_{y}\mathbf{b}_{m}^{(\mu)}} =\left( \sum\limits_{i,j}~ {{~}_{T}c_{ij}^{(\mu,\ell)}} ~{{~}_{T}\mathbf{b}_{i}^{(\mu-1)}} \otimes {b_{j}^{U}}\right) \left( \sum\limits_{i^{\prime},j^{\prime}}~ {{~}_{y}c_{i^{\prime}j^{\prime}}^{(\mu,m)}}~ {{~}_{y}\mathbf{b}_{i^{\prime}}^{(\mu-1)}} \otimes b_{j^{\prime}}\right) \\ & =&\sum\limits_{i,j,i^{\prime},j^{\prime}} ~{{~}_{T}c_{ij}^{(\mu,\ell)}} {_{y}c_{i^{\prime}j^{\prime}}^{(\mu,m)}} \left( {{~}_{T}\mathbf{b}_{i}^{(\mu-1)}}~ {{~}_{y}\mathbf{b}_{i^{\prime}}^{(\mu-1)}}\right) \otimes\left( {b_{j}^{U}}b_{j^{\prime}}\right) \\ & =&\sum\limits_{i,i^{\prime}}\sum\limits_{(j,j^{\prime})\in\{(1,2),(2,1)\}} ~{{~}_{T}c_{ij}^{(\mu,\ell)}} ~{{~}_{y}c_{i^{\prime}j^{\prime}}^{(\mu,m)}} \left( {{~}_{T}\mathbf{b}_{i}^{(\mu-1)}}~{{~}_{y}\mathbf{b}_{i^{\prime}}^{(\mu-1)}}\right) \otimes b_{1}\\ &&+\sum\limits_{i,i^{\prime}}\sum\limits_{(j,j^{\prime})\in\{(2,2),(3,1)\}}~ {{~}_{T}c_{ij}^{(\mu,\ell)}} ~{{~}_{y}c_{i^{\prime}j^{\prime}}^{(\mu,m)}} \left( {{~}_{T}\mathbf{b}_{i}^{(\mu-1)}}~{{~}_{y}\mathbf{b}_{i^{\prime}}^{(\mu-1)}}\right) \otimes b_{2} \end{array} $$

corresponds to (18). Here, we use that at the leaves the products \({b_{i}^{U}}b_{j}\) (i = 1,2,3;j = 1,2) are either b1 or b2 or zero. At the root, we obtain the result \(\mathbf {z}=\mathbf {Ty}= {{~}_{T}c_{1}^{(L)}} ~{{~}_{y}c_{1}^{(L)}}~ {{~}_{z}\mathbf {b}_{(1,1)}^{(\mu )}}\).

The required number of operations is \(8{\sum }_{\mu = 1}^{L}r_{\mu }(T)r_{\mu }(y)r_{\mu -1}(T)r_{\mu -1}(y)\). Using Lemma 5 for T = T(a) and the bound r := maxμ{rμ(y),rμ(a)}, we obtain the work bound \(32{\sum }_{\mu = 1}^{L}r_{\mu }(T)r_{\mu }(y)r_{\mu -1}(T)r_{\mu -1}(y)\lesssim 32r^{4}\). Similar to (37), the main cost is required by the orthonormalisation.

6 Additional Remarks

As mentioned above, the convolution can be computed via Fourier forward and backward transforms. As explained in [10, Section 14.4], the Fourier transform \(v\mapsto \hat {v}\) can be realised by using the TT format of the tensorisation of v. The algorithm in Section 4.4 yields the exact convolution. The exact Fourier transform of the tensorised v may produce intermediate results with increasing rank. Therefore, a statement as in (35d) cannot be obtained. Nevertheless, practical examples with intermediate truncation seem to give satisfactory results (cf. Dolgov et al. [3]).