A Lanczos-type procedure for tensors

Cipolla, Stefano; Pozza, Stefano; Redivo-Zaglia, Michela; Van Buggenhout, Niel

doi:10.1007/s11075-022-01351-6

A Lanczos-type procedure for tensors

Original Paper
Open access
Published: 31 August 2022

Volume 92, pages 377–406, (2023)
Cite this article

Download PDF

You have full access to this open access article

Numerical Algorithms Aims and scope Submit manuscript

A Lanczos-type procedure for tensors

Download PDF

1601 Accesses
1 Citation
Explore all metrics

Abstract

The solution of linear non-autonomous ordinary differential equation systems (also known as the time-ordered exponential) is a computationally challenging problem arising in a variety of applications. In this work, we present and study a new framework for the computation of bilinear forms involving the time-ordered exponential. Such a framework is based on an extension of the non-Hermitian Lanczos algorithm to 4-mode tensors. Detailed results concerning its theoretical properties are presented. Moreover, computational results performed on real-world problems confirm the effectiveness of our approach.

A μ-mode BLAS approach for multidimensional tensor-structured problems

Article Open access 04 October 2022

Numerical solution of a class of third order tensor linear equations

Article Open access 20 July 2020

On sign function of tensors with Einstein product and its application in solving Yang–Baxter tensor equation

Article 16 August 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this paper, we present an extension of the non-Hermitian Lanczos algorithm (see, e.g., [25]) where the inputs are a 4-mode tensor $\boldsymbol {\mathcal {A}} \in \mathbb {C}^{N \times N \times M \times M}$ and vectors $\mathbf {w}, \mathbf {v} \in \mathbb {C}^{N}$ so that w^Hv≠ 0. We aim to use the introduced algorithm to approximate the bilinear form w^HU(t)v, where $\boldsymbol {\mathsf {U}}(t) \in \mathbb {C}^{N \times N}$ is the so-called time-ordered exponential, i.e., the solution of the ordinary differential equation

$$ \frac{d}{dt}\boldsymbol{\mathsf{U}}(t) = \boldsymbol{\mathsf{A}}(t) \boldsymbol{\mathsf{U}}(t), \quad \boldsymbol{\mathsf{U}}({a})=I_{N}, \quad t \in I = [{a},b], $$

(1)

where $I_{N}\in \mathbb {R}^{N\times N}$ is the identity matrix and $\boldsymbol {\mathsf {A}}(t) \in \mathbb {C}^{N \times N}$ is a smooth matrix-valued function defined on the real interval I. Equation (1) can emerge in a variety of applications. For example, its solution is crucial in quantum physics where the matrix A(t) corresponds to the Hamiltonian operator. Situations where U(t) has no accessible expression are frequent in literature, see, e.g., [1, 7, 38, 49]. For instance, in Nuclear Magnetic Resonance (NMR) experiments, the associated bilinear form w^HU(t)v represents the measurement of changes in an applied magnetic field caused by nuclear spins that are excited with electromagnetic waves, i.e., spectroscopy [29, 39]. Other applications are found in control theory, filter design, and model reduction problems [4, 5, 14, 37, 47]. In the mentioned applications, the matrix A(t) is often large-to-huge and sparse. The introduced algorithm is motivated and theoretically supported by a new expression for the bilinear form. This expression is given by combining the two symbolic methods known as Path-sum and ⋆-Lanczos algorithm [21,22,23,24]. Given the matrix-valued function A(t) and the vectors w,v, the two symbolic methods produce an expression for the bilinear form w^HU(t)v composed of a finite and treatable number of integrals and scalar integral equations. To our knowledge, no other symbolic method can express the bilinear form with a treatable finite number of integral subproblems. Two commonly used alternative expressions are given by the Magnus series, i.e., an infinite series of nested integrals (e.g., [40]), and by the Floquet theory, where the solution of an infinite system of coupled linear differential equations is required (e.g., [7]).

The integrals and the integral equations generated by the ⋆-Lanczos and Path-sum methods do not always have an easily accessible solution. As a consequence, a numerical approach is needed. A possible strategy for the numerical approximation of the mentioned integrals and the integral equations is outlined in [23] and it is based on the discretization of the interval I into M − 1 equispaced subintervals. The algebraic objects resulting from the discretization strategy are the 4-mode tensor $\boldsymbol {\mathcal {A}}$ (corresponding to A(t)) and the 3-mode tensors V,W (corresponding to v,w).

The outputs obtained by combining the ⋆-Lanczos algorithm with the mentioned discretization strategy are mathematically equivalent to the outputs of the tensor Lanczos algorithm presented here with, as inputs, $\boldsymbol {\mathcal {A}}, \mathbf {v}, \mathbf {w}$. The main goal of this paper is to show that, in fact, the tensor Lanczos algorithm can converge to the outcome of the ⋆-Lanczos method within an accuracy of the same order as the discretization strategy. Moreover, the reported numerical experiments will show that the approximation of w^HU(t)v obtained by combining the tensor Lanczos with the discretized Path-sum approach also converges to the solution within the order of the discretization. Naturally, many numerical methods for the solution of non-autonomous ODEs can be found in literature, see, for instance, [2, 6, 7, 10, 13, 15, 30, 34,35,36]. For large matrices, these numerical methods are known to be highly demanding both in terms of computational cost and storage. This motivates the research of novel approaches suitable for large-scale problems. In order to be competitive with the most advanced techniques, tensor Lanczos needs to be used in combination with more accurate discretization schemes. Development of suitable, faster converging discretization schemes is an ongoing research and out of the scope of this work. At the same time, it is important to note that the algorithm here proposed is part of a wider class of tensor extensions of Krylov subspace methods that recently appeared in the literature, see, e.g., [18, 26, 33, 46].

The Lanczos-type process we introduce can also be equivalently written as a block Lanczos method since the 4-mode tensor $\boldsymbol {\mathcal {A}}$ can be seen as a block matrix; information about block Krylov subspace methods can be found, e.g., in [20]. Despite this fact, we prefer to interpret such a block structure in a tensorial fashion. Indeed, the tensorial approach has a direct translation in terms of a discretized ⋆-Lanczos algorithm. Moreover, as we will experimentally show, this interpretation is motivated by observing that several tensors from real-world examples related with (1) are characterized by a low parametric approximation known as Tensor Train decomposition (TT) [41, 42]. Such a low parametric approximation allows to efficiently manipulate and store the tensors. This paves the path for further improvements of our proposal, where the TT structure is fully exploited in the Lanczos-type procedure. Examples of tensor Krylov subspace methods combined with the TT decomposition can be found in [18, 48], further motivating our tensor-based point of view.

More in detail, this work is organized as follows. Preliminaries and definitions of tensor operations are given in Section 2. In Section 3 we discuss how to construct the non-Hermitian Lanczos procedure for tensors and we prove several crucial properties. In Section 4 we discuss the breakdown issue which typically arises when working with non-Hermitian Lanczos approaches. Numerical experiments are presented in Section 5 where we also give several examples exposing the low-rank TT structure of the considered tensors $\boldsymbol {\mathcal {A}}$. Section 6 concludes the paper and A contains several proofs.

2 Preliminaries

In this work, we use a notation borrowed from Matlab^®;. Fixing $i_{1}\in \{1,\dots ,N_{1}\}$ and $i_{2}\in \{1,\dots ,N_{2}\}$, if $\boldsymbol {\mathcal {A}} \in \mathbb {C}^{N_{1} \times N_{2} \times M \times M }$, then $\boldsymbol {\mathcal {A}}_{i_{1},i_{2},:,:}$ stands for the matrix

$$ \boldsymbol{\mathcal{A}}_{i_{1},i_{2},:,:}:=\left[\boldsymbol{\mathcal{A}}_{i_{1},i_{2},j_{1},j_{2}}\right]_{j_{1},j_{2}= 1}^{M}. $$

This notation similarly applies to 3-mode tensors, matrices, and vectors. Table 1 summarizes the notation used in the paper.

Table 1 Summary of notation

Full size table

In the following, we define several tensorial operations, which can be seen as generalizations of the usual products involving matrices and vectors. We summarize them in Table 2.

Table 2 Each of the considered products involves two tensors with n and m modes and it gives as an outcome a tensor with k-modes

Full size table

In the following definitions we consider the tensors $\boldsymbol {\mathcal {A}} \in \mathbb {C}^{N_{1}\times N_{2} \times M \times M },\boldsymbol {{\mathscr{B}}} \in \mathbb {C}^{N_{2} \times N_{3} \times M \times M }$, $A \in \mathbb {C}^{N_{2} \times M \times M }$, $B \in \mathbb {C}^{N_{1} \times M \times M }$, $\boldsymbol {\alpha } \in \mathbb {C}^{M \times M}$. Moreover, the indices $i_{1}\in \{1,\dots ,N_{1}\}$ $i_{2}\in \{1,\dots ,N_{2}\}$ are fixed.

Definition 1 (∗-Tensor product)

The product $(\boldsymbol {\mathcal {A}}*\boldsymbol {{\mathscr{B}}})\in \mathbb {C}^{N_{1} \times N_{3} \times M \times M } $ is defined as

$$ (\boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{B}})_{i_{1},i_{2},:,:}:= {\sum}_{k=1}^{N_{2}}{\boldsymbol{\mathcal{A}}_{i_{1},k,:,:}} \boldsymbol{\mathcal{B}}_{k,i_{2},:,:} . $$

Definition 2 (Tensor-Hypervector product)

The product $(\boldsymbol {\mathcal {A}}*A) \in \mathbb {C}^{N_{1} \times M \times M }$ is defined as

$$ (\boldsymbol{\mathcal{A}}*A)_{i_{1},:,:}:= {\sum}_{k=1}^{N_{2}}{\boldsymbol{\mathcal{A}}_{i_{1},k,:,:}} A_{k,:,:}. $$

We also need to define the action of a 3-mode tensor from the left. Every tensor with three modes that acts, will act, or is the outcome of a ∗-product from the left, will be denoted with a “D” (dual) as apex, and, in the remainder of this work, we will use $B^{D}_{k,:,:}$ to denote (B^D)_k,:,:. We define, $ (B^{D}*\boldsymbol {\mathcal {A}})^{D}\in \mathbb {C}^{N_{2} \times M \times M }$ as

$$ (B^{D}*\boldsymbol{\mathcal{A}})^{D}_{i_{2},:,:}:= {\sum}_{k=1}^{N_{1}}{{B^{D}_{k,:,:}} \boldsymbol{\mathcal{A}}_{k,i_{2},:,: }}. $$

Note that the following 4-mode tensor is the identity for ∗-products introduced above

$$ \mathbb{C}^{N_{1} \times N_{1} \times M \times M} \ni (\boldsymbol{\mathcal{I}}_{*})_{i_{1},i_{2},: , :}:= \left \{\begin{array}{ll} I_{M}, & \text{ if } i_{1}=i_{2} \\ 0_{M}, & \text{otherwise} \end{array} \right. . $$

Definition 3 (Hypervector inner-product)

The product $(B^{D}*A) \in \mathbb {C}^{M\times M}$ is defined as

$$ (B^{D}*A)_{:,:}:= {\sum}_{k=1}^{N_{1}}{B^{D}_{k,:,:} A_{k,:,: }}. $$

Definition 4 (Tensor-matrix product)

The products $(\boldsymbol {\mathcal {A}} \times \boldsymbol {\alpha }), (\boldsymbol {\alpha } \times \boldsymbol {\mathcal {A}} ) \in \mathbb {C}^{N_{1} \times N_{2} \times M \times M }$ are defined as

$$ (\boldsymbol{\mathcal{A}} \times \boldsymbol {\alpha})_{i_{1},i_{2},:,:}:= {\boldsymbol{\mathcal{A}}_{i_{1},i_{2},:,: }} \boldsymbol {\alpha} \quad \text{ and } \quad (\boldsymbol {\alpha} \times \boldsymbol{\mathcal{A}} )_{i_{1},i_{2},:,:}:= {\boldsymbol {\alpha} \boldsymbol{\mathcal{A}}_{i_{1},i_{2},:,: }}. $$

Definition 5 (Hypervector-matrix product)

The products $({A} \times \boldsymbol {\alpha }), (\boldsymbol {\alpha } \times {A} ) \in \mathbb {C}^{N_{2} \times M \times M }$ are defined as

$$ ({A} \times \boldsymbol {\alpha})_{i_{1},:,:}:= {{A}_{i_{1},:,: }}\boldsymbol {\alpha} \quad \text{ and } \quad (\boldsymbol {\alpha} \times {A} )_{i_{1},:,:}:= {\boldsymbol {\alpha}{A}_{i_{1},:,: }}. $$

Definition 6 (Vector-to-Hypervector)

Given $\mathbf {a} \in \mathbb {C}^{N}$ we define the product $ A=\mathbf {a}\otimes I_{M} \in \mathbb {C}^{N \times M \times M}$ as

$$A_{i_{1},:,:}=\mathbf{a}_{i_{1}}I_{M} \quad i_{1} \in \{1,\dots,N\}.$$

Note that rearranging A as a block matrix, we get the usual Kronecker product. All the products are clearly distributive with respect to the usual addition. On the other hand, the associativity of some of the products is less obvious. Therefore, we state it in the following Lemma 1, postponing its proof to A.

Lemma 1

The following states that the tensor-tensor and tensor-hypervector ∗-products are associative.

Given $\boldsymbol {\mathcal {A}} \in \mathbb {C}^{N_{1} \times N_{1} \times M \times M}$, $A \in \mathbb {C}^{N_{1} \times M \times M}$ we have
$$ (\boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{A}})*{A}= \boldsymbol{\mathcal{A}}*(\boldsymbol{\mathcal{A}}*{A}){.} $$
Given $B^{D} \in \mathbb {C}^{N_{1} \times M \times M}$, ${A} \in \mathbb {C}^{N_{2} \times M \times M }$, $\boldsymbol {\mathcal {A}} \in \mathbb {C}^{N_{1}\times N_{2} \times M \times M }$, then
$$ (B^{D} * \boldsymbol{\mathcal{A}})^{D} *A=B^{D} * (\boldsymbol{\mathcal{A}} *A). $$
Given $\boldsymbol {\mathcal {A}} \in \mathbb {C}^{N_{1}\times N_{2} \times M \times M }, \boldsymbol {{\mathscr{B}}} \in \mathbb {C}^{N_{2} \times N_{3} \times M \times M }, \boldsymbol {\mathcal {C}} \in \mathbb {C}^{N_{3}\times N_{1} \times M \times M }$, then
$$ (\boldsymbol{\mathcal{C}}*\boldsymbol{\mathcal{A}})*\boldsymbol{\mathcal{B}}=\boldsymbol{\mathcal{C}}*(\boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{B}}). $$

Having introduced the required products and their basic properties, we are ready to derive the tensor non-Hermitian Lanczos algorithm.

3 The Lanczos-type process

Using the operations given in Table 2, we propose a sensible generalization of Krylov subspaces where, instead of the usual matrix-vector product, the tensor-hypervector product is used to generate the subspaces. Section 3.1 describes these tensor Krylov-type subspaces in detail and defines biorthogonal bases for them. A Lanczos-type algorithm which generates these biorthogonal bases is proposed in Section 3.2. In Section 3.3 two important properties of the classical Lanczos algorithm are generalized, namely, the tensor representation of the three-term recurrence relations for the biorthogonal bases and the matching moment property. The computational cost and storage requirements of the algorithm are discussed in Section 3.4.

3.1 Krylov-type tensor subspaces

Consider the tensor $\boldsymbol {\mathcal {A}}\in \mathbb {C}^{N \times N \times M \times M}$. We define the polynomials of degree ℓ of $\boldsymbol {\mathcal {A}}$ as

$$ \begin{array}{@{}rcl@{}} p(\boldsymbol{\mathcal{A}}) &:= \sum\limits_{k=0}^{{\ell}} \boldsymbol{\mathcal{A}}^{k_{*}} \times \boldsymbol {\alpha}_{k} ,\\ p^{D}(\boldsymbol{\mathcal{A}}) &:= \sum\limits_{k=0}^{{\ell}} \boldsymbol {\alpha}_{k}^{H} \times \boldsymbol{\mathcal{A}}^{k_{*}} , \end{array} $$

where $\boldsymbol {\mathcal {A}}^{k_{*}}$ stands for k ∗-multiplications of $\boldsymbol {\mathcal {A}}$ by itself, and $\boldsymbol {\alpha }_{k}^{H}$ is the conjugate transpose of α. Given the tensors ${A}\in \mathbb {C}^{N \times M \times M}$, $B\in \mathbb {C}^{N \times M \times M}$ we can define the Krylov-type subspaces

$$ \begin{array}{@{}rcl@{}} \mathcal{K}_{n}(\boldsymbol{\mathcal{A}},A)&:=& \{p(\boldsymbol{\mathcal{A}})*A \text{ s.t. } deg(p) \leq n-1 \},\\ \mathcal{K}_{n}^{D}(B^{D},\boldsymbol{\mathcal{A}})&:=& \{B^{D} * p^{D}(\boldsymbol{\mathcal{A}}) \text{ s.t. } deg(p^{D}) \leq n-1 \}. \end{array} $$

Every element in $\mathcal {K}_{n}(\boldsymbol {\mathcal {A}},A)$ is a tensor in $\mathbb {C}^{N \times M \times M}$ and can be written as

$$ p(\boldsymbol{\mathcal{A}})* A = {\sum}_{k=0}^{n-1} (\boldsymbol{\mathcal{A}}^{k_{*}}\times \boldsymbol {\alpha}_{k} ) *A. $$

From now on we will assume that A is of the form A = a ⊗ I_M for some $\mathbf {a} \in \mathbb {C}^{N}$. In this case, the matrices α_k commute with A giving

$$ p(\boldsymbol{\mathcal{A}})* A = {\sum}_{k=0}^{n-1} (\boldsymbol{\mathcal{A}}^{k_{*}} *A)\times \boldsymbol {\alpha}_{k}. $$

An analogous result holds for $B^{D}*p^{D}(\boldsymbol {\mathcal {A}})$ when $B^D=\overline {\mathbf {b}} \otimes I_M$ for any $\mathbf {b} \in \mathbb {C}^{N}$, $\overline {\mathbf {b}}$ is the conjugated vector.

Driven by the analogy with the matrix case, our aim is to build two “biorthonormal bases” for the Krylov-type subspaces $\mathcal {K}_{n}(\boldsymbol {\mathcal {A}},A)$ and $ \mathcal {K}_{n}^{D}(B^{D}, \boldsymbol {\mathcal {A}})$. The following Definition 7 allows to characterize spaces spanned by 3-mode tensors.

Definition 7

Given $V_{1},\dots ,V_{n} \in \mathbb {C}^{N \times M \times M}$, ${W_{1}^{D}},\dots ,{W^{D}_{n}} \in \mathbb {C}^{N \times M \times M}$, we define the subspaces

$$ \begin{array}{@{}rcl@{}} \langle V_{1},\dots, V_{n} \rangle&:=& \left\{V= \sum\limits_{{k}=1}^{n} V_{{k}} \times \boldsymbol{\eta}_{{k}}, \textrm{ for } \boldsymbol{\eta}_{1},\dots, \boldsymbol{\eta}_{n} \in \mathbb{C}^{M \times M}\right\}; \\ \langle {W_{1}^{D}},\dots, {W_{n}^{D}} \rangle&:=& \left\{W^{D}= \sum\limits_{{k}=1}^{n} \boldsymbol{\eta}_{{k}} \times W^{D}_{{k}}, \textrm{ for } \boldsymbol{\eta}_{1},\dots, \boldsymbol{\eta}_{n} \in \mathbb{C}^{M \times M} \right\}. \end{array} $$

We say that $V_{1},\dots , V_{n}$ is a basis for the subspace $\langle V_{1},\dots , V_{n} \rangle $ and ${W_{1}^{D}},\dots , {W_{n}^{D}}$ is a basis for the subspace $\langle {W_{1}^{D}},\dots , {W_{n}^{D}} \rangle $.

Biorthonormal bases for Krylov-type subspaces are represented by the tensors $\boldsymbol {\mathcal {V}}_{n} \in \mathbb {C}^{N\times n \times M \times M }$ and $\boldsymbol {\mathcal {W}}_{n}\in \mathbb {C}^{n\times N \times M \times M }$ satisfying

$$ \boldsymbol{\mathcal{W}}_{n}*\boldsymbol{\mathcal{V}}_{n}=\boldsymbol{\mathcal{I}}_{*} \in \mathbb{R}^{n \times n \times M \times M}, $$

(2)

with the hypervectors $V_{k}:= (\boldsymbol {\mathcal {V}}_{n})_{:,k,:,:}$ and ${W^{D}_{k}}:= (\boldsymbol {\mathcal {W}}_{n})_{k,:,:,:}$, for $k=1,\dots ,n$, forming, respectively, bases for $\mathcal {K}_{n}(\boldsymbol {\mathcal {A}},A)$ and $ \mathcal {K}_{n}^{D}(B^{D}, \boldsymbol {\mathcal {A}})$, i.e.,

$$ \begin{array}{@{}rcl@{}} \langle V_{1},\dots, V_{n} \rangle = \mathcal{K}_{n}(\boldsymbol{\mathcal{A}},A), \quad \langle {W_{1}^{D}},\dots, {W_{n}^{D}} \rangle = \mathcal{K}_{n}^{D}(B^{D}, \boldsymbol{\mathcal{A}}). \end{array} $$

In the following section we derive such bases by constructing the tensor non-Hermitian Lanczos Algorithm.

3.2 The tensor Lanczos process

Given the inputs $\boldsymbol {\mathcal {A}}\in \mathbb {C}^{N\times N \times M \times M}$ and $\mathbf {v}, \mathbf {w} {\in \mathbb {C}^{N}}$, Algorithm 1 constructs, when no breakdown occurs, the bases $\boldsymbol {\mathcal {W}}_{n}$ and $\boldsymbol {\mathcal {V}}_{n}$, for $\mathcal {K}_{n}(\boldsymbol {\mathcal {A}},A)$ and $ \mathcal {K}_{n}^{D}(B^{D}, \boldsymbol {\mathcal {A}})$, respectively, which satisfy the ∗-biorthogonality conditions (2).

Details on how the algorithm constructs these bases using three-term recurrences are described below.

By definition, the first hypervectors of the bases are ${W_{1}^{D}}, V_{1}$ satisfying ${W_{1}^{D}}*V_{1}=I_{M}$;
Consider the vector $\widehat {W}^{D}_{2} \in \mathcal {K}^{D}_{2}(W^{D},\boldsymbol {\mathcal {A}})$ given by
$$ \widehat{W}^{D}_{2}:={W_{1}^{D}}*\boldsymbol{\mathcal{A}}-\boldsymbol {\alpha}_{1} \times {W_{1}^{D}}. $$
Imposing that $\widehat {W}_{2}^{D}$ satisfies the ∗-biorthogonal condition $\widehat {W}_{2}^{D}*V_{1}=0$, we have $\boldsymbol {\alpha }_{1}={W_{1}^{D}} * \boldsymbol {\mathcal {A}} *V_{1}$.
Analogously, define the vector $\widehat {V}_{2} \in \mathcal {K}_{2}(\boldsymbol {\mathcal {A}},V)$ given by
$$ \widehat{V}_{2} := \boldsymbol{\mathcal{A}}*V_{1}-V_{1}\times \boldsymbol {\alpha}_{1}. $$
Imposing the ∗-biorthogonality condition, we find the ∗-biorthogonal vectors
$$ V_{2}:=\widehat{V}_{2}\times \boldsymbol {\beta}_{1}^{-1} \text{ where } \boldsymbol {\beta}_{1}=\widehat{W}_{2}^{D}*\widehat{V}_{2}=\widehat{W}_{2}^{D}*V_{1} \text{ and } W_{2}=\widehat{W}_{2}. $$
Clearly $\mathcal {K}_{2}({\boldsymbol {\mathcal {A}},V})=\langle V_{1},V_{2} \rangle $ and $\mathcal {K}_{2}^{D}(W^{D},{\boldsymbol {\mathcal {A}}})=\langle {W^{D}_{1}},{W^{D}_{2}} \rangle $.
Now, assume the ∗-biorthonormal bases $V_{1},\dots ,V_{{k}}$ and ${W^{D}_{1}},\dots ,W^{D}_{{k}}$ are available. Consider the hypervector
$$ \widehat{W}^{D}_{{k}+1}:={W}_{{k}}^{D}*\boldsymbol{\mathcal{A}}-{\sum}_{i=1}^{{k}}\boldsymbol{\boldsymbol{\eta}}_{i} \times {W}_{i}^{D}. $$
The matrices η_i are determined by the conditions $\widehat {W}_{{k}+1}^{D}*V_{i}=0$, for $i~=~1,\dots ,~k$, which give
$$ \boldsymbol{\eta}_{i}=W_{{k}}^{D}*\boldsymbol{\mathcal{A}}*V_{i}, \quad \text{ for } i=1,\dots,{k}. $$
In particular, since $\boldsymbol {\mathcal {A}}*V_{i} \in \mathcal {K}_{i+1}(\boldsymbol {\mathcal {A}},V )$, we get η_i = 0 for $i=1,\dots ,{k}-2$. An analogous argument is valid for $\widehat {V}_{k+1}$. This leads to the following three-term recurrences
$$ \begin{array}{@{}rcl@{}} W_{k+1}^{D}={W_{k}^{D}}*\boldsymbol{\mathcal{A}}-\boldsymbol {\alpha}_{k}\times {W_{k}^{D}}-\boldsymbol {\beta}_{k}\times W_{k-1}^{D}, \end{array} $$
(3a)
$$ \begin{array}{@{}rcl@{}} V_{k+1}\times \boldsymbol {\beta}_{k+1}=\boldsymbol{\mathcal{A}} *V_{k}- V_{k}\times \boldsymbol {\alpha}_{k}-V_{k-1}, \end{array} $$
(3b)
with coefficients
$$ \boldsymbol {\alpha}_{k}= {W_{k}^{D}}*\boldsymbol{\mathcal{A}}*V_{k}, \boldsymbol {\beta}_{k+1}=W_{k+1}^{D}*\boldsymbol{\mathcal{A}}*V_{k}. $$
(4)
To prove that $\langle V_{1}, \dots , V_{n} \rangle =\mathcal {K}_{n}(\boldsymbol {\mathcal {A}},V)$ and $\langle {W_{1}^{D}}, \dots , {W_{n}^{D}} \rangle =\mathcal {K}_{n}(W^{D},\boldsymbol {\mathcal {A}})$, it is enough to use induction and the fact that $V_{{k}} \in \mathcal {K}_{{k}}(\boldsymbol {\mathcal {A}},V)$ and $W_{{k}}^{D} \in \mathcal {K}^{D}_{{k}}(W^{D},\boldsymbol {\mathcal {A}})$ for all ${k}=1,\dots ,n$.

Let us finally observe that, should β_k+ 1 not be invertible, we would get a breakdown in the algorithm.

Different rescaling strategies are possible by setting an invertible coefficient γ_k+ 1 and noticing that

$$ (\boldsymbol{\gamma}_{k+1})^{-1} \times W_{k+1}^{D} \ast V_{k+1} \times \boldsymbol{\gamma}_{k+1} = {I_{M}}. $$

This last observation completes the construction of Algorithm 1.

3.3 Main properties of the tensor Lanczos algorithm

It is important to note that the coefficients in the three-term recurrences (3.2–3.2) can be represented by a sparse 4-mode tensor. To this aim, let us consider $\boldsymbol {\mathcal {T}}_{n} \in \mathbb {C}^{n \times n \times M \times M}$ as the tensor defined as

$$ (\boldsymbol{\mathcal{T}}_{n})_{i_{1},i_{2}, :, :}:= \left \{\begin{array}{lll} \boldsymbol {\alpha}_{i_{1}}, & \text{ if } i_{1}=i_{2} & \text{ and } 1 \leq i_{1} \leq n \\ \boldsymbol{\gamma}_{i_{1}}, & \text{ if } i_{2}=i_{1}+1 & \text{ and } 1 \leq i_{1} \leq n-1 \\ \boldsymbol {\beta}_{i_{1}}, & \text{ if } i_{2}=i_{1}-1 & \text{ and } 2 \leq i_{1} \leq n\\ \boldsymbol{0}, & \text{ otherwise } \end{array} \right. . $$

(5)

where $\boldsymbol {\alpha }_{i_{1}}, \boldsymbol {\beta }_{i_{1}}, \boldsymbol {\gamma }_{i_{1}}$ are the matrices in Algorithm 1. The tensor $\boldsymbol {\mathcal {T}}_{n}$ is a generalization of the so-called (complex) Jacobi matrix associated with the non-Hermitian Lanczos algorithm; see, e.g., [45] and references therein. By using $\boldsymbol {\mathcal {T}}_{n}$, Theorem 1 provides a compact representation of the three-term recurrences constructing the biorthogonal bases.

Theorem 1

The three-term recurrences (3.2–3.2) can be written in the compact form

$$ \begin{array}{@{}rcl@{}} \boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{V}}_{n}=\boldsymbol{\mathcal{V}}_{n}*\boldsymbol{\mathcal{T}}_{n}+ \widetilde{\boldsymbol{\mathcal{V}}}_{n} \end{array} $$

(6a)

$$ \begin{array}{@{}rcl@{}} \boldsymbol{\mathcal{W}}_{n}*\boldsymbol{\mathcal{A}}=\boldsymbol{\mathcal{T}}_{n}*\boldsymbol{\mathcal{W}}_{n}+\widetilde{\boldsymbol{\mathcal{W}}}_{n} \end{array} $$

(6b)

where $ \widetilde {\boldsymbol {\mathcal {V}}}_{n} \in \mathbb {C}^{N \times n \times M \times M}$ is

$$ (\widetilde{\boldsymbol{\mathcal{V}}}_{n})_{:,k,:,:} := \left \{\begin{array}{ll} V_{n+1} \times \boldsymbol {\beta}_{n+1}, & \text{ if } k=n \\ \boldsymbol{0}, & \text{ otherwise } \end{array} \right., $$

and $\widetilde {\boldsymbol {\mathcal {W}}}_{n} \in \mathbb {C}^{n \times N \times M \times M}$ is

$$ (\widetilde{\boldsymbol{\mathcal{W}}}_{n})_{k,:,:,:} := \left \{\begin{array}{ll} \boldsymbol{\gamma}_{n+1} \times W^{D}_{n+1}, & \text{ if } k=n \\ \boldsymbol{0}, & \text{ otherwise } \end{array} \right. . $$

Proof

By direct inspection. We have, for all $i_{1} \in \{1,\dots ,N \}$, $i_{2} \in \{1,\dots ,n-1\}$

$$ (\boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{V}}_{n})_{i_{1},i_{2},:,:}={\sum}_{k=1}^{N}\boldsymbol{\mathcal{A}}_{i_{1},k,:,:}(\boldsymbol{\mathcal{V}}_{n})_{k,i_{2},:,:}={\sum}_{k=1}^{N}\boldsymbol{\mathcal{A}}_{i_{1},k,:,:}({V}_{i_{2}})_{k,:,:}=(\boldsymbol{\mathcal{A}}*{V}_{i_{2}})_{i_{1},:,:} $$

(7)

and

$$ \begin{array}{llllll} (\boldsymbol{\mathcal{V}}_{n}*\boldsymbol{\mathcal{T}}_{n}+ \widetilde{\boldsymbol{\mathcal{V}}}_{n})_{i_{1},i_{2},:,:}&= \sum\limits_{k=1}^{n}(\boldsymbol{\mathcal{V}}_{n})_{i_{1},k,:,:}(\boldsymbol{\mathcal{T}}_{n})_{k,i_{2},:,:} \\ & = (V_{i_{2}})_{i_{1}}\boldsymbol {\alpha}_{i_{2}}+(V_{i_{2}+1})_{i_{1}} \boldsymbol {\beta}_{i_{2}+1}+(V_{i_{2}-1})_{i_{1}} \\ & = (V_{i_{2}}\times \boldsymbol {\alpha}_{i_{2}}+V_{i_{2}+1}\times \boldsymbol {\beta}_{i_{2}+1}+V_{i_{2}-1})_{i_{1}} . \end{array} $$

(8)

The equality between (7) and (8) follows using (3b) and proves (6a). The remaining part of the theorem can be proved analogously. □

If we ∗-multiply (6a) by $\boldsymbol {\mathcal {W}_{n}}$ from the left we obtain the expression

$$ \boldsymbol{\mathcal{T}}_{n}=\boldsymbol{\mathcal{W}}_{n}* \boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{V}}_{n}. $$

The tensor $\boldsymbol {\mathcal {T}}_{n}$ satisfies a generalization of the matching moment property which is stated in Theorem 2.

Theorem 2 (Matching Moment Property)

Let $\boldsymbol {\mathcal {A}}, V, W$ and $\boldsymbol {\mathcal {T}}_{n}$ be as described above, then

$$ W^{D} * (\boldsymbol{\mathcal{A}}^{k_{*}}) * V = {E_{1}^{D}} * (\boldsymbol{\mathcal{T}}_{n})^{k_{*}}* E_{1}, \quad \text{ for } \quad k=0,\dots, 2n-1, $$

where E₁ = e₁ ⊗ I_M and e₁ is the first vector of the Euclidean base of $\mathbb {C}^{N \times N}$.

The proof of Theorem 2 can be found in A.

3.4 Numerical properties

The tensor $\boldsymbol {\mathcal {A}}$ is obtained by discretizing A(t) and stores in $\boldsymbol {\mathcal {A}}_{k,l,:,:}$ the coefficients representing the (k,l)-th element of A(t). Different methods of discretization are possible. In this paper, following [23], we discretize the interval I obtaining the mesh

$$ \tau_{i} = h (i-1) + a, \quad i = 1,\dots,M, \quad h = \frac{b-a}{M-1}. $$

(9)

For this mesh the discretization of $\boldsymbol {\mathsf {A}}(t) = \left [\boldsymbol {\mathsf {A}}_{k,\ell }(t) \right ]_{k,\ell }^{N}$ is the tensor

$$ \boldsymbol{\mathcal{A}}_{k,\ell,:,:} := \boldsymbol{\nu}_{k,\ell}, \quad k,\ell =1,\dots,N, $$

(10)

where the matrices $\boldsymbol {\nu }_{k,\ell }\in \mathbb {C}^{M \times M}$ are lower triangular matrices defined as

$$ \left( \boldsymbol{\nu}_{k,\ell} \right)_{i,j} = \left\{\begin{array}{ll} \boldsymbol{\mathsf{A}}_{k,\ell}(\tau_{i}) h, & i \geq j \\ 0 & i < j \end{array}\right . . $$

This discretization scheme has an accuracy of order $\mathcal {O}(h) = \mathcal {O}(1/M)$ and, indeed, in Section 5, we show that when this discretization scheme is used, the approximation of the bilinear form of interest also has an accuracy of $\mathcal {O}(1/M)$.

The computational cost of Algorithm 1 depends on the chosen number of discretization points M and on the number of iterations n. In this algorithm the dominant cost is the multiplication of a 4-mode tensor with a 3-mode tensor, i.e., $\boldsymbol {\mathcal {A}}\ast V_{k}$ and ${W_{k}^{D}}\ast \boldsymbol {\mathcal {A}}$. The worst case complexity of one such product is $\mathcal {O}(M^{3} N^{2})$, for a total cost of $\mathcal {O}(2 n M^{3} N^{2})$. However, since A(t) is sparse in all practical applications, the computational cost can be much lower. For example, if there are N_nz < N nonzeros in each column of A(t), then the cost reduces to $\mathcal {O}(2 n M^{3} N N_{\text {nz}} )$. It is important to note, moreover, that the term M³ arises from the matrix-matrix multiplication between V_k and the blocks in $\boldsymbol {\mathcal {A}}$. Since these blocks arise from a discretization strategy, it is likely that they will exhibit a particular structure that can be exploited for efficient computations. E.g., in the discretization used in this work, these blocks are lower triangular matrices for which the matrix-matrix multiplication has a cost of $\frac {M^{3}}{2}$.

Finally, the storage cost of Algorithm 1 is three basisvectors V_i, three basisvectors $W^{D}_{i}$ and 3n − 1 nonzero elements of $\boldsymbol {\mathcal {T}_{n}}$, for a total of $\mathcal {O}(6M^{2} N+ 3M^{2} n)$. Only three basisvectors must be kept in memory thanks to the underlying three-term recurrence relation.

Let us conclude this section observing that, as highlighted from the previous discussion, both, the computational cost and storage requirement depend strongly on the number M of discretization points used. For the discretization scheme described above, we expect that a large number of discretization points is required since its accuracy is $\mathcal {O}(1/M)$. This justifies the search for more accurate discretization schemes, for example Legendre polynomial approximation. However, other discretization schemes will not be discussed here since they are subject of future research and since the discretization scheme introduced above suffices to illustrate the potential of Algorithm 1.

4 Breakdowns

If the matrix β_k+ 1 is singular, then line 11 in Algorithm 1 cannot be performed and the algorithm breaks down. This breakdown issue is inherited from the (usual) non-Hermitian Lanczos algorithm; see, e.g., [19, 27, 28, 43, 52]. There are two different kinds of breakdowns. The first one, the so-called lucky breakdown, occurs when one of the Krylov-type subspaces $\mathcal {K}_{k}(\boldsymbol {\mathcal {A}},A)$ or $\mathcal {K}_{k}^{D}(B^{D},\boldsymbol {\mathcal {A}})$ becomes invariant under ⋆-multiplication with $\mathcal {A}$ from the left or right, respectively. Suppose that $\mathcal {K}_{k}(\boldsymbol {\mathcal {A}},A)$ is an invariant subspace, this will result, in exact arithmetic, in $\widehat {V}_{k+1} = \boldsymbol {0}$ in Line 5 of Algorithm 1. In finite precision $\widehat {V}_{k+1}\in \mathbb {C}^{N\times M \times M}$ will never be exactly zero. Therefore, the Frobenius norm

$$ \Vert \widehat{V}_{k+1}\Vert_{F} := {\sum}_{i=1}^{N} {\sum}_{j=1}^{M} {\sum}_{{\ell=1}}^{M} \left\vert\left( \widehat{V}_{k+1}\right)_{i,j,{\ell}}\right\vert^{2} $$

is used to define the following criterion to detect a lucky breakdown:

$$ \frac{\Vert \widehat{V}_{k+1}\Vert_{F}}{\Vert{V}_{k}\Vert_{F}} < \epsilon, $$

with 𝜖 << 1, a user-defined threshold close to machine precision. The same applies to the case $\widehat {W}_{k+1}^{D} = \boldsymbol {0}$. The second kind of breakdown occurs when both $\widehat {V}_{k+1} \neq \boldsymbol {0}$ and $\widehat {W}_{k+1}^{D} \neq \boldsymbol {0}$, but $\boldsymbol {\beta }_{k+1}\in \mathbb {C}^{M\times M}$ is still singular, then the algorithm breaks down. This case is known as a serious breakdown. In numerical computation, the condition number of β_k+ 1 is monitored to decide if Line 11 can be computed sufficiently accurate. A user-defined threshold 𝜖_s >> 1 specifies an upper bound on the allowed condition number of β_k+ 1. That is, if the ratio of its largest and smallest singular value is larger than 𝜖_s, i.e., $\sigma _{\max \limits }(\boldsymbol {\beta }_{k+1})/\sigma _{\min \limits }(\boldsymbol {\beta }_{k+1}) > \epsilon _{s}$, then the algorithm breaks down. Note that the choice of γ_k+ 1 will influence the condition number of β_k+ 1.

In the usual non-Hermitian Lanczos algorithm, a serious breakdown can be treated by using a so-called look-ahead strategy; see, e.g., [8, 9, 19, 27, 28, 43, 51]. Connection between serious breakdowns, (formal) orthogonal polynomials, and matching moment property can be found in [16, 44]. If needed, an analogous look-ahead strategy may be implemented for the tensor Lanczos algorithm. At the moment, an easier strategy to deal with serious breakdowns is to reformulate the problem so to change the input vectors v,w. For instance, when w = e_i and v = e_i, a serious breakdown is likely to happen due to the sparsity of $\boldsymbol {\mathcal {A}}$. However, we can rewrite the approximation of the time-ordered exponential U(t) as

$$\boldsymbol{e}_{i}^{H}\boldsymbol{\mathsf{U}}(t)\boldsymbol{e}_{j} = (\boldsymbol{e} + \boldsymbol{e}_{i})^{H} \boldsymbol{\mathsf{U}}(t) \boldsymbol{e}_{j} - \boldsymbol{e}^{H} \boldsymbol{\mathsf{U}}(t) \boldsymbol{e}_{j}, $$

with $\boldsymbol {e} = (1, \dots , 1)^{H}$. Then one can approximate $(\boldsymbol {e} + \boldsymbol {e}_{i})^{H} \boldsymbol {\mathsf {U}}(t) \boldsymbol {e}_{j}$ and $\boldsymbol {e}^{H} \boldsymbol {\mathsf {U}}(t) \boldsymbol {e}_{j}$ separately, which are less likely going to have a breakdown thanks to the fact that e is a full vector; see, e.g., [25, Section 7.3] and [23].

5 Numerical examples

Let us consider the following smooth matrix-valued function defined on a real interval I = [a,b]:

$$ \boldsymbol{\mathsf{A}}(t): I \subset \mathbb{C} \rightarrow \mathbb{C}^{N \times N}. $$

As anticipated in the Introduction, the time-ordered exponential of A(t) is the unique matrix-valued function $\boldsymbol {\mathsf {U}}(t) \in \mathbb {C}^{N\times N}$ defined on I that is the solution of the system of linear ordinary differential equations

$$ \frac{d}{dt}\boldsymbol{\mathsf{U}}(t) = \boldsymbol{\mathsf{A}}(t) \boldsymbol{\mathsf{U}}(t), \quad \boldsymbol{\mathsf{U}}(a)=I_{N}, \quad t \in I, $$

see [17]. In this section, we aim to approximate the bilinear form w^HU(t)v by using the tensor non-Hermitian Lanczos algorithm. If the matrix A is so that A(τ₁)A(τ₂) −A(τ₂)A(τ₁) = 0 for all τ₁,τ₂ ∈ I, then $\boldsymbol {\mathsf {U}}(t)=\exp \left ({{\int \limits }_{s}^{t}} \boldsymbol {\mathsf {A}}(\tau ) \mathrm {d}\tau \right ).$ Unfortunately, U(t) – and the related bilinear forms – cannot be expressed by an analogous simple form in the general case. Indeed, even for small matrices, U(t) may be given by complicated special functions [32, 53].

A new approach for the approximation of a time-ordered exponential bilinear form was introduced in [22,23,24] and it is based on ⋆-Lanczos, which is a symbolic algorithm. This method is able to approximate the bilinear form

$$\mathbf{w}^{H} \boldsymbol{\mathsf{U}}(t) \mathbf{v}, \quad t \in I$$

for the given vectors w,v, with w^H,v≠ 0. The matrices $\boldsymbol {\alpha }_{1},\dots ,\boldsymbol {\alpha }_{n}$, $\boldsymbol {\beta }_{2},\dots ,\boldsymbol {\beta }_{n}$, and $\boldsymbol {\gamma }_{2},\dots ,\boldsymbol {\gamma }_{n}$, which compose the 4-mode tensor $\boldsymbol {\mathcal {T}}_{n}$ in (5), are obtained by running n iterations of Algorithm 1 with, as inputs, the 4-mode tensor $\boldsymbol {\mathcal {A}}$ in (10) and the vectors v,w.

Sampling the true solution w^HU(t)v on the discretization nodes τ_i gives the vector $\hat {\mathbf {s}}$ defined as

$$ {\hat{\mathbf{s}}:= \begin{bmatrix} \mathbf{w}^{H} \boldsymbol{\mathsf{U}}(\tau_{1}) \mathbf{v} & \mathbf{w}^{H} \boldsymbol{\mathsf{U}}(\tau_{2}) \mathbf{v} & {\dots} & \mathbf{w}^{H} \boldsymbol{\mathsf{U}}(\tau_{M}) \mathbf{v} \end{bmatrix}^{\top}.} $$

Exploiting the results described in [23], the sampled solution vector $\hat {\mathbf {s}}$ can be approximated by

$$ {\mathbf{s}_{n}} := \frac{1}{h}\left( \boldsymbol{\theta} \times \left( R_{\ast}(\boldsymbol{\mathcal{T}}_{n})\right)_{1,1,:,:}\right) \mathbf{e}_{1} \approx \hat{\mathbf{s}}, $$

(11)

where R_∗ is the ∗-resolvent , i.e., the tensor

$$ R_{\ast}(\boldsymbol{\mathcal{T}}_{n}) := \boldsymbol{\mathcal{I}}_{\ast} + {\sum}_{k=1}^{\infty} \left( \boldsymbol{\mathcal{T}}_{n} \right)^{k_{\ast}}, $$

and

$$ \boldsymbol{\theta} := h \begin{bmatrix} 1 & 0& 0 &{\dots} & 0\\ 1 & 1 & 0 & {\dots} & 0\\ {\vdots} & {\vdots} & {\ddots} & {\ddots} & \vdots\\ 1 & 1 & {\dots} & 1 & 0\\ 1 & 1 & {\dots} & 1 & 1 \end{bmatrix} \in \mathbb{C}^{M\times M}. $$

Overall, the accuracy of the approximation in (11) can not be better than $\mathcal {O}(h)$. This is due to the fact that, as explained in [23], the discretization (10) is based on a rectangular quadrature rule. Finally, using the Path-sum method [21] we get the following explicit expression for the ∗-resolvent in terms of a continued fraction

$$ \begin{array}{llllll} &R_{\ast}(\boldsymbol{\mathcal{T}_{n}})_{1,1,:,:} = \\ & \left( \widetilde{\boldsymbol {\alpha}}_{1} - \boldsymbol {\beta}_{2} \left( \widetilde{\boldsymbol {\alpha}}_{2} - \boldsymbol {\beta}_{3} \left( {\cdots} \boldsymbol {\beta}_{n-1} \widetilde{\boldsymbol {\alpha}}_{n}^{-1} \boldsymbol{\gamma}_{n-1} {\cdots} \right)^{-1} \boldsymbol{\gamma}_{3} \right)^{-1} \boldsymbol{\gamma}_{2} \right)^{-1}, \end{array} $$

(12)

with $\widetilde {\boldsymbol {\alpha }}_{i} = I_{{M}} - \boldsymbol {\alpha }_{i}$. (12) is computed from the most inner term moving outward, where the inversion operation is performed using the backslash operator in Matlab^®;. Note that the ∗-resolvent and all inverses appearing in (12) are expected to exist for h small enough, since their continuous counterparts exist under certain regularity conditions on A(t); see [22, 24].

The rest of the section is structured as follows. Section 5.1 describes the measures that will be used to quantify the errors of the final solution and of the computed biorthonormal bases for Krylov subspaces. In Section 5.2 two examples are discussed for which an analytical solution is available. This allows us to compare the approximation to an exact solution and to show that it converges with the expected rate of convergence. Small-scale examples from NMR spectroscopy are discussed in Section 5.3. Finally, in Section 5.4, we analyze the approximability of the previously considered tensors by the Tensor Train representation.

5.1 Error measures

In this section we define a series of error measures which quantify the quality of the generated biorthogonal bases and the accuracy of the approximation (11). These measures use the Frobenius norm, which, for a 4-modes tensor, is defined as

$$ \Vert \boldsymbol{\mathcal{A}}\Vert_{F} := {\sum}_{i=1}^{N} {\sum}_{j=1}^{N} {\sum}_{k=1}^{M} {\sum}_{l=1}^{M} \left\vert\left( \boldsymbol{\mathcal{A}}\right)_{i,j,k,l}\right\vert^{2}. $$

The main goal is to analyze the rate of convergence as the number of discretization points M is increased. To stress the dependence on M of computed quantities we use the superscript “(M)”.

A generalization of the usual error measures for Krylov subspace methods are used. As a measure for the biorthonormality of the bases $\boldsymbol {\mathcal {V}}_{n} \in \mathbb {C}^{N\times n \times M \times M }$ and $\boldsymbol {\mathcal {W}}_{n}\in \mathbb {C}^{n\times N \times M \times M }$ generated by n steps of the algorithm, we use

$$ \text{err}_{\mathrm{o}} := \frac{\Vert \boldsymbol{\mathcal{W}}_{n}^{(M)}*\boldsymbol{\mathcal{V}}_{n}^{(M)} - \boldsymbol{\mathcal{I}}_{*}\Vert_{F}}{\max (\Vert\boldsymbol{\mathcal{V}}_{n}^{(M)} \Vert_{F},\Vert\boldsymbol{\mathcal{W}}_{n}^{(M)} \Vert_{F})}. $$

For a robust algorithm, it is paramount that the term ${\max \limits } (\Vert \boldsymbol {\mathcal {V}}_{n}^{(M)} \Vert _{F},\Vert \boldsymbol {\mathcal {W}}_{n}^{(M)} \Vert _{F})$ remains small as n increases. This can be obtained by employing an appropriate strategy to rescale the basisvectors in $\boldsymbol {\mathcal {V}}_{n}^{(M)}$ and $\boldsymbol {\mathcal {W}}_{n}^{(M)}$, i.e., by γ_k+ 1 in Algorithm 1. In this section we will choose γ_k+ 1 = I_M for all k, i.e., no rescaling. An effective rescaling strategy can improve on the numerical results reported below, but developing such a strategy is subject to future research.

To measure the quality of the recurrences (1), we use

$$ \begin{array}{@{}rcl@{}} \text{err}_{\mathrm{V}} := \frac{\Vert \boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{V}}_{n}^{(M)} - \boldsymbol{\mathcal{V}}_{n}^{(M)}*\boldsymbol{\mathcal{T}}_{n}^{(M)} - \widetilde{\boldsymbol{\mathcal{V}}}_{n}^{(M)}\Vert_{F}}{\max(\Vert \boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{V}}_{n}^{(M)} \Vert_{F}, \Vert\boldsymbol{\mathcal{V}}_{n}^{(M)}*\boldsymbol{\mathcal{T}}_{n}^{(M)} + \widetilde{\boldsymbol{\mathcal{V}}}_{n}^{(M)}\Vert_{F})},\\ \text{err}_{\mathrm{W}} :=\frac{\Vert \boldsymbol{\mathcal{W}}_{n}^{(M)}*\boldsymbol{\mathcal{A}} - \boldsymbol{\mathcal{T}}_{n}^{(M)}*\boldsymbol{\mathcal{W}}_{n}^{(M)} - \widetilde{\boldsymbol{\mathcal{W}}}_{n}^{(M)}\Vert_{F}}{\max(\Vert \boldsymbol{\mathcal{W}}_{n}^{(M)}*\boldsymbol{\mathcal{A}} \Vert_{F}, \Vert \boldsymbol{\mathcal{T}}_{n}^{(M)}*\boldsymbol{\mathcal{W}}_{n}^{(M)} + \widetilde{\boldsymbol{\mathcal{W}}}_{n}^{(M)}\Vert_{F})}. \end{array} $$

As a measure for the Matching Moment Property, see Theorem 2, we use

$$ \text{err}_{\mathrm{M}}(k) := \frac{\Vert W^{D} * (\boldsymbol{\mathcal{A}}^{k_{*}}) * V - {E_{1}^{D}} * (\boldsymbol{\mathcal{T}}_{n}^{(M)})^{k_{*}}* E_{1} \Vert_{F}}{\max(\Vert W^{D} * (\boldsymbol{\mathcal{A}}^{k_{*}}) * V \Vert_{F}, \Vert {E_{1}^{D}} * (\boldsymbol{\mathcal{T}}_{n}^{(M)})^{k_{*}}* E_{1}\Vert_{F}) }, $$

which should be close to zero for $k=0,\dots , 2n-1$.

Finally, to quantify the quality of the solution, we consider as error measure for (11) the quantity

$$ \text{err}_{\text{sol}} := \frac{\Vert \hat{\mathbf{s}}-\mathbf{s}^{(M)}_{n} \Vert_{2}}{\Vert \hat{\mathbf{s}} \Vert_{2}}, $$

where, if no analytic expression is available, an approximation of s is obtained by using ode45 in Matlab^®;. In the formula above, ∥⋅∥₂ stands for the usual Euclidean norm. The rate at which err_sol decreases as M increases is expected to be $\mathcal {O}(h)= \mathcal {O}(1/M)$, i.e., the accuracy of the discretization used here.

5.2 Proof of concept

As a proof of concept, we test our proposal on two problems which originally appeared in [23]. In both experiments a discretization with M points is used and we run n iterations of Algorithm 1 with γ_k+ 1 ≡ I_M. This produces the tensor $\boldsymbol {\mathcal {T}}_{n}^{(M)}$ defined in (5) with coefficients $\boldsymbol {\alpha _{1}}^{(M)},\dots , \boldsymbol {\alpha _{n}}^{(M)}$ and $\boldsymbol {\beta _{2}}^{(M)},\dots , \boldsymbol {\beta _{n}}^{(M)}$, depending on M. For the two experiments considered here the result of the ⋆-Lanczos algorithm [23] is known. The coefficients resulting from the latter algorithm are bivariate functions $\alpha _{1}(t,s),\dots ,\alpha _{n}(t,s)$ and $\beta _{2}(t,s),\dots ,\beta _{n}(t,s)$, because ⋆-Lanczos is a symbolic algorithm. The tensor Lanczos algorithm is a discretization of the ⋆-Lanczos algorithm, which means that $\boldsymbol {\alpha _{i}}^{(M)}$ and $\boldsymbol {\beta _{i}}^{(M)}$ can be seen as discretizations of the functions α_i(t,s) and β_i(t,s), respectively. Consider the evaluation of these functions on the mesh τ_i:

$$ \boldsymbol{ \hat{\alpha}}_{i} := \begin{bmatrix} \alpha(\tau_{i},\tau_{j}) \end{bmatrix}_{i,j=1}^{M}, \qquad \boldsymbol{ \hat{\beta}}_{i} := \begin{bmatrix} \beta(\tau_{i},\tau_{j}) \end{bmatrix}_{i,j=1}^{M}, $$

then, we can define the errors $\frac {\Vert \boldsymbol {\hat {\alpha }}_{i} - \boldsymbol {\alpha }_{i}^{(M)}\Vert _{2}}{\Vert \boldsymbol {\hat {\alpha }}_{i} \Vert _{2}}$, $i=1,\dots ,n$, and $\frac {\Vert \boldsymbol {\hat {\beta }}_{i} - \boldsymbol {\beta }_{i}^{(M)}\Vert _{2}}{\Vert \boldsymbol { \hat {\beta }}_{i} \Vert _{2}}$, $i=2^,\dots ,n$. These errors will be used as a measure for the accuracy of the computed tensor $\boldsymbol {\mathcal {T}}_{n}^{(M)}$. The number of iterations n is chosen equal to the problem size N, which allows us to compare all the available functions $\alpha _{1}(t,s),\dots , \alpha _{N}(t,s)$ and $\beta _{2}(t,s),\dots , \beta _{N}(t,s)$ with the elements in $\boldsymbol {\mathcal {T}}_{N}^{(M)}$ and track the convergence rate with M of the latter.

5.2.1 Time-independent matrix

Consider a constant matrix and starting vectors

$$ {\boldsymbol{\mathsf{A}}}(t) = \begin{bmatrix} -1 & \phantom{0.}1 & \phantom{0.}1\\ \phantom{0.}1 & \phantom{0.}0 & \phantom{0.}1\\ \phantom{0.}1 & \phantom{0.}1 & -1 \end{bmatrix},\qquad \mathbf{v},\mathbf{w} = \begin{bmatrix} 1\\ 0\\ 0 \end{bmatrix}, $$

and the interval I = [0,1]. The inputs of Algorithm 1 are the starting hypervectors v ⊗ I_M,w ⊗ I_M and the tensor $\boldsymbol {\mathcal {A}}$ whose components $\boldsymbol {\mathcal {A}}_{i_{1},i_{2},:,:}$ are defined as

$$ \boldsymbol{\mathcal{A}}_{{i_{1},i_{2}},:,:} = \begin{cases} \boldsymbol{\theta}, &\text{ if } {i_{1}=i_{2}} = 1 \text{ or } {i_{1}=i_{2}}=3\\ \boldsymbol{0}, &\text{ if } {i_{1}=i_{2}}=2 \\ -\boldsymbol{\theta}, &\text{ otherwise} \end{cases}, $$

where $\boldsymbol {0}\in \mathbb {R}^{M\times M}$ is the null matrix. The tensor $\boldsymbol {\mathcal {A}}$ is obtained by sampling the matrix-valued function A(t) on the M point mesh (9) and following the definition in (10). The output for n = N = 3 iterations is $\boldsymbol {\mathcal {T}}_{N}^{(M)}$. Table 3 reports the Krylov error measures, which behave as expected for the recurrence measures. The loss of biorthogonality observed for increasing values of M is presumably due to the fact that no rescaling is used in the algorithm.

Table 3 Krylov error measures for time-independent matrix

Full size table

On the other hand, this loss of biorthogonality does not compromise the moment matching capabilities of $\boldsymbol {\mathcal {T}}_{N}^{(M)}$, as it becomes evident from Table 4 where we report err_M(k) for k ≤ 5 = 2n − 1.

Table 4 Measure for moment matching err_M(k) for time-independent matrix. Entries for k = 0,1,2 are omitted since they are equal to zero

Full size table

Moreover, as the values reported in Table 5 confirm, we can observe that the elements β_i converge at the expected rate of $\mathcal {O}(1/M)$. The elements α_i are very accurate for M = 10 whereas for larger M, the accuracy of α_i decreases: this decrease is, presumably, the result of error propagation in the numerical algorithm. This error is still smaller than the expected order of $\mathcal {O}(1/M)$.

Table 5 Error on the elements of the tridiagonal tensor for time-independent matrix

Full size table

Moreover, in this particular case, the analytical solution to the ODE is known; see [23]:

$$ \hat{\mathbf{s}} = \begin{bmatrix} \left( \exp(A\tau_{1}) \right)_{11} & \left( \exp(A\tau_{2}) \right)_{11} & {\dots} & \left( \exp(A\tau_{M}) \right)_{11} \end{bmatrix}^{\top}, $$

with $\left (\exp (At) \right )_{11} = -\frac {1}{2} \sinh (2t) + \frac {1}{2}\cosh (2t) + \frac {1}{2} \cosh (\sqrt {2}t)$. Hence, it is possible to compare this exact solution with (11). Note that for n = 3 the ∗-resolvent is given by the continued fraction:

$$ R_{\ast}(\boldsymbol{\mathcal{T}_{3}})_{1,1,:,:} = \left( I_{M} - \boldsymbol {\alpha}_{1} - \left( I_{M} - \boldsymbol {\alpha}_{2} - \left( I_{M} - \boldsymbol {\alpha}_{3}\right)^{-1} \boldsymbol {\beta}_{3} \right)^{-1} \boldsymbol {\beta}_{2} \right)^{-1}. $$

Table 6 shows the error measure err_sol for increasing M, which convergences at the expected rate $\mathcal {O}(1/M)$.

Table 6 Error of approximation to the quantity of interest w^HU(t)v for time-independent matrix

Full size table

5.2.2 Time-dependent matrix

Consider the time-dependent matrix

$$ \tilde{\boldsymbol{\mathsf{A}}}(t) = \begin{bmatrix} \cos(t) & 0 & 1 & 2 & 1\\ 0 & \cos(t)-t & 1-3t & t & 0\\ 0 & t & 2t+\cos(t) & 0 & 0\\ 0 & 1 & 2t+1 & t + \cos(t) & t\\ t & -t-1 & -6t-1 & 1-2t & \cos(t)-2t \end{bmatrix}, $$

the starting vectors $\mathbf {v} = \mathbf {w} = \begin {bmatrix} 1& 0& 0& 0 & 0 \end {bmatrix}^{\top }$, and the interval I = [10^− 4,1]. As it becomes apparent from the results reported in Table 7, for this particular experiment, we obtain that the Krylov error measures and the recurrence measures are small whereas the biorthogonality measure is large.

Table 7 Krylov error measures for time-dependent matrix

Full size table

The loss of orthogonality is an inherent feature of Lanczos-like algorithms, and it does not necessarily compromise algorithm’s capability to produce an approximation to the bilinear form. Indeed, Table 8 confirms that the matching moment property is not affected by the loss of ∗-biorthogonality.

Table 8 Measure for moment matching err_M(k) for time-dependent matrix

Full size table

On the other hand, the results presented in Table 9, where we report the error measures for the coefficients computed by Algorithm 1, show that the loss of ∗-biorthogonality of the computed bases has a limited impact for the convergence of the algorithm to the solution. Indeed, in this case, for all β_i the expected convergence rate is observed whereas, for α_i, only i = 1,2,3 show the expected decrease in the error measure.

Table 9 Error on the elements of the tridiagonal tensor for time-dependent matrix

Full size table

Table 10 shows that the approximation to the quantity of interest converges at the rate $\mathcal {O}(1/M)$. Hence, the loss of biorthogonality and the inaccurate coefficients of the tridiagonal tensor did not compromise this approximation.

Table 10 Error of approximation to the quantity of interest w^HU(t)v for time-dependent matrix

Full size table

5.3 NMR experiments

Nuclear magnetic resonance (NMR) spectroscopy studies the structure and dynamics of molecules by looking at nuclear spins [31, 39]. Computer simulations of NMR experiments are important because they can improve the design and analysis of laboratory experiments [50]. In this section, three small, realistic examples arising from NMR spectroscopy [3] are discussed. The ODE that governs the dynamics of nuclear spins during NMR spectroscopy is the Schrödinger equation

$$ \frac{d}{dt} \phi(t) = -\imath 2\pi H(t) \phi(t), \quad t\in\left[0,\tau_{\exp}\right] $$

where H(t) is the so-called Hamiltonian, ϕ(t) the wave function, $\tau _{\exp }$ the duration of the experiment and $\imath = \sqrt {-1}$. The size of the Hamiltonian is related to the number of nuclear spins present in the system, for l nuclear spins H(t) is of the size 2^l × 2^l. Hence, H(t) grows exponentially with the number of spins, but it is a sparse matrix, making it an ideal candidate for a Lanczos-like algorithm.

The experiments discussed in this section use M = 500 discretization points, because of memory constraints. It is important to note that such memory constraints may be overcome using the Tensor Train approximation presented in Section 5.4. The number of iterations of the tensor Lanczos algorithm is chosen to obtain the maximal attainable accuracy, which is determined by the discretization scheme with M = 500. Since the discretization used here has an accuracy of $\mathcal {O}(1/M)$, the smallest number of iterations n such that err_sol is of order 10^− 3 suffices. Choosing larger n will not decrease err_sol further for a fixed M and will increase the computational cost.

5.3.1 Experiment 1: Weak coupling

Consider four nuclear spins with heteronuclear dipolar couplings. In this framework, the Hamiltonian for a magic angle spinning (MAS) experiment [29] is the diagonal matrix

$$ \begin{array}{@{}rcl@{}} H(t) = \text{diag}\left[\{f_{k}(t)\}_{k=1}^{16}\right],\quad f_{k}(t) = \alpha_{k} + \beta_{k} \cos(2\pi \nu t) + \gamma_{k} \cos(4\pi \nu t), \end{array} $$

with $\alpha _{k},\beta _{k},\gamma _{k}\in \mathbb {R}$ and ν = 10⁴. The diagonal matrix A(t) = −ı2πH(t) commutes with itself at all times and thus the solution U(t) can be computed

$$ \begin{array}{@{}rcl@{}} U(t) &= \text{diag}\left[\left\{\exp\left( -\imath \alpha_{k} t -\imath \frac{\beta_{k}}{2\pi\nu}\sin(2\pi \nu t)-\imath \frac{\gamma_{k}}{4\pi \nu} \sin(4\pi \nu t)\right) \right\}_{k=1}^{16}\right]. \end{array} $$

The starting vectors are chosen to excite and measure the lowest oscillatory components in U(t): $\mathbf {w} =\mathbf {v} = \begin {bmatrix} 0 & 1 & 1 & 0 & 1 & 1& {\dots } & 0& 1& 1 \end {bmatrix}^{\top }$. A typical experiment would run for a time of the order 10² seconds. Since the problem is (highly) oscillatory and the current discretization requires many points to accurately compute a solution, we choose to restrict the experiment time to $\tau _{\exp }=5\times 10^{-5}$. This is a valid approach since the total time interval of 10^− 2 can be split into subintervals of length 5 × 10^− 5 and the solutions on the subintervals can be combined to obtain the solution on the whole interval.

Algorithm 1 is run for n = 3 iterations and the corresponding Krylov error measures are shown in Table 11. A first observation concerns the fact that going from M = 5 to M = 50 a large decrease of the biorthogonality measure is observed. This is due to the fact that when discretizing with fewer discretization points, e.g., M = 5, the original matrix in the ODE − ı2πH(t) is translated into a simpler (and inaccurate) discretized input, for which the tensor Lanczos iteration converges fast. The discretization M = 50 represents the original input better, as is suggested by the stagnation of err_o going from M = 50 to M = 500.

Table 11 Error measures for Experiment 1

Full size table

The error err_sol is computed using the analytical solution w^HU(t)v evaluated in the discretization points and decays at the expected rate $\mathcal {O}(1/M)$. Figure 1 shows $\hat {\mathbf {s}}$ and the approximation $\mathbf {s}^{(M)}_{n}$ as a function of time; such approximation clearly converges for increasing M (x-axis reports time).

5.3.2 Experiment 2: Strong coupling

MAS with four nuclear spins with homonuclear dipolar couplings leads to the Hamiltonian

$$ \begin{array}{@{}rcl@{}} H(t) = \text{diag}\left[\{\alpha_{k}\}_{k=1}^{16}\right]+ B \cos(2\pi \nu t) + C \cos(4\pi \nu t), \end{array} $$

where $\alpha _{k}\in \mathbb {R}$ is a scalar, and $B,C\in \mathbb {R}^{16\times 16}$ are matrices with a sparsity structure as shown in Fig. 2.

A typical experiment time is 10^− 2 seconds and ν = 10⁴. The simulated experiment time is $\tau _{\exp } = 5 \times 10^{-6}$, the size of the Krylov subspace is k = 4 and $\mathbf {v}=\mathbf {w}=\begin {bmatrix} 0 & 1 & 1 & 0 & 1 & 1& {\dots } & 0& 1& 1 \end {bmatrix}^{\top }$. The corresponding error measures are shown in Table 12, which show a similar behavior as for Experiment 1. The measure err_sol is computed by comparing $\hat {\mathbf {s}}$ to the solution obtained by ode45.

Table 12 Error measures for Experiment 2

Full size table

5.3.3 Experiment 3: Uncoupled spins under a pulse wave

The Hamiltonian for four uncoupled spins under a pulse wave is

$$ \begin{array}{@{}rcl@{}} H(t) &= &\text{diag}\left[\{\alpha_{k}\}_{k=1}^{16}\right]+ B(0.5+\cos(4t) + \sin(10t) - 0.4 \sin(16t))\\ &\quad+& C(\sin(4t) + \cos(8t) + 2 \sin(12t)), \end{array} $$

with $\alpha _{k}\in \mathbb {R}$ and $B\in \mathbb {R}^{16\times 16}$, $C\in \mathbb {C}^{16 \times 16}$ have a structure as shown in Fig. 3.

A practical experiment time ranges from 10^− 6 to 10^− 3 seconds, here $\tau _{\exp } = 10^{-3}$ is used. The starting vectors are $\mathbf {v}= \mathbf {w}= \begin {bmatrix} 1 & {\dots } 1 \end {bmatrix}^{\top }$ and n = 4 iterations of the tensor Lanczos algorithm are run. The Krylov error measures shown in Table 13 behave similarly to the measures observed for Experiments 1 and 2. The measure err_sol is obtained via ode45 and shows a convergence rate a bit slower than $\mathcal {O}(1/M)$. The slower convergence rate can, in part, be explained by the fact that a comparison is made with the ode45 solution. Additional errors are incurred when comparing s with $\hat {\mathbf {s}}$, because the former is only available in the points τ_i and the latter is available only in points which are determined by ode45.

Table 13 Error measures for Experiment 3

Full size table

5.4 Tensor Train approximations

As briefly mentioned in the Introduction, despite the fact that the block matrix and the tensor formulation of the problem (1) are mathematically equivalent, the tensor formulation introduced and analyzed in this work, allows the exploitation of particular low parametric representations. The aim of this section is indeed to show that for all the examples previously presented, the resulting tensors can be accurately and conveniently approximated using a low parametric representation called Tensor Train (TT) format [41, 42].

As a matter of fact, multilinear algebra, tensor analysis, and the theory of tensor approximations play increasingly important roles in nowadays computational mathematics and numerical analysis, thereby attracting tremendous interest in recent years [12]. In this panorama, Tensor Train (TT) approximations are a powerful technique for dealing with the curse of dimensionality, i.e., the particularly unpleasant feature where the number of unknowns and the computational complexity grow exponentially when the dimension of the problem increases.

Before presenting the computational results, we briefly survey the main features of the TT representation, addressing the interested reader to the surveys [11, 12]. We consider the Tensor Train (TT) format [41] for the tensors of interest in this work. Specifically, a 4-mode $\boldsymbol {\mathcal {A}} \in \mathbb {C}^{N_{1} \times N_{2} \times M \times M }$ tensor is expressed in TT format when

$$ \boldsymbol{\mathcal{A}}_{i_{1},i_{2},i_{3},i_{4}}=G_{1}(i_{1})G_{2}(i_{2})G_{3}(i_{3})G_{4}(i_{4}) $$

where G_k(i_k) is a matrix of dimension r_k− 1 × r_k and r₀ = r₄ = 1. The numbers r_k are called TT-ranks, and G_k(i_k) are the cores of the TT decomposition. If r_k ≤ r, n_k ≤ n, then storing the TT representation requires memorizing ≤ 4nr² numbers. If r is small, then the memory requirement is much smaller than storing the full tensor, i.e., storing n⁴ numbers.

It is important to note that the TT representation allows to approximate various tensor operations efficiently, see, e.g., [41, Sec. 4]. In this paper, we do not propose a low parametric TT version of Algorithm 1. To be efficient, such a TT version would need a TT representation of the tensor products used in Algorithm 1. This paper aims to show that Algorithm 1 works; further enhancements are postponed to future investigations.

In Tables 14, 15, and 16 we present the TT ranks for all the tensors considered in Section 5.3. In particular, the tables present the details of the TT approximations obtained using the TT-toolbox [41] when the required accuracy for the approximation is set to 10^− 5 and 10^− 10. As becomes evident from the presented results, all the considered tensors are amenable of a low parametric representation provided by the TT format and, indeed, for all the presented results the Compression Factor (C.F.), which is defined as $({\sum }_{k=1}^{4} r_{k-1} \times n_{k} \times r_{k})/nnz(\boldsymbol {\mathcal {A}})$, with $nnz(\boldsymbol {\mathcal {A}})$ the number of nonzero elements of $\boldsymbol {\mathcal {A}}$, lies in the interval (10^− 3,0.5). It is important to observe that when increasing the accuracy from 10^− 5 to 10^− 10 the C.F. does not significantly change, suggesting the interesting fact that for the considered tensors, the TT format is closer to an exact representation rather than an approximation. Finally, it is important to note that the ranks of the TT approximations are robust across the choices of the parameter ν (cfr. the TT ranks in Table 14 and in Table 15) and to note also that, for some of the considered problems, the number of parameters needed for the approximation can be two orders of magnitude smaller than $nnz(\boldsymbol {\mathcal {A}})$; see Tables 15 and 16.

Table 14 TT ranks and compression factor for Experiment 1

Full size table

Table 15 TT ranks and compression factor for Experiment 2

Full size table

Table 16 TT ranks and compression factor for Experiment 3

Full size table

6 Conclusions

In this work we introduced a non-Hermitian Lanczos algorithm for tensors and we provided the corresponding theoretical analysis. In particular, after introducing all the necessary theoretical background, we are able to interpret such Lanczos-type process in terms of tensor polynomials and to prove the related matching moment property. A series of numerical experiments performed on real-world problems confirm the effectiveness of our approach. Using a linearly converging approximation for the inputs, the algorithm produces a linearly converging approximation of the bilinear form w^HU(t)v, where U(t) is the solution of the ODE (1). More accurate approximation schemes for the inputs are currently being developed by some of the paper’s authors, possibly leading to faster convergence. Moreover, in all the considered examples, the related tensors show a low parametric structure in terms of Tensor Train representation. This important feature paves the path for future efficiency improvements of our proposal where this representation is fully exploited in the Lanczos-type procedure.

References

Autler, S.H., Townes, C.H.: Stark effect in rapidly varying fields. Phys. Rev. 100, 703–722 (1955)
Google Scholar
Bader, P., Iserles, A., Kropielnicka, K., Singh, P.: Efficient methods for linear Schrödinger equation in the semiclassical regime with time-dependent potential. Proc R Soc A Math Phys Eng Sci 472, 20150733 (2016)
MATH Google Scholar
Baligács, E., Bonhomme, C: https://github.com/BaligacsEni/TOMEexamples.git. Accessed 1st July 2022 (2022)
Benner, P., Cohen, A., Ohlberger, M., Willcox, K.: Model Reduction and Approximation: Theory and Algorithms. Computational Science and Engineering. SIAM, Philadelphia (2017)
MATH Google Scholar
Blanes, S.: High order structure preserving explicit methods for solving linear-quadratic optimal control problems. Numer. Algorithms 69(2), 271–290 (2015)
MathSciNet MATH Google Scholar
Blanes, S., Casas, F.: A concise introduction to geometric numerical integration. CRC Press, Bocan Raton (2017)
Blanes, S., Casas, F., Oteo, J., Ros, J.: The Magnus expansion and some of its applications. Phys. Rep. 470(5), 151–238 (2009)
MathSciNet Google Scholar
Brezinski, C., Redivo-Zaglia, M., Sadok, H.: Avoiding breakdown and near-breakdown in Lanczos type algorithms. Numer. Algorithms 1(3), 261–284 (1991)
MathSciNet MATH Google Scholar
Brezinski, C., Redivo Zaglia, M., Sadok, H.: A breakdown-free Lanczos type algorithm for solving linear systems. Numer. Math. 63(1), 29–38 (1992)
MathSciNet MATH Google Scholar
Budd, C., Iserles, A., Nørsett, S.: On the solution of linear differential equations in Lie groups. Philos. Trans. R. Soc. London. Series A: Mathematical Phys. Eng. Sci. 357(1754), 983–1019 (1999)
MathSciNet MATH Google Scholar
Cichocki, A., Lee, N., Oseledets, I., Phan, A.-H., Zhao, Q., Mandic, D.: Low-rank tensor networks for dimensionality reduction and large-scale optimization problems: Perspectives and challenges part 1. arXiv:1609.00893(2016)
Cichocki, A., Mandic, D., Caiafa, C., Phan, A., Zhou, G., Zhao, Q., De Lathauwer, L.: Tensor decompositions for signal processing applications. IEEE Signal Processing Mag. (2013)
Cohen, D., Jahnke, T., Lorenz, K., Lubich, C.: Numerical integrators for highly oscillatory Hamiltonian systems: A review. In: Mielke, A. (ed.) Analysis, Modeling and Simulation of Multiscale Problems, pp 553–576. Springer, Berlin (2006)
Corless, M., Frazho, A.: Linear Systems and Control: An Operator Perspective. Pure and Applied Mathematics. Marcel Dekker, New York (2003)
MATH Google Scholar
Degani, I., Schiff, J.: RCMS: Right correction Magnus series approach for oscillatory ODEs. J. Comput. Appl. Math. 193(2), 413–436 (2006)
MathSciNet MATH Google Scholar
Draux, A.: Formal orthogonal polynomials revisited. Applic. Numer. Algorithms 11(1), 143–158 (1996)
MathSciNet MATH Google Scholar
Dyson, F.J.: Divergence of perturbation theory in quantum electrodynamics. Phys. Rev. 85(4), 631–632 (1952)
MathSciNet MATH Google Scholar
Feng, J., Yang, L.T., Zhang, R., Qiang, W., Chen, J.: Privacy preserving high-order Bi-Lanczos in cloud-fog computing for industrial applications. IEEE Transactions on Industrial Informatics, 1–1 (2020)
Freund, R.W., Gutknecht, M.H., Nachtigal, N.M.: An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices. SIAM J. Sci Comput. 14(1), 137–158 (1993)
MathSciNet MATH Google Scholar
Frommer, A., Lund, K., Szyld, D.B.: Block Krylov subspace methods for functions of matrices. Electron. Trans. Numer Anal. 47, 100–126 (2017)
MathSciNet MATH Google Scholar
Giscard, P.-L., Lui, K., Thwaite, S.J., Jaksch, D.: An exact formulation of the time-ordered exponential using path-sums. J. Math. Phys. 56(5), 053503 (2015)
MathSciNet MATH Google Scholar
Giscard, P.-L., Pozza, S.: Lanczos-like algorithm for the time-ordered exponential: The ∗-inverse problem. Appl. Math. 65(6), 807–827 (2020)
MathSciNet MATH Google Scholar
Giscard, P.-L., Pozza, S.: A Lanczos-like method for non-autonomous linear ordinary differential equations. arXiv:1909.03437 (2021)
Giscard, P.-L., Pozza, S.: Tridiagonalization of systems of coupled linear differential equations with variable coefficients by a Lanczos-like method. Linear Algebra Appl. 624, 153–173 (2021)
MathSciNet MATH Google Scholar
Golub, G.H., Meurant, G.: Matrices, Moments and Quadrature with Applications. Princeton Ser. Appl. Math. Princeton University Press, Princeton (2010)
MATH Google Scholar
Guide, M.E., Ichi, A.E., Jbilou, K., Sadaka, R.: On tensor GMRES and Golub-Kahan methods via the T-product for color image processing. Electron. J. Linear Algebra 37, 524–543 (2021)
MathSciNet MATH Google Scholar
Gutknecht, M.H.: A completed theory of the unsymmetric Lanczos process and related algorithms. I. SIAM J. Matrix Anal. Appl. 13(2), 594–639 (1992)
MathSciNet MATH Google Scholar
Gutknecht, M.H.: A completed theory of the unsymmetric Lanczos process and related algorithms. II. SIAM J. Matrix Anal. Appl. 15(1), 15–58 (1994)
MathSciNet MATH Google Scholar
Hafner, S., Spiess, H.-W.: Advanced solid-state NMR spectroscopy of strongly dipolar coupled spins under fast magic angle spinning. Concepts Magnetic Resonance 10(2), 99–128 (1998)
Google Scholar
Hochbruck, M., Lubich, C.: Exponential integrators for quantum-classical molecular dynamics. BIT Numer. Math. 39(4), 620–645 (1999)
MathSciNet MATH Google Scholar
Hore, P.J.: NMR principles. In: Lindon, J.C. (ed.) Encyclopedia of Spectroscopy and Spectrometry, 2nd edn, pp 1833–1840. Academic Press, Oxford (1999)
Hortaçsu, M : Heun functions some of their applications in physics. Adv High Energy Phys 2018, 8621573 (2018)
MathSciNet MATH Google Scholar
Huang, B., Xie, Y., Ma, C.: Krylov subspace methods to solve a class of tensor equations via the Einstein product. Numer. Linear Algebra Appl. 26, 4 (2019)
MathSciNet MATH Google Scholar
Iserles, A.: On the global error of discretization methods for highly-oscillatory ordinary differential equations. BIT Numer. Math. 42(3), 561–599 (2002)
MathSciNet MATH Google Scholar
Iserles, A.: On the method of Neumann series for highly oscillatory equations. BIT Numer. Math. 44(3), 473–488 (2004)
MathSciNet MATH Google Scholar
Iserles, A., Munthe-Kaas, H.Z., Nørsett, S.P., Zanna, A.: Lie-group methods. Acta Numer. 9, 215–365 (2000)
MathSciNet MATH Google Scholar
Kwakernaak, H., Sivan, R.: Linear Optimal Control Systems, vol. 1. Wiley-interscience, New York (1972)
MATH Google Scholar
Lauder, M., Knight, P., Greenland, P.: Pulse-shape effects in intense-field laser excitation of atoms. Opt Acta 33(10), 1231–1252 (1986)
Google Scholar
Levitt, M.H.: Spin Dynamics: Basics of Nuclear Magnetic Resonance, 2nd edn. Wiley, Chichester (2008)
Google Scholar
Magnus, W.: On the exponential solution of differential equations for a linear operator. Comm. Pure Appl. Math. 7(4), 649–673 (1954)
MathSciNet MATH Google Scholar
Oseledets, I.: Tensor-train decomposition. SIAM J. Sci. Comput. 33(5), 2295–2317 (2011)
MathSciNet MATH Google Scholar
Oseledets, I., Tyrtyshnikov, E.: TT-cross approximation for multidimensional arrays. Linear Algebra Appl. 432(1), 70–88 (2010)
MathSciNet MATH Google Scholar
Parlett, B.N., Taylor, D.R., Liu, Z.A.: A look-ahead Lanczos algorithm for unsymmetric matrices. Math Comp. 44(169), 105–124 (1985)
MathSciNet MATH Google Scholar
Pozza, S., Pranić, M.: The Gauss quadrature for general linear functionals, Lanczos algorithm, and minimal partial realization. Numer. Algorithms 88, 647–678 (2021)
MathSciNet MATH Google Scholar
Pozza, S., Pranić, M. S., Strakoš , Z.: Gauss quadrature for quasi-definite linear functionals. IMA J. Numer Anal. 37(3), 1468–1495 (2017)
MathSciNet MATH Google Scholar
Reichel, L., Ugwu, U.O.: Tensor Arnoldi-Tikhonov and GMRES-type methods for ill-posed problems with a t-product structure. Journal of Scientific Computing, 90(1) dec (2021)
Reid, W.T.: Riccati matrix differential equations and non-oscillation criteria for associated linear differential systems. Pacific J. Math. 13(2), 665–685 (1963)
MathSciNet MATH Google Scholar
Ruymbeek, K., Meerbergen, K., Michiels, W.: Tensor-Krylov method for computing eigenvalues of parameter-dependent matrices. J. Comput. Appl Math. 408, 113869 (2022)
MathSciNet MATH Google Scholar
Shirley, J.H.: Solution of the Schrödinger equation with a Hamiltonian periodic in time. Phys. Rev. 138, B979–B987 (1965)
Google Scholar
Smith, S.A., Palke, W.E., Gerig, J.T.: The Hamiltonians of NMR. Part I. Concepts in Magnetic Resonance 4(2), 107–144 (1992)
Google Scholar
Taylor, D.R.: Analysis of the Look Ahead Lanczos Algorithm. PhD thesis. University of, California, Berkeley (1982)
Google Scholar
Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Monographs on Numerical Analysis. The Clarendon Press Oxford University Press, New York (1988)
Google Scholar
Xie, Q., Hai, W.: Analytical results for a monochromatically driven two-level system. Phys. Rev. A 82, 032117 (2010)
Google Scholar

Download references

Acknowledgements

The authors want to thank Enikö Baligács and Christian Bonhomme (Laboratoire de chimie de la matière condensée de Paris, Sorbonne University) for providing the data from real-world applications used in Section 5. This work was supported by Charles University Research programs No. PRIMUS/21/SCI/009 and UNCE/SCI/023, and by the Magica project ANR-20-CE29-0007 funded by the French National Research Agency. The author M.R.-Z. is a member of the GNCS-INdAM group.

Author information

Authors and Affiliations

School of Mathematics, The University of Edinburgh, Peter Guthrie Tait Road, Edinburgh, EH9 3FD, UK
Stefano Cipolla
Faculty of Mathematics and Physics, Charles University, Sokolovská 83, Prague, 186 75, Czech Republic
Stefano Pozza & Niel Van Buggenhout
Department of Mathematics “Tullio Levi-Civita”, University of Padua, Via Trieste 63, Padua, 35121, Italy
Michela Redivo-Zaglia

Authors

Stefano Cipolla
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Pozza
View author publications
You can also search for this author in PubMed Google Scholar
Michela Redivo-Zaglia
View author publications
You can also search for this author in PubMed Google Scholar
Niel Van Buggenhout
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Stefano Cipolla, Stefano Pozza, Michela Redivo-Zaglia or Niel Van Buggenhout.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 A.1 The ∗-product associativeness

In the following, we prove the three statements of Lemma 1. First, we have

$$ \begin{array}{llllll} & ((\boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{A}})*{A})_{i_{1},:,:}=\sum\limits_{k=1}^{N_{1}} (\boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{A}})_{i_{1},k,:,:}A_{k,:,:}=\sum\limits_{k=1}^{N_{1}}(\sum\limits_{j=1}^{N_{1}} \boldsymbol{\mathcal{A}}_{i_{1},j,:,:}\boldsymbol{\mathcal{A}}_{j,k,:,:})A_{k,:,:}= \\ & \sum\limits_{j=1}^{N_{1}} \boldsymbol{\mathcal{A}}_{i_{1},j,:,:}(\sum\limits_{k=1}^{N_{1}}\boldsymbol{\mathcal{A}}_{j,k,:,:}A_{k,:,:})=\sum\limits_{j=1}^{N_{1}} \boldsymbol{\mathcal{A}}_{i_{1},j,:,:}(\boldsymbol{\mathcal{A}}*A)_{j,:,:}=(\boldsymbol{\mathcal{A}}*(\boldsymbol{\mathcal{A}}*A))_{i_{1},:,:}. \end{array} $$

The second statement is proved by direct inspection as follows:

$$ \begin{array}{llllll} & ((B^{D} * \boldsymbol{\mathcal{A}})^{D} *A)_{:,:}= \sum\limits_{k=1}^{N_{2}}(B^{D} * \boldsymbol{\mathcal{A}})^{D}_{k,:,:}A_{k,:,:} = \sum\limits_{k=1}^{N_{2}}(\sum\limits_{j=1}^{N_{1}} B^{D}_{j,:,:} \boldsymbol{\mathcal{A}}_{j,k,:,:})A_{k,:,:}=\\ & \sum\limits_{j=1}^{N_{1}} B^{D}_{j,:,:}(\sum\limits_{k=1}^{N_{2}} \boldsymbol{\mathcal{A}}_{j,k,:,:}A_{k,:,:})=\sum\limits_{j=1}^{N_{1}} B^{D}_{j,:,:}(\mathcal{A}*A)_{j,:,:} = B^{D} * (\boldsymbol{\mathcal{A}} *A). \end{array} $$

Finally, we have the following equality

$$ \begin{array}{llllll} &((\boldsymbol{\mathcal{C}}*\boldsymbol{\mathcal{A}})*\boldsymbol{\mathcal{B}})_{i_{1},i_{2},:,:}=\sum\limits_{k=1}^{N_{2}}(\boldsymbol{\mathcal{C}}*\boldsymbol{\mathcal{A}})_{i_{1},k,:,:}\boldsymbol{\mathcal{B}}_{k,i_{2},:,:} = \sum\limits_{k=1}^{N_{2}}(\sum\limits_{j=1}^{N_{1}}\boldsymbol{\mathcal{C}}_{i_{1},j,:,:}\boldsymbol{\mathcal{A}}_{j,k,:,:})\boldsymbol{\mathcal{B}}_{k,i_{2},:,:}=\\ &\sum\limits_{j=1}^{N_{1}}\boldsymbol{\mathcal{C}}_{i_{1},j,:,:}(\sum\limits_{k=1}^{N_{2}}\boldsymbol{\mathcal{A}}_{j,k,:,:}\boldsymbol{\mathcal{B}}_{k,i_{2},:,:})=\sum\limits_{j=1}^{N_{1}}\boldsymbol{\mathcal{C}}_{i_{1},j,:,:}(\boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{B}})_{j,i_{2},:,:}=(\boldsymbol{\mathcal{C}}*(\boldsymbol{\mathcal{A}}*\boldsymbol{\mathcal{B}}))_{i_{1},i_{2},:,:}. \end{array} $$

This concludes the proof of Lemma 1.

1.2 A.2 Matching moment property

We prove Theorem 2 by giving a polynomial interpretation of the Lanczos process for tensors. The proof follows the same principles as the proof given in [23] for a different but analogous case. For simplicity, we consider the case in which the matrices γ_k are set equal to the matrix identity. The proof can be easily extended to the general case. Let us define the following set of polynomials

$$ \mathcal{P}_{*} := \left\{ p(\lambda) = {\sum}_{{k}=0}^{{\ell}} \lambda^{{k_{*}}}\times \boldsymbol{\eta}_{{k}} \right\}, $$

with $\boldsymbol {\eta }_{{k}} \in \mathbb {C}^{M \times M}$. We say that the map $[\cdot , \cdot ]: \mathcal {P}_{*} \times \mathcal {P}_{*} \rightarrow \mathbb {C}^{M\times M}$ is a sesquilinear block form if and only if, given $p,q,r,s \in \mathcal {P}_{\ast }$ and $\boldsymbol {\alpha }, \boldsymbol {\beta } \in \mathbb {C}^{M \times M}$, it satisfies

$$ \begin{array}{@{}rcl@{}} [q \times \boldsymbol {\alpha}, p \times \boldsymbol {\beta}] &=& \boldsymbol {\alpha}^{H} \times[q,p] \times \boldsymbol {\beta}, \\ \left[q + s, p + r\right] &=& [q, p] + [s, p] + [q, r] + [s, r]. \end{array} $$

In addition, from now on we assume that for every sesquilinear block form [⋅,⋅] it holds that

$$ [\lambda*q,p] = [q, \lambda*p]. $$

(13)

Then, [⋅,⋅] is determined by its moments defined as

$$ \boldsymbol{\mu}_{{k}}:= [\lambda^{{{k_{*}}}},1] = [1, \lambda^{{{k}_{*}}}], \quad {k}=0,1,{\dots} . $$

We say that the sequences $p_{0}, p_{1}, \dots $ and $q_{0}, q_{1}, \dots $ from $\mathcal {P}_{*}$ are sequences of biorthonormal polynomials with respect to [⋅,⋅] if and only if

$$ [q_{i}, p_{j}] = \delta_{ij} I_{M}, $$

(14)

with δ_ij the Kronecker delta (hereafter the subindex k in p_k and q_k will stand for the degree of the polynomial). From now on, we also assume m₀ = I_M, getting p₀ = q₀ = I_M. Let q₁ be the polynomial from $\mathcal {P}_{*}$

$$ q_{1}(\lambda) = \lambda * q_{0}(\lambda) - q_{0}(\lambda)\times \boldsymbol {\alpha}_{0}^{H}, $$

the orthogonality conditions (14) imply

$$ \boldsymbol {\alpha}_{0} = [\lambda*q_{0}, p_{0}]. $$

Analogously, we get

$$ p_{1}(\lambda) \times \boldsymbol {\beta}_{1} = \lambda * p_{0}(\lambda) - p_{0}(\lambda) \times \boldsymbol {\alpha}_{0}, $$

with

$$ \boldsymbol {\alpha}_{0} = [q_{0}, \lambda*p_{0}], \quad \boldsymbol {\beta}_{1} = [q_{1}, \lambda*p_{0}]. $$

Repeating such a biorthogonalization process, we obtain the three-term recurrences for $k=0,1,\dots $

$$ \begin{array}{@{}rcl@{}} q_{k+1}(\lambda) &=& \lambda * q_{k}(\lambda) - q_{k}(\lambda)\times \boldsymbol {\alpha}_{k}^{H} - q_{k-1}(\lambda)\times \boldsymbol {\beta}_{k}^{H} \end{array} $$

(15a)

$$ \begin{array}{@{}rcl@{}} p_{k+1}(\lambda)\times\boldsymbol {\beta}_{k+1} &=& \lambda * p_{k}(\lambda) - p_{k}(\lambda)\times\boldsymbol {\alpha}_{k} - p_{k-1}(\lambda), \end{array} $$

(15b)

with p_− 1 = q_− 1 = 0 and

$$ \boldsymbol {\alpha}_{k} = [q_{k}, \lambda * p_{k}], \quad \boldsymbol {\beta}_{k+1} = [q_{k+1}, \lambda * p_{k}]. $$

(16)

We remark that the recurrences are obtained using property (13). The previous derivation also constructively proves that the biorthonormal polynomials $p_{0}, \dots , p_{n}$ and $q_{0}, \dots , q_{n}$ exist if $\boldsymbol {\beta }_{1}, \dots , \boldsymbol {\beta }_{n}$ are invertible matrices.

Let $\boldsymbol {\mathcal {A}}, V, W$ be as in Theorem 2 and let us define the sesquilinear block form

$$ [q, p]_{\boldsymbol{\mathcal{A}}} = W^{D} * q^{D}(\boldsymbol{\mathcal{A}}) * p(\boldsymbol{\mathcal{A}}) * V. $$

Assume that there exist polynomials $p_{0}, \dots , p_{n}$ and $q_{0},\dots ,q_{n}$ from $\mathcal {P}_{*}$ which are biorthonormal with respect to $[\cdot , \cdot ]_{\boldsymbol {\mathcal {A}}}$. Defining the vectors

$$ V_{k} = p_{k-1}(\boldsymbol{\mathcal{A}})*V, \qquad {W^{D}_{k}} = W^{D} * q_{k-1}^{D}(\boldsymbol{\mathcal{A}}), $$

and using the recurrences (A) we get the recurrences (3.2) of the non-Hermitian Lanczos for tensors. Moreover, the coefficients in (16) are the coefficients in (4).

Let $\boldsymbol {\mathcal {T}}_{n}$ be as in Theorem 2. We can define the sesquilinear block form $[q, p]_{n}: \mathcal {P}_{*} \times \mathcal {P}_{*} \rightarrow \mathbb {C}^{M \times M}$ as

$$ [q, p]_{n} = {E_{1}^{D}} * q^{D}(\boldsymbol{\mathcal{T}}_{n})* p(\boldsymbol{\mathcal{T}}_{n})* E_{1}. $$

Note that here the vector e₁ in the definition of E₁ = e₁ ⊗ I_M has length n ≤ N. The following Lemmas will show that

$$ \boldsymbol{\mu}_{{k}} = [\lambda^{*{k}},1]_{\boldsymbol{\mathcal{A}}} = [\lambda^{*{k}},1]_{n}, \quad {k}=0,\dots,2n-1, $$

concluding the proof of Theorem 2.

Lemma 2

Let $p_{0},\dots ,p_{n} \in \mathcal {P}_{*}$ and $q_{0},\dots ,q_{n} \in \mathcal {P}_{*}$ be biorthonormal polynomials with respect to $[\cdot ,\cdot ]_{\boldsymbol {\mathcal {A}}}$. Assume that $\boldsymbol {\beta }_{1},\dots , \boldsymbol {\beta }_{n}$ in (A) are invertible matrices. Then the polynomials are also biorthonormal with respect to [⋅,⋅]_n as defined above.

Proof

Let us define the tensors E_i := e_i ⊗ I_M for $i=1,\dots , n$, with $I_{n} = [\mathsf {e}_{1}, \dots , \mathsf {e}_{n}]$. We will first prove by induction that for $i=0,\dots ,n-1$

$$ E_{i+1} = p_{i}(\boldsymbol{\mathcal{T}}_{n}) * E_{1}, \quad E_{i+1}^{D} = {E_{1}^{D}} * {q_{i}^{D}}(\boldsymbol{\mathcal{T}}_{n}). $$

(17)

For i = 0, the relations (17) are trivial. Assume now that (17) hold for $i=1,\dots ,k$, by (A) we get

$$ \begin{array}{@{}rcl@{}} {E_{1}^{D}} * q_{k+1}^{D}(\boldsymbol{\mathcal{T}}_{n}) &=& E_{k+1}^{D} * \boldsymbol{\mathcal{T}}_{n} - \boldsymbol {\alpha}_{k} \times E_{k+1}^{D} - \boldsymbol {\beta}_{k} \times {E_{k}^{D}}, \\ p_{k+1}(\boldsymbol{\mathcal{T}}_{n})\times\boldsymbol {\beta}_{k+1} &=& \boldsymbol{\mathcal{T}}_{n} * E_{k+1} - E_{k+1}\times\boldsymbol {\alpha}_{k} - E_{k}. \end{array} $$

Since β_k+ 1 is invertible, direct computations prove that (17) holds i = k + 1.

As a consequence we have

$$ [{q}_{i}, {p}_{j}]_{n} = E_{i+1}^{D} \ast E_{j+1} = \delta_{ij} I_{M}, $$

which concludes the proof. □

Lemma 3

Let $p_{0},\dots ,p_{n-1} \in \mathcal {P}_{*}$ and $q_{0},\dots ,q_{n-1} \in \mathcal {P}_{*}$ be biorthonormal polynomials with respect to a sesquilinear block form $[\cdot ,\cdot ]_{\boldsymbol {\mathcal {A}}}$ and to a sesquilinear block form $[\cdot ,\cdot ]_{\boldsymbol {{\mathscr{B}}}}$. If $[1,1]_{\boldsymbol {\mathcal {A}}} = [1,1]_{\boldsymbol {{\mathscr{B}}}} = I_{M}$, then $[\lambda ^{{k_{*}}},1]_{\boldsymbol {\mathcal {A}}} = [\lambda ^{{k}_{*}},1]_{\boldsymbol {{\mathscr{B}}}}$ for ${k}=0,\dots ,2n-1$.

Proof

The proof is by induction. Let $\boldsymbol {\mu }_{{k}} = [\lambda ^{{{k}_{*}}},1]_{\boldsymbol {\mathcal {A}}}$ and $\widehat {\boldsymbol {\mu }}_{j} = [\lambda ^{{{k}_{*}}},1]_{\boldsymbol {{\mathscr{B}}}} $ for ${k}=0,1,\dots , 2n-1$. The coefficient formula (16) gives

$$ [q_{0}, \lambda*p_{0}]_{\boldsymbol{\mathcal{A}}} = \boldsymbol {\alpha}_{0} = [q_{0}, \lambda*p_{0}]_{\boldsymbol{\mathcal{B}}}. $$

Hence $\boldsymbol {\mu }_{1} = \boldsymbol {\alpha }_{0} = \widehat {\boldsymbol {\mu }}_{1}$. Considering the induction assumptions $\boldsymbol {\mu }_{{k}} = \widehat {\boldsymbol {\mu }}_{{k}}$ for ${k}=0,\dots ,2{j}-3$, we prove that $\boldsymbol {\mu }_{2{j}-2} = \widehat {\boldsymbol {\mu }}_{2{j}-2}$ and $\boldsymbol {\mu }_{2{j}-1} = \widehat {\boldsymbol {\mu }}_{2{j}-1}$, for ${j} = 2,\dots ,n$. By the formula in (16) we get

$$ [q_{{j}-1}, \lambda*p_{{j}-2}]_{\boldsymbol{\mathcal{A}}} = \boldsymbol {\beta}_{{j}-1} = [q_{{j}-1}, \lambda*p_{{j}-2}]_{\boldsymbol{\mathcal{B}}}, $$

which we can rewrite as

$$ {\sum}_{i=0}^{{j}-1}{\sum}_{{k}=0}^{{j}-2} \boldsymbol{\eta}_{i}^{H} \times \boldsymbol{\mu}_{i+{k}+1} \times \widehat{\boldsymbol{\eta}}_{{k}} = {\sum}_{i=0}^{{j}-1}{\sum}_{{k}=0}^{{j}-2} \boldsymbol{\eta}_{i}^{H} \times \widehat{\boldsymbol{\mu}}_{i+{k}+1} \times \widehat{\boldsymbol{\eta}}_{{k}}, $$

where $\boldsymbol {\eta }_{i}, \widehat {\boldsymbol {\eta }}_{{k}} \in \mathbb {C}^{M \times M}$ are the coefficients respectively of q_j− 1 and p_j− 2. By the induction assumption we obtain

$$ \boldsymbol{\eta}_{{j}-1}^{H} \times \boldsymbol{\mu}_{2{j}-2} \times \widehat{\boldsymbol{\eta}}_{{j}-2} = \boldsymbol{\eta}_{{j}-1}^{H} \times \widehat{\boldsymbol{\mu}}_{2{j}-2} \times \widehat{\boldsymbol{\eta}}_{{j}-2}. $$

The leading coefficients of the polynomials q_2j− 2 and p_2j− 2 are respectively η_j− 1 = 1 and $\widehat {\boldsymbol {\eta }}_{{j}-2}=(\boldsymbol {\beta }_{{j}-2} \times {\cdots } \times \boldsymbol {\beta }_{1})^{-1}$. Hence $\boldsymbol {\mu }_{2{j}-2} = \widehat {\boldsymbol {\mu }}_{2{j}-2}$. We conclude the proof repeating the same argument with the coefficient α_j− 1 (16). □

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Cipolla, S., Pozza, S., Redivo-Zaglia, M. et al. A Lanczos-type procedure for tensors. Numer Algor 92, 377–406 (2023). https://doi.org/10.1007/s11075-022-01351-6

Download citation

Received: 12 May 2022
Accepted: 07 June 2022
Published: 31 August 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s11075-022-01351-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Lanczos-type procedure for tensors

Abstract

Similar content being viewed by others

A μ-mode BLAS approach for multidimensional tensor-structured problems

Numerical solution of a class of third order tensor linear equations

On sign function of tensors with Einstein product and its application in solving Yang–Baxter tensor equation

1 Introduction

2 Preliminaries

Definition 1 (∗-Tensor product)

Definition 2 (Tensor-Hypervector product)

Definition 3 (Hypervector inner-product)

Definition 4 (Tensor-matrix product)

Definition 5 (Hypervector-matrix product)

Definition 6 (Vector-to-Hypervector)

Lemma 1

3 The Lanczos-type process

3.1 Krylov-type tensor subspaces

Definition 7

3.2 The tensor Lanczos process

3.3 Main properties of the tensor Lanczos algorithm

Theorem 1

Proof

Theorem 2 (Matching Moment Property)

3.4 Numerical properties

4 Breakdowns

5 Numerical examples

5.1 Error measures

5.2 Proof of concept

5.2.1 Time-independent matrix

5.2.2 Time-dependent matrix

5.3 NMR experiments

5.3.1 Experiment 1: Weak coupling

5.3.2 Experiment 2: Strong coupling

5.3.3 Experiment 3: Uncoupled spins under a pulse wave

5.4 Tensor Train approximations

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Data availability

Publisher’s note

Appendix

Appendix

1.1 A.1 The ∗-product associativeness

1.2 A.2 Matching moment property

Lemma 2

Proof

Lemma 3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation