Rank-adaptive tensor methods for high-dimensional nonlinear PDEs

We present a new rank-adaptive tensor method to compute the numerical solution of high-dimensional nonlinear PDEs. The new method combines functional tensor train (FTT) series expansions, operator splitting time integration, and a new rank-adaptive algorithm based on a thresholding criterion that limits the component of the PDE velocity vector normal to the FTT tensor manifold. This yields a scheme that can add or remove tensor modes adaptively from the PDE solution as time integration proceeds. The new algorithm is designed to improve computational efficiency, accuracy and robustness in numerical integration of high-dimensional problems. In particular, it overcomes well-known computational challenges associated with dynamic tensor integration, including low-rank modeling errors and the need to invert the covariance matrix of the tensor cores at each time step. Numerical applications are presented and discussed for linear and nonlinear advection problems in two dimensions, and for a four-dimensional Fokker-Planck equation.

In this paper, we build upon our recent work on dynamical tensor approximation [18,17], and develop new rankadaptive temporal integrators to compute the numerical solution of high-dimensional initial/boundary value problems of the form where x ∈ Ω ⊆ R d (Ω compact, d ≥ 1), and G is a nonlinear operator which may take into account boundary conditions. A well-known challenge of dynamic tensor approximations to (1) is that the curvature of the tensor manifold in which we compute the PDE solution is inversely proportional to the energy of the tensor modes. This means that the smaller the energy of the tensor modes the higher the curvature. Hence, to integrate a solution characterized by tensor modes with a wide range of energies one has to consider time stepping schemes that can effectively handle geometric features associated with the curvature of the manifold. In projection-based approaches [32,41,18,17] the computational challenge posed by the curvature of the tensor manifold translates into the need to invert the positive semi-definite covariance matrix of the tensor cores at each time step. A time-integration scheme constructed in this way may become numerically unstable in the presence of tensor modes with small energy, or even singular when modes with zero energy are present (e.g., at a time instant in which we increase the tensor rank by adding a mode with zero energy). To mitigate this problem, Babaee et al. [3] introduced a matrix pseudo-inverse approximation method that can handle potential singularities in the covariance matrices of the tensor cores, in particular when adding modes with zero energy to the tensor series expansion of the PDE solution.
A mathematically rigorous framework to integrate dynamical tensors over manifolds with arbitrary curvature was developed by Lubich et. al in [35,30,36]. The key idea is to integrate the evolution equation generating the tensor dynamics using operator splitting schemes, e.g., the Lie-Trotter or the Strang time integrators (see [30,36] for details). This results in a scheme that does not suffer from the curvature of the tensor manifold, and even provides an exact representation in the presence of tensor modes with zero energy. The numerical method presented in this work combines all these features, i.e., functional tensor train (FTT) series expansions, operator splitting time integration, and a new rank-adaptive algorithm to add and remove tensor modes from the PDE solution based on a thresholding criterion that limits the component of the velocity vector normal to the FTT tensor manifold.
This paper is organized as follows. In Section 2 we briefly review finite-rank functional tensor train (FTT) expansions of high-dimensional functions. In Section 3 we discuss dynamic tensor approximation of nonlinear PDEs of the form (1) and develop robust temporal integration schemes based on operator splitting methods. We also discuss steptruncation algorithms [50,49] and prove that dynamic tensor approximation and step-truncation are at least order one consistent to one another. In Section 4 we develop new rank-adaptive time integrators on rank-structured FTT tensor manifolds and prove that the resulting scheme is consistent. In Section 5 we present and discuss various numerical applications of the proposed rank-adaptive tensor method, and demonstrate its accuracy and computational efficiency. The main findings are summarized in Section 6.

The manifold of fixed-rank FTT tensors
Let us consider the weighted Hilbert space 1 where Ω ⊆ R d is a separable domain such as a d-dimensional torus T d or a Cartesian product of d real intervals and µ is a finite product measure on Ω Let τ be the counting measure on N. Each element u ∈ L 2 µ (Ω) admits functional tensor train (FTT) expansion [7,18] of the form where each basis set for each i = 1, . . . , d − 1, and {ψ d (α d−1 ; x d ; 1)} α d−1 is orthonormal with respect to the inner product in L 2 µ d (Ω d ). The sequence of real numbers λ(α d−1 ) have a single accumulation point at zero. By truncating (5) so that only the largest singular values λ(α d−1 ) are retained yields the following approximation of u(x) where r = (r 0 , r 1 , . . . , r d−1 , r d ) is the TT rank. Let M ri−1×ri (L 2 µi (Ω i )) denote the set of r i−1 × r i matrices with entries in L 2 µi−1 (Ω i ). The truncated expansion (7) can be written compactly as where Ψ i (x i ) ∈ M ri−1×ri L 2 µi (Ω i ) are called FTT cores and Λ is a diagonal matrix with entries λ(α d−1 ) (α d−1 = 1, . . . , r d−1 ). We will often write Ψ i in place of Ψ i (x i ) suppressing dependence of the spatial variable as such dependence is clear from the matrix subscript.

Recursive orthgonalization of FTT tensors
For any tensor core Ψ i ∈ M ri−1×ri (L 2 µi (Ω i )) we define the matrix with entries 2 The FTT representation (8) is given in terms of orthonormal cores, i.e., Other orthogonal representations can be computed, e.g., based on recursive QR decompositions. To describe this method, let Ψ i ∈ M ri−1×ri (L 2 µi (Ω i )) and consider each column of Ψ i as a vector in Performing an orthogonalization process (e.g. Gram-Schmidt) on the columns of the FTT core Ψ i relative to the inner product (9) yields a QR-type decomposition of the form where Q i is an r i−1 × r i matrix with elements in L 2 µi (Ω i ) satisfying Q T i Q i i = I ri×ri , and R i is an upper triangular r i × r i matrix with real entries. Next, consider an arbitrary FTT tensor u = Ψ 1 Ψ 2 · · · Ψ d , where each core Ψ i is not necessarily full rank. For notational convenience define the partial products One way to orthogonalize u is by performing QR decompositions recursively from left to right as we will now describe. We begin by decomposing Ψ 1 as Now we may express u = Q 1 R 1 Ψ 2 · · · Ψ d . Next, we perform another QR decomposition 2 The averaging operation in (10) is a standard inner product in Proceeding recursively in this way we obtain a representation for u of the form where each Q i ∈ M ri−1×ri (L 2 µi (Ω i )) satisfies Q T i Q i i = I ri×ri . We shall call such a representation left orthogonalization of u. We may also stop the orthogonolization process at any step in the recursive process to obtain the partial left orthogonalization Similar to orthogonalizing from the left we may also orthogonalize from the right. To do so we begin by performing a QR decomposition A substitution of (18) into (8) yields the expansion Proceeding recusively in this way we obtain the right orthogonalization We may have stopped the orthogonalization process at any point to obtain the partial right orthogonalization It is also useful to perform orthogonalizions from the left and right to obtain expansions of the form The matrix R i R T i+1 has rank r i . Another important operation is truncation of FTT tensors to smaller rank. Efficient algorithm to perform this operation for TT tensors can be found in [43,Section 3] and in [16]. Such algorithm is easily adapted to FTT tensors by replacing QR decompositions of matrices with the QR of FTT cores given in (12) and SVD decomposition of matrices with Schmidt decompositions. In numerical implementations this amounts to introducing appropriate quadrature weight matrices into the algorithm.

Tangent and normal spaces of fixed-rank FTT manifolds
Denote by V (i) ri−1×ri the set of all tensor cores Ψ i ∈ M ri−1×ri (L 2 µi (Ω i )) with the property that the autocovariance matrices i.e., the set of fixed-rank FTT tensors with invertible cores, is a smooth Hilbert submanifold of L 2 µ (Ω) (see [17]). We define the tangent space of M r at the point u ∈ M r as equivalence classes of continuously differentiable curves on M r passing through u A sketch of M r and T u M r is provided in Figure 1. Denote by T M r the tangent bundle of M r which consists of the disjoint union of tangent spaces over all points in M r , i.e., Since L 2 µ (Ω) is an inner product space, for each u ∈ L 2 µ (Ω) the tangent space T u L 2 µ (Ω) is canonically isomorphic to L 2 µ (Ω). Moreover, for each u ∈ M r the normal space to M r at the point u, denoted by N u M r , consists of all vectors in L 2 µ (Ω) that are orthogonal to T u M r with respect to the inner product in L 2 µ (Ω) Since the tangent space T u M r is closed, for each point u ∈ M r the space L 2 µ (Ω) admits a decomposition into tangential and normal components, i.e.,

Dynamic tensor approximation of nonlinear PDEs
The idea of dynamic tensor approximation is to project the time derivative of a low-rank tensor onto the tangent space of the corresponding low-rank tensor manifold at each time. Such a projection results in evolution equations on the low-rank tensor manifold, and can be used to solve initial/boundary value problem of the form (1). This approximation technique is known in the quantum physics community as Dirac-Frenkel/Mclachlan variational principle [44,25,38]. Dynamic approximation has been recently studied by Lubich et al. [32,41,33,35] for finite-dimensional rank-structured manifolds embedded in Euclidean spaces. There have also been extensions to the Tucker format on tensor Banach spaces [20] and tree-based tensor formats on tensor Banach spaces [21].

Dynamic tensor approximation on low-rank FTT manifolds
Let us briefly describe the method of dynamic tensor approximation for the low-rank FTT manifold (23). First we define a projection onto the tangent space of M r at u by For fixed u, the map P u is linear and bounded. Since the tangent space T u M r ⊂ H is closed, each v ∈ H admits a unique representation as v = v t + v n where v t ∈ T u M r and v n ∈ N u M r (see equation (27)). From such a representation it is clear that P u is an orthogonal projection onto the tangent space T u M r . Using this orthogonal projection we define the operator If the initial condition u(x, 0) is on the manifold M r then the solution to the initial/boundary value problem remains on the manifold M r for all t ≥ 0. The solution to (30) is known as a dynamic approximation to the solution of (1). In the context of Hilbert spaces the dynamic approximation problem (30) can be solved using dynamically orthogonal or bi-orthogonal constraints on tensor modes [18,17]. Such constraints, also referred to as gauge conditions, provide the unique solution of the minimization problem (28) with different FTT cores. However, in the presence of repeated eigenvalues the bi-orthogonal constraints result in singular equations for the tangent space projection (28).
Hereafter we recall the equations which allow us to compute (28) with FTT cores subject to dynamically orthogonal (DO) constraints. First, expand u ∈ M r in terms of FTT cores as With this ansatz, an arbitrary element of the tangent space T u M r can be expressed asu whereu = ∂u/∂t andΨ i = ∂Ψ i /∂t. The DO constraints are given by which ensure that Ψ i (t), i = 1, . . . , d − 1, remain orthonormal for all t ≥ 0. We have shown in [17] that under these constraints the convex minimization problem (28) admit a unique minimum for vectors in the tangent space (31) satisfying the PDE systeṁ where u = Ψ 1 Ψ 2 · · · Ψ d ∈ M r , and the averaging operator · k1,...,kp is defined as This operator maps any matrix Ψ(x) with entries in L 2 µ (Ω) into a matrix Ψ k1,...,kp of the same size of Ψ(x) but with entries in L 2 µ Ω \ Ω k1 × · · · × Ω kp . The DO-FTT system (33) involves several inverse covariance matrices Ψ ≥k Ψ T ≥k −1 k,...,d , which can become poorly conditioned in the presence of tensor modes with small energy (small singular values). This phenomenon has shown to be related to the curvature of the tensor manifold and can be avoided by using projector-splitting integrators [35,30,36].
A slight improvement to the numerical stability of (33) can be obtained by right orthogonalizing the partial products Using the orthogonality of Q k it can easily be verified that R k = Ψ ≥k Ψ T ≥k 1/2 k,...,d . Using these right orthogonalized cores the DO-FTT system (33) can be written aṡ Here Ψ ≥k Ψ T ≥k −1/2 k,...,d denotes the inverse of the matrix square root. Since the condition number of Ψ ≥k Ψ T ≥k k,...,d is larger than the condition number of Ψ ≥k Ψ T ≥k 1/2 k,...,d we have that the inverse covariances at the right hand side of (36) can be computed more accurately than the ones in (33) in the presence of small singular values.

Temporal integration using operator splitting methods
One of the challenges of dynamic approximation of PDEs on low-rank tensor manifolds relates to the curvature of the manifold, which is proportional to the inverse of the smallest singular value of Ψ ≥k Ψ T ≥k k,...,d [32,Section 4]. Such curvature appears naturally at the right hand side of the DO-FTT system (33) in the form of inverse covariances Ψ ≥k Ψ T ≥k −1 k,...,d . Clearly, if tensor modes with low energy are present in the solution then the covariances Ψ ≥k Ψ T ≥k k,...,d tend to be ill-conditioned and therefore not easily invertible. Moreover, it is desirable to add and remove tensor modes adaptively during temporal integration, and adding a mode with zero energy immediately yields singular covariance matrices (see [18]). The problem of inverting the covariance matrices Ψ ≥k Ψ T ≥k k,...,d when integrating (33) or (36) can be avoided by using projector-splitting methods. These methods were originally proposed by Lubich et. al in [35,30,36]. The key idea is to apply an exponential operator splitting scheme, e.g., the Lie-Trotter scheme, directly to the projection operator onto the tangent space defining the dynamic approximation (see equation (28)). To describe the method, we begin by introducing a general framework for operator splitting of dynamics on the FTT tangent space. To this end, we first rewrite the right hand side of (30) as where in the second line we used the right orthogonalizations in equation (35). A substitution of the expressions foṙ Ψ k we obtained in (36) into (37) yields where we defined the following projection operators from L 2 and we set Ψ 0 = 1. The key point in (38) is that inverse covariance matrices no longer appear . To establish a general operator splitting framework let us assume that there exists an evolution operator E Gr for the solution of the initial/boundary value problem (30), where G r is defined in (38). Such evolution operator E Gr : satisfies a semi-group property and it maps the initial condition u 0 (x) into the solution to (30) at a later time We write such an evolution operator formally as an exponential operator with generator D Gr (see e.g. [42]) where D Gr is the Lie derivative associated with G r . We now discretize the temporal domain of interest [0, T ] into N + 1 evenly-spaced time instants, An approximation to the exact solution of (30) is then obtained by the recurrence relation where S is an exponential operator splitting that approximates the exact evolution operator Setting s = 1, γ 1,j = 1, ∀j = 1, . . . , d in (44) yields the well-known Lie-Trotter splitting, which is first-order in time (see [35]). One step of the splitting integrator solves the following initial value problems in consecutive order over the Such differential equations can be solved exactly as follows.
which is left orthogonalized. We orthogonalize the right partial product Ψ ≥k+1 to rewrite the FTT expansion as Differentiating with respect to t and using the definition of P + j the differential equation to be solved iṡ ChoosingΨ ≤k−1 =Q ≥k+1 = 0 this differential equation becomes Using orthonormality of the cores Ψ ≤k−1 and Q T ≥k+1 we find which can be integrated exactly over the time interval Since Ψ ≤k−1 and Q ≥k+1 are constant with respect to time, multiplying the above equation on the left by Ψ ≤k−1 (t i ) and on the right by Q ≥k+1 (t i ) proves the first half of the Proposition. The second half is done in a similar way with P + j replaced by −P − j .
Note that when using the Lie-Trotter splitting integrator (45) to solve (30) one can replace ∆u with ∆tG r (u).

3.2.
Step-truncation methods Another methodology to integrate nonlinear PDEs on fixed-rank tensor manifolds M r is step-truncation [31,50,49]. The idea is to integrate the solution off of M r for short time, e.g., by performing one time step of the full equation with a conventional time-stepping scheme, followed by a truncation operation back onto M r . To describe this method further let us define the truncation operator which provides the best approximation of u on M r . Such a map is known as a metric projection or closest point function and in general it may be multivalued, i.e., the set of u r ∈ M r which minimize u − u r H is not a singleton set. However, since M r is a smooth submanifold of H we have by [52, Proposition 5.1] that for each u 0 ∈ M r there exists an open neighborhood U of u 0 such that T r is well-defined and smooth on U . Let be a convergent one-step time integration scheme 3 approximating the solution to the initial value problem (1). Assume that the solution u(x, t 0 ) at time t 0 is on M r . 4 In order to guarantee the solution u(x, t k ) at time step t k is an element of the manifold M r for each k = 1, 2, . . . we apply the truncation operator to the right hand side (52). This yields the following step-truncation method

Consistency between dynamic approximation and step-truncation methods
Next we ask what happens in the step-truncation algorithm in the limit of time step ∆t approaching zero. The result of such a limiting procedure results in a scheme which keeps the solution u(x, t) on the manifold M r for all time t ≥ t 0 in an optimal way. We now show that this limiting procedure in fact results in precisely the dynamic approximation method described in Section 3.1. In other words, by sending ∆t to zero in (53) we obtain a solution of (30). For similar discussions connecting these two approximation methods in closely related contexts see [22,23,31]. To prove consistency between step-truncation and dynamic approximation methods we need to compute T r (u(x, t)) for t infinitesimally close to t 0 . Such quantity depends on the derivative The following proposition provides a representation of the derivative ∂T r (u(x, t))/∂t in terms of G(u(x, t)) and the Fréchet derivative [54] of the operator T r (u).
Proposition 3.2. If the solution u 0 = u(x, t 0 ) to (1) at time t 0 is on the manifold M r then where (T r ) u0 is the Fréchet derivative of the nonlinear operator T r at the point u 0 .
Proof: Express the solution of (1) at time t ≥ t 0 as where Expanding T r (u(x, t)) in a Taylor series around u 0 (x) we obtain [40, Theorem 6.1] Differentiating (58) with respect to t and evaluating at t = t 0 we obtain where we assumed that ∂/∂t commutes with (T r ) u0 and used the fact that ∂h(x, t)/∂t = G(u(x, t)) for the first order term. All of the higher order terms are seen to zero by commuting ∂/∂t with (T r ) (n) u0 and using chain rule.
Since T r (u(x, t)) is an element of M r for all t ≥ t 0 it follows that (55) is an element of T u0 M r . Arguing on the optimality of the tangent space element (T r ) u0 G(u(x, t 0 )) it is seen that (59) is the same problem as dynamic approximation (30), i.e., (T r ) u0 = P u0 . Now consider the scheme (53) and use a Taylor expansion of T r around u r (x, t k ) on the right hand side Discarding higher order terms in ∆t yields Moreover if the increment function Φ is the one defining the Euler forward, i.e., then the scheme (61) is exactly the scheme in (30). Thus, we just proved the following lemma.
Step-truncation and dynamic approximation methods are consistent at least to first-order in ∆t.
This Lemma applies to any first-order time integrator, including the Lie-Trotter splitting integrator we discussed in section 3.1.1. Higher-order consistency results can be obtained by generalizing Proposition 3.2 to higher-order time derivatives, and linking such results with higher-order increment functions Φ(G, u, ∆t).

Adaptive integration on rank-structured tensor manifolds
The solution to the initial/boundary value problem (1) is often not accurately represented on a tensor manifold with fixed rank, even for short integration times. In this section we discuss effective methods to adaptively add and remove tensor modes from the solution based on appropriate criteria.
In the context of step-truncation algorithms, if the solution rank naturally decreases in time then the operator T r in (53) is no longer well-defined. In this situation, replacing the operator T r with T s for an appropriate 5 s ≤ r allows for integration to continue. On the other hand if the solution rank increases in time then the operator T r will still be well-defined for small enough ∆t but the approximation on M r will not retain accuracy. To address this problem of constant rank integration we shall introduce a criterion for rank increase of the FTT solution. Both decreasing and increasing rank are based on FTT truncation (see Section 2.1). For the remainder of this section let u(x, t) be the solution to (1) and u r (x, t) ∈ M r an approximation of u(x, t) obtained by either the solution of the dynamical approximation problem (30) or step-truncation methods (see Section 3.2).

Decreasing tensor rank
For decreasing tensor rank at time t we are interested in determining if u r (x, t) ∈ M r is close to an element u s (x, t) ∈ M s for s ≤ r. This can be achieved by simply performing a FTT truncation on u r (x, t) with small threshold dec . Since the splitting integrator described in Section 3.1.1 is robust to over approximation by tensor rank it may not be strictly necessary to decrease rank during integration. However, it is desirable to have solutions of the lowest rank possible (while retaining accuracy) when solving high dimensional problems. For these reasons it is advisable not perform a FTT truncation at each time step (as this would be unnecessary and inefficient when using an operator splitting integrator) but only every once and a while. One may choose a criterion for when to check for rank decrease based on the problem, step size, current rank, and dimension. If one is using a step-truncation method with a tolerance based FTT truncation algorithm such as the one described in Section 2.1 then rank decrease is already built into each time step.

Increasing tensor rank
As a general heuristic one would like to increase rank at the time when the error between the low-rank approximation u r (x, t) and the PDE solution u(x, t) will become large after the subsequent time step. Such critical time instant for rank increase can be determined by examining the normal component of the dynamics, i.e., N u (G(u)) = G(u) − P u (G(u)). Suppose we are integrating one time step forward from t i to t i+1 . The error at t i+1 is given by For small ∆t the above integral can be approximated by the left endpoint where N ur (x,τ ) denotes the orthogonal projection onto the normal space of M r at the point u r (x, t). Hence, up to first-order in ∆t we have that From this approximation we see that a reasonable criterion for increasing rank at time t i is when the norm of the normal component of G(u r (x, t i )) is larger than some threshold inc (see Figure 2) To efficiently compute the normal component N ur (x,ti) (G(u r (x, t i ))) at each time instant t i we use the formula where N r (G(u r )) and T ur (G(u r )) represent the normal and tangential components of G(u r ). The tangential component can be approximated at a low computational cost via backward differentiation formulas (BDF) as With a p-point backward difference approximation of the tangent space projection available at t i we easily obtain an approximation of the normal component of G(u r ) at t i N ur (x,ti) (G(u r (x, t i ))) = G(u r (x, t i )) − P (p) ur (x,ti) (G(u r (x, t i ))) + O(∆t p+1 ), We emphasize that this method of using a finite difference stencil based on the temporal grid for approximating the tangential component of the dynamics (and thus the normal component) creates a lower bound for the choice of normal vector threshold inc . In particular, we must have that K 1 (∆t) p ≥ inc for some constant K 1 otherwise the error incurred from our approximation of the normal component may trigger unnecessary mode addition. This approximation of the normal component is cheap but only informs on whether or not it is appropriate to add modes at time instant t i . The subsequent question is which entries of the rank vector r need to be increased. In order to make such a determination we expand the approximate solution at time t as where Γ 1 (t) · · · Γ d (t) = 0 for all t ∈ [0, T ]. Differentiating (73) with respect to time yields Subtracting off the tangential component (31) we have the normal component at time t Next, orthogonalize the partial product Γ ≤i−1 (t) from the left and the partial product Γ ≥i (t) from the right to obtain where C i = 0 ri−1×ri and Γ T i , Γ i i = I for all i = 1, 2, . . . , d. Expand (76) using a product rule and evaluate at From the previous equation we see that the FTT auto-correlation matrices of the normal component at time instant t i are the time derivatives of the zero energy modes in the current solution. Thus, if the normal component has FTT rank n then then the solution u r (x, t) at time t i should be represented by an FTT tensor of rank r + n. Certainly, the solution will be over represented at t i with rank r + n. However, after one step of the splitting integrator the additional ranks will ensure that the low-rank solution u r+n (x, t) ∈ M r+n retains its accuracy.
Algorithm 1: One step integration with adaptive rank increase Input: u r (x, t i ), u r (x, t i−1 ), . . . , u r (x, t i−p ) → time snapshots of the PDE solution with rank r, G(u r (x, t i )) → velocity vector defined by the right hand side of the PDE (1) at time t i , ∆t → time step, inc → threshold for the norm of normal component N ur (x,ti) (G(u r (x, t i ))).
Output: u r+n (x, t i+1 ) → PDE solution with rank r + n at time t i+1 Initialization: • Approximate the constant rank velocity vector via the BDF formula: P Runtime: Compute the FTT decomposition of normal component: Initialize to zero additional tensor modes in u r (x, t i ), as many as the rank of N TT (say n): • Use one step of Lie-Trotter splitting integrator to map u r+n (x, t i ) into u r+n (x, t i+1 ) The main steps of the algorithm we propose to adaptively increase the tensor rank are summarized in Algorithm 1. The operation " * " appearing within the conditional statement if/end denotes scalar times FTT tensor, and is meant to indicate that the multiplication is done by scaling the first core of the tensor with the scalar 0 and leaving the remainder of the cores unchanged [43]. As we will demonstrate in Section 5, Algorithm 1 is robust and it yields accurate results that do no require ad-hoc approximations such the matrix pseudo-inverse approximation introduced in [3].

Order of the rank-adaptive tensor scheme
Let us choose the threshold inc in (67) to satisfy and assume that the condition is satisfied for all t ∈ [0, T ]. Then we have the following bound for the local truncation error In particular, we have that the continuous-time adaptive rank scheme is consistent with order one in ∆t if the normal vector threshold is set as in (78). When implementing the adaptive scheme we usually discretize the time domain [0, T ] into a mesh of time instants t 1 , t 2 , . . .. Therefore, we do not necessarily have control over the normal vector for all t ∈ [0, T ] but rather only at a finite number of time instants. However, an analogous argument as we have made for order one consistency in the continuous time adaptive rank scheme holds for the discrete time adaptive rank scheme by considering the first-order approximation of the local truncation error given in (65). In particular by using the equality in (65) and discrete time thresholding of the normal component, i.e., we have that This proves that the discrete time adaptive rank scheme with normal threshold given by (81) is consistent with order one in ∆t. Higher-order consistency results can be obtained with higher-order time integration methods and higherorder estimators for the component of G(u r ) normal to the FTT tensor manifold.

Numerical examples
In this Section we demonstrate the proposed rank-adaptive FTT tensor method on linear and nonlinear PDEs. In all examples the rank-adaptive scheme relies on first-order Lie-Trotter operator splitting time integration (45), and the thresholding criterion (67). For each PDE we rigorously assess the accuracy of the proposed rank-adaptive tensor method by comparing it with benchmark solutions computed with well-established numerical methods.

Two-dimensional variable coefficient advection equation
Let us begin with the two-dimensional variable coefficient advection problem on the torus Ω = T 2 . We have shown in previous work [18] that the tensor solution to the PDE (83) increases in rank as time increases. As is well known, the PDE (83) can be reduced to the trivial ODE du/dt = 0 along the flow generated by the dynamical system (see, e.g., [47]) With the flow {x 1 (t, x 01 , x 02 ), x 1 (t, x 01 , x 02 )} available, we can write the analytical solution to (83) as where {x 01 (x 1 , x 2 , t), x 02 (x 1 , x 2 , t)} denotes the inverse flow generated by (84). We obtain a semi-analytical solution to the PDE (83) by solving the characteristic system (84) numerically for different initial conditions and then evaluating (85). A few time snapshots of the semi-analytical solution (85) are plotted in Figure 3 (middle row).
We also solve the PDE (83) using the proposed rank-adaptive tensor method with first-order Lie-Trotter operator splitting and thresholding criterion (67) with inc = 10 −2 . The initial condition is approximated by an FTT tensor u TT (x 1 , x 2 , 0) with multivariate rank r = 1 15 1 where Ψ 1 (x 1 ) = ψ 1 (1; x 1 ; 1) · · · ψ 1 (1; x 1 ; 15) , (87) Each tensor mode ψ i is discretized on a grid of 81 evenly-spaced points in the interval Ω i = [0, 2π]. One-dimensional Fourier pseudo-spectral quadrature rules and differentiation matrices [28] are used to compute inner products and derivatives when needed. We run three simulations with the initial tensor decomposition (86) and time step ∆t = 10 −4 . In the first simulation we do not use any rank adaptation, in the second simulation we set the normal vector threshold to inc = 10 −1 and in the third simulation we set inc = 10 −2 . At each time step the component of G(u TT (x, t i )) normal to the tensor manifold is approximated with the two-point BDF formula (Section 4.2). In Figure 5 we plot a few time snapshots of the singular values of the adaptive rank FTT solution with inc = 10 −2 . Figures 4(a)-(c) summarize the performance and accuracy of the proposed rank-adaptive FTT solver. In particular, in Figure 4(a) we plot the time-dependent L 2 (Ω) error between the rank-adaptive FTT solution and the reference solution we obtained with method of characteristics. It is seen that decreasing the threshold inc on the norm of the component of G(u TT ) normal to the FTT tensor manifold (Figure 4(b)) yields addition of more tensor mores to the FTT solution ( Figure  4(c)). This, in turn, results in better accuracy as demonstrated in Figure 4

Two-dimensional Kuramoto-Sivashinsky equation
In this section we demonstrate the rank-adaptive FTT integrator on the two-dimensional Kuramoto-Sivashinsky equation [2]      ∂ ∂t u(x, y, t) + 1 2 |∇ ν u(x, y, t)| 2 + ∆ ν u(x, y, t) + ν 1 ∆ 2 ν u(x, y, t) = 0, u(x, y, 0) = sin(x + y) + sin(x) + sin(y), where Here, ν 1 , ν 2 are bifurcation parameters and ν = ν 2 /ν 1 . For our demonstration we set ν 1 = 0.25, ν 2 = 0.04 and solve (88) on the two-dimensional torus T 2 . We compute a benchmark solution by using a Fourier pseudospectral method [28] on a 33 grid of evenly-spaced points on the torus [0, 2π] 2 (1089 total points). Derivatives and integrals are approximated with well-known pseudo-spectral differentiation matrices and Gauss quadrature rules. The resulting ODE system is integrated forward in time using an explicit fourth-order Runge-Kutta method with time step ∆t = 10 −5 . As before, we performed multiple simulations using the proposed rank-adaptive FTT algorithm with different thresholds for the component of G(u TT ) normal to the tensor manifold. Specifically, we ran one simulation with no mode addition and three simulations with adaptive mode addition based on Algorithm 1, and thresholds set to inc = 10, inc = 10 −1 , and inc = 10 −2 . We used the two-point BDF formula (69) to approximate the component of the solution normal to the tensor manifold at each time step and the Lie-Trotter operator splitting scheme (45) with time step ∆t = 10 −5 to integrate in time the rank-adaptive FTT solution. In Figure 6 we compare the time snapshots of the rank-adaptive FTT solution with inc = 10 −2 with the benchmark solution obtained by the Fourier pseudo-spectral method. As before, Figures 7(a)-(c) demonstrate that the rank-adaptive FTT algorithm is effective in controlling the L 2 (Ω) error of the FTT solution. Interestingly, the solution to the PDE (88) has the property that any tensor approximation with sufficient rank yields a normal component that does not grow in time. In fact, as seen in Figure 7(b) the tensor rank becomes constant for each threshold inc after a transient of approximately 0.5 dimensionless time units. Note also that the numerical solution of constant rank 2 turns out to be unstable. This can be justified by using simple Fourier analysis. In fact, suppose our rank 2 FTT solution to (88) is of the form for all t ∈ [0, T ]. Plugging this expansion into the PDE (88) and projecting onto e iα1x e iβ1y yields which is unstable for large α 1 and β 1 . An analogous equation can be derived for the time derivative of c 2 . This instability can be seen in our numerical results by examining the L 2 (Ω) error of the constant rank 2 FTT solution (Figure 7 (a)).

Four-dimensional Fokker-Planck equation
Lastly, we demonstrate the proposed rank-adaptive FTT integrator on a four-dimensional Fokker-Planck equation with non-constant drift and diffusion coefficients. As is well known [48], the Fokker-Planck equation describes the evolution of the probability density function (PDF) of the state vector solving the Itô stochastic differential equation Here, X t is the d-dimensional state vector, µ(X t , t) is the d-dimensional drift, σ(X t , t) is an d × m matrix and W t is an m-dimensional standard Wiener process. The Fokker-Planck equation that corresponds to (92) has the form where p 0 (x) is the PDF of the initial state X 0 , L is a second-order linear differential operator defined as and D(x, t) = σ(x, t)σ(x, t) T /2 is the diffusion tensor. For our numerical demonstration we set where g(x) = 1 + k sin(x). With the drift and diffusion matrices chosen in (95) the operator (94) takes the form Clearly L is a linear, time-independent separable operator of rank 9, since it can be written as Full tensor product and all other unspecified L (j) i are identity operators. We set the parameters in (95) as α = 0.1, β = 2.0, k = 1.0 and solve (93) on the four-dimensional torus T 4 . The initial PDF is set as p 0 (x) = sin(x 1 ) sin(x 2 ) sin(x 3 ) sin(x 4 ) sin(x 1 ) sin(x 2 ) sin(x 3 ) sin(x 4 ) L 1 (Ω) , which is a four-dimensional FTT tensor with multivariate rank r = 1 1 1 1 1 .
To obtain a benchmark solution with which to compare the rank-adaptive FTT solution, we solve the PDE (93) using a Fourier pseudo-spectral method on the torus T 4 with 21 4 = 194481 evenly-spaced points. As before, the operator L is represented in terms of pseudo-spectral differentiation matrices [28], and the resulting semi-discrete approximation (ODE system) is integrated with an explicit fourth-order Runge Kutta method using time step ∆t = 10 −4 . The numerical solution we obtained in this way is denoted by p ref (x, t). We also solve the Fokker-Planck using the proposed rank-adaptive FTT method with first-order Lie-Trotter time integrator (Section 3.1.1) and normal vector thresholding (Section 4.2). We run three simulations all with time step ∆t = 10 −4 : one with no rank adaption, and two with rank-adaptation and normal component thresholds set to inc = 10 −2 and inc = 10 −3 . In Figure 8 we plot three time snapshots of the two-dimensional solution marginal computed with the rank-adaptive FTT integrator ( inc = 10 −3 ) and the full tensor product pseudo-spectral method (reference solution). In Figure 9(b) we plot the component of Lp TT normal to the tensor manifold, which is approximated using the two-point BDF formula (69). Note that in the rank-adaptive FTT solution with threshold inc = 10 −3 the solver performs both mode addition as well as mode removal. This is documented in Figure 10. The reason for tensor mode removal is to ensure that the tensor rank is not unnecessarily large (see Section 4.1). In fact, the diffusive nature of the Fokker-Plank equation on the torus T 4 yields relaxation to a statistical equilibrium state that depends on the drift and diffusion coefficients in (93). Such equilibrium state may be well-approximated by a low-rank FTT tensor.

Summary
We presented a new rank-adaptive tensor method to integrate high-dimensional initial-boundary value problems for nonlinear PDEs. The new method is based on functional tensor train (FTT) expansions [17,43,7], operator splitting time integration [30,36], and a new rank-adaptive algorithm to add and remove tensor modes from the PDE solution based on thresholding the component of the velocity vector normal to the FTT tensor manifold. We tested the proposed new algorithm on three different initial/boundary value problems involving a 2D variable-coefficient first-order linear PDE, the 2D Kuramoto-Sivashinsky equation, and a 4D Fokker-Planck equation. In all cases the adaptive FTT solution was compared to a benchmark numerical solution constructed with well-established numerical methods. The numerical results we obtained demonstrate that the proposed rank-adaptive tensor method is effective in controlling the temporal integration error, and outperforms previous dynamical tensor methods for PDEs in terms of accuracy, robustness and computational cost. We also proved that the new method is consistent with recently proposed step-truncation algorithms [31,50,49] in the limit of small time steps.