Numerical solution of a class of third order tensor linear equations

We propose a new dense method for determining the numerical solution to a class of third order tensor linear equations. The approach does not require the use of the coefficient matrix in Kronecker form, thus it allows the treatment of structured very large problems. A particular version of the method for symmetric matrices is also discussed. Numerical experiments illustrate the properties of the proposed algorithm.


Introduction
We are interested in the numerical computation of the unique solution X ∈ R n×n×n to the nonsingular system Ax = b written in the following tensor form where all coefficient matrices are real and have the same n × n dimensions. Here ⊗ denotes the Kronecker product (to be recalled later) and vec(X ) stacks the components of the tensor X one after the other. In particular, in (1.1) two terms share the same matrices, either M or H (purposely in bold face), while all other matrices A i ,i = 1, 2, 3 and H 3 , M 1 have no relation to each other. The only assumption on the coefficient matrices, in addition to the nonsingularity of A, is that M, H and M 1 , H 3 be nonsingular. The unknown solution tensor will also be highlighted in bold face, to emphasize that this is the array to be determined. This tensor equation is representative of a large class of problems that can be described by means of tensors and formulated as linear array equation. For instance, the discretization of IMATI-CNR, Pavia, Italy three dimensional partial differential equations by means of a tensor basis, as is the case for finite differences on parallelepipedal domains or certain spectral methods, can lead to tensor equations of type (1.1). Tensor equations have become a fundamental algebraic ingredient for the numerical treatment of mathematical models depending on many parameters, as is the case in uncertainty quantification and parameter-dependent model order reduction methodologies; see, e.g., [1,4,5,15,20,23]. In typical situations, tensor equations with many terms occur, and each term may have a number of Kronecker products. In a simplified context, for instance, the following tensor equation is of interest (see, e.g., [17] this is a special case of our problem (1.1), where all matrices except A i , i = 1, 2, 3 are equal to the identity matrix I . Finally, we explicitly remark that if the right-hand side in (1.1) were the sum of rank-one tensors, then the solution could be expressed as the sum of solutions to tensor equations with right-hand sides of rank one.
The literature on tensors -their analysis and the associated approximation methods -has grown tremendously in the past twenty years. Different tensor representations and decompositions have been analyzed. We refer the reader to [16] for an introductory and historical account, and to [11] for a literature survey up to 2013. Numerous different decompositions have allowed the developments of various problem dependent strategies, see, e.g., [7,12,14,18,24].
More recently, methods for solving linear equations in tensor format have been proposed and analyzed. In most cases, the authors have been interested in the presence of many summands and many Kronecker products, for which iterative methods appear to be mandatory. In this context, most approaches try to take into account the Kronecker structure and the possible low rank of the involved iteration matrices, see, e.g., [2,3,6,13,[19][20][21]. However, little has been said on "direct" dense methods for low order tensor equations, without the explicit use of the Kronecker form. Here we close this gap for the special case of (1.1), which nonetheless appears to be a feasible algebraic formulation of a quite large set of differential problems.

A closed form solution
The numerical solution to (1.1) can be given in closed form by unfolding the 3-mode tensor in one of the three directions. In particular, a tensor X ∈ R n 1 ×n 2 ×n 3 can be written using the mode-1 unfolding as (see, e.g., [16]) each X j is called a slice, and X (1) is a matrix in R n 1 ×n 2 n 3 . Some additional standard notation needs to be recalled. The Kronecker product of two matrices X , Y is defined in the standard block form as where X i, j denotes an element of X . Moreover, vec(X ) is the operator stacking all columns of the matrix X one after the other. In the case of third order tensors, we will apply the vec operator to the mode-1 unfolding. The reverse operation, for known dimensions of the vector x, will be denoted by mat(x, n 1 , n 2 ), so that x = vec(X ) and X = mat(x, n 1 , n 2 ). Similarly, X = tensor (1) (x, n 1 , n 2 , n 3 ) will fold a long vector into a tensor via the mode-1 unfolding. A standard property of the Kronecker product that will be used repeatedly is the following (B T denotes the real transpose of B), which allows one to go back and forth between the vector and matrix notations. Other properties used in the sequel are i) The following result holds. Here Q * denotes the conjugate transpose of the complex matrix Q, while H −T = (H −1 ) T and T denotes transposition.
Using the mode-1 unfolding, the solution X to (1.1) is given by where for j = 1, . . . , n, the matrixZ j solves the generalized Sylvester equation where R j, j denotes the ( j, j) element of the upper triangular matrix R and R 1: j−1, j the first j − 1 components of its jth column; we define mat(Z j−1 R 1: j−1, j , n, n)H T 3 to be an empty array for j = 1.
Proof Using (2.1) for the unfolded tensor we have H X (1) For later readability, let us transpose both sides and set Y = (X (1) ) T . Then we obtain Using (H −1 A 3 ) T = Q R Q * and multiplying the equation by Q from the right, we can write Thanks to the upper triangular form of R, for the first columnẑ 1 it holds For the subsequent columns j = 2, . . . , n, taking into account once again the triangular form of R, we set w j−1 = [ẑ 1 , . . . ,ẑ j−1 ]R 1: j−1, j so that Each column can be obtained in sequence by further unmaking the Kronecker product as follows. Let us reshape eachẑ j so thatẑ j = vec(Ẑ j ). By using (2.1) in (2.2) for j = 1, we can write T , which can be written as , or equivalently, for j = 2, . . . , n, as (2.5) Multiplying both sides by M 1 (from the right) and by M T (from the left), the result follows.
We notice that the use of the mode-1 unfolding is related to the specific location of the repeated matrices H, M. Different unfoldings could be used if these matrices occupy different positions. We also remark that the same procedure could be applied if data were complex, without any particular change in the proof, or in the algorithm below, except that one should keep in mind that property (2.1) uses real transposition, even if data are complex.

The new algorithm
The proof used for Theorem 2.1 is constructive, as it provides an explicit way to generate the tensor solution, one slice at the time. The complete procedure is described in the algorithm below, in the following called the Three-Term-Tensor Sylvester (T 3 -Sylv) method. Algorithm T 3 -Sylv.
Input: (1) (x, n, n, n) Output: X solution to (1.1) In practice, using appropriate transformations, the method is a nested Sylvester solver, which treats one slice at the time, and updates the corresponding coefficient matrix and right-hand side F. The solvability of the Sylvester equations is related to that of the original problem, and in particular to the nonsingularity of A.
The algorithm relies on the initial Schur decomposition, which provides robust unitary transformations. Moreover, for each slice, a matrix Sylvester equation needs to be solved, whose solution also involves the Schur decompositions of the coefficient matrices, see, e.g., [26]; its sensitivity has been analyzed in [8, sec.4.1]. Stability of the overall algorithm is also affected by the presence of several inverses, which can be harmful in case of ill-conditioned matrices. Indeed, if some of the involved matrices are severely ill-conditioned, the solution may lose accuracy. This fact was experimentally observed in our experiments, some of which are reported in Example 4.3.

Numerical experiments
In this section we report on some numerical experiments with the T 3 -Sylv method. All experiments were performed using Matlab [22].

Example 3.1
To test the efficiency of the method, we consider dense matrices with random entries (taken from a uniform distribution in the interval (0,1), Matlab function rand) of increasing size n. The same is used for the vectors b 1 , b 2 , b 3 . We stress that the Kronecker form of the problem would involve a dense matrix A of size n 3 × n 3 , which could not even be stored. We readily observe that the method is able to solve a (random) structured dense problem of size n 3 = 16, 777, 216 in about 34 seconds on a standard laptop. The CPU times in Table 1 show that the computational cost of the method approximately grows between six and ten times as the dimension n doubles. However, going from n to 2n, the problem dimension in the full space would grow from n 3 to 2 3 n 3 . Hence, the actual cost appears to grow linearly with n 3 . Since data are dense, Gauss elimination on A would instead require O((n 3 ) 3 ) floating point operations.

The symmetric case
If all matrices are symmetric and positive definite, the solution is also symmetric and positive semidefinite, in the appropriate tensor representation; see, e.g., [10] for a general discussion. With these hypotheses, the derivation of the solution procedure simplifies accordingly, as shown in the following result. We stress that many problems can be brought to this setting. Consider for instance the differential equation − u = f on the unit cube with zero Dirichlet boundary conditions. By discretizing using linear finite elements in each direction (this may be seen as a linear finite element discretization using Q 1 brick elements), we obtain with M = tridiag(−1, 4, −1) ∈ R n×n and A = tridiag(−1, 2, −1) ∈ R n×n (the underlined number corresponds to the diagonal entry in the two symmetric and tridiagonal matrices). If f can be well approximated by a separable function in spatial dimensions, then F will have the desired Kronecker form, see, e.g., [10,17] for a similar description 1 .
Analogously, one could consider the equation L(u) = f (x, y, z), (x, y, z) ∈ ⊂ R 3 and . Finite difference discretization leads to the Kronecker form in (1.1), where, M is a diagonal matrix containing the coefficients in m(z k ) at the interior nodes z k in the z-direction; similarly for the other coefficients. The matrices A i , i = 1, 2, 3 contain the three-point stencil of the discretized second order derivative in each direction, respectively, together with the coefficient φ i ; see, e.g., [25].
We do not report numerical results with data stemming from these discretizations, as they would not significantly differ from those shown in our examples, for dense data.
We then describe the specialized result for symmetric and positive definite matrices. This leads to better stability properties of the algorithm (see Example 4.3). where for j = 1, . . . , n, Z 1 = Z j L −1 1 and the matrix Z j solves the generalized Sylvester equation Proof. Consider the following Cholesky factorizations of the given matrices Using (2.1) for the unfolded tensor we have H X (1) Note that all these "hat" coefficient are displayed, with x nonsym obtained by algorithm t 3 -sylv and x sym by t 3 -sym-sylv; here x * is a reference solution, obtained by using the Matlab direct solver "\" on the Kronecker form of the problem with coefficient matrix of size 125 × 125. The figure also displays the quantities 10 −15 κ 3/2 and 10 −15 κ 5/2 , which appear to match the dependence of the error on the matrices conditioning when employing either of the two methods, respectively. Clearly, the possibility of using the symmetric solver significantly improves the accuracy of the obtained solution, with respect to the problem condition number. The sensitivity of the method deserves a deeper analysis that will be performed in future work.

Conclusions
We have proposed a new method for solving order-3 tensor linear equations. We derived a general approach relying on the Schur decomposition, and a specialized one that effectively exploits the symmetric positive definiteness of the coefficient matrices.
Although the considered class of tensor equations is restricted by the role and position of the two matrices H and M, our presentation shows that this setting is sufficiently general to represent a good variety of practical problems. On the other hand, the repeated presence of M and H forced us to use the same dimensions in all modes. We will try to relax this constraint in future work. We also remark that the algebraic problem could be formulated with these two repeated matrices located in other (different) positions in the tensor equation, in a way that a similar solution derivation could be devised.
Since the right-hand side has low numerical tensor rank, we expect X to have low tensor rank [10], [20,Th.3.6]. Tensor-based truncation strategies could be employed to satisfactorily approximate the obtained solution. This would avoid storing the whole dense tensor, if dimensions become large.
The proposed strategy allows us to solve with an essentially direct method problems of structured large dimensions. Nonetheless, if n is required to be significantly larger and the coefficient matrices are sparse, then an iterative variant of the proposed method could be considered. As an alternative, the new method can serve as workhorse for solving the reduced equation in projection type procedures for large and sparse third order tensor equations; see similar strategies in [26] for linear matrix equations.