Fine spectral estimates with applications to the optimally fast solution of large FDE linear systems

In the present article we consider a type of matrices stemming in the context of the numerical approximation of distributed order fractional differential equations (FDEs). From one side they could look standard, since they are real, symmetric and positive definite. On the other hand they cause specific difficulties which prevent the successful use of classical tools. In particular the associated matrix-sequence, with respect to the matrix-size, is ill-conditioned and it is such that a generating function does not exists, but we face the problem of dealing with a sequence of generating functions with an intricate expression. Nevertheless, we obtain a real interval where the smallest eigenvalue belongs to, showing also its asymptotic behavior. We observe that the new bounds improve those already present in the literature and give more accurate pieces of spectral information, which are in fact used in the design of fast numerical algorithms for the associated large linear systems, approximating the given distributed order FDEs. Very satisfactory numerical results are presented and critically discussed, while a section with conclusions and open problems ends the current work.


Introduction
When considering the numerical approximation of fractional differential equations (FDEs), due to the nonlocal nature of the involved operators, the matrices are intrinsically dense, even in the case where the approximation technique is of local nature (Finite Differences, Finite Elements, Finite Volumes, Isogeometric Analysis etc.); see [14,18,19,20,22,25,34,35,36,37] and references therein.The same situation is present also in the context of distributed order FDEs (see [21] and references therein).When employing equispaced gridding, the resulting structures have the advantage of being of Toeplitz nature, so that the cost of a matrix-vector multiplication is almost linear i.e. of O(n log n) arithmetic operations, where n is the matrix order and the constant hidden in the big O is very moderate (refer to [8,23] and to the references there reported).These computational features, joint with the usually large dimensions of the considered linear systems, lead to the search for suitable iterative solvers: typically the most successful iterative procedures belong to the class of preconditioned Krylov methods, to the algorithms of multigrid type, or to clever combinations of them [8,10,11,12,23,31,29].
A crucial information for an appropriate design of an efficient solver of such a kind is the precise knowledge of the asymptotical behavior of the minimal eigenvalue, which in the positive definite case is also related to the asymptotic spectral conditioning: in our setting we emphasize that the considered matrices are real, symmetric, and positive definite.
In this work, starting from [5,21], we improve the bounds present in the literature.We take into account the techniques already developed in [4,6]: the additional difficulty, in the present setting, relies on the fact that the generating function is not fixed, since it depends on the matrix order n, which is somehow a challenging novelty with respect to the classical work on the subject [4,6,27].
In reality, in technical terms, we consider the real-valued symbol f n (θ) := n−1 j=0 h jh |θ| 2−jh with h := 1 n .In this note we obtain a real interval where the smallest eigenvalue of the Toeplitz matrix A n := T n (f n ) belongs to, with f n being the generating function of A n , for any fixed n.
Based on the spectral information, few algorithmic proposals are also discussed starting from those presented in [12,13,21] and are of the type mentioned before: preconditioned conjugate gradient (PCG) algorithms, multigrid solvers (typically the V-cycle), and combinations of them that is a V-cycle where the smoothing iterations (one step) incorporate the proposed PCG choices.
The work is organized as follows.Section 2 contains the setting of the problem regarding the minimal eigenvalue of A n and its study and solution.Section 3 is devoted to numerical experiments concerning the solution of large linear systems with coefficient matrix A n , including few evidences of the spectral clustering at one of the proposed preconditioned matrix-sequences.

The problem and its solution
Let h := 1 n and consider the function .
Note that f n coincides with the function Fn used in [5].We employ the notation then f n is the so-called generating function or symbol of A n .Using that, we obtain To simplify the previous expression, we consider now the function g given by Then we infer the relation We now establish a connection with the simple-loop method [3,4].Consider the Toeplitz matrix T n (a) with symbol a : T → R given by where t = e iθ .For a symbol b, let λ j (T n (b)) and ψ j (T n (b)) be its jth eigenvalue and eigenfunction, respectively.Since a is real-valued, its eigenvalues must be real and we arrange them in increasing order, that is If ψ j (T n (a)) is the eigenfunction corresponding to λ j (T n (a)), then it is well-known that (see for example [4,7]) it leads to the equations (n + 1) where s := π n+1 and c n is bounded when n → ∞.The constant c n can be calculated from the relation ψ 1 (T n (a)) 2 = 1, see Figure 1.

Upper bound for λ 1 (A n )
Theorem 2.1.We have where the constant k 1 is given by and can be numerically approximated as 12.9301.
Proof.Let •, • be the inner product of the Hilbert space L 2 (T).Note that the essential ranges of the symbols n 2 a and nf n have a similar behavior in an interval of the type 0, O 1 n .Hence using the well-known formula where Here the notation f ∼ g means lim n→∞ Note that the last integral does not depend on n, its value can be numerically found producing c n ≈ 2.2214, which agrees with the Figure 1.For the second integral in (2.3) we recall that which combined with (2.5) produces Finally, combining (2.3), (2.4), (2.5), and (2.6) we obtain the thesis.

Lower bound for λ 1 (A n )
In this part we will implement the trick used in [6] which works as follows.Let b ∈ C(T) and Since T n (q n ) is the zero matrix we clearly have T n (a) = T n (a + q n ), thus instead of working with T n (a) we can use T n (a + q n ) which under a swiftly selected symbol q n , can be advantageous.
Theorem 2.2.We have where the constant k 2 is given by which can be approximated numerically as 2.2945.
Proof.In our case instead of working with the symbol nf n we will use where p(σ) := ∞ j=1 p j cos(σj), p j are real constants, and r n is given in (2.1).Let t = e iθ , then we can write which clearly shows the form (2.7).Additionally, the function p is 2π-periodic, even, and satisfies π 0 p(σ) dσ = 0. We thus deduce T n (p(n|θ|)) = 0, and hence Keeping in mind that, for any real-valued symbol β, the smallest eigenvalue of T n (β) must be greater or equal to the infimum of β, we obtain where in the last line we used the variable change σ = n|θ|.Then we obtain for a sufficiently large constant M .Thus we have In order to obtain a neat lower bound for λ 1 (A n ), we need to choose the coefficients p j in such a way that be maximal.Since p is 2π-periodic and g is strictly increasing with g(0) = 0 and g(∞) = ∞, the infimum of g + p must be in the interval [0, π], and consequently we infer that Consider the integral which satisfies k 2 m, and take p as with respect to those available in the literature and which amount so far to only two works [5,21]: our bounds are more precise as emphasized in the next section.

Few selected numerical experiments
The current section is divided into two parts.In the first we discuss our theoretical results, regarding the bounds on the minimal eigenvalue of A n , as a function of the matrix order n, also in comparison with the bounds present in the literature.Regarding [21], based on [26], the only relevant observation is that the minimal eigenvalue of T −1 n (|θ| 2 )h α T n (|θ| 2−α ) is well separated from zero, for any choice of α ∈ (0, 2), and this provides a qualitative indication that the minimal eigenvalue of An n converges to zero with a speed of 1 n 2 .Concerning [5], the latter claim is indeed proved formally, but the constants are not computed, while in the present note we improve the findings, by determining quite precise lowerbounds and upperbounds (see Figure 2).
The second part is devoted to exploit the spectral information for the use and the design of specialized iterative algorithms, when solving large linear systems having An n as coefficient matrix.In Table 1 we employ the standard conjugate gradient (CG), since the coefficient matrix is positive definite, but we do not expect optimality due to the ill-conditioning of the related matrix-sequence.We then use the preconditioned conjugate gradient (PCG) with Strang and Frobenius optimal preconditioners (see [8,9,23,28] taking the preconditioner in the algebra of the circulant matrices and in the algebra of sine transforms, containing the Finite Difference  discrete Laplacian T n (2 − 2 cos θ)).Also in this case it is nontrivial to obtain optimality, due to the ill-conditioning of the involved matrices, which grows to infinity as the matrix size tends to infinity.
Few observations are in order: the considered matrix A n is real, symmetric, and Toeplitz and the τ algebra is again inherently real, symmetric, and with Toeplitz generator [9].The latter statement represents a qualitative explanation of the fact that the τ preconditioners perform substantially better than the analogs in the circulant algebra (see Table 1): the theoretical ground to the preceding qualitative remark relies in the notion of good algebras developed in [30].
Notice that the tridiagonal τ discrete Finite Difference Laplacian ∆ n is optimal: the related linear system solution is extremely cheap both in a sequential model (via the Thomas algorithm) and in a parallel model of computation (via e.g.classical Multigrid methods [17]).We also notice the remarkable difference between the performance of the optimal τ preconditioner and of the optimal circulant preconditioner.The reason is spectral (plus the good algebra argument [30]).
The minimal eigenvalue of the optimal circulant preconditioner is averaged, due to a Cesàro sum effect [9,28], and hence it behaves as 1 n (instead of 1 n 2 ) and this explains the reason why the number of iterations grows as √ n, in the light of the classical results on conjugate gradient methods [1].On the other hand, the optimal τ preconditioner matches very well the extremal eigenvalues of A n (see e.g.[9]).
Thus, as a conclusion, we deduce that the preconditioners C S,n (Strang-Circulant), τ N,n (Natural-τ ), τ F,n (Frobenius-Optimal τ ), and ∆ n (Discrete FD Laplacian) are all optimal in the sense that the iteration count is bounded by constants independent of the matrix size n and the cost per iteration is that of the Fast Fourier Transform, which amounts to O(n log n) arithmetic operations (for a formal notion of optimality see [2]).As the data indicate, the preconditioners τ N,n , τ F,n are the best, but also ∆ n is of interest given its sparsity.
In Table 2-Table 4 the minimal and maximal eigenvalues of the considered preconditioned matrices are reported to highlight the efficiency of the proposed preconditioner.For sure, both the data regarding the preconditioners τ N,n and τ F,n deserve further attention.The outliers analysis reported in Tables 5 and 6, respectively, seems to show a strong cluster at 1.For sure, a weak clustering is observed, being the outliers percentage decreasing as long as n increases: we notice that the weak clustering can be deduced theoretically using the GLT theory [16], while the strong clustering is nontrivial given the ill-conditioning of A n .
The last choice is a multigrid method designed with the use of the spectral information available.In fact, the projector and the restriction operators are those classically used when dealing with the standard discrete Laplacian ∆ n = T n (2−2 cos θ).Indeed, even if An n is dense and seemingly there are no similarities with ∆ n , from a spectral point of view both matrix-sequences have minimal eigenvalue collapsing to zero as 1 n 2 , up to some moderate constants.
More precisely, the transfer operators are designed, in a very standard way, as the product of the Toeplitz matrix generated by the symbol 2 + 2 cos θ and a proper cutting matrix (see e.g.[15,17]).A pure algebraic multigrid is considered, so that matrices at coarser levels are obtained by projection via the transfer operators according to the Galerkin condition.When setting smoothers, we tested several choices by considering a standard Gauss-Seidel iteration, or PCG with sine transform preconditioners, according to the natural approach and the Frobenius optimal choice: ν pre steps are applied for the presmoother and ν post steps for the postsmoother, respectively.
As it can be seen in Table 7, the numerical results with the multigrid choice are fully satisfactory, as well: the method is optimal [2] in the sense that the iteration count is bounded by a constant independent of n and the cost per iteration is proportional to that of the matrix-vector product.
A special mention deserves a last test in which we set as presmoother the PCG with Finite Difference discrete Laplacian preconditioner with ν pre = 1, and as postsmoother the PCG with Natural-τ or Frobenius-Optimal τ preconditioner with ν post = 1, but only at the finest level.In all the coarser levels the smoothers are simply chosen as one iteration of the standard Gauss-Seidel iteration, so further reducing the computational cost.The number of required iterations is 3 as for the case γ.
Though the efficiency is comparable with those of the preconditioned PCG in Table 1 in the present unilevel setting, multigrid could become the most promising choice in the multilevel one, due to the theoretical barriers studied in [24,32,33], regarding the non optimality of the PCG with matrix-algebra preconditioners in the multilevel context.Size

Conclusions
In the current note we have considered a type of matrix stemming when considering the numerical approximations of distributed order FDEs (see [5,21] for example).The main contribution relies in precise bounds for the minimal eigenvalue of the involved matrices.In fact the new presented bounds improve those already present in the literature and give a more accurate spectral information.The latter knowledge has been used in the design of fast numerical algorithms for the associated linear systems approximating the given distributed order FDEs: an interesting challenge is to consider a d-dimensional version of the considered FDE (see [21]), in order to show how the presented techniques and numerical methods can be adapted and extended in d-level setting with d 2.

Figure 2 :
Figure 2: The normalized minimum eigenvalue nλ 1 (A n ) of A n for different values of n.The left image includes the lower and upper bounds given by Theorems 2.2 and 2.1, respectively, while the right image shows a standard data range.
) and right (n r out ) outliers (eigenvalues not belonging to (1−ε, 1+ε) with ε = 10 −1 and 10 −2 and their percentage with respect to the dimension in the case of Natural-τ preconditioner τ N,n .
which combined with (2.8) proves the theorem.Finally, combining the Theorems 2.1 and 2.2 we obtain 2.2945 ≈ k 2 nλ 1 (A n ) k 1 ≈ 12.9301, for every sufficiently large n.
Remark 2.3.In this remark we give a specific account on a comparison of the obtained results

Table 2 :
Minimal and maximal eigenvalues of preconditioned matrices with preconditioner C S,n

Table 4 :
Minimal and maximal eigenvalues of preconditioned matrices with preconditioner ∆ n (Discrete FD Laplacian).

Table 5 :
Number of left (n l out