An $\alpha$-robust and second-order accurate scheme for a subdiffusion equation

We investigate a second-order accurate time-stepping scheme for solving a time-fractional diffusion equation with a Caputo derivative of order~$\alpha \in (0,1)$. The basic idea of our scheme is based on local integration followed by linear interpolation. It reduces to the standard Crank--Nicolson scheme in the classical diffusion case, that is, as $\alpha\to 1$. Using a novel approach, we show that the proposed scheme is $\alpha$-robust and second-order accurate in the $L^2(L^2)$-norm, assuming a suitable time-graded mesh. For completeness, we use the Galerkin finite element method for the spatial discretization and discuss the error analysis under reasonable regularity assumptions on the given data. Some numerical results are presented at the end.


Introduction
We shall approximate the solution of the time-fractional diffusion equation ∂ α t u(x, t) + Au(x, t) = f (x, t) for (x, t) ∈ Ω × (0, T ], (1.1) subject to homogeneous Dirichlet boundary conditions, that is, u(x, t) = 0 on ∂Ω × (0, T ], with u(x, 0) = u 0 (x) at the initial time level t = 0.The spatial domain Ω ⊂ R d (with d = 1, 2, 3) is a convex polyhedron, 0 < α < 1, the time fractional Caputo derivative , where v ′ = ∂v/∂t and Γ denotes the gamma function.We use the notation Iv(t) for the standard time integral of v from 0 to t.In (1.1),A is an elliptic operator in the spatial variables, defined by Aw(x) = −∇ • (κ(x)∇w)(x).The diffusivity κ ∈ L ∞ (Ω) satisfies 0 < κ min ≤ κ on Ω, for some constant κ min .For the error analysis, we also require that κ ∈ W 1,∞ (Ω).The presence of the nonlocal time fractional (Caputo) derivative in (1.1) and the fact that the solution u suffers from a weak singularity near t = 0 have a direct impact on the accuracy, and consequently the convergence rates, of numerical methods.To overcome this difficulty, different approaches have been applied including corrections, graded meshes, and convolution quadrature [6,8,14,25,30,31].Indeed, the numerical solutions for model problems of the form (1.1), including a priori and a posteriori error analyses and fast algorithms, were studied by various authors over the past fifteen years using multiple approaches [1,3,5,7,9,10,12,13,18,28,32,33,34,36,37].For more references and details, see the recent monograph by Jin and Zhou [11].
In this work, we investigate rigorously the error from approximating the solution of the initialboundary value problem (1.1) using a uniform second-order accurate time-stepping method.The latter is defined via a local time-integration of problem (1.1) on each subinterval of the time mesh combined with continuous piecewise linear interpolation.The proposed scheme is identical to the piecewise-linear case of a discontinuous Petrov-Galerkin method proposed in [22].Therein, with τ being the maximum time mesh step size, a suboptimal convergence rate of order O(τ (3−α)/2 ) was proved.A time-graded mesh (2.1) was employed to compensate for the singular behaviour of the continuous solution at t = 0.In the limiting case as α → 1, the problem (1.1) reduces to the classical diffusion equation, and the considered numerical scheme reduces to the classical Crank-Nicolson method.In this case, O(τ (3−α)/2 ) reduces to O(τ ) which is far from the optimal O(τ 2 ) rate achieved in practice.
By using an innovative approach that relies on interesting implicit polynomial interpolations and duality arguments, we show O(τ 2 ) convergence, whilst at the same time relaxing the imposed regularity assumptions from the earlier analysis [22].This convergence rate is α-robust in the sense that the constant in the error bound remains bounded as α → 1. Implementation wise, although the proposed scheme is uniformly second-order accurate, the computational cost is comparable to the well-known backward Euler or L1 [16,25,29] methods, which are not even first-order accurate.
For completeness, we discretize the problem (1.1) over the spatial domain Ω using the standard Galerkin finite element method (FEM), thereby defining a fully discrete approximation to u.An additional error of order O(h 2 ) is anticipated under certain regularity assumptions on the continuous solution, where h is the maximum spatial finite element mesh size.This is proved via a concise approach that relies on the discrete version of the earlier error analysis.To make this feasible, the solution of the semidiscrete Galerkin finite element solution of problem (1.1) plays the role of the comparison function.
Outline of the paper.In the next section, we define our time-stepping scheme, introduce some notations and technical lemmas, and summarize the convergence results in Theorem 2.1.The required regularity properties are also highlighted.Section 3 proves some error bounds for the implicit piecewise-linear interpolant u defined in (2.6).Section 4 is devoted to showing the second order of accuracy of the proposed time-stepping scheme via a duality argument.In Section 5, we discretize in space via the Galerkin finite element method and discuss the convergence of the fully discrete solution.To support our theoretical findings, we present some numerical results in Section 6.Finally, a short technical appendix derives an α-robust interpolation estimate.

Time-stepping scheme
This section is devoted to discretizing the model problem (1.1) over the time interval [0, T ] through a second-order accurate method, and stating our main convergence results.We begin by introducing some notations.
For ℓ ≥ 0, the norm on H ℓ (Ω) is denoted by • ℓ .The Sobolev spaces H ℓ (Ω) and H 1 0 (Ω) are defined as usual, and the norm • Ḣr (Ω) in the (fractional-order) Sobolev space Ḣr (Ω) is defined in the usual way via the Dirichlet eigenfunctions of the self-adjoint elliptic operator A on Ω.The inner product in L 2 (Ω) is denoted by •, • , and the associated norm by • .The generic constant C remains bounded for 0 < α ≤ 1, and is independent of the time mesh and the finite element mesh, but may depend on Ω, T , and other quantities, including κ, u 0 and f .Define the time mesh 0 = t 0 < t 1 < t 2 < . . .< t N = T by and let τ n = t n − t n−1 .Such a time-graded mesh has the properties [19] t n ≤ 2 γ t n−1 and γτ t Integrating problem (1.1) over I n := (t n−1 , t n ) and then dividing by τ n yields where fn = τ −1 n In f (t) dt denotes the average value of a function f over the time interval I n .Motivated by (2.3), for t ∈ I n and for 1 ≤ n ≤ N, our semidiscrete approximate solution U (t) ≈ u(t is defined by requiring that where implying that our scheme reduces to the Crank-Nicolson scheme for the classical diffusion equation.
Our convergence analysis relies on decomposing the error as where u is a continuous piecewise-linear function in time satisfying Alternatively, u can be defined via I u(t n ) = Iu(t n ) for 1 ≤ n ≤ N , with u(0) = u 0 , and we say that u interpolates u implicitly.The decomposition (2.5) of the error η follows a well-known pattern, but the novel choice of the piecewise linear function u makes possible our improved error analysis under reasonable regularity assumption.The continuous average of u equals both the continuous and the discrete average of u on each time subinterval I n .For comparison, let u I denote the usual continuous piecewise-linear interpolant to u, that is, and observe that u I and u have the same discrete average 1 2 (u(t n ) + u(t n−1 )) on each I n , but their continuous averages will differ unless u is linear on I n .Subtracting (2.4) from (2.3) and using (2.6), we obtain Taking the L 2 (Ω)-inner product with a test function ϕ ∈ H 1 0 (Ω), and applying the divergence theorem, it follows that Choosing ϕ = θ ′ and summing over n yields To proceed in our analysis, we make use of the following technical lemma.For the proof, we refer to Mustapha and Schötzau [23,Lemma 3.1 (iii)].
Lemma 2.1 For 0 < α ≤ 1 and ǫ > 0, For later use, by expanding I 1−α (v + w), v + w) then applying Lemma 2.1 with ǫ = 1/(2α) we deduce the inequality in the next lemma.Lemma 2.2 For 0 < α ≤ 1, We now apply Lemma 2.1 to the right-hand side of (2.9) with ǫ = 1/(2α 2 ).Multiplying through by 2, and then cancelling the similar terms, leads to the estimate below that will be used later in our convergence analysis.
Under reasonable regularity assumptions, a novel error analysis involving implicit interpolations and a duality argument leads to the convergence results in the next theorem.With J = (0, T ), an optimal O(τ 2 )-rate of convergence is achieved in the L 2 (J; L 2 (Ω))-norm.Our numerical results illustrate this in the stronger L ∞ (J; L 2 (Ω))-norm.Moreover, our numerical results suggest that the condition on the graded mesh exponent can be further relaxed.More precisely, instead of γ > max{2/σ, (3 − α)/(2σ − α)} it suffices to impose γ > 2/σ.

Errors from implicit interpolations
In preparation for our convergence analysis, we now study the error from approximating u by u, and proceed to estimate ψ and I( I 1−α ψ ′ , ψ ′ ).These estimates assume that the regularity property (2.11) holds.For ease of reference, we here introduce the parameter δ = σ − 2 γ which will subsequently appear repeatedly.We start this section with the following representation of the implicit interpolation error in the approximation u ≈ u at t = t n .
Proof.Since In u I dt = 1 2 τ n (u n + u n−1 ), we find using (2.6) that The formula for ψ n then follows by induction on n, after noting that ψ 0 = 0. Recalling the Peano kernel for the trapezoidal rule, we see that implying the second expression for ∆ j .✷ Lemma 3.2 For n ≥ 1, Proof.For t ∈ I j , we have the identity Multiply both sides by (t j − t)(t − t j−1 ) and integrate to obtain where Thus, by Lemma 3.1, Shifting the summation index, so and the bound for ψ n follows after canceling the common terms.The interpolant ψ I , defined as in (2.7), satisfies ψ I = u I − u, leading to the representation which implies the bound for ψ because ψ I ≤ max ψ n , ψ n−1 on I n .✷ Corollary 3.1 Under the regularity assumption in (2.11) and for a time mesh of the form (2.1) with grading parameter γ ≥ 1 we have, for n ≥ 1, Proof.First we show that for γ ≥ 1, u − u I In ≤ Cτ min(γσ,2) t max(0,δ) n . (3.2) Since the interpolation error u − u I vanishes if u is a polynomial of degree 1, by computing the Peano kernel one finds that for t ∈ I n , The right-side is bounded by and so, by using the time mesh properties in (2.2), we get , the proof of (3.2) is completed after noting that Turning to the estimate for ψ n , Lemma 3.2 and (2.11) imply that ≤ Cτ γσ , and we again bound τ 2 n t σ−2 n using (3.4).For the sum over j, and the estimate for ψ n follows.✷ The next target is to estimate Preceding this, we need to bound ψ ′ in the next lemma.
Proof.Differentiating (3.1) and (3.3), we see for t ∈ I n that Thus, if t ∈ I 1 then, by (2.11), Corollary 3.1, and the fact that ψ 0 = 0, If δ > 0, n ≥ 2 and t ∈ I n then, recalling (3.4) and using again (2.11), Proof.For n = 1, the Cauchy-Schwarz inequality, Lemma 3.3, and the assumption σ > α/2 give To deal with the case n ≥ 2, we make the splitting for t ∈ I j and j ≥ 2. Using Lemma 3.3 and (2.2), we observe that For estimating T 1 (t), integrate by parts recalling ψ 0 = 0, where We apply Corollary 3.1 to conclude that Lemma 3.3 and above estimates for T 1 and T 2 yield By using this and (3.5), and noting that γ(2σ − α) > 3 − α, we reach and therefore the desired bound holds.✷

Errors from the time discretizations
This section is devoted to estimating the error η = u − U from the time discretization in the norm of L 2 (J; L 2 (Ω)).To achieve an optimal convergence rate, we employ a duality argument in addition to the usage of the time graded meshes.By reversing the order of integration, we find that and integrating by parts yield and since We remark that Zhang et al. [35,Equation (89)] have recently exploited this dual operator w → −(J 1−α T w) ′ in the error analysis of a discontinuous Galerkin scheme for (1.1).Suppose that ϕ satisfies the final-value problem subject to homogeneous Dirichlet boundary conditions.Let y(t) = ϕ(0) + t 0 ϕ(s) ds so that y solves the initial-value problem and with y I denotes the continuous piecewise-linear function that interpolates y at the time levels t j , put Y = y − y I .Proof.Using (4.1), (4.3), η(0) = 0 and (4.4), At the same time, (2.3) and (2.4) imply that the inequality follows at once.✷ We will show below in Theorem 4.2 that the interpolation error Y satisfies Assuming this fact for now, we can derive an estimate for η in terms of ψ ′ .For convenience, we use the notations: Proof.It suffices to estimate the right-hand side of the inequality in Lemma 4.1.By Lemma 2.1, After using (2.10) and η ′ = ψ ′ − θ ′ , we conclude that Since Y (t j ) = 0 for 0 ≤ j ≤ N , integrating by parts, using AY = IAY ′ = I 1−α (I α AY ′ ), and applying Lemma 2.1, The same estimate holds with θ replaced by ψ, so because η = ψ − θ, and using (2.10) again, Since (G(Y )) 2 + (F (AY )) 2 is just the left-hand side of (4.6), the desired inequality followed after cancelling the common factor η 2 L 2 (J) .✷ It remains to prove (4.6).We start by showing preliminary bounds for I( Y ′ , I 1−α Y ′ )(T ) and I( AY, I 1−α AY ′ )(T ) in the next two lemmas.
Proof.For t ∈ I j with j ≥ 2, we write where Applying the Cauchy-Schwarz inequality, integrating, changing the order of integration, and integrating again, yields ds and S 2 (t) = 0. Following the steps above, we find that S 1 2 , and consequently Turning to the second term S 2 (t), we integrate by parts (noting that Y j−2 = 0 = Y 0 ) and obtain and, remembering that ω −α (t) = −α t −1 ω 1−α (t), , and thus, and in the second term, noting that 1/Γ(1 By interchanging the order of the double sum, which, combined with (4.9), yields the desired estimate.✷ Lemma 4.3 Proof.Integrating by parts, where we used the splitting I α AY = S 3 + S 4 with For the estimate involving S 4 , we reverse the order of integration and then integrate by parts, to obtain and thus, applying the Cauchy-Schwarz inequality and using AY I i AY I j ≤ AY 2 The proof is concluded by inserting this and (4.11) into the splitting (4.10).✷ By using the achieved estimates in the previous two lemmas, we are now able in the next theorem to provide the missing part in the proof of Theorem 4.1.Proof.Recall that Y = y − y I where y I is the piecewise linear polynomial that interpolates y at the time nodes, and y ′ = ϕ.Thus, if t ∈ I j then so Y (t) ≤ I j ϕ ds.Similarly, replacing u with y in (3.3), we have Consider the linear operator B defined by (Bϕ)(t) = τ −α/2 j Y I j for t ∈ I j and 1 ≤ j ≤ N .The estimates (4.13) give and, since ϕ(T ) = 0 and (τ 1−α ) 1−α (τ 3−α ) α = τ 1+α , we may apply Corollary 7.1 from the Appendix to deduce that Furthermore, by differentiating (4.12) and (3.3), After summing over j and once again applying Corollary 7.1, we arrive at Now take the inner product of (4.3) with −(J 1−α T ϕ) ′ in L 2 (Ω), and then integrate in time to obtain .
and therefore (J Combining this with (4.14), (4.16), we conclude that and so, applying Lemma 4.2, By taking the inner product of (4.3) with Aϕ and proceeding as above, we deduce that Repeating the arguments leading to the first estimate in (4.15) but with Y ′ replaced by AY ′ , we see that Likewise, AY 2 I j ≤ τ j Aϕ 2 L 2 (I j ) by the arguments leading to (4.12), so Hence, by Lemma 4.3 and (4.19), Together, (4.18) and (4.20) imply the desired estimate (4.6).✷

A fully discrete scheme and error analysis
In this section, we discretize the time-stepping scheme (2.4) in space using the continuous piecewise-linear Galerkin FEM and hence define a fully-discrete method.Thus, we introduce a family of regular (conforming) triangulations T h of the domain Ω indexed by h = max K∈T h (h K ), where h K denotes the diameter of the element K. Let V h denote the space of continuous, piecewise-linear functions with respect to T h that vanish on ∂Ω.Let W(V h ) ⊂ C([0, T ]; V h ) denote the space of linear polynomials on I n for 1 ≤ n ≤ N , with coefficients in V h .Motivated by the weak formulation of (2.4), our fully-discrete solution and for 1 ≤ n ≤ N , where ).For the discrete initial data, we choose In the next theorem, we prove that the numerical solution defined by (5.1) is second order accurate in both time and space, provided that the time mesh exponent γ is chosen appropriately.In comparison, under heavier regularity assumptions and stronger graded meshes, convergence of order h 2 + τ (3−α)/2 was proved by Mustapha et al. [22].Furthermore, the proof therein is more technical and lengthy.Use of the piecewise-linear polynomial function u, see (2.6), and a duality argument allowed us to improve the convergence rate, simplify the proof and also relax the regularity assumptions.In addition to the regularity assumption in (2.11), for t > 0, we impose u ′ (t) 2 + t u ′′ (t) 2 ≤ C t υ−1 , for some υ > 0. (5.2) Theorem 5.1 Let u be the solution of (1.1) and let U n h be the approximate solution defined by (5.1).Assume that the regularity assumptions in (2.11) and (5.2) are satisfied for σ, υ > α/2, and choose the mesh grading exponent γ > max{2/σ, 1/υ, (3 − α)/(2σ − α)}.Then, Proof.Decompose the error as u − U h = (u − u h ) + (u h − U h ), where u h is the Galerkin finite element solution of (1.1) defined by for each fixed t > 0, with u h (0) = U 0 h = R h u 0 .From this, the weak formulation of (1.1), and the orthogonality property of the Ritz projection, we have From (5.1) and (5.3), and with χ h = u h − R h u, we have Using the orthogonality property of the Ritz projection and the definition of u, in addition to (5.3),

Numerical results
In this section, we illustrate numerically the theoretical finding in Theorem 2.1.An O(h 2 ) convergence of the finite element solution was confirmed for various choices of the given data [8,10,24].In time, some numerical convergence results (piecewise linear discontinuous Petrov-Galerkin method) were also delivered [22].However, we illustrate the errors and convergence rates in the stronger L ∞ ((0, T ), L 2 (Ω))-norm on more realistic examples.We choose κ = 1, Ω = (0, 1) and a uniform spatial grid T h .In both examples, we choose h so that the error from the time discretization dominates.
In all tables and figures, we evaluated the series solution by truncating the infinite series after 60 terms.The empirical convergence rate CR is calculated by halving τ , that is, CR = log 2 (E τ /E τ /2 ). Figure 1 plots the nodal errors U n h − u(t n ) against t n ∈ [0, 1] for different values of N in the cases γ = 1 and γ = 4.The practical benefit of the mesh grading is evident.Since u 0 ∈ Ḣ1.5 − (Ω), the regularity property (2.11) is satisfied for σ = 3 4 α.As in Example 1, the numerical results in Table 2 exhibit convergence of order τ σγ for 1 ≤ γ ≤ 2/σ in the stronger • J -norm.For a graphical illustration of the impact of the graded mesh on the pointwise error, we fixed N = 80 in Figure 2  We define an unbounded operator D 0 on X 0 with domain X 1 , by setting D 0 u = u ′ on the interval (0, T ).Let ℜz denote the real part of a complex number z.Since ℜ u(t), D 0 u(t) Y =

Figure 1 :
Figure 1: Errors for Example 1 as functions of t for different choices of N when α = 0.5, taking γ = 1 in the left figure and γ = 4 in the right figure.

Figure 2 :
Figure 2: Error at t n for 1 ≤ n ≤ N in Example 2, for a fixed N = 80 and different choices of the mesh exponent γ with α = 0.7.

Table 1 :
Errors and empirical convergence rates for Example 1 with α = 0.5, using different choices of the time mesh-grading exponent γ.