## 1 Introduction

In this paper, we will consider the numerical methods for solving the following time fractional partial differential equation

\begin{aligned}&_0^C D_{t}^{\alpha } u (x, t) - \varDelta u (x, t) = f(x, t), \; x \in \varOmega , \; t \in (0, T], \end{aligned}
(1)
\begin{aligned}&u(x, 0) = u_{0}, \quad x \in \varOmega , = \end{aligned}
(2)
\begin{aligned}&u(x, t) = q(x, t), \quad x \in \partial \varOmega , \; t \in (0, T], \ \end{aligned}
(3)

where $$\varOmega \subset \mathbb {R}^{d}, d=1, 2, 3$$ is a convex polygonal/polyhedral domain and $$\varDelta$$ denotes the Laplacian. Here f and q are two given functions and $$_0^C D_{t}^{\alpha } x(t), 0< \alpha <1$$ denotes the Caputo fractional order derivative. Many application problems can be modelled by (1)–(3), for example, the thermaldiffusion in media with fractional geometry [28], highly heterogeneous aquifer [1], underground environmental problem [10], random walk [9, 25], etc.

To solve (1)–(3) numerically one needs to approximate the time fractional order derivative. There are three predominant approximations in literature: finite difference method (L1 scheme) [13, 20], the Grünwald–Letnikov method, [35, 36], Diethelm’s method [4, 7]. The L1 scheme is obtained by approximating the first order derivative with the finite difference quotients in the definition of the fractional derivative. The Grünwald–Letnikov method is based on the convolution quadrature and finally the Diethelm’s method is based on the approximation of the Hadamard finite-part integral.

Langlands and Henry [13] considered the L1 scheme for the Riemann–Liouville derivative and proved that the convergence order is $$O(\tau ^{2- \alpha })$$ if $$u \in C^{2}[0, T]$$. Lin and Xu [20] studied the L1 scheme for the Caputo fractional derivative and proved that the convergence order is also $$O(\tau ^{2- \alpha })$$ if $$u \in C^{2}[0, T]$$, see also [17, 21, 31]. Li and Ding [14] obtained a finite difference method with order $$O(\tau ^{2})$$ if $$\, _{-\infty }^{R} D_{t}^{3- \alpha } u \in L^{1}(0, T)$$, see also [12, 24, 26, 27, 29, 30], etc.

Yuste and Acedo [35] considered a Grünwald–Letnikov discretization of the Riemann–Liouville derivative and provided a von Neumann type stability analysis. Zeng et al. [36] introduced two fully discrete schemes with convergence order $$O(\tau ^{2- \alpha })$$ if $$u \in C^{2}[0, T]$$ by using fractional linear multistep method in time to approximate the convolution integral, see also [2, 11, 18, 22, 34], etc.

Diethelm [4] introduced a finite difference scheme to approximate the Riemann–Liouville fractional derivative by using the Hadamard finite-part integral and showed that the truncation error is $$O(\tau ^{2- \alpha })$$ if $$u \in C^{2}[0, T]$$. The scheme in [4] is obtained by approximating the Hadamard finite-part integral with the linear interpolation polynomials. This scheme is actually equivalent to the L1 scheme [13, 20] since the weights are the same. Ford et al. [6] applied the Diethelm’s method for solving time fractional partial differential equation and proved that the convergence order is $$O(\tau ^{2- \alpha })$$ if $$u \in C^{2}[0, T]$$. Higher order Diethelm’s schemes are also available in the literature, see [5, 7, 19, 33], etc.

Recently, Gao et al. [8] obtained a high order numerical differentiation formula with $$O(\tau ^{3- \alpha }), 0<\alpha <1$$ for the Caputo fractional derivative by discretizing fractional derivative directly and applied this formula for solving a time fractional diffusion equation. But there are no error estimates in [8]. Li et al. [15] also introduced a high order $$O(\tau ^{3- \alpha }), 0<\alpha <1$$ numerical method to approximate the Caputo fractional derivative and applied this method for solving time fractional advection–diffusion equations, see also [3, 16]. However the error estimates and stability analysis in [15] are provided only for $$\alpha \in (0, \alpha _{0})$$ with some positive $$\alpha _{0} \in (0, 1)$$, see also [3, 16]. More recently, Lv and Xu [23] proposed a higher order numerical method which is slightly different from the method in Gao et al. [8] for solving time fractional diffusion equation and proved that the scheme has the convergence order $$O(\tau ^{3- \alpha })$$ for all $$\alpha \in (0, 1)$$.

Yan et al. [33] introduced a numerical method for solving linear fractional differential equation with convergence order $$O(\tau ^{3- \alpha }), 0< \alpha <1$$ by approximating the Hadamard finite-part integral with the quadratic interpolation polynomials following Diethelm’s idea in [4]. They obtained an asymptotic expansion of the error, but there are no error estimates proved in [33]. Recently, Li et al. [19] gave the detailed and thorough error estimates for the numerical method in [33] for solving the linear fractional differential equation.

In this paper, we will consider the numerical method for solving time fractional partial differential equation. The time discretization is based on the numerical method in [33] and the space discretization is based on the standard finite element method. The error estimates with convergence order $$O( \tau ^{3- \alpha } + h^2)$$ are proved in detail by using the argument developed in [23] (see also [19]). The assumption of the regularity for the exact solution in our paper is $$u \in C^{3}([0, T], H^{2}(\varOmega )) = C^{3}(H^2)$$ which is much weaker than the assumption for the solution u in Li et al. [15] where $$u \in C^{5}(H^{2})$$. In [15], the exact solution u needs to satisfy $$\frac{\partial u(x, 0)}{\partial t} = \frac{\partial ^2 u(x, 0)}{\partial t^2} =0$$ in order to obtain the required convergence order $$O(\tau ^{3- \alpha }), 0< \alpha <1$$. Our numerical method has no such requirements for the solution u.

The paper is organized as follows. In Sect. 2, we consider the time discretization of the time fractional partial differential equations and prove that the numerical method has the convergence order $$O(\tau ^{3- \alpha })$$ for all $$0< \alpha <1$$. In Sect. 3, we consider the error estimates for solving time fractional partial differential equation in the fully discrete case where the spatial variables are discretized by using standard Galerkin finite element method. Finally in Sect. 4, we give some numerical examples in both one-dimensional and two-dimensional cases.

By C we denote a positive constant independent of the functions and parameters concerned, but not necessarily the same at different occurrences. By $$c_{0}, c_{1}, c_{2}$$ we denote some particular positive constants independent of the functions and parameters concerned.

## 2 Time Discretization

In this section, we will consider the time discretization of (1)–(3). Recall that the Riemann–Liouville fractional derivative is defined by, with $$0< \alpha <1$$,

\begin{aligned} _0^R D_{t}^{\alpha } x(t)= \frac{1}{\varGamma (1-\alpha )} \frac{d}{dt} \int _{0}^{t} (t- \tau )^{-\alpha } x(\tau ) \, d \tau . \end{aligned}
(4)

Let $$\mathbb {N}$$ denote the set of all natural numbers then, for $$p\notin \mathbb {N}$$, on a general interval [ab] Hadamard finite-part integral is defined in [4] as follows: with $$p >1$$,

\begin{aligned}&\oint ^b_a (x-a)^{-p} f(x) dx \nonumber \\&\qquad :=\sum ^{\lfloor {p}\rfloor -1}_{k=0}\frac{f^{(k)}(a)(b-a)^{k+1-p}}{(k+1-p) k!}+\int ^b_a(x-a)^{-p}R_{{\lfloor {p}\rfloor }-1} (x,a)dx, \end{aligned}
(5)

where

\begin{aligned} R_{\mu }(x,a):=\frac{1}{\mu !}\int ^x_a(x-y)^{\mu }f^{(\mu +1)}(y)dy \end{aligned}
(6)

and $$\oint$$ denotes the Hadamard finite-part integral. $$\lfloor {p}\rfloor$$ denotes the largest integer not exceeding p, where $$p\not \in \mathbb {N}$$.

It is easy to show that the Riemann–Liouville fractional derivative in (4) can be written as, [4]

\begin{aligned} _0^R D_{t}^{\alpha } x(t)= \frac{1}{\varGamma (-\alpha )} \oint _{0}^{t} (t- \tau )^{-1-\alpha } x(\tau ) \, d \tau . \end{aligned}
(7)

We will approximate the Hadamard finite-part integral by using piecewise quadratic interpolation polynomial. Let $$n=2M$$, where M denotes a fixed positive integer. Let $$0 = t_{0}< t_{1}< t_{2}< \dots< t_{2j}< t_{2j+1}<\dots < t_{2M} =T$$ be a partition of [0, T] and $$\tau$$ the step size. For simplicity of notation, we assume that $$T=1$$ below. At the point $$t_{2j} = \frac{2j}{2M} T = \frac{2j}{2M}$$, the Eq. (1) can be written as

\begin{aligned} _0^R D_{t}^{\alpha } [ u(x,t_{2j}) - u_{0}] - \varDelta u(x,t_{2j}) = f(x,t_{2j}), \quad j=1,2,\dots , M, \end{aligned}
(8)

and at the point $$t_{2j+1} = \frac{2j+1}{2M}$$, the equation can be written as

\begin{aligned} _0^R D_{t}^{\alpha } [ u(x,t_{2j+1}) - u_{0}] - \varDelta u(x,t_{2j+1}) = f(x,t_{2j+1}), \quad j=1,2,\dots , M-1. \end{aligned}
(9)

Let us first consider the discretization of (8). Note that

\begin{aligned} _0^R D_{t}^{\alpha } u(x,t_{2j})&= \frac{1}{\varGamma (-\alpha )} \oint _{0}^{t_{2j}} (t_{2j}- \tau )^{-1-\alpha } u(x,\tau ) \, d \tau \nonumber \\&= \frac{t_{2j}^{-\alpha }}{\varGamma (-\alpha )} \oint _{0}^{1} w^{-1-\alpha } u(x,t_{2j} - t_{2j} w) \, dw, \end{aligned}
(10)

where the integral denotes the Hadamard finite-part integral.

We will approximate the integral by a piecewise quadratic interpolation polynomial with the equispaced nodes $$0, \frac{1}{2j}, \frac{2}{2j}, \dots , \frac{2j}{2j}, \, j=1, 2, \dots , M$$. More precisely, for the sufficiently smooth function g(w), we have

\begin{aligned} \oint _{0}^{1} w^{-1-\alpha } g(w) \, dw = \oint _{0}^{1} w^{-1-\alpha } g_{2}(w) \, d w + E_{2j}(g), \end{aligned}
(11)

where $$g_{2}(w)$$ is the piecewise quadratic interpolation polynomial of g(w) defined on the nodes $$0< \frac{1}{2j}< \frac{2}{2j}< \dots < \frac{2j}{2j} =1, \, j=1, 2, \dots , M$$, and $$E_{2j} (g)$$ is the remainder term.

### Lemma 1

[19, Lemma 3.1] Let $$0< \alpha <1$$. Assume that $$g \in C^{3}[0, 1]$$. Then, with $$j=1, 2, \dots , M$$,

\begin{aligned} \oint _{0}^{1} w^{-1-\alpha } g(w) \, dw = \sum _{k=0}^{2j} \alpha _{k,2j} g \Big (\frac{k}{2j} \Big ) + R_{2j}(g), \end{aligned}
(12)

where, with $$j=1$$,

\begin{aligned}&(-\alpha ) (- \alpha +1) (-\alpha +2) (2j)^{-\alpha } \alpha _{l, 2j} = {\left\{ \begin{array}{ll} 2^{-\alpha } ( \alpha +2), \quad \text{ for } \; l=0, \nonumber \\ (-\alpha ) 2^{2-\alpha }, \quad \text{ for } \; l=1, \nonumber \\ \frac{1}{2} F_{2}(1), \quad \text{ for } \; l=2, \nonumber \end{array}\right. } \end{aligned}

and, with $$j=2, 3, \dots , M$$,

\begin{aligned}&(-\alpha ) (- \alpha +1) (-\alpha +2) (2j)^{-\alpha } \alpha _{l, 2j} \nonumber \\&\quad = {\left\{ \begin{array}{ll} 2^{-\alpha } ( \alpha +2), \; \qquad \qquad \qquad \text{ for } \; l=0, \nonumber \\ (-\alpha ) 2^{2-\alpha }, \quad \qquad \qquad \qquad \text{ for } \; l=1, \nonumber \\ (-\alpha )(-2^{-\alpha } \alpha ) + \frac{1}{2} F_{0}(2), \quad \text{ for } \; l=2, \nonumber \\ -F_{1}(k), \qquad \qquad \qquad \text{ for } \; l=2k-1, \quad k=2,3, \dots , j, \nonumber \\ \frac{1}{2}(F_{2}(k) +F_{0}(k+1)), \text{ for } \; l=2k, \quad k=2,3, \dots , j-1, \nonumber \\ \frac{1}{2} F_{2}(j), \qquad \qquad \qquad \qquad \text{ for } \; l=2j. \nonumber \end{array}\right. } \nonumber \end{aligned}

Here

\begin{aligned} F_{0}(k) =&(2k-1)(2k) \Big ( (2k)^{-\alpha } - (2(k-1))^{-\alpha } \Big )(-\alpha +1)(-\alpha +2) \nonumber \\&-\,\Big ((2k-1) + 2k \Big ) \Big ((2k)^{-\alpha +1} -(2(k-1))^{-\alpha +1} \Big ) (-\alpha )(-\alpha +2) \nonumber \\&+\, \Big ( (2k)^{-\alpha +2} - (2(k-1))^{-\alpha +2} \Big ) (-\alpha ) (-\alpha +1), \end{aligned}
(13)
\begin{aligned} F_{1}(k) =&(2k-2)(2k) \Big ( (2k)^{-\alpha } - (2k-2)^{-\alpha } \Big )(-\alpha +1)(-\alpha +2) \nonumber \\&-\,\Big ((2k-2) + 2k \Big ) \Big ((2k)^{-\alpha +1} -(2k-2)^{-\alpha +1} \Big ) (-\alpha )(-\alpha +2) \nonumber \\&+\, \Big ( (2k)^{-\alpha +2} - (2k-2)^{-\alpha +2} \Big ) (-\alpha ) (-\alpha +1), \end{aligned}
(14)

and

\begin{aligned} F_{2}(k) =&(2k-2)(2k-1) \Big ( (2k)^{-\alpha } - (2k-2)^{-\alpha } \Big )(-\alpha +1)(-\alpha +2) \nonumber \\&-\,\Big ((2k-2) + (2k-1) \Big ) \Big ((2k)^{-\alpha +1} -(2k-2)^{-\alpha +1} \Big ) (-\alpha )(-\alpha +2) \nonumber \\&+\, \Big ( (2k)^{-\alpha +2} - (2k-2)^{-\alpha +2} \Big ) (-\alpha ) (-\alpha +1). \end{aligned}
(15)

Next we consider the discretization of (9). At the point $$t_{2j+1}= \frac{2j+1}{2M}, \, j=1,2, \dots , M-1$$ we have

\begin{aligned} _0^R D_{t}^{\alpha } u(x,t_{2j+1})&= \frac{1}{\varGamma (-\alpha )} \oint _{0}^{t_{2j+1}} (t_{2j+1}- \tau )^{-1-\alpha } u(x,\tau ) \, d \tau \nonumber \\&= \frac{1}{\varGamma (-\alpha )} \int _{0}^{t_{1}} (t_{2j+1}- \tau )^{-1-\alpha } u(x,\tau ) \, d \tau \nonumber \\&\quad + \frac{t_{2j+1}^{-\alpha }}{\varGamma (-\alpha )} \oint _{0}^{\frac{2j}{2j+1}} w^{-1-\alpha } u(x,t_{2j+1} - t_{2j+1} w) \, dw. \nonumber \end{aligned}

We approximate this Hadamard finite-part integral by a piecewise quadratic interpolation polynomial with the equispaced nodes $$0, \frac{1}{2j+1}, \frac{2}{2j+1}, \dots , \frac{2j}{2j+1}, \, j=1, 2, \dots , M-1$$. More precisely, we have, for the sufficiently smooth function g(w),

\begin{aligned} \oint _{0}^{\frac{2j}{2j+1}} w^{-1-\alpha } g(w) \, dw = \oint _{0}^{\frac{2j}{2j+1}} w^{-1-\alpha } g_{2}(w) \, d w + E_{2j+1}(g), \end{aligned}
(16)

where $$g_{2}(w)$$ is the piecewise quadratic interpolation polynomial of g(w) defined on the nodes $$0, \frac{1}{2j+1}, \frac{2}{2j+1}, \dots , \frac{2j}{2j+1}, \, j=1, 2, \dots , M-1$$ and $$E_{2j+1} (g)$$ is the remainder term.

### Lemma 2

[19, Lemma 3.2] Let $$0< \alpha <1$$. Assume that $$g \in C^{3}[0, 1]$$. Then

\begin{aligned} \oint _{0}^{\frac{2j}{2j+1}} w^{-1-\alpha } g(w) \, dw = \sum _{k=0}^{2j} \alpha _{k,2j+1} g \Big (\frac{k}{2j} \Big ) + R_{2j+1}(g), \end{aligned}
(17)

where $$\alpha _{k, 2j+1} = \alpha _{k, 2j}, \, k=0, 1, 2, \dots , 2j, \, j=1, 2, \dots , M-1$$ and $$\alpha _{k, 2j}$$ are given in Lemma 1.

By using (10)–(12), we obtain the following approximation of the Riemann–Liouville fractional derivative $$_0^R D_{t}^{\alpha } u(x,t)$$ at $$t = t_{2j}, \, j=1, 2, \dots , M$$

\begin{aligned} \, _0^R D_{t}^{\alpha } u(x,t_{2j}) = \tau ^{-\alpha } \sum _{k=0}^{2j} w_{k, 2j} u(x,t_{2j-k}) + R_{2}^{2j}, \end{aligned}
(18)

where $$R_{2}^{2j} \le C \tau ^{3- \alpha } \big ( \max _{0 \le s \le 1} | u^{\prime \prime \prime } (x,s) | \big )$$ and we denote $$R_{2}^{2j} = O(\tau ^{3- \alpha })$$ [33] and the weights $$w_{k, 2j}, k=0, 1, 2, \dots , 2j, j=1, 2, \dots , M$$ satisfy

\begin{aligned} \varGamma (3- \alpha ) w_{k, 2j} =(-\alpha ) (- \alpha +1) (-\alpha +2) (2j)^{-\alpha } \alpha _{k, 2j}, \quad k=0, 1, 2, \dots , 2j. \end{aligned}
(19)

Similarly, we have at $$t = t_{2j+1}, \, j=1, 2, \dots , M - 1$$,

\begin{aligned} \, _0^R D_{t}^{\alpha } u(x,t_{2j+1}) =&\frac{1}{\varGamma (-\alpha )} \int _{0}^{t_{1}} (t_{2j+1}-s)^{-\alpha -1} \, u(x,s) \, ds \nonumber \\&+\, \tau ^{-\alpha } \sum _{k=0}^{2j} w_{k, 2j+1} u(x,t_{2j+1-k}) + R_{2}^{2j+1}, \nonumber \end{aligned}

where

\begin{aligned} w_{k,2j+1 } = w_{k, 2j}, k=0, 1, 2, \dots , 2j, \quad \text{ and } \quad R_{2}^{2j+1} = O(\tau ^{3- \alpha }). \end{aligned}
(20)

For the Caputo fractional derivative $$_0^C D_{t}^{\alpha } u(x,t)$$ at $$t = t_{2j}, \, j=1, 2, \dots , M$$, we have, noting that $$_0^R D_{t}^{\alpha } u(x,0) = u(x,0) \, _0^R D_{t}^{\alpha } (1) = \frac{u(x,0)}{\varGamma (1- \alpha )} t^{-\alpha }$$,

\begin{aligned} \, _0^C D_{t}^{\alpha } u(x,t_{2j}) = \, _0^R D_{t}^{\alpha } \big ( u(x,t_{2j}) - u(x,0) \big ) = \tau ^{-\alpha } \sum _{k=0}^{2j} \bar{w}_{k, 2j} u(x,t_{2j-k}) + R_{2}^{2j}, \end{aligned}

where the weights, with $$k=0, 1, 2, \dots , 2j-1, j=1, 2, \dots , M$$,

\begin{aligned} \bar{w}_{k, 2j} = w_{k, 2j}, \quad \bar{w}_{2j, 2j} = w_{2j, 2j} - \frac{(2j)^{-\alpha }}{\varGamma (1- \alpha )}. \end{aligned}
(21)

Similarly, we have at $$t = t_{2j+1}, \, j=1, 2, \dots , M-1$$,

\begin{aligned} \, _0^C D_{t}^{\alpha } u(x,t_{2j+1}) =&\frac{1}{\varGamma (-\alpha )} \int _{0}^{t_{1}} (t_{2j+1}-s)^{-\alpha -1} \, u(x,s) \, ds \nonumber \\&+\, \tau ^{-\alpha } \sum _{k=0}^{2j+1} \bar{w}_{k, 2j+1} u(x,t_{2j+1-k}) + R_{2}^{2j+1}, \nonumber \end{aligned}

where, with $$k=0, 1, 2, \dots , 2j, \, j=1, 2, \dots , M-1$$,

\begin{aligned} \bar{w}_{k, 2j+1} = w_{k, 2j}, \quad \bar{w}_{2j+1, 2j+1} = {w_{2j+1, 2j+1}} -\frac{(2j+1)^{-\alpha }}{\varGamma (1- \alpha )}. \end{aligned}
(22)

The exact solution u in (8)–(9) then satisfies, with $$l=2, 3, \dots , 2M$$,

\begin{aligned} \bar{w}_{0, l} u(x,t_{l}) - \tau ^{\alpha } \varDelta u(x,t_{l}) = I_{l} - \sum _{k=1}^{l} \bar{w}_{k, l} u(x,t_{l-k}) + \tau ^{\alpha } f(x,t_{l}) - \tau ^{\alpha } R_{2}^{l}, \end{aligned}

or

\begin{aligned}&u(x,t_{l}) - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \varDelta u(x,t_{l}) \nonumber \\&= (\bar{w}_{0, l})^{-1} I_{l} + \sum _{k=1}^{l} d_{k, l} u(x,t_{l-k}) + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } f(x,t_{l}) - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l}, \end{aligned}
(23)

where $$d_{k, l} = -\bar{w}_{k, l}/\bar{w}_{0, l}, k=1, 2, \dots , l, \; l=2, 3, \dots , 2M$$, where $$I_{l}$$ is defined by

\begin{aligned} I_{l} = {\left\{ \begin{array}{ll} 0, \quad l=2j, \; j=1, 2, \dots , M, \nonumber \\ - {\frac{\tau ^{\alpha }}{\varGamma (-\alpha )}} \int _{0}^{t_{1}} (t_{2j+1}-s)^{-\alpha -1} \, u(x,s) \, ds, \, l=2j+1, \; j=1, 2, \dots , M-1, \nonumber \end{array}\right. } \end{aligned}

For the simplification of notation, we omit the dependence of the exact solution u(xt) on x below. Let $$u^{l} \approx u(x,t_{l}), \, l=0, 1, 2, \dots , 2 M$$ denote the approximate solution of $$u(x,t_{l})$$. We define the following numerical method to approximate the exact solutions in (23), with $$l=2,3, \dots , 2M$$,

\begin{aligned} u^{l} - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \varDelta u^{l} =(\bar{w}_{0, l})^{-1} \tilde{I}_{l} + \sum _{k=1}^{l} d_{k, l} u^{l-k} + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } f(x,t_{l}), \end{aligned}
(24)

where $$\tilde{I}_{l}$$ is some approximation of $$I_{l}$$ discussed below in (26). Here we assume that $$u^{0} = u_{0}$$ and $$u^{1}$$ will be approximated below in (25).

To approximate $$u (x,t_{1})$$ with the required accuracy $$O(\tau ^{3- \alpha })$$ which will be the convergence order of our numerical method (24), we divide the interval $$[0, t_{1}]$$ by the equispaced nodes $$0= t_{1}^{(0)}< t_{1}^{(1)}< \dots < t_{1}^{(n_{1})}= t_{1}$$ with step size $$\widetilde{\tau }$$ such that $$\widetilde{\tau }^{2- \alpha } \approx \tau ^{3 - \alpha }$$, where $$n_{1}$$ is some positive integer. We then apply the numerical method with the convergence order $$O ( \tilde{\tau }^{2- \alpha })$$ in [4] to get the approximate value $$u^{1} \approx u(x,t_{1})$$ such that, with $$e^{1} = u^{1} - u(x, t_{1})$$,

\begin{aligned} \Vert e^{1} \Vert = \Vert u^{1} - u(x,t_{1}) \Vert = O ( \widetilde{\tau }^{2- \alpha }) = O (\tau ^{3- \alpha }), \end{aligned}
(25)

where $$\Vert \cdot \Vert$$ denotes the $$L^{2}$$ norm.

We also need to approximate the integral $$I_{l}$$ in (23) with the required accuracy $$O(\tau ^{3})$$ which we shall use in (27). Let $$n_{2}$$ be some positive integer, we divide the interval $$[0, t_{1}]$$ by the equispaced nodes $$0= t_{1}^{(0)}< t_{1}^{(1)}< \dots < t_{1}^{(n_{2})}= t_{1}$$ with step size $$\overline{\tau }$$ such that $$\overline{\tau }^{2} \approx \tau ^{3-\alpha }$$. We then apply the composite trapezoidal quaduature rule on $$[0, t_{1}]$$ which has the convergence order $$O ( \overline{\tau }^2)$$. More precisely, we have

\begin{aligned} \Vert I_{l} - \tilde{I}_{l} \Vert&= \Big \Vert \frac{{\tau ^{\alpha }}}{\varGamma (-\alpha )} \int _{0}^{t_{1}} (t_{2j+1}-s)^{-\alpha -1} \, u(x,s) \, ds \nonumber \\&\qquad - \frac{ {\tau ^{\alpha }} }{\varGamma (-\alpha )} \int _{0}^{t_{1}} (t_{2j+1}-s)^{-\alpha -1} \, \tilde{u}(x,s) \, ds \Big \Vert \nonumber \\&= {\tau ^{\alpha }} O ( \overline{\tau }^2) = O ( \tau ^{3}), \end{aligned}
(26)

where $$\tilde{u} (x,s)$$ is the piecewise linear interpolation polynomial of u(xs) on $$[0, t_{1}]$$, which implies that $$I_{l} - \tilde{I}_{l} = O(\tau ^{3})$$. We need this approximation below in (27).

Let $$e^{l}= u^{l}- u(x,t_{l}), l=0, 1, \dots , 2M$$. Subtracting (23) from (24), we have, by (26),

\begin{aligned} e^{l} - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \varDelta e^{l} = \sum _{k=1}^{l} d_{k, l} e^{l-k} + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l}, \, l=2, 3, \dots , 2M, \end{aligned}
(27)

where $$e^{0} =0$$ and $$e^{1}$$ is approximated in (25) and $$d_{k, l} = -\bar{w}_{k, l}/\bar{w}_{0, l}, k=1, 2, \dots , l, \, l =2, 3, \dots , 2M$$ are defined as in (21) and (22).

Note that, by (21) and (22), $$d_{1, l}= -\bar{w}_{1, l}/\bar{w}_{0, l}, l=2, 3, \dots , 2M$$ is a constant which is independent on $$l=2, 3, \dots , 2M$$. Further we define

\begin{aligned} d_{1,1} := d_{1, l}, \; l=2, 3, \dots , 2M. \end{aligned}

We now denote

\begin{aligned} \bar{e}^{l}= e^{l}- \eta e^{l-1}, \; \eta = \frac{d_{1, l}}{2}, \; l=1, 2, 3, \dots , 2M, \end{aligned}
(28)

where $$\eta$$ is a constant independent on $$l=1, 2, \dots , 2M$$. We have, with $$l=2, 3, \dots , 2M$$,

\begin{aligned}&\qquad \bar{e}^{l} - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \varDelta e^{l} = e^{l}- \eta e^{l-1} - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \varDelta e^{l} \nonumber \\&= \sum _{k=1}^{l-1} d_{k, l} e^{l-k} + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} -\eta e^{l-1} \nonumber \\&= \eta ( e^{l-1} - \eta e^{l-2}) + (\eta ^2 + d_{2, l}) e^{l-2} + d_{3, l} e^{l-3} + \dots + d_{l-1, l} e^{1} + d_{l, l} e_{0} + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} \nonumber \\&= \eta ( e^{l-1} - \eta e^{l-2}) + (\eta ^2 + d_{2, l}) ( e^{l-2} -\eta e^{l-3}) \nonumber \\&\quad + (\eta ^3 + d_{2, l} \eta + d_{3, l})e^{l-3} + d_{4, l} e^{l-4} + \dots + d_{l, l-1} e^{1} + d_{l, l} e^{0} + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} \nonumber \\&= \dots \nonumber \\&= \eta ( e^{l-1} - \eta e^{l-2}) + (\eta ^2 + d_{2, l}) ( e^{l-2} -\eta e^{l-3}) \nonumber \\&\quad +\, (\eta ^3 + d_{2, l} \eta + d_{3, l}) ( e^{l-3}- \eta e^{l-4}) \nonumber \\&\quad +\, \dots \nonumber \\&\quad +\,(\eta ^{l-2} + d_{2, l} \eta ^{l-4} + \dots + d_{l-3, l} \eta + d_{l-2, l}) ( e^{2}- \eta e^{1}) \nonumber \\&\quad +\, (\eta ^{l-1} + d_{2, l} \eta ^{l-3} + \dots + d_{l-2, l} \eta + d_{l-1, l}) ( e^{1}- \eta e^{0}) \nonumber \\&\quad +\, (\eta ^{l} + d_{2, l} \eta ^{l-2} + \dots + d_{l-1, l} \eta + d_{l, l}) e^{0} + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l}. \nonumber \end{aligned}

Denote

\begin{aligned} \bar{d}_{i, l} := \eta ^{i} + \sum _{j=2}^{i} \eta ^{i-j} d_{j, l}, \; i=2,3, \dots , l, \, l =2, 3, \dots , 2M, \end{aligned}
(29)

we have, with $$\bar{d}_{1, l } = \eta$$,

\begin{aligned} \bar{e}^{l} - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \varDelta e^{l} = \sum _{k=1}^{l-1} \bar{d}_{k, l } \bar{e}^{l-k} + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l}, \quad l=2, 3, \dots , 2M. \end{aligned}
(30)

### Lemma 3

[19, Lemma 3.3] For $$0< \alpha <1$$, the coefficients in (30) satisfy, with $$l =2, 3, \dots , 2M$$,

\begin{aligned}&0< \eta = \frac{d_{1, l}}{2} <\frac{2}{3}, \end{aligned}
(31)
\begin{aligned}&\bar{d}_{k, l} >0, \; k = 1, 2, \dots , l, \end{aligned}
(32)
\begin{aligned}&\eta +\sum _{k=2}^{l} \bar{d}_{k, l} \le 1, \end{aligned}
(33)
\begin{aligned}&(\bar{d}_{l, l})^{-1} \le c_{0} \tau ^{-\alpha }, \quad \text{ for } \text{ some } \text{ constant } \, c_{0}. \end{aligned}
(34)

Next we will consider the error estimates in Theorem 1. To prove the error estimates in Theorem 1 below, we need the further assumption for the initial approximation $$u^{1}$$. Let $$\bar{e}^{1} = e^{1} -\eta e^{0}=e^{1}$$ be defined as in (28), where $$e^{1} = u^{1} - u(x, t_{1})$$. By (32)–(34), we see that $$1 \le (\bar{d}_{l, l})^{-1} \le c_{0} \tau ^{-\alpha }$$. Thus, with some fixed $$l=2, 3, \dots , 2M$$, choosing $$\tilde{\tau }$$ sufficiently small as in (25) , we may find

\begin{aligned} R_{2}^{1} = O(\tau ^{3- \alpha }), \end{aligned}
(35)

and a constant $$c_{1}>0$$ such that,

\begin{aligned} \Vert \bar{e}^{1} \Vert _{1}^{2} \le c_{1} \bar{d}_{l,l} \Vert (\bar{d}_{l, l})^{-1} (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{1} \Vert ^{2}, \end{aligned}
(36)

which we need in (40) below. By (34), we have, noting that $$\bar{d}_{l, l} <1$$ by (32) and (33),

\begin{aligned} \Vert \bar{e}^{1} \Vert _{1}^{2} \le c_{1} \bar{d}_{l,l} \Vert (\bar{d}_{l, l})^{-1} (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{1} \Vert ^{2} = (O(\tau ^{3- \alpha }))^{2}. \end{aligned}
(37)

Now it is ready to introduce the following error estimates.

### Theorem 1

Let $$u(x,t_{l})$$ and $$u^{l}, l=0, 1, \dots , 2M$$ be the exact and approximate solutions of (23) and (24), respectively. Assume that $$u(x,t) \in C^{3}[0, T]$$. Further assume that $$u^{0}= u_{0}$$ and $$u^{1}$$ satisfies (36). Then there exists a constant $$C= C(\alpha , f, T)$$ such that

\begin{aligned} \Vert u^{l} - u(t_{l}) \Vert ^2+ (\bar{w}_{0, l})^{-1}\tau ^{\alpha } \big \Vert \nabla \big (u^{l} - u(t_{l}) \big ) \big \Vert ^2 \le C \big ( \tau ^{3- \alpha } \big )^2, \quad l=2, 3, \dots , 2M. \end{aligned}

### Proof

Multiplying $$2 \bar{e}^{l}$$ in both sides of (30), we have, with $$l =2, 3, \dots , 2M$$,

\begin{aligned}&( \bar{e}^{l}, 2 \bar{e}^{l}) + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } (\nabla e^{l}, 2\nabla \bar{e}^{l}) = \sum _{k=1}^{l-1} \bar{d}_{k, l } (\bar{e}^{l-k}, 2 \bar{e}^{l}) + \big ( (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l}, 2 \bar{e}^{l} \big ). \end{aligned}
(38)

Note that

\begin{aligned} 2 (\nabla e^{l}, \nabla \bar{e}^{l}) = (\nabla e^{l},\nabla e^{l}) + (\nabla \bar{e}^{l}, \nabla \bar{e}^{l}) - \eta ^2 (\nabla e^{l-1}, \nabla e^{l-1}), \; \text{ for } \; l =1, 2, \dots , 2M. \end{aligned}

We have, with $$l =2, 3, \dots , 2M$$,

\begin{aligned}&2 \Vert \bar{e}^{l} \Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } ( \nabla e^{l}, \nabla e^{l}) + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } ( \nabla \bar{e}^{l}, \nabla \bar{e}^{l}) - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \eta ^2 ( \nabla e^{l-1}, \nabla e^{l-1}) \nonumber \\&\quad = \sum _{k=1}^{l-1} \bar{d}_{k, l } (\bar{e}^{l-k}, 2 \bar{e}^{l}) + \big ( ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l}, 2 \bar{e}^{l} \big ). \nonumber \end{aligned}

We write, with $$l =2, 3, \dots , 2M$$,

\begin{aligned} \big ( ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l}, 2 \bar{e}^{l} \big ) = \bar{d}_{l, l} \big ( (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l}, 2 \bar{e}^{l} \big ). \end{aligned}

By Cauchy-Schwarz inequality, we have, with $$l=2, 3, \dots , 2M$$, noting that $$\bar{d}_{l, l} >0$$ from (32),

\begin{aligned}&2 \Vert \bar{e}^{l} \Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \Vert \nabla e^{l} \Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \Vert \nabla \bar{e}^{l} \Vert ^2 - (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \eta ^2 \Vert \nabla e^{l-1} \Vert ^2 \nonumber \\&\quad \le \sum _{k=1}^{l-1} \bar{d}_{k, l} \big ( \Vert \bar{e}^{l-k}\Vert ^2 + \Vert \bar{e}^{l} \Vert ^2 \big ) + \bar{d}_{l, l} \big ( \Vert (\bar{d}_{l, l })^{-1} (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} \Vert ^2 + \Vert \bar{e}^{l} \Vert ^2 \big ). \nonumber \\&\quad =\sum _{k=1}^{l-1} \bar{d}_{k, l} \Vert \bar{e}^{l-k}\Vert ^2 + \bar{d}_{l, l} \big \Vert (\bar{d}_{l, l})^{-1} (\bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} \Vert ^2 + \sum _{k=1}^{l} \bar{d}_{k, l} \Vert \bar{e}^{l} \Vert ^2. \nonumber \end{aligned}

By (33) and noting that $$\bar{d}_{1, l} = \eta$$ , we have

\begin{aligned} \Vert \bar{e}^{l} \Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \Vert \nabla e^{l} \Vert ^2 \le&\bar{d}_{1, l} \Vert \bar{e}^{l-1} \Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \eta ^2 \Vert \nabla e^{l-1} \Vert ^2 + \bar{d}_{2, l} \Vert \bar{e}^{l-2} \Vert ^2 \nonumber \\&+ \dots + \bar{d}_{l-1, l} \Vert \bar{e}^{1} \Vert ^2 + \bar{d}_{l, l} \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} \Vert ^2. \nonumber \end{aligned}

By (31), we have, noting that $$\bar{d}_{1, l} = \eta$$,

\begin{aligned}&\Vert \bar{e}^{l} \Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \Vert \nabla e^{l} \Vert ^2 \le \bar{d}_{1, l } \Big ( \Vert \bar{e}^{l-1}\Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \eta \Vert \nabla e^{l-1} \Vert ^2 \Big ) \nonumber \\&\qquad + \bar{d}_{2, l} \Vert \bar{e}^{l-2}\Vert ^2 + \dots + \bar{d}_{l-1, l} \Vert \bar{e}^{1}\Vert ^2 + \bar{d}_{l, l} \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} \Vert ^2 \nonumber \\&\le \bar{d}_{1, l} \big ( \Vert \bar{e}^{l-1}\Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \Vert \nabla e^{l-1} \Vert ^2 \big ) + \bar{d}_{2, l} \big ( \Vert \bar{e}^{l-2}\Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \Vert \nabla e^{l-2} \Vert ^2 \big ) \nonumber \\&\qquad + \dots + \bar{d}_{l-1, l} \big ( \Vert \bar{e}^{1}\Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \Vert \nabla e^{1} \Vert ^2 \big ) + \bar{d}_{l, l} \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} \Vert ^2. \nonumber \end{aligned}

Note that, by (21) and (22), $$\bar{w}_{0, l} = \frac{2^{-\alpha } (\alpha +2)}{\varGamma (3- \alpha )}, \, l=2, 3, \dots , 2M$$ which is independent on $$l=2, 3, \dots , 2M$$. Further we define

\begin{aligned} \bar{w}_{0, 1} := \bar{w}_{0, l}, \; l=2, 3, \dots , 2M. \end{aligned}

We now denote the norm, with $$l=1, 2, \dots , 2M$$,

\begin{aligned} \Vert \bar{e}^{l} \Vert _{1}^{2} = \Vert \bar{e}^{l}\Vert ^2 + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \Vert \nabla e^{l} \Vert ^2. \end{aligned}
(39)

We then have, with $$l =2, 3, \dots , 2M$$,

\begin{aligned} \Vert \bar{e}^{l} \Vert _{1}^{2} \le \bar{d}_{1, l} \Vert \bar{e}^{l-1} \Vert _{1}^{2} + \bar{d}_{2, l} \Vert \bar{e}^{l-2} \Vert _{1}^{2} + \dots + \bar{d}_{l-1, l} \Vert \bar{e}^{1} \Vert _{1}^{2} + \bar{d}_{l, l} \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{l} \Vert ^2. \nonumber \end{aligned}

By the initial approximation estimate (36), we have, with $$l=2, 3, \dots , 2M$$,

\begin{aligned} \Vert \bar{e}^{1} \Vert _{1}^2 \le c_{1} \bar{d}_{l, l} \big \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{1} \big \Vert ^2 \le c_{1} \Big ( \bar{d}_{l, l} \max _{1 \le s \le l} \big \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{s} \big \Vert ^2 \Big ). \end{aligned}
(40)

We next prove the following by the mathematical induction, with $$k=2,3, \dots , l, \; l =2, 3, \dots , 2M$$,

\begin{aligned} \Vert \bar{e}^{k} \Vert _{1}^2 \le \Big ( 1- \bar{d}_{1, l} - \dots - \bar{d}_{k-1, l} \Big )^{-1} c_{1} \Big ( \bar{d}_{l, l} \max _{1 \le s \le l} \big \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{s} \big \Vert ^2 \Big ). \end{aligned}
(41)

Assume that (40) and (41) hold true for $$k=1, 2, \dots , l-1, \; l =2, 3, \dots , 2M$$, we have, for $$k=l$$, by (32),

which is (41).

By (33), we have, with $$l =2, 3, \dots , 2M$$,

\begin{aligned} \Vert \bar{e}^{l}\Vert _{1}^2&\le \frac{c_{1} \bar{d}_{l, l}}{ 1- \bar{d}_{1, l} - \dots - \bar{d}_{l-1, l}} \max _{1 \le s \le l} \big \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{s} \big \Vert ^2 \nonumber \\&\le c_{1} \max _{1 \le s \le l} \big \Vert (\bar{d}_{l, l})^{-1} ( \bar{w}_{0, l})^{-1} \tau ^{\alpha } R_{2}^{s} \big \Vert ^2 \le C \max _{1 \le s \le l} \big \Vert R_{2}^{s} \big \Vert ^2 \le C \big ( \tau ^{3-\alpha } \big )^2, \nonumber \end{aligned}

where the last inequality is due to the fact $$R_{2}^{s} = O(\tau ^{3- \alpha }), 1 \le s \le l$$ by (35), (18), (20). Hence we have $$\Vert \bar{e}^{l}\Vert _{1} \le C \tau ^{3-\alpha }, \; l=2, 3, \dots , 2M.$$

Further we have, by (31), with $$l=2,3, \dots , 2M$$,

\begin{aligned} \Vert e^{l}\Vert _{1}&=\Vert \bar{e}^{l} + \eta e^{l-1}\Vert _{1} \le \Vert \bar{e}^{l}\Vert _{1} +\Vert \eta e^{l-1}\Vert _{1} \le C \tau ^{3- \alpha } +\Vert \eta e^{l-1}\Vert _{1} \nonumber \\&\le C \tau ^{3- \alpha } + \eta \big ( C \tau ^{3- \alpha } + \eta \Vert e^{l-2}\Vert _{1} \big ) \le (1+ \eta ) C \tau ^{3- \alpha } + \eta ^2 \Vert e^{l-2}\Vert _{1} \nonumber \\&\le \dots \dots \nonumber \\&\le (1+ \eta + \eta ^2 + \dots \eta ^{l} ) C \tau ^{3- \alpha } \le \frac{1}{1- \eta } C \tau ^{3- \alpha } \le C \tau ^{3- \alpha }. \end{aligned}
(42)

Together these estimates complete the proof of Theorem 1.

## 3 The Fully Discrete Scheme

In this section, we shall consider the fully discretization scheme for solving (1)–(3). Here we only consider the error estimates for the homogeneous Dirichlet boundary condition, i.e., $$q=0$$ in (3). But the error estimates in Theorem 2 is also true for the non homogeneous Dirichlet boundary condition. Let $$S_{h}\subseteq H_{0}^{1} (\varOmega )$$ denote the standard linear finite element space and h the space step size. Let $$R_{h}:H_{0}^{1}(\varOmega )\rightarrow S_{h}$$ denote the Ritz projection defined by, for $$\forall \varphi \in H_{0}^{1}(\varOmega )$$,

\begin{aligned} (\nabla R_{h}\varphi ,\nabla \chi )=(\nabla \varphi ,\nabla \chi ),\ \ \forall \chi \in S_{h}. \end{aligned}

It is well known that, see [32],

\begin{aligned} \Vert R_{h}\varphi -\varphi \Vert +h\Vert \nabla (R_{h}\varphi -\varphi )\Vert \le Ch^{2}\Vert \varphi \Vert _{H^{2}({\varOmega })},\ \ \forall \varphi \in H^{2}(\varOmega )\cap H_{0}^{1}(\varOmega ). \end{aligned}
(43)

We now consider the fully discrete scheme for solving (23). Let $$U_{h}^{l} \in S_{h}, l=0, 1, 2, \dots , {2M}$$ denote the approximation of $$u(x, t_{l})$$. We choose $$U_{h}^{0} = R_{h} u_{0}$$ and we assume that $$U_{h}^{1} \in S_{h}$$ is a suitable approximation of $$u(t_{1})$$ which can be obtained by using some special numerical methods and satisfies the condition (45) below. For $$l=2, 3, \dots , 2M$$, we define the following finite element method: find $$U_{h}^{l} \in S_{h}$$, such that

\begin{aligned} (U_{h}^{l},\chi )+(\bar{w}_{0, l})^{-1} \tau ^{\alpha }(\nabla U_{h}^{l},\nabla \chi )&= {\big ((\bar{w}_{0, l})^{-1} \tilde{I}_{l}, \chi \big )} \nonumber \\&\quad + \sum _{k=1}^{l} d_{k, l} (U_{h}^{l-k},\chi )+(\bar{w}_{0, l})^{-1} \tau ^{\alpha }(f(t_{l}),\chi ) ,\ \ \forall \chi \in S_{h}. \end{aligned}
(44)

### Theorem 2

Let $$u(x,t_{l})$$ and $$U_{h}^{l}, l=0, 1, \dots , 2M$$ be the exact and approximate solutions of (23) and (44), respectively. Assume that $$u \in C^{3} ([0, T];H^{2}(\varOmega ))$$ and the initial condition $$U_{h}^{0}=R_{h}u_{0}$$. Assume that there exists a constant $$c_{2}$$ such that

\begin{aligned} \Vert U_{h}^{1} - R_{h} u ( t_{1}) \Vert ^2 + \big ( \bar{w}_{0, l} \big )^{-1} \tau ^{\alpha } \big \Vert \nabla \big ( U_{h}^{1} - R_{h} u (t_{1}) \big ) \big \Vert ^2 \le c_{2} \Big ( \bar{d}_{l,l} \big \Vert (\bar{d}_{l, l})^{-1} (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \delta _{2}^{1} \big \Vert ^{2}\Big ), \end{aligned}
(45)

where $$\delta _{2}^{1}=( R_{h} - I) \big ( \, _0^C D_{t}^{\alpha } u (t_{1})-R_{2}^{1} \big )+R_{2}^{1}$$ and $$R_{2}^{1}$$ is defined in (36). Then there exists a constant $$C= C(\alpha , f,T)$$ such that, with $$l=2, 3, \dots , 2M$$,

\begin{aligned} \Vert U_{h}^{l} - R_{h} u(t_{l}) \Vert ^2+ (\bar{w}_{0, l})^{-1}\tau ^{\alpha } \big \Vert \nabla \big (U_{h}^{l} - R_{h} u(t_{l}) \big ) \big \Vert ^2 \le C (h^{2}+\tau ^{3- \alpha } )^{2}. \end{aligned}

Further we have

\begin{aligned} \Vert U_{h}^{l} - u(t_{l}) \Vert \le C (h^{2}+\tau ^{3- \alpha } ), \, l=2, 3, \dots , 2M. \nonumber \end{aligned}

### Proof

Denote, with $$l =1, 2, \dots , N$$,

\begin{aligned} U_{h}^{l} - u(t_{l})=U_{h}^{l}-R_{h}u(t_{l})+R_{h}u(t_{l})-u(t_{l})=\theta ^{l}+\rho ^{l}. \end{aligned}

By (43), we have

\begin{aligned} \Vert \rho ^{l} \Vert \le C h^2 \Vert u \Vert _{H^{2}}. \end{aligned}
(46)

We now estimate $$\theta ^{l}, l=2, 3, \dots , N$$. We have, for $$\forall \, \chi \in S_{h}$$,

\begin{aligned}&(\theta ^{l},\chi )+(\bar{w}_{0, l})^{-1} \tau ^{\alpha }(\nabla \theta ^{l},\nabla \chi )\nonumber \\&=(U_{h}^{l},\chi )-(R_{h}u(t_{l}),\chi )+(\bar{w}_{0, l})^{-1} \tau ^{\alpha }(\nabla U_{h}^{l},\nabla \chi ) -(\bar{w}_{0, l})^{-1} \tau ^{\alpha }(\nabla R_{h}u(t_{l}),\nabla \chi )\nonumber \\&=\sum _{k=1}^{l} d_{k, l} (U_{h}^{l-k},\chi )-(R_{h}u(t_{l}),\chi )-(\bar{w}_{0, l})^{-1} \tau ^{\alpha }(\nabla R_{h}u(t_{l}),\nabla \chi )+(\bar{w}_{0, l})^{-1} \tau ^{\alpha }(f(t_{l}),\chi )\nonumber \\&=\sum _{k=1}^{l} d_{k, l} (\theta ^{l-k},\chi )+ \big ( (I-R_{h}) \big ( u(t_{l})-\sum _{k=1}^{l} d_{k, l} u(t_{l-k}) \big ),\chi \big ) - (\bar{w}_{0, l})^{-1} \tau ^{\alpha }(R_{2}^{l},\chi )\nonumber \\&=\sum _{k=1}^{l} d_{k, l} (\theta ^{l-k},\chi )+(\bar{w}_{0, l})^{-1} \tau ^{\alpha } \big ( (I-R_{h})(_0^C D_{t}^{\alpha } u (t_{l})-R_{2}^{l}),\chi \big )- (\bar{w}_{0, l})^{-1} \tau ^{\alpha }(R_{2}^{l},\chi )\nonumber \end{aligned}

Hence we get

\begin{aligned} (\theta ^{l},\chi )+(\bar{w}_{0, l})^{-1} \tau ^{\alpha }(\nabla \theta ^{l},\nabla \chi ) = \sum _{k=1}^{l} d_{k, l} (\theta ^{l-k},\chi ) - (\bar{w}_{0, l})^{-1} \tau ^{\alpha }(\delta _{2}^{l},\chi ), \, \forall \chi \in S_{h}, \end{aligned}
(47)

where $$\delta _{2}^{l}=( R_{h} - I) \big ( \, _0^C D_{t}^{\alpha } u (t_{l})-R_{2}^{l} \big )+R_{2}^{l}.$$

From the triangle inequality, we have

\begin{aligned} \Vert \delta _{2}^{l}\Vert \le \Vert (R_{h} -I )_0^C D_{t}^{\alpha } u (t_{l})\Vert +\Vert (R_{h} - I)R_{2}^{l}\Vert +\Vert R_{2}^{l}\Vert . \end{aligned}

Note that

\begin{aligned}&\Vert R_{2}^{l}\Vert \le C\tau ^{3-\alpha },\\&\Vert (R_{h} - I )R_{2}^{l}\Vert \le Ch^{2}\tau ^{3-\alpha }, \Vert (R_{h} - I)_0^C D_{t}^{\alpha } u (x, t_{l})\Vert \le Ch^{2}. \end{aligned}

We have

\begin{aligned} \Vert \delta _{2}^{l}\Vert \le C \big ( h^{2} +h^{2}\tau ^{3-\alpha } +\tau ^{3-\alpha } \big ) \le C (h^{2}+\tau ^{3- \alpha }). \end{aligned}

Let

\begin{aligned} \bar{\theta }^{l}= \theta ^{l}- \eta \theta ^{l-1}, \; \eta = \frac{d_{1, l}}{2}, \; l= 1, 2, \dots , 2M. \end{aligned}

Applying (45), we have, noting that $$R_{2}^{1} = O(\tau ^{3- \alpha })$$ by (35),

\begin{aligned} \Vert \theta ^{1} \Vert _{1}^{2} \le c_{2} \Big ( \bar{d}_{l,l} \big \Vert (\bar{d}_{l, l})^{-1} (\bar{w}_{0, l})^{-1} \tau ^{\alpha } \delta _{2}^{1} \big \Vert ^{2}\Big ) \le C (h^2 + \tau ^{3- \alpha })^{2}. \end{aligned}

We next consider the case for $$l=2, 3, \dots , N$$. We may write the equation (47) as

\begin{aligned} (\bar{\theta }^{l},\chi ) + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } (\nabla \theta ^{l},\nabla \chi ) = \sum _{k=1}^{l-1} \bar{d}_{k, l } (\bar{\theta }^{l-k},\chi ) + (\bar{w}_{0, l})^{-1} \tau ^{\alpha } (\delta _{2}^{l},\chi ),\forall \chi \in S_{h}. \end{aligned}
(48)

Following the same argument as in the proof of Theorem 1, we may get, with $$l=2, 3, \dots , 2M$$,

\begin{aligned} \Vert \theta ^{l}\Vert ^2+ (\bar{w}_{0, l})^{-1}\tau ^{\alpha } \, \Vert \nabla \theta ^{l}\Vert ^2 \le C(h^{2}+\tau ^{3-\alpha })^2, \end{aligned}

Together these estimates with (46) completes the proof of Theorem 2. $$\square$$

## 4 Numerical Simulations

In this section, we will consider four examples. Examples 1 and 2 are one-dimensional problems and Examples 3 and 4 are two-dimensional problems.

### Example 1

Consider

\begin{aligned}&_0^R D_{t}^{\alpha } u (x, t) - \frac{\partial ^2 u(x, t) }{\partial x^2} = f(x, t), \quad t \in [0, T], \; 0< x<1, \end{aligned}
(49)
\begin{aligned}&u(x, 0) = 0, \quad 0< x <1, \end{aligned}
(50)
\begin{aligned}&u(0, t) = u(1, t)= 0, \quad t \in [0, T], \end{aligned}
(51)

where $$f(x, t) =\frac{\varGamma (m+1)}{\varGamma (m+1 -\alpha )} t^{m- \alpha } \sin (2 \pi x) + 4 \pi ^2 t^{m} \sin (2 \pi x)$$. The exact solution is $$u(x, t)= t^{m} \sin (2 \pi x)$$. In Table 1, we choose $$m=3.5$$. In this case the exact solution $$u(x, \cdot ) \in C^{3}[0, T]$$.

The main purpose is to check the order of convergence of the numerical method with respect to the time step size $$\tau$$ for the different fractional orders $$\alpha$$. For various choices of $$\alpha \in (0,1)$$, we compute the errors in the $$L^{2}$$ norm at $$T=1$$. We use the linear finite element space $$S_{h}$$ with the space step size $$h = 1/2^6$$ which is sufficiently small such that the error will be dominated by the time discretization of the method. We choose the time step size $$\tau = 1/2^{l}, l=3, 4, 5, 6,7$$, i.e, and we divide the interval [0, T] into $$N=1/\tau$$ subintervals with nodes $$0=t_{0}< t_{1}< \dots < t_{N}=1$$. Then we compute the error $$e(t_{N})=\Vert u(x, t_{N}) - U_{h}^{N} \Vert$$. By Theorem 2, we have

\begin{aligned} \Vert e(t_{N})\Vert = \Vert u(x, t_{N}) - U_{h}^{N} \Vert \le C \tau ^{3- \alpha }, \end{aligned}
(52)

To observe the order of convergence we shall compute the error $$\Vert e(t_{N})\Vert$$ at $$t_{N}=1$$ with respect to the different values of $$\tau$$. Denote $$\Vert e_{\tau _{l}}(t_{N})\Vert$$ the error at $$t_{N}=1$$ with respect to the time step size $$\tau _{l}$$. Let $$\tau _{l}= \tau = 1/ 2^{l}$$ for a fixed $$l =3, 4, 5, 6, 7$$. We then have

\begin{aligned} \frac{\Vert e_{\tau _{l}}(t_{N}) \Vert }{\Vert e_{\tau _{l+1}}(t_{N})\Vert } \approx \frac{C \tau _{l}^{3-\alpha }}{C \tau _{l+1}^{3-\alpha }} = 2^{3- \alpha }, \end{aligned}

which implies that the order of convergence satisfies $$3- \alpha \approx \text{ log }_{2} \Big ( \frac{\Vert e_{\tau _{l}}(t_{N})\Vert }{\Vert e_{\tau _{l+1}}(t_{N})\Vert } \Big )$$. In Table 1, we compute the orders of convergence for the different values of $$\alpha$$. The numerical results are consistent with the theoretical results in Theorem 2. We also compare the numerical results of our scheme with the results obtained by the scheme in [8]. We observe that the convergence rate of our method is slightly higher than the convergence rate of the method in [8] for small $$\alpha \in (0, 1)$$. We will investigate this interesting issue in our future work. We have the similar observations in other examples below.

### Example 2

Consider

\begin{aligned}&_0^R D_{t}^{\alpha } u (x, t) - \frac{\partial ^2 u(x, t) }{\partial x^2} = f(x, t), \quad t \in [0, T], \; 0< x<1, \end{aligned}
(53)
\begin{aligned}&u(x, 0) = 0, \quad 0< x <1, \end{aligned}
(54)
\begin{aligned}&u(0, t) = u(1, t)= 0, \quad t \in [0, T], \end{aligned}
(55)

where $$u(x, t) = e^{x} t^{4+\alpha }$$ and $$f(x, t) = e^{x} \frac{\varGamma (5+ \alpha )}{24} t^{4} - t^{4+ \alpha } e^{x}$$.

We use the same notations as in the experiments in Example 1. In Table 2, we observe that the numerical results are consistent with the theoretical results.

Next we will consider two examples in two-dimensional cases. Let us first introduce the algorithm for solving the following time fractional partial differential equations in two-dimensional case by using finite element method.

\begin{aligned}&_0^C D_{t}^{\alpha } u (x, t) - \varDelta u (x, t) = f(x, t), \quad t \in [0, T], \; x \in \varOmega , \end{aligned}
(56)
\begin{aligned}&u(x, 0) = u_{0}(x), \quad x \in \varOmega , \end{aligned}
(57)
\begin{aligned}&\frac{\partial u(x, t)}{\partial n} = \kappa ( u(x,t) - q (x, t)), \quad t \in [0, T], x \in \partial \varOmega , \end{aligned}
(58)

where $$\varOmega = (0, 1) \times (0, 1)$$ and $$\frac{\partial u(x, t)}{\partial n}$$ denotes the normal derivative on the boundary $$\partial \varOmega$$ and q(xt) is some function defined on the boundary $$x \in \partial \varOmega$$ and $$t \in (0, T)$$. Here $$\kappa$$ is a constant. When $$\kappa$$ is sufficiently large, (58) is reduced to the Dirichlet boundary condition. In our numerical examples below, we will only consider the Dirichlet boundary conditions.

The variational form of (56)–(58) is to find $$u(t) \in L^{2}(\varOmega )$$, such that

\begin{aligned} \left( _0^C D_{t}^{\alpha } u(t), v\right) _{L^{2}(\varOmega )}&+ ( \nabla u (t), \nabla v )_{L^{2}(\varOmega )} + ( \kappa u(t), v)_{L^{2}(\partial \varOmega )} \\&= ( \kappa q, v)_{L^{2}(\partial \varOmega )} + (f (t), v)_{L^{2}(\varOmega )}, \quad \forall \; v \in H^{1}(\varOmega ). \nonumber \end{aligned}
(59)

Let $$x= (x_{1}, x_{2})$$. Let $$0= x_{1}^{0}< x_{1}^{1}< \dots < x_{1}^{M1}=1$$ be a partition of [0, 1] on the $$x_{1}$$ axis and h the step size. Similarly we let $$0= x_{2}^{0}< x_{2}^{1}< \dots < x_{2}^{M1}=1$$ be a partition of [0, 1] on the $$x_{2}$$ axis and h the step size. For simplicity of notation, we use the same step size on both $$x_{1}$$ and $$x_{2}$$ axes. We divide the domain $$\varOmega$$ into the small triangles which have the same sizes.

Let $$S_{h}$$ denote the linear finite element space defined on $$\varOmega$$. Let $$P_{0}, P_{1}, \dots P_{N_{h}}$$ denote all the nodes on the triangulation of $$\varOmega$$. Let $$\varphi _{j}, j=0, 1, 2, \dots , N_{h}$$ be the linear basis function corresponding to the node $$P_{j}, j=0, 1, 2, \dots , N_{h}$$. The finite element method is to find $$u_{h} \in S_{h}$$ such that

\begin{aligned} \left( _0^C D_{t}^{\alpha } u_{h}(t), \chi \right) _{L^{2}(\varOmega )}&+ ( \nabla u_{h} (t), \nabla \chi )_{L^{2}(\varOmega )} + (\kappa u_{h}(t) , \chi )_{L^{2}(\partial \varOmega )} \\ =&(\kappa q(t), \chi )_{L^{2}(\partial \varOmega )} + (f (t), \chi )_{L^{2}(\varOmega )}, \quad \forall \; \chi \in S_{h}. \nonumber \end{aligned}
(60)

Let $$0< t_{0}< t_{1}< \dots < t_{2M}=T$$ be the time partition of [0, T] and $$\tau$$ be the step size. Below we only consider the time discretization at the even node $$t_{n}, n=2, 4, \dots , 2M$$ to get the idea of the algorithm of our method. (Similarly one can consider the discretization at the odd node $$t_{n}, n=1, 3, \dots , 2M-1$$). By (18), we have

\begin{aligned} _0^R D_{t}^{\alpha } u(t_{n}) = \tau ^{-\alpha } \sum _{k=0}^{n} w_{k, n} u(t_{n-k}) + O(\tau ^{3-\alpha }). \end{aligned}

with some suitable weights $$w_{k, n}, k=0, 1, \dots , n$$.

Define the following time discretization scheme: find $$U^{n} \approx u_{h}(t_{n}), n=2, 4, \dots , t_{2M}=T$$ such that, noting that $$\, _0^C D_{t}^{\alpha } u(t_{n}) =\, _0^R D_{t}^{\alpha } ( u(t_{n}) - u_{0})$$,

\begin{aligned}&\tau ^{-\alpha } \big ( \sum _{k=0}^{n} w_{k, n} U^{n-k} , \chi \big )_{L^{2}(\varOmega )} + ( \nabla U^{n}, \nabla \chi )_{L^{2}(\varOmega )} + ( \kappa U^{n}, \chi )_{L^{2}(\partial \varOmega )} \\&\quad = ( \kappa q(t_{n}), \chi )_{L^{2}(\partial \varOmega )} + (f (t_{n}), \chi )_{L^{2}( \varOmega )} + \frac{t_{n}^{-\alpha }}{\varGamma (1-\alpha )} (u_{0}, \chi )_{L^{2}( \varOmega )}, \quad \forall \; \chi \in S_{h}, \nonumber \end{aligned}
(61)

or

\begin{aligned}&(U^{n}, \chi )_{L^{2}( \varOmega )} + \frac{1}{w_{0,n}} \tau ^{\alpha } (\nabla U^{n}, \nabla \chi )_{L^{2}( \varOmega )} + \frac{1}{w_{0,n}} \tau ^{\alpha } \kappa ( U^{n}, \chi )_{L^{2}(\partial \varOmega )} \nonumber \\&\quad = \frac{1}{w_{0,n}} \tau ^{\alpha } (f(t_{n}), \chi )_{L^{2}( \varOmega )} + \frac{1}{w_{0,n}} \tau ^{\alpha } \kappa (q(t_{n}), \chi )_{L^{2}(\partial \varOmega )} \nonumber \\&\quad \quad - \frac{1}{w_{0,n}} \sum _{k=1}^{n} w_{k, n} (U^{n-k}, \chi )_{L^{2}( \varOmega )} + \frac{1}{w_{0,n}} \tau ^{\alpha } \frac{t_{n}^{-\alpha }}{\varGamma (1-\alpha )} (u_{0}, \chi )_{L^{2}( \varOmega )}. \; \forall \chi \in S_{h}. \nonumber \end{aligned}

Let $$U^{n} = \sum _{j=0}^{N_{h}} \alpha _{j}^{n} \varphi _{j}$$ be the approximate solution of $$u_{h}(t_{n}), n=1, 2, \dots , N$$. Choose $$\chi = \varphi _{l}, l=0, 1, \dots , N_{h}$$, we have

\begin{aligned}&(U^{n}, \chi )_{L^{2}( \varOmega )} + \frac{1}{w_{0,n}} \tau ^{\alpha } (\nabla U^{n}, \nabla \varphi _{l})_{L^{2}( \varOmega )} + \frac{1}{w_{0,n}} \tau ^{\alpha } \kappa ( U^{n}, \varphi _{l} )_{L^{2}(\partial \varOmega )} \nonumber \\&\quad = \frac{1}{w_{0,n}} \tau ^{\alpha } (f(t_{n}), \varphi _{l})_{L^{2}( \varOmega )} + \frac{1}{w_{0,n}} \tau ^{\alpha } \kappa (q(t_{n}), \varphi _{l})_{L^{2}(\partial \varOmega )} \nonumber \\&\quad \quad - \frac{1}{w_{0,n}} \sum _{k=1}^{n} w_{k, n} (U^{n-k}, \varphi _{l} ) + \frac{1}{w_{0,n}} \tau ^{\alpha } \frac{t_{n}^{-\alpha }}{\varGamma (1-\alpha )} (u_{0}, \varphi _{l})_{L^{2}( \varOmega )}, \quad \forall \; \chi \in S_{h}. \nonumber \end{aligned}

Denote

\begin{aligned} \mathbf M= & {} (\varphi _{j}, \varphi _{l})_{L^{2}(\varOmega )} \quad \mathbf S = ( \nabla \varphi _{j}, \nabla \varphi _{l})_{L^{2}(\partial \varOmega )}, \quad \mathbf K = ( \kappa \varphi _{j}, \varphi _{l})_{L^{2}(\partial \varOmega )},\\ \mathbf G= & {} ( \kappa q, \varphi _{l})_{L^{2}(\partial \varOmega )}, \quad \mathbf F = ( f(t_{n}), \varphi _{l})_{L^{2}(\varOmega )}, \quad \alpha ^n = ( \alpha _{j}^{n}), \end{aligned}

We have the matrix form

\begin{aligned} \big ( \mathbf M + \frac{\tau ^{\alpha }}{w_{0,n}} \mathbf S + \frac{\tau ^{\alpha }}{w_{0,n}} \mathbf K \big ) \alpha ^n&=\frac{\tau ^{\alpha }}{w_{0,n}} \ \Big ( \mathbf F + \mathbf G + \frac{t_{n}^{-\alpha }}{\varGamma (1- \alpha )} (\mathbf M *\alpha ^{0}) \Big ) \nonumber \\&- \frac{1}{w_{0,n}} \sum _{k=1}^{n} w_{k, n} (\mathbf M *\alpha ^{n-k}). \nonumber \end{aligned}

Solving this system, we get the finite element solution $$\alpha ^{n}, n=2, 4, \dots , 2M$$.

### Example 3

Consider

\begin{aligned}&_0^C D_{t}^{\alpha } u (x, t) - \varDelta u (x, t) = f(x, t), \quad t \in [0, T], \; x \in \varOmega , \end{aligned}
(62)
\begin{aligned}&u(x, 0) = u_{0}(x), \quad x \in \varOmega , \end{aligned}
(63)
\begin{aligned}&\frac{\partial u(x, t)}{\partial n} = \kappa ( u(x,t) - q (x, t)), \quad t \in [0, T], x \in \partial \varOmega , \end{aligned}
(64)

where $$\varOmega = (0, 1) \times (0, 1)$$. The exact solution is $$u(x, t) = t^{m} \sin (2 \pi x_{1}) \sin (2 \pi x_{2})$$ for some $$m >0$$ and

\begin{aligned} f(x, t) = \frac{\varGamma (m+1)}{\varGamma (m+1-\alpha )} t^{m-\alpha } \sin (2 \pi x_{1}) \sin (2 \pi x_{2}) + t^{m} (8 \pi ^2) \sin (2 \pi x_{1}) \sin (2 \pi x_{2}). \end{aligned}

Here $$q(x, t) = t^{m} \sin (2 \pi x_{1}) \sin (2 \pi x_{2})$$ and $$u_{0} (x) =0$$. In Table 3, we choose $$m=3.5$$.

We will consider the Dirichlet boundary condition and therefore we choose $$\kappa =10000$$ in our numerical simulation. For various choices of $$\alpha \in (0,1)$$, we compute the errors at $$T=1$$. We use the linear finite element space $$S_{h}$$ with the space step size $$h = 1/2^6$$ which is sufficiently small such that the error will be dominated by the time discretization of the method. We choose the time step size $$\tau = 1/2^{l}, l=3, 4, 5, 6,7$$, i.e, we divide the interval [0, T] into $$2M=1/\tau$$ subintervals with nodes $$0=t_{0}< t_{1}< \dots < t_{2M}=1$$. In Table 3, we compute the orders of convergence for the different values of $$\alpha$$. The numerical results are consistent with the theoretical results in Theorem 2.

### Example 4

Consider

\begin{aligned}&_0^C D_{t}^{\alpha } u (x, t) - \varDelta u (x, t) = f(x, t), \quad t \in [0, T], \; x \in \varOmega , \end{aligned}
(65)
\begin{aligned}&u(x, 0) = u_{0}(x), \quad x \in \varOmega , \end{aligned}
(66)
\begin{aligned}&\frac{\partial u(x, t)}{\partial n} = k( u(x,t) - q (x, t)), \quad t \in [0, T], x \in \partial \varOmega , \end{aligned}
(67)

where $$\varOmega = (0, 1) \times (0, 1)$$. The exact solution is $$u(x, t) = e^{x_{1} + x_{2}} t^{4+\alpha }$$ and $$f(x, t) = e^{x_{1} +x_{2}} \big ( \frac{\varGamma (5+ \alpha )}{24} t^{4} \big ) - t^{4+ \alpha } \big (2 e^{x_{1} + x_{2}}\big )$$ Here $$q(x, t) =e^{x_{1} +x_{2}} t^{4+\alpha }$$ and $$u_{0} (x) =0$$.

We use the same notations as in the experiments in Example 3. In Table 4, we observe that the numerical results are consistent with the theoretical results.