An Analysis of the Rayleigh-Stokes problem for a Generalized Second-Grade Fluid

We study the Rayleigh-Stokes problem for a generalized second-grade fluid which involves a Riemann-Liouville fractional derivative in time, and present an analysis of the problem in the continuous, space semidiscrete and fully discrete formulations. We establish the Sobolev regularity of the homogeneous problem for both smooth and nonsmooth initial data $v$, including $v\in L^2(\Omega)$. A space semidiscrete Galerkin scheme using continuous piecewise linear finite elements is developed, and optimal with respect to initial data regularity error estimates for the finite element approximations are derived. Further, two fully discrete schemes based on the backward Euler method and second-order backward difference method and the related convolution quadrature are developed, and optimal error estimates are derived for the fully discrete approximations for both smooth and nonsmooth initial data. Numerical results for one- and two-dimensional examples with smooth and nonsmooth initial data are presented to illustrate the efficiency of the method, and to verify the convergence theory.


Introduction
In this paper, we study the homogeneous Rayleigh-Stokes problem for a generalized second-grade fluid with a fractional derivative model. Let ⊂ R d (d = 1, 2, 3) be a convex polyhedral domain with its boundary being ∂ , and T > 0 be a fixed time. Then the mathematical model is given by where γ > 0 is a fixed constant, v is the initial data, ∂ t = ∂/∂t, and ∂ α t is the Riemann-Liouville fractional derivative of order α ∈ (0, 1) defined by [11,24]: The Rayleigh-Stokes problem (1.1) has received considerable attention in recent years. The fractional derivative ∂ α t in the model is used to capture the viscoelastic behavior of the flow; see e.g. [5,28] for derivation details. The model (1.1) plays an important role in describing the behavior of some non-Newtonian fluids.
In order to gain insights into the behavior of the solution of this model, there has been substantial interest in deriving a closed form solution for special cases; see, e.g. [5,28,32]. For example, Shen et al. [28] obtained the exact solution of the problem using the Fourier sine transform and fractional Laplace transform. Zhao and Yang [32] derived exact solutions using the eigenfunction expansion on a rectangular domain for the case of homogeneous initial and boundary conditions. The solutions obtained in these studies are formal in nature, and especially the regularity of the solution has not been studied. In Sect. 2 below, we fill this gap and establish the Sobolev regularity of the solution for both smooth and nonsmooth initial data. We would like to mention that Girault and Saadouni [7] analyzed the existence and uniqueness of a weak solution of a closely related time-dependent grade-two fluid model.
The exact solutions obtained in these studies involve infinite series and special functions, e.g., generalized Mittag-Leffler functions, and thus are inconvenient for numerical evaluation. Further, closed-form solutions are available only for a restricted class of problem settings. Hence, it is imperative to develop efficient and optimally accurate numerical algorithms for problem (1.1). This was considered earlier in [1,2,12,21,31]. Chen et al. [1] developed implicit and explicit schemes based on the finite difference method in space and the Grünwald-Letnikov discretization of the time fractional derivative, and analyzed their stability and convergence rates using the Fourier method. Of the same flavor is the work [2], where a scheme based on Fourier series expansion was considered. Wu [31] developed an implicit numerical approximation scheme by transforming problem (1.1) into an integral equation, and showed its stability and convergence by an energy argument. Lin and Jiang [12] described a method based on the reproducing kernel Hilbert space. Recently, Mohebbi et al. [21] compared a compact finite difference method with the radial basis function method. In all these studies, however, the error estimates were obtained under the assumption that the solution to (1.1) is sufficiently smooth and the domain is a rectangle. Hence the interesting cases of nonsmooth data (the initial data or the right hand side) and general domains are not covered.
Theoretical studies on numerical methods for differential equations involving fractional derivatives have received considerable attention in the last decade. McLean and Mustapha [18,22] analyzed piecewise constant and piecewise linear discontinuous Galerkin method in time, and derived error estimates for smooth initial data; see also [23] for related superconvergence results. In [8,10], a space semidiscrete Galerkin finite element method (FEM) and lumped mass method for problem C ∂ α t u + Au = 0 with u(0) = v (with A being an elliptic operator, and C ∂ α t being the Caputo derivative) has been analyzed. Almost optimal error estimates were established for initial data v ∈Ḣ q ( ), −1 ≤ q ≤ 2, (see Sect. 2 below for the definition) by exploiting the properties of the two-parameter Mittag-Leffler function. Note that this includes weak (nonsmooth), v ∈ L 2 ( ), and very weak data, v ∈Ḣ −1 ( ). In [19,Section 4], McLean and Thomée studied the following equation ∂ t u + ∂ −α t Au = f (with ∂ −α t being Riemann-Liouville integral and derivative operator for α ∈ (0, 1) and α ∈ (−1, 0), respectively), and derived L 2 ( )-error estimates for the space semidiscrete scheme for both v ∈ L 2 ( ) and v ∈Ḣ 2 ( ) (and suitably smooth f ) and some fully discrete schemes based on Laplace transform were discussed. The corresponding L ∞ ( ) estimates for data v ∈ L ∞ ( ) and Av ∈ L ∞ ( ) were derived in [20]. Lubich et al. [15] developed two fully discrete schemes for the problem ∂ t u + ∂ −α t Au = f with u(0) = v and 0 < α < 1 based on the convolution quadrature of the fractional derivative term, and derived optimal error estimates for nonsmooth initial data and right hand side. Cuesta et al. [4] considered the semi-linear counterpart of the model with convolution quadrature, which covers also the fractional diffusion case, i.e., −1 < α < 0, and provided a unified framework for the error analysis with optimal error estimates in an abstract Banach space setting.
In this paper we develop a Galerkin FEM for problem (1.1) and derive optimal with respect to data regularity error estimates for both smooth and nonsmooth initial data. The approximation is based on the finite element space X h of continuous piecewise linear functions over a family of shape regular quasi-uniform partitions {T h } 0<h<1 of the domain into d-simplexes, where h is the maximum diameter. The semidiscrete Galerkin FEM for problem (1.1) is: find u h (t) ∈ X h such that where a(u, w) = (∇u, ∇w) for u, w ∈ H 1 0 ( ), and v h ∈ X h is an approximation of the initial data v. Our default choices are the L 2 ( ) projection v h = P h v, assuming v ∈ L 2 ( ), and the Ritz projection v h = R h v, assuming v ∈Ḣ 2 ( ). Further, we develop two fully discrete schemes based on the backward Euler method and the second-order backward difference method and the related convolution quadrature for the fractional derivative term, which achieves respectively first and second-order accuracy in time. Error estimates optimal with respect to data regularity are provided for both semidiscrete and fully discrete schemes.
Our main contributions are as follows. First, in Theorem 2.1, using an operator approach from [25], we develop the theoretical foundations for our study by establishing the smoothing property and decay behavior of the solution to problem (1.1). Second, for both smooth initial data v ∈Ḣ 2 ( ) and nonsmooth initial data v ∈ L 2 ( ), we derive error estimates for the space semidiscrete scheme, cf. Theorems 3.1 and 3.2: The estimate for v ∈ L 2 ( ) deteriorates as t approaches 0. The error estimates are derived following an approach due to Fujita and Suzuki [6]. Next, in Theorems 4.1 and 4.2 we establish optimal L 2 ( ) error estimates for the two fully discrete schemes. The proof is inspired by the fundamental work of Cuesta et al. [4], which relies on known error estimates for convolution quadrature and bounds on the convolution kernel. We show for example, that the discrete solution U n h by the backward Euler method (on a uniform grid in time with a time step size τ ) satisfies the following a priori error bound A similar estimate holds for the second-order backward difference method. The rest of the paper is organized as follows. In Sect. 2 we establish the Sobolev regularity of the solution. In Sect. 3, we analyze the space semidiscrete scheme, and derive optimal error estimates for both smooth and nonsmooth initial data. Then in Sect. 4, we develop two fully discrete schemes based on convolution quadrature approximation of the fractional derivative. Optimal error estimates are provided for both schemes. Finally in Sect. 5, numerical results for one-and two-dimensional examples are provided to illustrate the convergence theory. Throughout, the notation c denotes a constant which may differ at different occurrences, but it is always independent of the solution u, mesh size h and time step-size τ .

Regularity of the solution
In this section, we establish the Sobolev regularity of the solution to (1.1) in the homogeneous case f ≡ 0. We first recall preliminaries on the elliptic operator and function spaces. Then we derive the proper solution representation, show the existence of a weak solution, and establish the Sobolev regularity of the solution to the homogeneous problem. The main tool is the operator theoretic approach developed in [25]. Further, we give an alternative solution representation via eigenfunction expansion, and derive qualitative properties of the time-dependent components.

Preliminaries
First we introduce some notation. For q ≥ −1, we denote byḢ q ( ) ⊂ H −1 ( ) the Hilbert space induced by the norm with (·,·) denoting the inner product in L 2 ( ) and {λ j } ∞ j=1 and {ϕ j } ∞ j=1 being respectively the Dirichlet eigenvalues and eigenfunctions of − on the domain . As usual, we identify a function f in For δ > 0 and θ ∈ (0, π) we introduce the contour δ,θ defined by where the circular arc is oriented counterclockwise, and the two rays are oriented with an increasing imaginary part. Further, we denote by θ the sector θ = {z ∈ C; z = 0, | arg z| < θ}.
We recast problem (1.1) with f ≡ 0 into a Volterra integral equation by integrating both sides of the governing equation in (1.1) where the kernel k(t) is given by and the operator A is defined by A = − with a domain D(A) = H 1 0 ( ) ∩ H 2 ( ). The H 2 ( ) regularity of the elliptic problem is essential for our discussion, and it follows from the convexity assumption on the domain . It is well known that the operator −A generates a bounded analytic semigroup of angle π/2, i.e., for any θ ∈ (0, π/2) Meanwhile, applying the Laplace transform to (2.1) yields i.e., u(z) = H (z)v, with the kernel H (z) given by where k is the Laplace transform of the function k(t). Hence, by means of the inverse Laplace transform, we deduce that the solution operator S(t) is given by where δ > 0, θ ∈ (0, π/2). First we state one basic estimate about the kernel g(z) = z/(1 + γ z α ).

A priori estimates of the solution
Now we can state the regularity to problem (1.1) with f ≡ 0.
Moreover, the following stability estimates hold for t ∈ (0, T ] and ν = 0, 1: where c, c T > 0 are constants depending on d, , α, γ, M and m, and the constant c T also depends on T . Proof By Lemma 2.1 and (2.2) we obtain Then by [25, Theorem 2.1 and Corollary 2.4], for any v ∈ L 2 ( ) there exists a unique solution u of (2.1) and it is given by It remains to show the estimates. Let t > 0, θ ∈ (0, π/2), δ > 0. We choose δ = 1/t and denote for short Next we prove estimate (2.9) for ν = 1 and m ≥ 0. By applying the operator A to both sides of (2.4) and differentiating we arrive at (2.14) Using the identity it follows from (2.12) and Lemma 2.1 that By taking AH(z) ≤ c|z| −α , we obtain from (2.14) This shows estimate (2.9). To prove estimate (2.10) with ν = 0 we observe that

Now by noting the identity
and the fact that z m−1 e zt dz = 0 for m ≥ 1, we have By (2.11) we obtain and thus using this estimate, we get Lastly, note that (2.10) with ν = 1 is equivalent to (2.9) with ν = 0 and v replaced by Av.
Remark 2. 1 We note that this argument is applicable to any sectorial operator A, including the Riemann-Liouville fractional derivative operator in space [9].
Further, the estimates in Theorem 2.1 imply the following result by interpolation.

Further discussions on the behavior of the solution
The estimate (2.9) holds for any t > 0. However, in the case ν = 1 and m = 0 we can improve this estimate for large t > 0. Namely, if we apply the bound AH(z) ≤ M from (2.15) in the estimate of (2.14), we get the following sharper bound for large t: which is sharper than (2.9) for large t. This bound together with (2.9) with ν = 0, m = 1, imply the following a priori estimate for the solution of problem (1.1): Further, by applying eigenfunction expansion, the solution of the Rayleigh-Stokes problem (1.1) can be written in the form where f j (t) = ( f (., t), ϕ j ) and u j (t) satisfies the following equation: To solve (2.17) we apply Laplace transform and use the identities which hold for functions u(t), continuous for t > 0, and such that u(0) is finite [16, equation (1.15)]. In this way, for the Laplace transform of u j (t), one arrives at Based on this representation, in the next theorem we summarize some properties of the time-dependent components u j (t), which are useful in the study of the solution behavior, including the inhomogeneous problem.
Recall that a function u(t) is said to be completely monotone if and only if

Theorem 2.2
The functions u j (t), j = 1, 2, . . . , have the following properties: where the constant c does not depend on j and t.
Proof We introduce the auxiliary functions v j (t) defined by their Laplace transforms By the property of the Laplace transform u(0) = lim z→+∞ z u(z) we obtain u j (0) = 1 and v j (0) = 1. Further, taking the inverse Laplace transform of (2.17), we get where Br = {z; z = σ, σ > 0} is the Bromwich path [30]. The function under the integral has a branch point 0, so we cut off the negative part of the real axis. Note that the function z + γ λ j z α + λ j has no zero in the main sheet of the Riemann surface including its boundaries on the cut. Indeed, if z = e iθ , with > 0, θ ∈ (−π, π), then since sin θ and sin αθ have the same sign and λ j , γ > 0. Hence, u j (t) can be found by bending the Bromwich path into the Hankel path Ha(ε), which starts from −∞ along the lower side of the negative real axis, encircles the disc |z| = ε counterclockwise and ends at −∞ along the upper side of the negative real axis. By taking ε → 0 we obtain Since α ∈ (0, 1), and λ j , γ > 0, there holds K j (r ) > 0 for all r > 0. Hence, by Bernstein's theorem, u j (t) are completely monotone functions. In particular, they are positive and monotonically decreasing. This shows the first two assertions.
In the same way we prove that the functions v j (t) are completely monotone and hence 0 < v j (t) ≤ 1. By (2.18), and (2.20), Last, using the representation where the function g(z) is defined as in (2.3), the last assertion follows by applying the argument from the proof of Theorem 2.1 with A replaced by λ j > 0 and using the following estimate analogous to (2.15): This completes the proof of the proposition.
By Theorem 2.1, for any α ∈ (0, 1), the solution operator S has a smoothing property in space of order two. In the limiting case α = 1, however, it does not have any smoothing property. To see this, we consider the eigenfunction expansion: (2.21) In the case α = 1 we deduce from (2.17) and (2.18) This shows that the problem does not have smoothing property.

Remark 2.4
We observe that if v ∈ L 2 ( ), then u(t) Ḣ 2 ( ) behaves like t α−1 as t → 0. This behavior is the identical with that of the solution to the subdiffusion equation; see [17,Theorem 4.1] and [26, Theorem 2.1]. However, as t → ∞, u(t) Ḣ 2 ( ) decays like t −1 , as in the case of standard diffusion equation. The solution u(t) of (1.1) decays like t −1 for t → ∞. This is faster than t α−1 , the decay of the solution to subdiffusion equation [26, Corollary 2.6], but much slower than the exponential decay for the diffusion equation.
We may extend Theorem 2.1 to the case of very weak initial data, i.e., v ∈Ḣ q ( ) with −1 < q < 0. Obviously, for any t > 0 the function u(t) = S(t)v satisfies Eq. (1.1) in the sense ofḢ q ( ). Then we appeal to the expansion (2.21). Repeating the argument of Theorem 2.1 yields Hence, the function u(t) = S(t)v satisfies (1.1) and for t → 0 converges to v iṅ H q ( ), i.e., u(t) = S(t)v does represent a solution. Further, the argument of Theorem 2.1 yields u(t) = S(t)v ∈Ḣ 2+q ( ) for any t > 0.

Semidiscrete Galerkin finite element method
In this section we consider the space semidiscrete finite element approximation and derive optimal error estimates for the homogeneous problem.

Semidiscrete Galerkin scheme
First we recall the L 2 ( )-orthogonal projection P h : L 2 ( ) → X h and the Ritz projection R h : H 1 0 ( ) → X h , respectively, defined by For ϕ ∈Ḣ −s ( ) for 0 < s ≤ 1, the L 2 ( )-projection P h is not well-defined. Nonetheless, one may view (ϕ, χ ) for χ ∈ X h ⊂Ḣ s as the duality pairing between the spacesḢ s ( ) andḢ −s ( ) and define P h in the same manner.
The Ritz projection R h and the L 2 -projection P h have the following properties.

Lemma 3.1 Let the mesh X h be quasi-uniform. Then the operators R h and P h satisfy:
In addition, P h is stable onḢ q ( ) for −1 ≤ q ≤ 1.

Upon introducing the discrete Laplacian
and f h = P h f , we may write the spatially discrete problem (1.2) as to find u h ∈ X h such that where v h ∈ X h is a suitable approximation to the initial condition v. Accordingly, the solution operator S h (t) for the semidiscrete problem (1.2) is given by where is the contour defined in (2.13) and A h = − h . Further, with the eigenpairs The stability of the operator S h (t) is given below. The proof is similar to that of Theorem 2.1, and hence omitted.

Lemma 3.2 Let S h (t) be defined by (3.2) and v h ∈ X h . Then
where for m = 0 and 0 ≤ q ≤ p ≤ 2 or m > 0 and 0 ≤ p, q ≤ 2. Now we derive error estimates for the semidiscrete Galerkin scheme (3.2) using an operator trick, following the interesting work of Fujita and Suzuki [6]. We note that similar estimates follow also from the technique in [10], but at the expense of an additional logarithmic factor | ln h| in the case of nonsmooth initial data.
The following lemma plays a key role in deriving error estimates.
The next lemma shows an error estimate between (g(z)I + A) −1 v and its discrete analogue (3.5) Proof By the definition, w and w h respectively satisfy Subtracting these two identities yields the following orthogonality relation for the error e = w − w h : g(z)(e, χ) + (∇e, ∇χ) = 0, ∀χ ∈ X h . (3.6) This and Lemma 3.3 imply that for any χ ∈ X h By taking χ = π h w, the Lagrange interpolant of w, and using the Cauchy-Schwarz inequality, we arrive at Appealing again to Lemma 3.3 with the choice ϕ = w, we obtain Consequently In view of (3.8), a bound on w Ḣ 2 ( ) can be derived It follows from this and (3.7) that and this yields This gives the desired bound on ∇e L 2 ( ) . Next, we derive the estimate on e L 2 ( ) by a duality argument. For ϕ ∈ L 2 ( ), by setting Then the desired estimate follows from (3.6) and (3.9) by This completes proof of the lemma.

Error estimates for the semidiscrete scheme
Now we can state the error estimate for the nonsmooth initial data v ∈ L 2 ( ).

Theorem 3.1 Let u and u h be the solutions of problem (1.1) and (3.2)
with v ∈ L 2 ( ) and v h = P h v, respectively. Then for t > 0, there holds:

Proof The error e(t) := u(t) − u h (t) can be represented as
A similar argument also yields the L 2 ( )-estimate.
Next we turn to the case of smooth initial data, i.e., v ∈Ḣ 2 ( ) and v h ∈ R h v. We take again contour = 1/t,π −θ . Then the error e(t) = u(t) − u h (t) can be represented as By the equality we can obtain Then we derive the following error estimate.

Theorem 3.2 Let u and u h be the solutions of problem (1.1) and (3.2) with v ∈Ḣ 2 ( )
and v h = R h v, respectively. Then for t > 0, there holds: Now it follows from this and the representation (3.10) that Hence we obtain the L 2 ( )-error estimate. The H 1 ( )-error estimate follows analogously.
Remark 3.1 For smooth initial data v ∈Ḣ 2 ( ), we may also take the approximation v h = P h v. Then the error can be split into Theorem 3.2 gives an estimate of the first term. A bound for the second term follows from Lemmas 3.1 and 3.2 Thus the error estimate (3.11) holds for the initial approximation v h = P h v. It follows from this, Theorem 3.1, and interpolation that for all q ∈ [0, 2] and v h = P h v, there holds Remark 3.2 If the initial data is very weak, i.e., v ∈Ḣ q ( ) with −1 < q < 0, Then the argument of [8, Theorem 2] yields the following optimal error estimate for the semidiscrete finite element approximation (1.2) (3.12)

Fully discrete schemes
Now we develop two fully discrete schemes for problem (1.1) based on convolution quadrature (see [4,[13][14][15] for detailed discussions), and derive optimal error estimates for both smooth and nonsmooth initial data.

Convolution quadrature
First we briefly describe the abstract framework in [4, Sections 2 and 3], which is instrumental in the development and analysis of fully discrete schemes. Let K be a complex valued or operator valued function that is analytic in a sector π −θ , θ ∈ (0, π/2) and is bounded by for some real numbers μ and M. Then K (z) is the Laplace transform of a distribution k on the real line, which vanishes for t < 0, has its singular support empty or concentrated at t = 0, and which is an analytic function for t > 0. For t > 0, the analytic function k(t) is given by the inversion formula where is a contour lying in the sector of analyticity, parallel to its boundary and oriented with increasing imaginary part. With ∂ t being time differentiation, we define K (∂ t ) as the operator of (distributional) convolution with the kernel k : K (∂ t )g = k * g for a function g(t) with suitable smoothness. A convolution quadrature approximates K (∂ t )g(t) by a discrete convolution K (∂ τ )g(t). Specifically, we divide the time interval [0, T ] into N equal subintervals with a time step size τ = T /N , and define the approximation: where the quadrature weights {ω j } ∞ j=0 are determined by the generating function Here δ is the quotient of the generating polynomials of a stable and consistent linear multistep method. In this work, we consider the backward Euler (BE) method and second-order backward difference (SBD) method, for which Now we specialize the construction to the semidiscrete problem (3.2). By integrating (3.2) from 0 to t, we arrive at a representation of the semidiscrete solution u h The left-hand side is a convolution, which we approximate at t n = nτ with U n h by where the symbols∂ α−1 τ and∂ −1 τ refer to relevant convolution quadrature generated by the respective linear multistep method. For the convenience of numerical implementation, we rewrite them in a time stepping form.

The backward Euler (BE) method
The BE method is given by: Find U n h for n = 1, 2, . . . , N such that with the convolution quadratures∂ α−1 τ and∂ −1 τ generated by the BE method. By applying∂ τ to the scheme (4.2) and the associativity of convolution, we deduce that it can be rewritten as: Remark 4.1 In the scheme (4.3), the term at n = 0 in∂ α τ A h U n h can be omitted without affecting its convergence rate [15,27].

The second-order backward difference (SBD) method
Now we turn to the SBD scheme. It is known that it is only first-order accurate if g(0) = 0, e.g., for g ≡ 1 [13, Theorem 5.1] [4, Section 3]. The first-order convergence is numerically also observed on problem (1.1). Hence, one needs to correct the scheme, and we follow the approach proposed in [4,15]. Using the identity we can rewrite the semidiscrete solution u h into . This leads to the convolution quadrature The purpose of keeping the operator ∂ −1 t intact in (4.4) is to achieve a second-order accuracy, cf. Lemma 4.4 below. Letting 1 τ = (0, 3/2, 1, . . .), and noting the identity 1 τ =∂ τ ∂ −1 1 at grid points t n , and associativity of convolution, (4.4) can be rewritten as Next by applying the operator∂ τ , we obtain Thus we arrive at a time stepping scheme: with U 0 h = v h , find U n h such that where the convolution quadrature∂ α τ ϕ n is given bỹ with the weights {ω α j } generated by the SBD method. The error analysis of the fully discrete schemes (4.3) and (4.5) for the case f ≡ 0 will be carried out below, following the general strategy in [4, Section 4].

Error analysis of the backward Euler method
Upon recalling the function g(z) from (2.3) and denoting we can write the difference between u h (t n ) and U n h as For the error analysis, we need the following estimate [13, Theorem 5.2].
Lemma 4.1 Let K (z) be analytic in π −θ and (4.1) hold. Then for g(t) = ct β−1 , the convolution quadrature based on the BE satisfies Now we can state the error estimate for nonsmooth initial data v ∈ L 2 ( ).

Lemma 4.2 Let u h and U n h be the solutions of problem (3.2) and (4.3)
with v ∈ L 2 ( ), U 0 h = v h = P h v and f ≡ 0, respectively. Then there holds Proof By (2.2) and the identity G(z) = g(z)(g(z)I + A h ) −1 for z ∈ π −θ , there holds Then (4.7) and Lemma 4.1 (with μ = 0 and β = 1) give and the desired result follows directly from the L 2 ( ) stability of P h .
Next we turn to smooth initial data, i.e., v ∈Ḣ 2 ( ). Proof With the identity and denoting G s (z) = −(g(z)I + A h ) −1 , the error U n h − u h (t n ) can be represented by From (2.2) and Lemma 2.1 we deduce and the desired estimate follows directly from the identity A h R h = P h A.
Remark 4.2 By Lemma 4.3, the error estimate exhibits a singular behavior of order t −α as t → 0 + , even for smooth initial data v ∈Ḣ 2 ( ). Nonetheless, as α → 0 + , problem (1.1) reduces to the standard parabolic equation, and accordingly the singular behavior disappears for smooth data, which coincides with the parabolic counterpart [29].
Now we can state error estimates for the fully discrete scheme (4.3) with smooth and nonsmooth initial data, by the triangle inequality, Theorems 3.1 and 3.2, Lemmas 4.2 and 4.3, respectively for the nonsmooth and smooth initial data.

Remark 4.3
For v ∈Ḣ 2 ( ), we can also choose v h = P h v. Let U n h be the corresponding solution of the fully discrete scheme with v h = P h v. By the stability of the scheme, a direct consequence of Lemma 4.3, we have Thus the estimate in Theorem 4.1(a) still holds for v h = P h v. Then by interpolation with the estimate for v ∈ L 2 ( ), we deduce

Remark 4.4
In case of very weak initial data, i.e., v ∈Ḣ q ( ) with −1 < q < 0, by Lemma 4.2, the inverse inequality [3, pp. 140] and Lemma 3.1 we have This and Remark 3.2 yield the following error estimate

Error analysis of the second-order backward difference method
With Now we can state the error estimate for nonsmooth initial data v ∈ L 2 ( ).

Lemma 4.5 Let u h and U n h be the solutions of problem (3.2) and (4.5) with
Proof By (2.2) and the identity there holds Then (4.8) and Lemma 4.4 (with μ = −1 and β = 2) give and the desired result follows directly from the L 2 ( ) stability of P h .
Next we turn to smooth initial data v ∈Ḣ 2 ( ).

Lemma 4.6
Let u h and U n h be the solutions of problem (3.2) and (4. From (2.2) and Lemma 2.1 we deduce and the desired estimate follows from the identity A h R h = P h A.
Then we have the following error estimates for the fully discrete scheme (4.5).

Numerical results
In this part, we present numerical results to verify the convergence theory in Sects. 3 and 4. We shall consider one-and two-dimensional examples with smooth, nonsmooth and very weak initial data. In the one-dimensional case, we take = (0, 1), and in the two-dimensional case = (0, 1) 2 . Here we use the notation χ S for the characteristic function of the set S. The following four cases are considered. 1/2] ; the jump at x = 1/2 and v(0) = 0 lead to v / ∈Ḣ 1 ( ); but for any In our experiments, we fix the parameter γ = 1 in (1.1) for all cases. We examine separately the spatial and temporal convergence rates at t = 0.1. For the case of nonsmooth initial data, we are especially interested in the errors for t close to zero. The exact solutions to these examples can be expressed in terms of generalized Mittag-Leffler functions, which however is difficult to compute, and hence we compute the reference solution on a very refined mesh. We report the normalized errors e n L 2 ( ) / v L 2 ( ) and e n Ḣ 1 ( ) / v L 2 ( ) , e n = u(t n ) − U n h , for both smooth and nonsmooth data.
In our computation, we divide the unit interval (0, 1) into K = 2 k equally spaced subintervals, with a mesh size h = 1/K . The finite element space X h consists of continuous piecewise linear functions. Similarly, we take the uniform temporal mesh with a time step size τ = t/N , with t being the time of interest.

Numerical results for example (a)
First, we fix the mesh size h at h = 2 −11 so that the error incurred by spatial discretization is negligible, which enable us to examine the temporal convergence rate. In Table 1, we show the L 2 ( )-norm of the error at t = 0.1 for different α values. In the table, BE and SBD denote the backward Euler method and the second-order backward difference method, respectively, rate refers to the empirical convergence rate when the time step size τ (or the mesh size h) halves, and the numbers in the bracket denote theoretical convergence rates. In Fig. 1 we plot the results for α = 0.5 in a log-log scale. A convergence rate of order O(τ ) and O(τ 2 ) is observed for the BE method and the SBD method, respectively, which agrees well with our convergence theory. Further, we observe that the error decreases as the fractional order α increases.   In Table 2 and Fig. 2, we show the L 2 ( )and H 1 ( )-norms of the error at t = 0.1 for the BE scheme. We set τ = 2 × 10 −5 and check the spatial convergence rate. The numerical results show O(h 2 ) and O(h) convergence rates respectively for the L 2 ( )and H 1 ( )-norms of the error, which fully confirm Theorem 3.2. Further, the empirical convergence rate is almost independent of the fractional order α.

Numerical results for example (b)
In Tables 3 and 4 we present the results for example (b). The temporal convergence rate is O(τ ) and O(τ 2 ) for the BE and the SBD method, respectively, cf. Table 3, and   Table 4. For nonsmooth initial data, we are especially interested in errors for t close to zero. Thus we also present the error at t = 0.01 and t = 0.001 in Table 4. The numerical results fully confirm the predicted rates. Further, in Table 5 and Fig. 3 we show the L 2 ( )-norm of the error for examples (a) and (b), for fixed h = 2 −6 and t → 0. To check the spatial discretization error, we fix time step τ at τ = t/1,000 and use the SBD method so that the temporal discretization error is negligible. We observe that in the smooth case, i.e., example (a), the spatial error essentially stays unchanged, whereas in the nonsmooth case, i.e., example (b), it deteriorates as t → 0. In example (b) the initial data v ∈Ḣ 1/2− ( ) for any > 0, and by Remark 4.5, the error grows like O(t −3α/4 ) as t → 0. The empirical rate in Table 5 and Fig. 3 agrees well with the theoretical prediction, i.e., −3α/4 = −0.375 for α = 0.5.

Numerical results for example (c)
In the case of very weak data, according to Remarks 4.4 and 4.6, we can only expect spatial convergence for a small time step size τ . The results in Table 6 indicate a superconvergence phenomenon with a rate O(h 2 ) in the L 2 ( )-norm and O(h) in the H 1 ( )-norm. This is attributed to the fact that in one dimension the solution with the Dirac δ-function as the initial data is smooth from both sides of the support point and the finite element spaces X h have good approximation property. When the singularity point x = 1/2 is not aligned with the grid, Table 7 shows an O(h 3/2 ) and O(h 1/2 ) rate for the L 2 ( )and H 1 ( )-norm of the error, respectively.

Numerical results for example (d)
Here we consider a two-dimensional example on the unit square = (0, 1) 2 for the nonsmooth initial data. To discretize the problem, we divide the unit interval (0, 1)   into K = 2 k equally spaced subintervals with a mesh size h = 1/K so that the domain is divided into K 2 small squares. We get a symmetric triangulation of the domain by connecting the diagonal of each small square. Table 8 shows a temporal convergence rate of first order and second order for the BE and SBD method, respectively. Spatial  Tables 8 and 9, respectively. All numerical results confirm our convergence theory.

Concluding remarks
In this work, we have studied the homogeneous problem for the Rayleigh-Stokes equation in a second grade generalized flow. The Sobolev regularity of the solution was established using an operator theoretic approach. A space semidiscrete scheme based on the Galerkin finite element method and two fully discrete schemes based on the backward Euler method and second-order backward difference method and related convolution quadrature were developed and optimal with respect to the data regularity error estimates were provided for both semidiscrete and fully discrete schemes. Extensive numerical experiments fully confirm the sharpness of our convergence analysis.