1 Introduction

Anisotropic adaptive meshes with large aspect ratio have proved to be extremely efficient for partial differential equations with free boundaries or boundary layers, see for instance [1,2,3] for applications in computational fluid dynamics. In most cases, the adaptive criteria are based on heuristics or interpolation error estimates rather than rigorous a posteriori error estimates. This is particularly the case when the time dependent transport equation is involved, since few a posteriori error estimates are available [4].

In [5], anisotropic a posteriori error estimates were derived for the time dependent transport equation discretized in space only. An adaptive algorithm with meshes having large aspect ratio was proposed. Since the error due to time discretization was not considered, small time steps were used. In this paper, the error due to time discretization is taken into account. More precisely, the order two Crank–Nicolson scheme is used and an appropriate piecewise quadratic time reconstruction is advocated, as in [6]. The quality of our error estimator is first validated on non-adapted meshes and constant time steps. An adaptive algorithm is then proposed, with goal to build a sequence of anisotropic meshes and time steps, so that the final error is close to a preset tolerance. Numerical results on adapted, anisotropic meshes and time steps show the efficiency of the method.

2 Statement of the Problem and Numerical Schemes

2.1 Problem Setting

Given an open set \(\varOmega \subset \mathbb {R}^{2}\), \(T>0\), \(\beta \in C^{1}\left( \bar{\varOmega }\right) \), \(\text {div }\beta =0\), \(f\in C\left( \left[ 0,T\right] ;L^{2}\left( \varOmega \right) \right) \) and \(u_{0}\in C(\bar{\varOmega }) \), we are looking for \(u:\varOmega \times \left[ 0,T\right] \longrightarrow \mathbb {R}\) satisfying the transport problem

$$\begin{aligned} \left\{ \begin{array}{l} \frac{{\displaystyle \partial u}}{{\displaystyle \partial t}}+\beta \cdot \nabla u=f \text{ in } \varOmega \times \left( 0,T\right) ,\\ u=0 \text{ on } \varGamma ^{-}\times \left( 0,T\right) ,\\ u(\cdot ,0)=u_{0}, \end{array}\right. \end{aligned}$$
(1)

where \(\varGamma ^{-}=\left\{ x\in \partial \varOmega :\beta \cdot n<0\right\} \), with n being the unit outer normal of \(\varOmega .\) With the above assumptions, the problem (1) has a unique solution \(u\in C^0([0,T]; L^2(\varOmega ))\), see for instance [7]. Throughout this paper, it will be assumed that the data T, \(\varOmega \), f, \(\beta \) and \(u_0\) are such that u is smooth enough to justify all required computations.

It is well known that the classical Galerkin formulation is unsuitable for the transport equation and that some stabilization techniques are necessary. Assume that \(\varOmega \) is a polygon and \(\varGamma ^{-}\) is the union of edges lying on \(\partial \varOmega \). For any \(h>0,\) let \(\mathcal {T}_{h}\) be a conformal triangulation of \(\bar{\varOmega }\) into triangles K of diameter \(h_K\) less than h. Let \(V_{h}\) be the set of continuous piecewise linear functions on each triangle of \(\mathcal {T}_{h}\), with zero value on \(\varGamma ^-\). A possible finite element discretization in space is to search for \(u_{h}:\varOmega \times \left( 0,T\right) \longrightarrow \mathbb {R}\) such that \(u_{h}(\cdot ,0)=r_{h}u_{0}\) (\(r_{h}\) is the Lagrange interpolant) and, for all \(0\le t \le T\)

$$\begin{aligned} \int _{\varOmega }\left( \frac{\partial u_{h}}{\partial t}+\beta \cdot \nabla u_{h}-f\right) \left( v_{h}+\delta _{h}\beta \cdot \nabla v_{h}\right) dx=0 \quad \forall v_{h}\in V_{h}, \end{aligned}$$
(2)

where \(\delta _{h}>0\) is a stabilization parameter that will be specified later on. A numerical study of (2) with anisotropic finite elements has already been proposed in [5], our goal is to take into account an order two in time discretization, namely the Crank–Nicolson scheme. Let N be a non-negative integer and consider a partition \(0=t^{0}<t^{1}<\cdots <t^{N}=T.\) We denote by \(\tau ^{n+1}=t^{n+1}-t^{n}\) the time step, \(n=1,2,\ldots ,N-1\). Starting from \(u_{h}^{0}=r_{h}u_{0}\), for \(n=0,1,\ldots ,N-1\), we are looking for \(u_{h}^{n+1}\in V_{h}\) such that

(3)

where

We will only consider one mesh \(\mathcal {T}_h\) for the theoretical analysis of the scheme (3). Comments are added in the case of dynamic meshes in Sect. 4.2.

2.2 Anisotropic Finite Elements

In this paper, anisotropic finite elements will be used, that is to say meshes with possibly large aspect ratio. We will use the notations and results of [8,9,10], see also [11] for similar results. Let \(K\in \mathcal {T}_{h}\) and \(T_{K}:\hat{K}\longrightarrow K\) be the affine transformation mapping the reference triangle \(\hat{K}\) into K defined by

$$\begin{aligned} x=T_{K}(\hat{x})=M_{K}\hat{x}+t_{K}, \end{aligned}$$

with \(M_{K}\in \mathbb {R}^{2\times 2},t_{K}\in \mathbb {R}^{2}.\) Observe that \(M_{K}\) is invertible, so it admits a singular value decomposition \(M_{k}=R_{K}^{T}\varLambda _{K}P_{K},\) where \(R_{K}\) and \(P_{K}\) are orthogonal matrices and

$$\begin{aligned} \varLambda _{K}=\begin{pmatrix}\lambda _{1,K} &{} 0\\ 0 &{} \lambda _{2,K} \end{pmatrix},\quad \lambda _{1,K}\ge \lambda _{2,K}>0. \end{aligned}$$

We note

$$\begin{aligned} R_{K}=\begin{pmatrix}r_{1,K}^{T}\\ r_{2,K}^{T} \end{pmatrix}, \end{aligned}$$

where \(r_{1,K},r_{2,K}\) are the unit vectors corresponding to directions of maximum and minimum stretching, respectively, so that \(\lambda _{1,K},\lambda _{2,K}\) are the value of maximum and minimum stretching. With these notations, the following interpolation results holds for the Lagrange interpolant \(r_h\) [8, 9]:

$$\begin{aligned} \left\| v-r_{h}v\right\| _{L^{^{2}}(K)}^{2}+\lambda _{2,K}^{2}\left\| \nabla (v-r_{h}v)\right\| _{L^{^{2}}(K)}^{2}\le C L_{K}^{2}(v) \quad v\in H^{2}(\varOmega ), \end{aligned}$$
(4)

where \(C>0\) is a constant depending only on the reference triangle \(\hat{K}\), and

$$\begin{aligned} L_K^2(v)= & {} \lambda _{1,K}^4 \int _K (r_{1,K}^{T}H(v)r_{1,K})^2dx + \lambda _{1,K}^2 \lambda _{2,K}^2 \int _K (r_{1,K}^{T}H(v)r_{2,K})^2dx\nonumber \\&\quad +\,\lambda _{2,K}^4 \int _K (r_{2,K}^{T}H(v)r_{2,K})^2dx, \end{aligned}$$
(5)

where H(v) is the Hessian matrix defined by

$$\begin{aligned} H(v)=\begin{pmatrix}{\displaystyle \frac{\partial ^2 v}{\partial x_1^2}} &{} {\displaystyle \frac{\partial ^2 v}{\partial x_1 \partial x_2}} \\ {\displaystyle \frac{\partial ^2 v}{\partial x_1 \partial x_2}} &{}{\displaystyle \frac{\partial ^2 v}{\partial x_2^2}} \end{pmatrix}. \end{aligned}$$
(6)

The considerations that follow require also the use of Clément’s interpolant. Since anisotropic meshes are considered, we assume that each vertex has a number of neighbours bounded from above, uniformly with respect to h. Moreover, we suppose that for each K, the diameter of \({\Delta }\hat{K}=T_{K}^{-1}({\Delta } K)\), where \({\Delta } K\) is the union of triangles sharing a vertex with K, is uniformly bounded, independently of the mesh geometry. For more details, we refer again to [8,9,10]. In this framework, the following estimation holds

$$\begin{aligned} \parallel v-R_{h}v\parallel _{L^{^{2}}(K)}^{2}+\,\lambda _{2,K}^{2}\parallel \nabla (v-R_{h}v)\parallel _{L^{^{2}}(K)}^{2}\le C\omega _{K}^{2}(v)\quad v\in H^{1}(\varOmega ), \end{aligned}$$
(7)

where \(R_{h}\) is the Clément’s interpolant, \(C>0\) is a constant depending only on the reference triangle \(\hat{K},\) and

$$\begin{aligned} \omega _{K}^{2}(v)=\lambda _{1,K}^{2}(r_{1,K}^{T}G_{K}(v)r_{1,K})+\lambda _{2,K}^{2}(r_{2,K}^{T}G_{K}(v)r_{2,K}), \end{aligned}$$
(8)

with

$$\begin{aligned} G_{K}(v)=\begin{pmatrix}{\displaystyle \int _{{\Delta } K}}\left( \frac{{\displaystyle \partial v}}{{\displaystyle \partial x_{1}}}\right) ^{2}dx &{} {\displaystyle \int _{{\Delta } K}}{\displaystyle \frac{\partial v}{\partial x_{1}}\frac{\partial v}{\partial x_{2}}}dx\\ {\displaystyle \int _{{\Delta } K}}{\displaystyle \frac{\partial v}{\partial x_{1}}\frac{\partial v}{\partial x_{2}}}dx &{} {\displaystyle \int _{{\Delta } K}}\left( {\displaystyle \frac{\partial v}{\partial x_{2}}}\right) ^{2}dx \end{pmatrix}. \end{aligned}$$

3 Error Estimates

3.1 A Priori Error Estimates

We now prove that the solution of the numerical method (3) converges to that of problem (1) for anisotropic meshes. This has already been proved for isotropic meshes in [12]. The stabilization parameter is kept constant in time and space. The key ingredient of the proof consists in taking test functions of the form \(\displaystyle v + \delta _h \frac{\partial v}{\partial t}\) [12] and to observe that, since \(\beta \) is divergence free, we have for all \(v\in H^{1}(\varOmega ) \text{ vanishing } \text{ on } \varGamma ^{-}\)

$$\begin{aligned} \int _{\varOmega }(\beta \cdot \nabla v )vdx =\int _{\varOmega } \frac{1}{2} \text { div}(\beta v^{2}) dx = \frac{1}{2}\int _{\partial \varOmega }(\beta \cdot n) v^{2} dx \ge 0. \end{aligned}$$
(9)

Since \(\mathbb {P}_1\) finite elements and the Crank–Nicolson method are used, it is expected that the error at final time \(\Vert u(T) - u_{h}^{N} \Vert _{L^{2}(\varOmega )}\) reduces to in the isotropic settings.

Theorem 1

Assume that \(\beta \) is not identically zero on \(\varOmega \). Let u be the solution of (1) and let \(u_h^N\) be the solution of (3) with a constant \(\delta _h\) defined by

$$\begin{aligned} \delta _h = \frac{\displaystyle \max _{K\in \mathcal {T}_h}\lambda _{2,K}}{2\Vert \beta \Vert _{(L^\infty (\varOmega ))^2}}. \end{aligned}$$
(10)

Let

$$\begin{aligned} \displaystyle \tau = \max _{n=0,\ldots ,N-1}\tau ^{n+1}. \end{aligned}$$

Assume that the data T, \(\varOmega \), f, \(\beta \), \(u_0\) are such that \(u \in H^1(0,T;H^2(\varOmega ))\) and \(\displaystyle \frac{\partial ^3 u_h }{\partial t^3} \in L^2(0,T; L^2(\varOmega )).\) Let \(e(t^n)=u(t^n)-u_h^n, n=0,\ldots ,N\). Then, there exists \(C>0\) independent of the data T, \(\varOmega \), f, \(\beta \), \(u_0\), the mesh size, aspect ratio and the time step such that

$$\begin{aligned} \Vert e(T) \Vert _{L^2(\varOmega )}^2 +\delta _h^2 \Vert \beta \cdot \nabla e(T)\Vert _{L^2(\varOmega )} ^2\le & {} C \Biggl ( \Vert e(0) \Vert _{L^2(\varOmega )}^2 +\delta _h^2 \Vert \beta \cdot \nabla e(0)\Vert _{L^2(\varOmega )} ^2 \nonumber \\&\quad + \int _0^T \sum _{K\in \mathcal {T}_h} \left( \left( \frac{1}{\delta _h} + \frac{\delta _h\Vert \beta \Vert _{(L^\infty (K))^2}^{2}}{\lambda _{2,K}^2}\right) L_K^2(u)\right. \nonumber \\&\quad \left. +\left( \delta _h + \frac{\delta _h^3\Vert \beta \Vert _{(L^\infty (K))^2}^{2}}{\lambda _{2,K}^2}\right) L_K^2\left( \frac{\partial u }{\partial t}\right) \right) dt\Biggl ) \nonumber \\&\quad +\, e \left( 4T\tau ^4 + 2\delta _h \tau ^4 + 16T \tau ^2 \delta _h^2 \right) \int _0^{T} \left\| \frac{\partial ^{3} u_h}{\partial t^{3}}\right\| _{L^{2}(\varOmega )}^{2} dt, \nonumber \\ \end{aligned}$$
(11)

where \(L_K\) is defined by (5).

Remark 1

In the case of isotropic meshes, \(\lambda _{1,K} \simeq \lambda _{2,K} \simeq h_K\) and \(L^2_K(u) \le C h_K^4 \vert u \vert ^2_{H^2(K)}\), where C is independent of the mesh size but can depend on the mesh aspect ratio. Thus, in these settings, (11) reduces to

$$\begin{aligned} \Vert e(T) \Vert ^2_{L^2(\varOmega )} \le C (h^3 + \tau ^4) + h.o.t., \end{aligned}$$

where h.o.t. stands for higher order terms.

Remark 2

As already explained in [5], estimate (11) is optimal with respect to the space discretization parameter for anisotropic meshes. Indeed, assume that the solution u depends only on one variable and that the mesh is aligned with the solution, then the estimate (11) reduces to

$$\begin{aligned} \Vert e(T) \Vert ^2_{L^2(\varOmega )} \le C \left( \left( \max _{K\in T_{h}}\lambda _{2,K}\right) ^3 + \tau ^4\right) + h.o.t. \end{aligned}$$

and \(\max _{K\in T_{h}}\lambda _{2,K} \rightarrow 0\) is sufficient to ensure the convergence of the numerical method.

Remark 3

We have not been able to prove that

$$\begin{aligned} \int _0^{T} \left\| \frac{\partial ^{3} u_h}{\partial t^{3}}\right\| _{L^{2}(\varOmega )}^{2} dt \end{aligned}$$

is bounded independently of h and \(\tau \). The proof is not obvious, even for parabolic problems [13], and out of the scope of the present paper. It should be noticed that an a priori error estimate can also be proved introducing the anisotropic equivalent of the hyperbolic projector used in [12]. In this case, only derivatives of the exact solutions

$$\begin{aligned} \int _0^{T} \left\| \frac{\partial ^{3} u}{\partial t^{3}} \right\| ^2_{L^2(\varOmega )} dt, \quad \int _0^{T} \left\| \frac{\partial ^{4} u}{\partial t^{4}} \right\| ^2_{L^2(\varOmega )} dt, \quad \sup _{t\in [0,T]}\left\| \frac{\partial ^{3} u}{\partial t^{3}} \right\| ^2_{L^2(\varOmega )}, \end{aligned}$$

appear in the error bound instead of \( \displaystyle \int _0^{T} \left\| \frac{\partial ^{3} u_h}{\partial t^{3}}\right\| _{L^{2}(\varOmega )}^{2} dt\). We have completed the proof, which is not presented here since it is significantly longer than the one below.

Remark 4

As in [12], a similar analysis can be performed if \(\text {div } \beta \ne 0\) under restrictions on h and \(\tau \), with the price to pay that all the constants involved depends exponentially on the final time and the divergence of \(\beta \).

Proof

Observe that

$$\begin{aligned}&\int _{\varOmega } ( u(T)-u_h^N)^2 dx + \delta _h^2 \int _{\varOmega } (\beta \cdot \nabla (u(T) -u_h^N))^2 dx \\&\quad \le \underbrace{2 \int _{\varOmega } ( u(T)-u_h(T))^2 dx + 2\delta _h^2 \int _{\varOmega } (\beta \cdot \nabla (u(T) -u_h(T)))^2 dx}_{I_1}\\&\qquad +\,\underbrace{2 \int _{\varOmega } ( u_h(T)-u_h^N)^2 dx + 2\delta _h^2 \int _{\varOmega } (\beta \cdot \nabla (u_h(T) -u_h^N))^2 dx}_{I_2} , \end{aligned}$$

where \(u_h(t)\) is the solution of (2). Then we apply Theorem 3.1 of [5] on \(I_1\) and we obtain

$$\begin{aligned}&\int _{\varOmega } e^2(T) dx +\delta _h^2 \int _{\varOmega } (\beta \cdot \nabla e(T))^2 dx \nonumber \\&\quad \le C \left( \int _{\varOmega } e^2(0) dx + \int _{\varOmega } (\beta \cdot \nabla e(0))^2 dx + \int _0^T \sum _{K\in \mathcal {T}_h} \left( \left( \frac{1}{\delta _h} + \frac{\delta _h\Vert \beta \Vert _{(L^\infty (K))^2}^{2}}{\lambda _{2,K}^2}\right) L_K^2(u)\right. \right. \nonumber \\&\qquad +\left. \left. \left( \delta _h + \frac{\delta _h^3\Vert \beta \Vert _{(L^\infty (K))^2}^{2}}{\lambda _{2,K}^2}\right) L_K^2\left( \frac{\partial u }{\partial t}\right) \right) dt\right) + I_2, \end{aligned}$$
(12)

where \(C>0\) depends on the reference triangle \(\hat{K}\) only. In particular, C is independent of \(\varOmega ,f,\beta ,u_0,u,T,N\), the mesh size and aspect ratio and the time step.

We now have to estimate \(I_2\). By using several times the Fundamental Theorem of Calculus, one can derive that

$$\begin{aligned} \frac{u_h(t^{n+1})-u_h(t^{n})}{\tau ^{n+1}}=\frac{\partial _t u_h(t^{n+1})+\partial _t u_h(t^{n})}{2}+r^{n+1}, \end{aligned}$$
(13)

where

$$\begin{aligned} r^{n+1}=\frac{1}{2\tau ^{n+1}}\int _{t^{n}}^{t^{n+1}}\left( \int _{t^{n}}^{s}\int _{t^{n}}^{t} \dfrac{\partial ^3 u_h}{\partial t^3}(\zeta )d \zeta dt + \int _{t^{n+1}}^{s}\int _{t^{n}}^{t} \dfrac{\partial ^3 u_h}{\partial t^3}(\zeta )d \zeta dt \right) ds. \end{aligned}$$

In particular, we observe that

$$\begin{aligned} \vert r^{n+1} \vert ^{2} \le (\tau ^{n+1})^{3} \int _{t^{n}}^{t^{n+1}} \left( \dfrac{\partial ^3 u_h}{\partial t^3}(t)\right) ^{2}dt. \end{aligned}$$
(14)

In the sequel, we will note \(e_h^n = u_h(t^n) -u_h^n\). By using (2), (3) and (13), the following relation holds for the numerical error

$$\begin{aligned}&\int _{\varOmega }\left( \frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}}+\beta \cdot \nabla \left( \frac{e_h^{n+1}+e_h^{n}}{2}\right) \right) \left( v_{h}+\delta _{h}\beta \cdot \nabla v_{h}\right) dx \nonumber \\&\quad = \int _{\varOmega }r^{n+1}\left( v_{h}+\delta _{h}\beta \cdot \nabla v_{h}\right) dx,\quad \forall v_{h}\in V_{h}. \end{aligned}$$
(15)

Choosing

$$\begin{aligned} v_h= \frac{e_h^{n+1}+e_h^{n}}{2}+\delta _h\frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}} \end{aligned}$$

and using (9), we therefore obtain

$$\begin{aligned}&\frac{1}{2\tau ^{n+1}}\left( \Vert e_h^{n+1} \Vert _{L^{2}(\varOmega )}^{2} -\Vert e_h^{n} \Vert _{L^{2}(\varOmega )}^{2} \right) + \frac{\delta _h^{2}}{2\tau ^{n+1}}\left( \Vert \beta \cdot \nabla e_h^{n+1} \Vert _{L^{2}(\varOmega )}^{2} -\Vert \beta \cdot \nabla e_h^{n} \Vert _{L^{2}(\varOmega )}^{2} \right) \\&\quad +\, \delta _h \int _{\varOmega } \left( \frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}}+\beta \cdot \nabla \left( \frac{e_h^{n+1}+e_h^{n}}{2}\right) \right) ^{2}dx\\&\qquad \le \int _{\varOmega } r^{n+1} \left( \frac{e_h^{n+1}+e_h^{n}}{2} + \delta _h^2 \beta \cdot \nabla \left( \frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}}\right) \right) dx \\&\qquad \quad +\, \delta _h \int _\varOmega r^{n+1} \left( \frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}}+\beta \cdot \nabla \left( \frac{e_h^{n+1}+e_h^{n}}{2}\right) \right) dx. \end{aligned}$$

Using Cauchy–Schwarz and Young’s inequality yields

$$\begin{aligned}&\frac{1}{2\tau ^{n+1}}\left( \Vert e_h^{n+1} \Vert _{L^{2}(\varOmega )}^{2} -\Vert e_h^{n} \Vert _{L^{2}(\varOmega )}^{2} \right) + \frac{\delta _h^{2}}{2\tau ^{n+1}}\left( \Vert \beta \cdot \nabla e_h^{n+1} \Vert _{L^{2}(\varOmega )}^{2} -\Vert \beta \cdot \nabla e_h^{n} \Vert _{L^{2}(\varOmega )}^{2} \right) \\&\quad + \frac{\delta _h}{2} \int _{\varOmega } \left( \frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}}+\beta \cdot \nabla \left( \frac{e_h^{n+1}+e_h^{n}}{2}\right) \right) ^{2}dx\\&\qquad \le \left\| r^{n+1}\right\| _{L^2(\varOmega )} \left\| \frac{e_h^{n+1}+e_h^{n}}{2} + \delta _h^2 \beta \cdot \nabla \left( \frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}}\right) \right\| _{L^2(\varOmega )} + \frac{\delta _h}{2} \left\| r^{n+1} \right\| _{L^2(\varOmega )}^{2}. \end{aligned}$$

Multiplication by \(2\tau ^{n+1}\) and use of Cauchy–Schwarz, triangle and Young’s inequalities yield after summing from 0 to \(N-1\)

$$\begin{aligned}&\Vert e_h^{N} \Vert _{L^{2}(\varOmega )}^{2} + \delta _h^{2}\ \Vert \beta \cdot \nabla e_h^{N} \Vert _{L^{2}(\varOmega )}^{2}\\&\qquad \quad +\, \delta _h \sum _{n=0}^{N-1}\tau ^{n+1}\int _{\varOmega } \left( \frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}}+\beta \cdot \nabla \left( \frac{e_h^{n+1}+e_h^{n}}{2}\right) \right) ^{2}dx\\&\qquad \le \sum _{n=0}^{N-1}\left( 2 T \tau ^{n+1} + 8T \frac{\delta _h^2}{\tau ^{n+1}} +\delta _h\tau ^{n+1} \right) \left\| r^{n+1}\right\| _{L^{2}(\varOmega )}^{2}\\&\qquad \quad +\sum _{n=0}^{N-1}\frac{\tau ^{n+1}}{4T}\left( \Vert e_h^{n+1} \Vert _{L^{2}(\varOmega )}^{2}+\delta _h^{2}\Vert \beta \cdot \nabla e_h^{n+1} \Vert _{L^{2}(\varOmega )}^{2}\right) \\&\qquad \quad +\sum _{n=0}^{N-1}\frac{\tau ^{n+1}}{4T}\left( \Vert e_h^{n} \Vert _{L^{2}(\varOmega )}^{2}+\delta _h^{2}\Vert \beta \cdot \nabla e_h^{n} \Vert _{L^{2}(\varOmega )}^{2}\right) \\&\qquad \le \sum _{n=0}^{N-1}\left( 2 T \tau ^{n+1} + 8T \frac{\delta _h^2}{\tau ^{n+1}} +\delta _h\tau ^{n+1} \right) \left\| r^{n+1}\right\| _{L^{2}(\varOmega )}^{2}\\&\qquad \quad +\sum _{n=0}^{N}\frac{\tau ^{n}+\tau ^{n+1}}{4T}\left( \Vert e_h^{n} \Vert _{L^{2}(\varOmega )}^{2}+\delta _h^{2}\Vert \beta \cdot \nabla e_h^{n} \Vert _{L^{2}(\varOmega )}^{2}\right) . \end{aligned}$$

Here we use the fact that \(e_h^{0}=0\) and we have set \(\tau ^0=\tau ^{N+1}=0.\) Finally, we use the discrete Gronwall’s Lemma (see [14], Lemma 5.1) and we get

$$\begin{aligned} \Vert e_h^{N} \Vert _{L^{2}(\varOmega )}^{2} + \delta _h^{2} \Vert \beta \cdot \nabla e_h^{N} \Vert _{L^{2}(\varOmega )}^{2} + \delta _h\sum _{n=0}^{N-1}\tau ^{n+1} \int _\varOmega \left( \frac{e_h^{n+1}-e_h^{n}}{\tau ^{n+1}} + \beta \cdot \nabla \left( \frac{e_h^{n+1}+e_h^{n}}{2} \right) \right) ^{2}\\ \displaystyle \le \exp \left( {\displaystyle \sum _{n=0}^N \frac{\mu _n}{1-\mu _n}}\right) \sum _{n=0}^{N-1}\left( 2 T \tau ^{n+1} + 8T \frac{\delta _h^2}{\tau ^{n+1}} +\delta _h\tau ^{n+1} \right) \left\| r^{n+1}\right\| _{L^{2}(\varOmega )}^{2}, \end{aligned}$$

where \(\displaystyle \mu _n=\frac{\tau ^n+\tau ^{n+1}}{4T}<1.\) Since \( \sum _{n=0}^N \frac{\mu _n}{1-\mu _n} \le 1\) and using (14), we obtain

$$\begin{aligned} I_2 \le e \left( 4 T \tau ^4 + 16T \delta _h^2 \tau ^2 + 2\delta _h\tau ^4 \right) \int _{0}^{T}\left\| \frac{\partial ^{3} u_h}{\partial t^{3}}\right\| _{L^{2}(\varOmega )}^{2} dt. \end{aligned}$$
(16)

Estimates (16) and (12) together yield the result. \(\square \)

3.2 A Posteriori Error Estimate

We now prove an a posteriori error estimate involving time and space discretization for problem (3). As in [5], the following choice for the stabilization parameter \(\delta _h\) is advocated. For all \(K\in \mathcal {T}_{h}\), if \(\beta \) is not identically zero on K, then

$$\begin{aligned} \delta _{h|K}=\frac{\lambda _{2,K}}{2\left\| \beta \right\| _{(L^{\infty }(K))^2}}\quad \forall K\in \mathcal {T}_{h}, \end{aligned}$$
(17)

else \(\delta _{h|K}\) is set to zero. As proposed in [6], we introduce a piecewise quadratic reconstruction of the computed solution in order to recover an \(O(\tau ^{2})\) error estimator. We shall use the following notation

$$\begin{aligned} \partial ^{2}u_{h}^{n+1}(x)=\dfrac{\dfrac{u_{h}^{n+1}(x)-u_{h}^{n}(x)}{\tau ^{n+1}}-\dfrac{u_{h}^{n}(x)-u_{h}^{n-1}(x)}{\tau ^{n}}}{\dfrac{\tau ^{n+1}+\tau ^{n}}{2}}\quad x\in \bar{\varOmega },\quad n\ge 1. \end{aligned}$$

Then, for \(n=1,2,3,\ldots ,N-1,\) we define

$$\begin{aligned} u_{h\tau }(x,t)= & {} u_{h}^{n}(x)+(t-t^{n})\partial u_{h}^{n+1}(x)\nonumber \\&\quad +\,\dfrac{1}{2}(t-t^{n})(t-t^{n+1})\partial ^{2}u_{h}^{n+1}(x)\quad (x,t)\in \bar{\varOmega }\times \left[ t^{n},t^{n+1}\right] , \end{aligned}$$
(18)

and for \(n=0\),

$$\begin{aligned} u_{h\tau }(x,t)=u_{h}^{0}(x)+(t-t^{0})\partial u_{h}^{1}(x)\quad (x,t)\in \bar{\varOmega }\times \left[ t^{0},t^{1}\right] , \end{aligned}$$
(19)

Observe that (18) is a Newton polynomial; for every \(n\ge 1\), \(u_{h\tau }\) is the unique quadratic polynomial in time that equals \(u_{h}^{n-1}\), \(u_{h}^{n}\), \(u_{h}^{n+1}\), at time \(t^{n-1}\), \(t^{n}\), \(t^{n+1}\), respectively.

We first prove the following lemma :

Lemma 1

We have, for all \(v_h\in V_h\):

$$\begin{aligned}&\int _{\varOmega } \left( \frac{\partial u_{h\tau }}{\partial t} + \beta \cdot \nabla u_{h\tau } - f\right) (v_h+\delta _h \beta \cdot \nabla v_h) dx \nonumber \\&\quad = \int _{\varOmega } \theta (v_h + \delta _h \beta \cdot \nabla v_h) dx, \end{aligned}$$
(20)

where \(\theta \) is defined, for \((x,t)\in \bar{\varOmega }\times \left[ t^{n},t^{n+1}\right] \), by

(21)

and

(22)

Proof

We start with \(n=0\). Then

$$\begin{aligned}&\int _\varOmega \left( \frac{ \partial u_{h\tau }}{\partial t} + \beta \cdot \nabla u_{h\tau }\right) (v_h + \delta _h \beta \cdot \nabla v_h) dx \nonumber \\&\quad = \int _\varOmega \left( \frac{u_h^{1}-u_h^{0}}{\tau ^{1}} +\beta \cdot \nabla u_{h\tau }\right) (v_h + \delta _h \beta \cdot \nabla v_h) dx, \end{aligned}$$

thus using (3), we get

The result is then obtained noticing that

For \(n\ge 1\), we have

so that using (3), we have

(23)

We then take the difference of (3) with superscript n and (3) with superscript \(n-1\) to obtain

$$\begin{aligned} \int _{\varOmega } \left( \partial ^2 u_h^{n+1}+\beta \cdot \nabla \dfrac{u^{n+1}-u^{n-1}}{\tau ^n+\tau ^{n+1}}-\dfrac{f^{n+1}-f^{n-1}}{\tau ^n+\tau ^{n+1}}\right) (v_h + \delta _h \beta \cdot \nabla v_h) dx=0. \end{aligned}$$

Inserting into (23) yields the result. \(\square \)

We are now ready to prove our a posteriori error estimate.

Theorem 2

Assume that the data T, \(\varOmega \), f, \(\beta \), \(u_0\) are such that \(u \in L^2(0,T;H^1(\varOmega ))\cap H^1(0,T;L^2(\varOmega )).\) Let \(\delta _{h|K}\) be defined by (17). Let \(u_{h\tau }\) be defined by (18), (19) and set \(e=u-u_{h\tau }\). Then there exists \(C>0\), independent of T, \(\varOmega \), f, \(\beta \), \(u_0\), the mesh size, aspect ratio and the time step such that

$$\begin{aligned} \left\| e(T)\right\| _{L^{2}(\varOmega )}^{2}\le & {} C\Biggl (\left\| e(0)\right\| _{L^{2}(\varOmega )}^{2} +\sum _{n=0}^{N-1}\sum _{K\in \mathcal {T}_{h}}\int _{t^{n}}^{t^{n+1}}\Bigg (\Bigg ( \left\| f-\frac{\partial u_{h\tau }}{\partial t}-\beta \cdot \nabla u_{h\tau }\right\| _{L^{2}(K)}\nonumber \\&\quad +\left\| \theta \right\| _{L^{2}(K)}\Bigg ) \omega _{K}(e) +c_n\left\| \theta \right\| _{L^{2}(K)}^{2}\Bigg )\Biggl )dt, \end{aligned}$$
(24)

where \(\omega _K\) is defined by (8), \(\theta \) by (21) and \(c_n=\left\{ \begin{array}{c} \tau ^1 \quad n=0, \\ T \quad n\ge 1 \end{array} \right. \).

Proof

Let \(t\in \left( t^{n},t^{n+1}\right) \), \(n\ge 1\). Using (9), (1), (3) and Lemma 1, we have

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\int _{\varOmega }e^{2}dx\le & {} \int _{\varOmega }\left( \frac{\partial e}{\partial t}e+(\beta \cdot \nabla e)e\right) dx\nonumber \\= & {} \int _{\varOmega }\left( f-\frac{\partial u_{h\tau }}{\partial t}-\beta \cdot \nabla u_{h\tau }\right) edx \nonumber \\= & {} \int _{\varOmega }\left( f-\frac{\partial u_{h\tau }}{\partial t}-\beta \cdot \nabla u_{h\tau }\right) \left( e-v_{h}-\delta _{h}\beta \cdot \nabla v_{h}\right) dx\nonumber \\&-\int _{\varOmega }\theta \left( v_{h}+\delta _{h}\beta \cdot \nabla v_{h}\right) dx. \end{aligned}$$
(25)

The triangle and Cauchy–Schwarz inequalities imply

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\int _{\varOmega }e^{2}dx\le & {} \sum _{K\in \mathcal {T}_{h}}\Biggl (\left( \left\| f-\frac{\partial u_{h\tau }}{\partial t}-\beta \cdot \nabla u_{h\tau }\right\| _{L^{2}(K)}\right. \\&\left. +\left\| \theta \right\| _{L^{2}(K)}\right) \left( \left\| e-v_{h}\right\| _{L^{2}(K)}+\left\| \delta _{h|K}\beta \cdot \nabla v_{h}\right\| _{L^{2}(K)}\right) \\&+\left\| \theta \right\| _{L^{2}(K)} \left\| e\right\| _{L^{2}(K)}\Biggr ). \end{aligned}$$

Choosing \(v_{h}=R_{h}e\), using estimation (7) and definition of \(\delta _{h|K}\), we have

$$\begin{aligned} \left\| e-R_{h}e\right\| _{L^{2}(K)}+\left\| \delta _{h|K}\beta \cdot \nabla R_{h}e\right\| _{L^{^{2}}(K)}\le C\omega _{K}(e), \end{aligned}$$
(26)

see [5] for details. Therefore we have

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\left\| e\right\| _{L^{2}(\varOmega )}^{2}\le C\sum _{K\in \mathcal {T}_{h}}\left( \alpha _K+\theta _K\right) \omega _{K}(e)+\sum _{K\in \mathcal {T}_{h}}\theta _K\left\| e\right\| _{L^{2}(K)}, \end{aligned}$$

where we have set

$$\begin{aligned} \alpha _K=\left\| f-\frac{\partial u_{h\tau }}{\partial t}-\beta \cdot \nabla u_{h\tau }\right\| _{L^{2}(K)} \qquad \text {and}\qquad \theta _K=\left\| \theta \right\| _{L^{2}(K)}, \end{aligned}$$

and where C denotes a positive constant, independent of T, \(\varOmega \), f, \(\beta \), \(u_0\), the mesh size, aspect ratio and the time step, which may change from line to line. Using the discrete Cauchy–Schwarz and Young’s inequalities we therefore obtain

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\left\| e\right\| _{L^{2}(\varOmega )}^{2} \le C\sum _{K\in \mathcal {T}_{h}}\left( \alpha _K+\theta _K\right) \omega _{K}(e)+\frac{1}{2\varepsilon }\sum _{K\in \mathcal {T}_{h}}\theta _K^{2}+\frac{\varepsilon }{2}\left\| e\right\| _{L^{2}\left( \varOmega \right) }^{2}, \end{aligned}$$

where \(\varepsilon \) is any positive number. Multiplying by \(2e^{-\varepsilon t}\) and integrating between time \(t^1\) and T, we get

$$\begin{aligned}&\left\| e(T)\right\| _{L^{2}(\varOmega )}^{2}e^{-\varepsilon T}-\parallel e(t^{1})\parallel _{L^{2}(\varOmega )}^{2}e^{-\varepsilon t^{1}}\\&\quad \le C\sum _{n=1}^{N-1}\sum _{K\in \mathcal {T}_{h}}\int _{t^{n}}^{t^{n+1}}e^{-\varepsilon t}\left( \left( \alpha _K+\theta _K\right) \omega _{K}(e)+\frac{1}{\varepsilon }\theta _K^{2}\right) dt. \end{aligned}$$

Finally, we choose \(\varepsilon =\frac{1}{T}\) so that the exponential growth in time is eliminated:

$$\begin{aligned}&\left\| e(T)\right\| _{L^{2}(\varOmega )}^{2}\nonumber \\&\quad \le C\left( \left\| e(t^{1})\right\| _{L^{2}(\varOmega )}^{2}+\sum _{n=1}^{N-1}\sum _{K\in \mathcal {T}_{h}}\int _{t^{n}}^{t^{n+1}}\left( \left( \alpha _K+\theta _K\right) \omega _{K}(e)+T\theta _K^{2}\right) dt\right) . \end{aligned}$$
(27)

In order to estimate \(\left\| e(t^{1})\right\| _{L^{2}(\varOmega }^{2}\), we proceed in the same manner to obtain

$$\begin{aligned} \left\| e(t^1)\right\| _{L^{2}(\varOmega )}^{2}\le C\Biggl ( \left\| e(0)\right\| _{L^{2}(\varOmega )}^{2}+\sum _{K\in \mathcal {T}_{h}}\int _{0}^{t^{1}}\left( (\alpha _K+\theta _K)\omega _{K}(e)+\tau ^1 \theta _K^2\right) dt\Biggr ). \end{aligned}$$
(28)

The desired estimate is obtained plugging (28) into (27). \(\square \)

Remark 5

Estimate (24) is not a standard a posteriori estimate since the exact solution u is contained in \(\omega _K(e)\). However, post-processing techniques can be applied in order to approximate \(G_K(e)\), for instance Zienkiewicz−Zhu (ZZ) post-processing. More precisely, we will replace the first order partial derivatives with respect to \(x_i\)

$$\begin{aligned} \frac{\partial (u-u_{h\tau })}{\partial x_i} \; \text {by} \; \varPi _h \frac{\partial u_{h\tau }}{\partial x_i}-\frac{\partial u_{h\tau }}{\partial x_i},\quad i=1,2, \end{aligned}$$

where, for any \(v_h\in V_h\), for any vertex P of the mesh

$$\begin{aligned} \varPi _h \frac{\partial v_h}{\partial x_i} (P) = \dfrac{{\mathop {\mathop {\sum }\nolimits _{K\in \mathcal {T}_h}}\nolimits _{P \in K}}\vert K \vert \; \frac{\partial v_h}{\partial x_i}_{\vert K}}{{\mathop {\mathop {\sum }\nolimits _{K\in \mathcal {T}_h}}\nolimits _{P \in K}} \vert K \vert } \end{aligned}$$

is an approximate \(L^2(\varOmega )\) projection of \(\partial v_h/\partial x_i\) onto \(V_h\). Numerical results already presented in [5, 6, 15,16,17] showed the efficiency of ZZ post-processing for anisotropic meshes for elliptic, parabolic, and hyperbolic problems.

Remark 6

The following a posteriori error estimate can also be proved. Starting form (25), we have

$$\begin{aligned} \frac{1}{2} \frac{d}{dt} \Vert e \Vert _{L^{2}(\varOmega )}^{2} \le \int _{\varOmega }\left( \frac{\partial e}{\partial t}e+(\beta \cdot \nabla e)e\right) dx=\int _{\varOmega }\left( f-\frac{\partial u_{h\tau }}{\partial t}-\beta \cdot \nabla u_{h\tau }\right) edx. \end{aligned}$$

Cauchy–Schwarz inequality implies that

$$\begin{aligned} \frac{1}{2} \frac{d}{dt} \Vert e \Vert _{L^{2}(\varOmega )}^{2} \le \left\| f - \frac{\partial u_{h\tau }}{\partial t} - \beta \cdot \nabla u_{h\tau } \right\| _{L^{2}(\varOmega )} \Vert e \Vert _{L^{2}(\varOmega )}, \end{aligned}$$

which yields

$$\begin{aligned} \Vert e(T) \Vert _{L^{2}(\varOmega )} \le \Vert e(0) \Vert _{L^{2}(\varOmega )} + \int _0^{T} \left\| f - \frac{\partial u_{h\tau }}{\partial t} - \beta \cdot \nabla u_{h\tau } \right\| _{L^{2}(\varOmega )} dt. \end{aligned}$$
(29)

Estimate (29) was already pointed out in [4] and is valid for non-smooth solutions. Numerical experiments (not reported here) have shown that (29) is suboptimal for smooth solutions, thus estimate (24) should be preferred for smooth solutions.

Remark 7

We have not been able to prove a lower bound corresponding to estimate (24), this being also the case for parabolic problems with anisotropic finite elements [6]. However, for elliptic problems [16], we have been able to prove a lower bound provided that

$$\begin{aligned} \lambda _{1,K}^2(r_{1,K}^T G_K(e) r_{1,K}) = \lambda _{2,K}^2(r_{2,K}^T G_K(e) r_{2,K}), \end{aligned}$$

that is to say that the error is equidistributed in both directions \(r_{1,K}, r_{2,K}\).

Remark 8

Estimate (24) can be generalized in the case \(\text {div } \beta \ne 0\) under the assumptions of the Theorem 2. In this case, the constant involved in (24) depends exponentially on the final time T and \(\displaystyle \Vert \text {div } \beta \Vert _{L^{\infty }(\varOmega )}.\) Indeed, (25) becomes

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\int _{\varOmega }e^{2}dx\le & {} \frac{1}{2}\frac{d}{dt}\int _{\varOmega }e^{2}dx + \frac{1}{2}\int _{\partial \varOmega } (\beta \cdot n) e^2 dx \\= & {} \frac{1}{2}\frac{d}{dt}\int _{\varOmega }e^{2}dx + \frac{1}{2}\int _{\varOmega } \text { div} \left( \beta e^2 \right) dx \\= & {} \int _{\varOmega }\left( \frac{\partial e}{\partial t}e+(\beta \cdot \nabla e)e\right) dx - \frac{1}{2} \int _\varOmega (\text {div } \beta ) e^2 dx \\= & {} \int _{\varOmega }\left( f-\frac{\partial u_{h\tau }}{\partial t}-\beta \cdot \nabla u_{h\tau }\right) edx - \frac{1}{2} \int _\varOmega (\text {div } \beta ) e^2 dx. \end{aligned}$$

We conclude the proof using the same techniques as in Theorem 2, using the Gronwall’s Lemma to control \(\displaystyle \frac{1}{2} \int _\varOmega (\text {div } \beta ) e^2 dx\). Therefore, (24) becomes

$$\begin{aligned} \left\| e(T)\right\| _{L^{2}(\varOmega )}^{2}\le & {} e^{\Vert \text {div } \beta \Vert _{L^{\infty }(\varOmega )}T}C{\Biggl (\left\| e(0)\right\| _{L^{2}(\varOmega )}^{2}}\\&+\sum _{n=0}^{N-1}\sum _{K\in \mathcal {T}_{h}}\int _{t^{n}}^{t^{n+1}}\left( \left( \left\| f-\frac{\partial u_{h\tau }}{\partial t}-\beta \cdot \nabla u_{h\tau }\right\| _{L^{2}(K)}\right. \right. \\&\left. \left. +\left\| \theta \right\| _{L^{2}(K)}\right) \omega _{K}(e) +c_n\left\| \theta \right\| _{L^{2}(K)}^{2}\right) \Biggl )dt, \end{aligned}$$

where C and \(c_n\) are as in Theorem (2).

3.3 A Posteriori Error Indicators

We now define our error indicator

Here the anisotropic error indicator in space \(\eta ^{A}\) is defined by

with

$$\begin{aligned} (\eta _{K,n}^{A})^{2}=\int _{t^n}^{t^{n+1}} \left\| f - \frac{\partial u_{h\tau }}{\partial t} - \beta \cdot \nabla u_{h\tau } \right\| _{L^2(K)}\omega _K(e)dt. \end{aligned}$$

The error indicator in time is defined by

with

$$\begin{aligned} (\eta _{K,n}^{T})^{2} =c_n\int _{t^n}^{t^{n+1}}\left\| \theta \right\| _{L^{2}(K)}^{2}dt \end{aligned}$$
(30)

with \(\theta \) given in Lemma 1 and \(c_n\) as in Theorem 2. The reader should note that the other terms in (24) have not been considered since they are of higher order. In order to check the sharpness of these error indicators, we will compare them to the true errors. To this end, we introduce the effectivity indices ei and \(ei^{ZZ}\) defined by

Here ei measures the sharpness of our space-time error indicator, whereas \(ei^{ZZ}\) measures the quality of our Zienkiewicz−Zhu post-processing.

4 Numerical Experiments

4.1 Numerical Experiments on Non-adapted Meshes with Constant Time Steps

We now investigate the sharpness of our indicators by performing numerical experiments on nonadapted meshes with constant time steps. Problem (1) is considered in the unit square \(\varOmega = (0,1)^{2}\), with \(T=0.5, f=0,\beta =(1,0)^{T}\), the initial condition is given by the smooth function

$$\begin{aligned} u_{0}(x_1,x_2)=\tanh (-C((x_1-0.25)^{2}-0.01)), \end{aligned}$$
(31)

\(\varGamma ^{-}\) is the left boundary of \(\varOmega \), thus the exact solution \(u(x_1,x_2,t)\) is given by

$$\begin{aligned} u(x_1,x_2,t)=u_0(x_1-t,x_2). \end{aligned}$$

The solution is smooth with small variations, except in a thin layer of width controlled by C, the larger C, the smaller the layer, the larger the error for a given mesh size. Several experiments have been performed on anisotropic meshes with aspect ratio varying from 50 to 500, where we keep the time step constant. In what follows, h\(_1\)–h\(_2\) denotes the mesh size in the directions \(x_1,x_2\) and \(\tau \) is the time step.

Table 1 Convergence results when \(\tau =O(h^{2})\) with \(C=60\) and aspect ratio 50 (rows 1–4) and 500 (rows 5–6)
Table 2 Convergence results when \(\tau =O(h^{2})\) with \(C=240\) and aspect ratio 50 (rows 1–4) and 500 (rows 5–6)
Table 3 Convergence results when \(h=O(\tau ^{2})\) with \(C=60\) and aspect ratio 50 (rows 1–4) and 500 (rows 5–7)

We first investigate the sharpness of the anisotropic error indicator in space \(\eta ^{A}\), choosing \(\tau = O(h^{2})\) so that the error due to time discretization is negligible, see Tables 1 and 2. It is observed that the \(L^{2}(\varOmega )\) error at final time is \(\simeq O(h^{1.8})\) while the \(L^{2}(0,T,H^{1}(\varOmega ))\) error is \(\simeq O(h)\). The post-processed ZZ gradient is asymptotically exact, while the effectivity index ei converges to a value close to 20. These results agree with those of [5].

Then, we check that the quadratic reconstruction in (18) and (19) yields an error indicator of optimal second order in time. We choose \(h\simeq O(\tau ^{2})\) so that the error due to the space discretization is negligible. The numerical results presented in the Tables 3 and 4 show that both the \(L^{2}(\varOmega )\) error at final time and the time indicator \(\eta ^{T}\) are \(\simeq O(\tau ^{2})\). The effectivity index tends to a value close to 2. Note that in this case, \(ei^{ZZ}\) is away from 1, which implies that the post-processing included in our error indicator in space \(\eta ^A\) is not accurate; but this is unimportant since \(\eta ^A\) is much smaller than the error indicator in time \(\eta ^T\).

In order to check that the effectivity index does not depend on \(\varOmega \) and T, we reproduce the same experiment on a domain \(\varOmega = (0,10) \times (0,1)\) for several values of the final time T. The corresponding results are presented in Tables 5 and 6 for \(C=60\) and meshes with aspect ratio 50. The effectivity index remains close to the values obtained previously.

Table 4 Convergence results when \(h=O(\tau ^{2})\) with \(C=240\) and aspect ratio 50 (rows 1–5) and 500 (rows 6–8)
Table 5 Convergence results when \(\varOmega = [0,10] \times [0,1]\) and T varies; \(h_1 = 0.000625\), \(h_2 =0.03125\), \(\tau =0.000125\)
Table 6 Convergence results when \(\varOmega = [0,10] \times [0,1]\) and T varies; \(h_1 = 0.000625\), \(h_2 =0.03125\), \(\tau =0.00625\)

In order to obtain an effectivity index close to one, we divide the space indicator \(\eta ^A\) by 20 and the time indicator \(\eta ^T\) by 2. We report the result obtain in Tables 7 and 8 where we consider the normalized error indicator

$$\begin{aligned} \sqrt{\frac{(\eta ^{A})^{2}}{400}+\frac{(\eta ^{T})^{2}}{4}}. \end{aligned}$$

The corresponding effectivity index is shown to be near a value of 1 when . In the sequel, we will always consider the normalized indicators without introducing new notations.

Table 7 Convergence results with the normalized error indicator when with \(C=60\) and aspect ratio 50 (rows 1–4) and 500 (rows 5–7)
Table 8 Convergence results with the normalized error indicator when with \(C=240\) and aspect ratio 50 (rows 1–4) and 500 (rows 5–7)

4.2 An Adaptive Algorithm in Space and Time

Although the analysis in Sect. 3 is restricted to a single mesh \(\mathcal {T}_h\), we now present an adaptive space-time algorithm which involves several meshes. Then the question of interpolation between meshes is discussed.

The goal of the adaptive space-time algorithm is to control \( \Vert e(T) \Vert _{L^2(\varOmega )}\). Given a prescribed tolerance TOL, we want to ensure that

A sufficient condition is to ensure that, for \(n=0,1,2,\ldots ,N-1\)

$$\begin{aligned} \frac{0.75^2\,{ TOL}^2\,\tau ^{n+1}}{2} \le \sum _{K\in \mathcal {T}_{h}}(\eta _{K,n}^{A})^{2} \le \frac{1.25^2\,{ TOL}^2\,\tau ^{n+1}}{2}, \end{aligned}$$
(32)

and

$$\begin{aligned} \frac{0.75^2\,{ TOL}^2\,\tau ^{n+1}}{2} \le \sum _{K\in \mathcal {T}_{h}}(\eta _{K,n}^{T})^{2} \le \frac{1.25^2\,{ TOL}^2\,\tau ^{n+1}}{2}. \end{aligned}$$
(33)
Fig. 1
figure 1

Adaptive algorithm. The index i denotes the number of remeshing required to build an acceptable mesh at current time, starting from the mesh accepted at previous time

The main steps of the adaptive algorithm are summarized in Fig. 1. At each time step, a new mesh is built, whenever needed. Then, the previous finite element approximation, \(u_h^n\), has to be interpolated in order to compute the current one, \(u_h^{n+1}\). More precisely, if we denote by \(\mathcal {T}_{h,i}^{n}\) and \(\mathcal {T}_{h,i+1}^{n}\) two successive meshes generated at time \(t^{n+1}\), and by \(V_{h,i}^{n}\), \(V_{h,i+1}^{n}\) the associated finite elements spaces, we consider the interpolation operator

$$\begin{aligned} \pi _{h,i+1}^{n} : V_{h,i}^{n} \longrightarrow V_{h,i+1}^{n}. \end{aligned}$$

If a new mesh has to be built, then we interpolate the values of \(u_h^{n}\) from \(V_{h,i}^{n}\) to \(V_{h,i+1}^{n}\) and compute \(u_{h}^{n+1}\in V_{h,i+1}^{n}\) such that

(34)

for all \(v_{h}\in V_{h,i+1}^{n}\) Five interpolation operators have been considered.

  • the linear Lagrange interpolation,

  • the exact \(L^2\) projection [5, 18],

  • the conservative algorithm of [19],

  • the Ritz hyperbolic projection [12],

  • the modified hyperbolic projection defined below.

We give more details on the last choice. For \(g\in H^1(\varOmega )\), we define \(\pi _{h,i+1}^n : H^1(\varOmega ) \rightarrow V_{h,i+1}^{n}\) by

$$\begin{aligned}&\int _\varOmega \pi _{h,i+1}^n g v_h dx + \int _\varOmega (\delta _h \beta \cdot \nabla \pi _{h,i+1}^n g) (\delta _h \beta \cdot \nabla v_h) dx\nonumber \\&\quad = \int _\varOmega g v_h dx + \int _\varOmega (\delta _h \beta \cdot \nabla g) (\delta _h \beta \cdot \nabla v_h) dx, \forall v_h \in V_{h,i+1}^{n}. \end{aligned}$$
(35)

The projection \(\pi _{h,i+1}^n\) clearly satisfies the following property

$$\begin{aligned} \Vert \pi _{h,i+1}^n g \Vert _{L^2(\varOmega )}^2 + \Vert \delta _h \beta \cdot \nabla \pi _{h,i+1}^n g \Vert _{L^2(\varOmega )}^2 \le \Vert g \Vert _{L^2(\varOmega )}^2 + \Vert \delta _h \beta \cdot \nabla g \Vert _{L^2(\varOmega )}^2. \end{aligned}$$
(36)

Stability of the scheme (34) is not guaranteed with the four first interpolation operators. Stability can be proven when \(\pi _{h,i+1}^n\) is defined by (35) and \(\delta _h\) constant.

Lemma 2

Assume that \(\delta _h\) is constant and \(f=0\). Let \(u_h^{n+1}\) be the solution of (34) with \(\pi _{h,i+1}^n\) being defined by (35). Then, we have

$$\begin{aligned} \int _\varOmega (u_h^{n+1})^2 dx + \delta _h^2 \int _\varOmega (\beta \cdot \nabla u_h^{n+1})^2 dx \le \int _\varOmega (u_h^n)^2 dx +\, \delta _h^2 \int _\varOmega (\beta \cdot \nabla u_h^n)^2 dx,\quad \forall n=0,\ldots ,N. \end{aligned}$$

Proof

Choose

$$\begin{aligned} v_h = \frac{u_h^{n+1}+\pi _{h,i+1}^n u_h^n}{2} + \delta _h \frac{u_h^{n+1}-\pi _{h,i+1}^n u_h^n}{\tau ^{n+1}} \end{aligned}$$

in (34). Using (9) yields

$$\begin{aligned}&\frac{1}{2\tau ^{n+1}}\int _\varOmega (u_h^{n+1})^2 dx + \delta _h \int _\varOmega \left( \frac{u_h^{n+1}-\pi _{h,i+1}^n u_h^n}{\tau ^{n+1}} + \beta \cdot \left( \frac{u_h^{n+1}+\pi _{h,i+1}^n u_h^n}{2}\right) \right) ^2 dx \\&\quad + \frac{1}{2\tau ^{n+1}}\delta _h^2 \int _\varOmega (\beta \cdot \nabla u_h^{n+1})^2 dx \\&\qquad \le \frac{1}{2\tau ^{n+1}}\int _\varOmega (\pi _{h,i+1}^nu_h^{n})^2 dx+ \frac{1}{2\tau ^{n+1}}\delta _h^2 \int _\varOmega (\beta \cdot \nabla \pi _{h,i+1}^nu_h^{n})^2 dx. \end{aligned}$$

We conclude by multiplying on each side by \(2\tau ^{n+1}\) and using (36). \(\square \)

This stability result has a little interest in practice since \(\delta _h\) is not constant for adapted meshes. In the numerical experiments, the best results have been obtained using the conservative algorithm of [19], the other four interpolation operators are shown to be less accurate.

The BL2D software [20] is used in order to build anisotropic meshes, the indicator \((\eta _K^{A})^{2}\) being equidistributed in the directions of maximum and minimum stretching \(r_{1,K}\), \(r_{2,K}\). Each triangle K is aligned with the eigenvectors of the error gradient matrix \(G_K(e)\), where ZZ post-processing is used in order to approximate \(\partial e/\partial x_i\). We shortly describe this remeshing procedure. Since the BL2D software uses informations about vertices, we need to translate the error indicator \(\eta ^A\) from triangles to vertices. We define, for all \(K \in \mathcal {T}_h\), the anisotropic error indicator in direction \(r_{i,K}\), by

and for all vertex P

Then, a sufficient condition to ensure (32) is the following. For all vertex P of the mesh, \(\eta _{P,n}^{A,i}\) should satisfy

$$\begin{aligned} \frac{3}{2}\frac{0.75^4\,{ TOL}^4\,(\tau ^{n+1})^2}{4 N_v^2} \le (\eta _{P,n}^{A,i})^4 \le \frac{3}{2}\frac{1.25^4\,{ TOL}^4\,(\tau ^{n+1})^2}{4 N_v^2}, \quad i=1,2. \end{aligned}$$
(37)

Hereabove, the factor 3 is due to the fact that summing over all vertices is equivalent to summing 3 times over all triangles; the factor 1 / 2 to the fact that the error is equidistributed in both directions \(r_{1,K}\), \(r_{2,K}\). Also, \(N_v\) denotes the number of vertices of the current mesh. The remeshing procedure is then the following : for every P, we set

$$\begin{aligned} \lambda _{i,P} = \dfrac{\displaystyle {\sum _{\begin{array}{c} K\in \mathcal {T}_h \\ P\in K \end{array}}} \lambda _{i,K}}{\displaystyle {\sum _{\begin{array}{c} K\in \mathcal {T}_h \\ P\in K \end{array}} 1 }}, \quad i=1,2. \end{aligned}$$

If (37) is not satisfied, we modify \(\lambda _{i,P}\) by a factor \(\beta \), else we keep it as is. Based on these stretching values, a new mesh will be generated by the BL2D software. The results for example (31) hereafter have been obtained setting \(\beta =\frac{2}{3}\), while \(\beta =\frac{1}{2}\) was set for example (38).

4.3 Numerical Results with Adapted Meshes and Adapted Time Steps

We now analyse the efficiency of the adaptive algorithm of Fig. 1. We first consider example (31) with \(C=60\). The initial mesh is an isotropic mesh with mesh size \(h=0.01\), while the initial time step is taken as \(\tau ^1 = 0.002\). The mesh and solution at final time are shown in Figs. 2 and 3 when using conservative interpolation [19], \(C=240\) and \(\textit{TOL}=0.001\).

Fig. 2
figure 2

Example (31). Mesh and solution with \(C=240\) and \(\textit{TOL}=0.001\). Conservative interpolation between meshes is used a \(t=0\) b \(t=0.5\)

Fig. 3
figure 3

Example (31). Zoom of Fig. 2 at final time

We investigate the number of vertices, aspect ratio, number of time steps and remeshings, for various values of the prescribed tolerance TOL. The notations are summarized in Table 9, the results in Table 10. The observations are the following when using conservative interpolation.

  • The error at final time is approximatively divided by 2 when TOL is divided by 2.

  • Both effectivity indices ei and \(ei^{ZZ}\) are close to one,

  • The number of remeshing depends on the exact solution u (the larger C, the larger \(N_m\)).

  • Since the solution depends only on the \(x_1\) variable, the total number of vertices at final time is only doubled as the tolerance is divided by two (it should be multiplied by four with isotropic meshes).

  • The total number of time steps is multiplied by \(\sqrt{2}\) as the tolerance is divided by 2, which confirms the second order convergence of the error indicator in time \(\eta ^T\).

Linear interpolation, the \(L^2\) projection, the Ritz hyperbolic projection and the modified hyperbolic projection (35) yield worse results, for instance the ZZ effectivity index is away of one. This has already been been observed in [5, 17] for hyperbolic problems, whereas interpolation between meshes seems not to be an issue for parabolic problems [6].

Table 9 Additional notations for the analysis of the adaptive algorithm
Table 10 Example (31)
Fig. 4
figure 4

Example (38). Mesh and solution at time \(t=0,1,2,3,4,\) with \(C=240\) and \({ TOL}=0.025\). Conservative interpolation between meshes is used

Fig. 5
figure 5

Example (38). Exact and numerical solutions at time \(T=4\) with \(C=60\). Plot of \(u_{h\tau }\) with respect to \(x_1\) along the line \(x_2=0.75\). Conservative interpolation between meshes is used

Fig. 6
figure 6

Example (38). Zoom at Fig. 5

Fig. 7
figure 7

Example (38). Exact and numerical solutions at time \(T=4\) with \(C=240\). Plot of \(u_{h\tau }\) with respect to \(x_1\) along the line \(x_2=0.75\). Conservative interpolation between meshes is used

The last test case is the stretching of a circle in a vortex flow. We again set \(\varOmega = (0,1)^{2},T=4\). The initial condition is given by

$$\begin{aligned} u_0(x_1,x_2)=\tanh \left( -C(\sqrt{(x_1-0.5)^{2}+(x_2-0.75)^{2}}-0.15)\right) , \end{aligned}$$
(38)

where \(C=60\) or \(C=240\). No boundary conditions along \(\partial \varOmega \) are prescribed. The velocity field is defined by

$$\begin{aligned} \beta = \begin{pmatrix} -2\sin (\pi y)\cos (\pi y)\sin ^{2}(\pi x)\cos (0.25\pi t) \\ 2\sin (\pi y)\cos (\pi y)\sin ^{2}(\pi x)\cos (0.25\pi t) \end{pmatrix}. \end{aligned}$$

The exact solution is not known, however, since the flow is reversed at \(t=2\), we must have \(u(x_1,x_2,4)=u_0(x_1,x_2)\).

This example is not covered by our our theory, since the velocity field \(\beta \) depends on time. Although the anisotropic error indicator in space \(\eta ^A\) remains valid even for a time dependent velocity field \(\beta \), the error indicator in time \(\eta ^T\) should be modified. However, this is beyond the scope of the present study. Nevertheless, the use of the time indicator \(\eta ^T\) defined by (30) yields good results. Several meshes and numerical solutions are presented in Fig. 4 when \({ TOL}=0.025\) and conservative interpolation is used. In Figs. 5, 6, 7, and 8 and Table 11 we have checked convergence of the computed solution at final time for several values of TOL. For comparison, we present in Table 12 results with non-adapted uniform meshes and constant time steps. In Fig. 9, we compare the solution computed on a non-adapted meshes with the one obtained with the largest value \({ TOL}=0.1\) of the adaptive algorithm. Clearly, the coarsest adapted solution is more accurate than the finest non-adapted one. Note that the number of vertices of the non-adapted mesh is 200 larger than that of adapted meshes.

Table 11 Exemple 38
Table 12 Exemple 38
Fig. 8
figure 8

Example (38). Zoom at Fig. 7

Fig. 9
figure 9

Example (38). Comparison between numerical solutions at time \(T=4\) with \(C=60\). Plot of \(u_{h\tau }\) with respect to \(x_1\) along the line \(x_2=0.75\). The adapted solution is computed with the Fig. 1 with \(\textit{TOL}=0.1\). The non-adapted solution is computed on a fix uniform mesh with constant time steps (\(h=0.0025, \tau = 0.00016\))