Error analysis for discretizations of parabolic problems using continuous finite elements in time and mixed finite elements in space

Variational time discretization schemes are getting of increasing importance for the accurate numerical approximation of transient phenomena. The applicability and value of mixed finite element methods in space for simulating transport processes have been demonstrated in a wide class of works. We consider a family of continuous Galerkin–Petrov time discretization schemes that is combined with a mixed finite element approximation of the spatial variables. The existence and uniqueness of the semidiscrete approximation and of the fully discrete solution are established. For this, the Banach–Nečas–Babuška theorem is applied in a non-standard way. Error estimates with explicit rates of convergence are proved for the scalar and vector-valued variable. An optimal order estimate in space and time is proved by duality techniques for the scalar variable. The convergence rates are analyzed and illustrated by numerical experiments, also on stochastically perturbed meshes.


Introduction
Numerical simulations of time dependent single and multiphase phase flow and multicomponent transport processes in complex and porous media with strong heterogeneities and anisotropies are desirable in several fields of natural sciences and civil engineering as well as in a large number of branches of technology; cf. e.g., [22,29]. Typically, the discretization in space involves a significant set of complexities and challenges. MFEM (cf. [17,21]) have proved their potential and capability to approximate solutions with high accuracy and physical consistency; cf. e.g., [13,19]. So far, the temporal approximation of flows and transport phenomena in porous media have received relatively little interest (cf. e.g., [5,18,27,[42][43][44]49] and the references therein) and have been limited to traditional non-adaptive first and second order methods, even if strong chemical reactions with high temporal variations in profiles are present. Rigorous studies of higher order time discretizations are still missing. The low-order implicit time discretization is of particular concern with respect to numerical diffusion for smooth solutions of transport problems (cf. [45] for a study on numerical diffusion for different temporal and spatial discretizations of a transport equation).
The Galerkin method is a well-recognised approach to solve time dependent problems; cf. e.g., [6,48]. However, until now it has rarely been used in practice for discretizing the time variable in approximations of initial-boundary value problems. Since recently, variational time discretization schemes based on continuous or discontinuous finite element techniques have been developed to the point that they can be put into use (cf. [30,31]) and demonstrate their significant advantages. Higher order methods are naturally embedded in these schemes and the uniform variational approach simplifies stability and error analyses. Further, goal-oriented error control [9] based on the dual weighted residual approach relies on variational space-time formulations and the concepts of adaptive finite element techniques for changing the polynomial degree as well as the length of the time intervals become applicable. Variational time discretization schemes that are combined with continuous or discontinuous finite element methods for the spatial variables are studied for flow and parabolic problems in, for instance [1][2][3][4]10,15,[30][31][32]38,47] and for wave problems in, for instance [7,36,37]. In these works algebraic formulations of the variational time discretizations are developed [4,30,31,36,37,47], preconditioning techniques for the arising block matrix systems are addressed [4,10,32,37] and, finally, computational studies are performed.
Numerical analyses of semidiscretizations in time by variational methods and of variational space-time approaches can be found in, for instance [20,34,35,46,48]. In [48] discontinuous variational approximations of the time variable are studied for abstract parabolic problems whereas in [46] their continuous counterparts are analyzed. In [20,47] discontinuous variational approximations in time and space are studied and error estimates are proved. In [47] time-dependent domains are considered in an arbitrary Lagrangian Eulerian (ALE) framework and the advection-diffusion equation is written in mixed form as a system of first order equations in space. In [25] a discontinuous Galerkin method in time combined with a stabilized finite element approach in space for first order partial differential equations is investigated for static and dynamically changing meshes. Error estimates in the L ∞ (L 2 ) and L 2 (L 2 ) norm are derived. In [34,35] continuous space-time approximations for nonlinear wave equations with mesh modifications and for the Schrödinger equation are considered. Existence and uniqueness of the discrete solutions are discussed and error estimates are proved for the schemes.
As far as the MFE approximation of parabolic problems is concerned, in [48] an error estimate for the semidiscretization in space is given. However, for the flux variable an error estimate is proved only for the L 2 norm. No estimate is provided for the error in divergence of the flux, that is part of the natural norm of the underlying function space H(div; ). In [23,33] similar error estimates, also in negative norms, are presented. In particular, estimates similar to the error estimates for conventional finite element approximations are established. The singular behavior of the error estimates as t → 0 for initial data in L 2 ( ) is further included.
In this work a continuous Galerkin-Petrov (cGP) method is used for the discretization in time, whereas the MFEM [17,21] is applied for the spatial discretization. Appreciable advantages of the MFEM are its local mass conservation property and the inherent approximation of the flux field as part of the formulation itself. In simulating coupled flow and transport processes in porous media the flux approximation of the flow problem is usually of higher practical interest than the approximation of the scalar variable itself. To the best of our knowledge, rigorous error estimates for fully discrete variational space-time discretization schemes that are based on MFE approximations are still missing. In our numerical analysis we split the temporal discretization error from the spatial one by introducing an auxiliary problem based on the semidiscretization in time. We firstly estimate the temporal discretization error and secondly the error between the semidiscrete and the fully discrete solution. The order of convergence estimates are derived in the natural norms of the variational space-time approach. They are summarized in Theorem 4.6. For the scalar variable of the MFE approach one of the given error estimates, measured in the norm of L 2 (0, T ; L 2 ( )), is optimal in space and time if a certain regularity assumption is supposed to be satisfied. For constant scalar-valued diffusion coefficients an error estimate for the flux variable in the norm of L 2 (0, T ; L 2 ( )) is further provided. It is optimal in space and suboptimal in time. In the Gaussian quadrature points of the temporal discretization optimal order error estimates for the flux variable in L 2 ( ) are even obtained for heterogeneous diffusion matrices. The existence and uniqueness of the semidiscrete and fully discrete solution is further established. Even though a prototype model problem is studied here only, we believe that the techniques for analyzing mixed variational space-time approximation schemes can be applied similarly to more complex flow and transport problems in porous media.
This work is organized as follows. In Sect. 2 our fully discrete variational spacetime method is developed. In Sect. 3 we address the semidiscrete problem by proving existence and uniqueness of its solution and error estimates for the semidiscretization in time. In Sect. 4 we study the fully discrete problem and show the existence and uniqueness of its solution. The error between the semidiscrete and fully discrete problem is estimated. In Theorem 4.6 an error estimate for the simultaneous space-time discretization is provided by combining the before-given estimates of the temporal and spatial discretization. In Sect. 5 we illustrate and validate our derived error estimates by numerical experiments. We end our work with some conclusions in Sect. 6.

Notation and preliminaries
Throughout this paper, standard notations are used. A summary of the notations used in this work is presented in Appendix B. Let ⊂ R d , with d = 2 or d = 3, be a polygonal or polyhedral bounded domain. We denote by H p ( ) the Sobolev space of L 2 functions with derivatives up to order p in L 2 ( ) and by ·, · the inner product in L 2 ( ). Sobolev spaces of vector-valued functions are written in bold letters. Further, let H 1 0 ( ) = {u ∈ H 1 ( ) | u = 0 on ∂ } and H −1 ( ) denote its dual space. For the norms of the Sobolev spaces the notation is For the mixed problem formulation we use the abbreviations Let X 0 ⊆ X ⊆ X 1 be three reflexive Banach spaces with continuous embeddings. Then we consider the following set of Banach space valued function spaces, that are equipped with their naturals norms (cf. [24]) and where the time derivative ∂ t is understood in the sense of distributions on (0, T ). In particular, every function in [24]. For X 0 = X = X 1 we simply write H 1 (I ; X ). Moreover, we put where the matrix D = D(x) = (d i j (x)) d i, j=1 satisfies d i j ∈ L ∞ ( ) and is elliptic with As usual, by c > 0 we denote a generic constant throughout the paper.

Problem formulation
As a prototype model for more sophisticated multiphase flow and multicomponent reactive transport systems in porous media (cf. e.g. [22,29]) we study in this work In order to derive our family of discretization schemes, we first define the auxiliary flux variable q := − D∇u for the weak solution u of (2.6)-(2.8) that is given by (2.9). Since ∂ t u ∈ L 2 (I ; W ) is satisfied by (2.9) and f ∈ L 2 (I ; W ) holds by assumption, it directly follows that q ∈ L 2 (I ; V ). The pair {u, q} ∈ H 1 (I ; W )∩C(I ; W )×L 2 (I ; V ) is then also the unique solution to the set of variational equations for all w ∈ L 2 (I ; W ) and v ∈ L 2 (I ; V ) and satisfies the initial condition u(0) = u 0 . To find (2.11), integration by parts was used. The global problem formulation (2.10), (2.11) motivates our semidiscretization in time. . For (elliptic) regularity results in domains with non-smooth boundaries we refer to, e.g., [28,39]. Below, we tacitly assume that the required assumptions about the data and ∂ are satisfied such that the existence of a sufficiently regular solution can be assumed. Without such an assumption the application of higher order methods is not meaningful.

Variational discretization in time by a continuous Galerkin method
For the discretization in time we decompose the time interval (0, T ] into N subintervals Further τ denotes the discretization parameter in time and is defined as the maximum time step size τ = max 1≤n≤N τ n , where τ n = t n − t n−1 . We introduce the function spaces of piecewise polynomials of order r in time, where P r (J ; X ) = p: J → X p(t) = r j=0 ξ j n t j , ξ j n ∈ X, j = 0, . . . , r and X r (X ) ⊂ H 1 (0, T ; W ). We let Further, we put We equip the function spaces W and V with their natural norms being defined by With respect to these norms the space W is a Banach space and the space V is a reflexive Banach space. Further, we define the space-time bilinear form a τ ∈ L(W × V; R) by means of Obviously, the mapping a τ : W × V → R is linear and continuous, i.e.
with some constant c > 0 independent of τ and T . For the family of continuous variational time discretization schemes the spaces X r (X ) of continuous functions act as spaces for the solution whereas the spaces Y r −1 (X ) consisting of piecewise polynomials that are discontinuous at the end points of the time intervals are used as test spaces. Since the spaces of the trial and test functions differ here, a discretization of Galerkin-Petrov type is thus obtained.
A semidiscrete variational approximation of the mixed form of problem (2.6)-(2.8), referred as the exact form of cGP(r ), is then defined by solving the variational equations (2.10), (2.11) in discrete subspaces:
We refer to the solution of Eqs. (2.14), (2.15) as the continuous Galerkin-Petrov method with piecewise polynomials of order r and use the notation cGP(r ). To ensure the existence and uniqueness of solutions to (2.14), (2.15), it is sufficient to use the test spaces Y r −1 (W ) and Y r −1 (V ) with piecewise polynomials of order r − 1, since the continuity constraint at the discrete time points t n , n = 0, . . . , N − 1, that is implied by the definition of the solution spaces X r (W ) and X r (V ), yields a further condition. By using discontinuous test basis functions w τ (t) = wψ n,i (t) and v τ = vψ n,i (t), for i = 1, . . . , r , with arbitrary time independent functions w ∈ W and v ∈ V , respectively, and piecewise polynomial functions ψ n,i : I → R that are of order r − 1 on I n and vanish on I \I n , we can recast the variational equations (2.14), (2.15) as a time marching scheme: For n = 1, . . . , N find u τ |I n ∈ P r (I n ; W ) and q τ |I n ∈ P r (I n ; V ) such that for all w ∈ W and v ∈ V and i = 1, . . . , r with the continuity constraints u τ |I n (t n−1 ) = u τ |I n−1 (t n−1 ) and q τ |I n (t n−1 ) = q τ |I n−1 (t n−1 ) for n ≥ 2 and the initial conditions u τ |I n (t n−1 ) := u 0 , q τ |I n (t n−1 ) := − D∇u 0 for n = 1.
To determine u τ |I n and q τ |I n , we represent them in terms of basis functions, with respect to the time variable, of the spaces X r (W ) and X r (V ) such that u τ |I n (t) = r j=0 U j n ϕ n, j (t) and q τ |I n (t) = r j=0 Q j n ϕ n, j (t), for t ∈ I n , (2.18) with coefficient functions U j n ∈ W and Q j n ∈ V for j = 0, . . . , r and polynomial basis functions ϕ n, j ∈ P r (I n ; R) that are Lagrange functions with respect to r + 1 nodal points t n, j ∈ I n satisfying the conditions ϕ n, j (t n,i ) = δ i, j for i, j = 0, . . . , r . For the treatment of the continuity constraint in time we put t n,0 = t n−1 . The other points t n,1 , . . . , t n,r are chosen as the quadrature points of the r -point Gaussian quadrature formula on I n which is exact if the function to be integrated is a polynomial of degree less or equal to 2r − 1. The basis functions ϕ n, j ∈ P r (I n ; R) of (2.18), for j = 0, . . . , r , are defined, as usual in the finite element framework, via the affine reference transformation ontoÎ = [0, 1]. The test basis functions ψ n,i ∈ P r −1 (I n ; R) with ψ n,i (t n,l ) = δ i,l for i, l = 1, . . . , r are defined similarly; cf. [15,37] for details. Now we transform all the time integrals in (2.16), (2.17) to the reference intervalÎ . By a subsequent application of the r -point Gaussian quadrature formula with weightsω i and quadrature nodest i onÎ as well as the further notation for i = 1, . . . , r , j = 0, . . . , r (cf. [15,36,46]), we obtain the following system of variational problems for the coefficient functions U j n ∈ W and Q j n ∈ V of the representation (2.18): For n = 1, . . . , N and j = 1, . . . , r find coefficient functions

Remark 2.2
In the numerical scheme (2.19), (2.20), the flux coefficient functions Q j n , for j = 1, . . . , r , arise only in the r Gaussian quadrature points t n,1 , . . . , t n,r ∈ (t n−1 , t n ) of the subinterval I n . Nevertheless, the coefficient functions Q 0 n for n ≥ 1, are needed for the unique determination of the semidiscrete flux function q τ ∈ X r (V ) and an explicit evaluation of q τ |I n by the representation (2.18)  For the derivation of (2.19), (2.20) from (2.16), (2.17) we tacitly replaced the integrand f on the right-hand side by its Lagrange interpolate r f ∈ P r (I n ; L 2 ( )) defined by r f (t) |I n = r j=0 f (t n, j )ϕ n, j (t) for t ∈ I n . (2.21) We note that the constantsβ ii are satisfying the following property.

Lemma 2.3 [Coefficient property (C)]
There exist constants β m , β M ∈ R such that is satisfied. The constants do not depend on the time step size, but only on the number r of involved Gaussian quadrature points.
Proof Indeed, the coefficients β ii =ω i are the Gauss-Legendre quadrature weights scaled to the interval [0, 1], i.e.ω i =ω i /2. In (2.23), P r denotes the Legendre polynomial of degree r and x i , for i = 1, . . . , r , are its roots, cf. e.g., [41, p. 436]. Since the sum of the weightsω i equals to two and the weights are all strictly positive, we immediately conclude that an upper bound forω i is given by one. On the other hand, we know that |P r (x)| ≤ r (r + 1)/2 for any x ∈ [−1, 1]; cf. [16, p. 73]. This gives us the lower boundŵ i ≥ 2/(r (r + 1)) 2 .
Below, we will also need the following auxiliary results.
for some c > 0 independent of τ n . An analogous results holds for coefficients F i n ∈ V .
Proof Using the properties of the basis functions ϕ i and that the r -point Gaussian quadrature formula is exact for polynomials of maximum degree 2r − 1 there holds that The second of the equalities in (2.24) follows immediately from the first one. It remains to prove (2.25). It holds that with c independent of τ n . Here we used that

Discretization in space by the mixed finite element method
Now, we present the fully discrete approximation scheme that is obtained by discretizing (2.19), (2.20) with respect to their spatial variables. For this we choose a pair of finite element spaces W h ⊂ W and V h ⊂ V satisfying the inf-sup stability condition; cf. [17,21]. Here, we denote by T h = {K } a finite element decomposition of mesh size h of the polyhedral domain into closed subsets K , quadrilaterals in two space dimensions and hexahedrals in three space dimensions. Since the software library deal.ii [8] that we use for our implementation of the schemes allows only quadrilateral and hexahedral elements, we restrict ourselves to these types of elements in the following. Triangular and tetrahedral elements can be treated in an analogous way. In our calculations (cf. Sect. 5) we use the Raviart-Thomas element on quadrilateral meshes for two space dimensions. For an application in three dimensions based on the Raviart-Thomas-Nédélec element we refer to [15,37]. The construction of the discrete function spaces W h and V h on quadrilateral and hexahedral finite elements is done by a transformation We sketch this briefly for d = 2; cf. [21,37] for d = 3. For this, let We then define the discrete subspaces W The fully discrete continuous Galerkin-Petrov and MFE approximation scheme, referred to as cGP(r )-MFEM( p), then defines fully discrete solutions u τ,h ∈ X r (W h ) and q τ,h ∈ X r (V h ) that are represented in terms of basis functions in time by For the derivation of the algebraic formulation of the fully discrete variational problem (2.28), (2.29) we also refer to [15,36]. In [15,36], the iterative solution of the arising linear systems and the construction of an efficient preconditioner is further addressed. For solving the algebraic counterpart of Eqs. (2.28), (2.29) we do not apply an additional hybridization technique as it was done, for instance, in [11,12,14] and the references therein. We solve the algebraic system by using a Schur complement technique. In [36] the efficiency of the proposed iterative solver along with an adapted preconditioning technique is analyzed numerically. In [15,36], the approximation properties of some families of space-time discretization schemes, including the cGP(r )-MFEM( p) approach, in terms of convergence rates and their robustness are studied by numerous numerical experiments. Test cases in three space dimensions and with heterogeneous and strongly anisotropic material properties are also included.

Existence and uniqueness of the semidiscrete approximation and error estimates
In this subsection we prove the existence and uniqueness of solutions to the semidiscrete approximation scheme that is defined by (2.14), (2.15) and its numerically integrated counterpart (2.19), (2.20), respectively. The time discretization error is also studied in this section. The spatial discretization error is analyzed in Sect. 4.
Proof To prove existence of solutions to problem (2.14), (2.15), we will use an equivalent conformal formulation, see [43] for a similar approach.
for all w τ ∈ Y r −1 (H 1 0 ( )). The existence and uniqueness of the semidiscrete approximation satisfying (3.9) can be established. This is shown in the "Appendix" of this work. Then we define u τ := u τ and q τ := − D∇ u τ . (3.10) Obviously, it holds that u τ ∈ X r (W ) since H 1 0 ( ) ⊂ W . Further, we have that ∂ t u τ ∈ L 2 (I ; H 1 0 ( )) since on each of the subintervals I n , n = 1, . . . , N the function u τ ∈ X r (H 1 0 ( )) admits the representation with coefficients U j n ∈ H 1 0 ( ) and polynomial basis functions ϕ n, j ∈ P r (I n ; R). Next, we prove that q τ ∈ X r (V ). Under the assumption of Sect.
Consequently, it holds that (cf. [17, p. 18 Finally, from the expansion in terms of polynomial basis functions we conclude that q τ ∈ C([0, T ]; V ). Equation (3.9) then directly implies that the functions u τ and q τ defined in (3.10) satisfy the first equation of the variational problem (2.14), (2.15). The second equation of the system (2.14), (2.15) then follows from the representation (3.11) of the variable q τ by testing the identity (3.11) with some function v τ ∈ Y r −1 (V ) and applying the divergence theorem of Gauss. Hence, the assertion of the theorem is proved.
As a corollary of the previous two theorems proving the existence of a unique solution to the semidiscrete problem (2.14), (2.15) we obtain an inf-sup stability condition within our space-time framework. This result will play a fundamental role in our error analyses. For this we need some further notation. Let {u τ , q τ } ∈ X r (W )× X r (V ) denote the solution of the semidiscrete problem (2.14), (2.15). We split u τ as In terms of the tuple {u 0 τ , q τ } of unknowns we recast the existence and uniqueness result of Theorems 3.1 and 3.2 in the following form.
is the unique solution of the following variational problem: As a corollary we get the following inf-sup stability condition.

Estimates for the error between the continuous and the semidiscrete solution
Now we shall show error estimates for the exact form (2.14), (2.15) of the cGP(r ) approach applied to the mixed formulation (2.10), (2.11) of our parabolic model problem.
For this we assume that the following approximation property are satisfied. There exist interpolation operators I τ : such that for sufficiently smooth functions u ∈ H 1 (I ; W ) and q ∈ L(I ; V ) and all time intervals I n , for n = 1, . . . , N , it holds that with some constant c independent of τ n and τ . The existence of such approximations is obviously ensured, for instance, by using Lagrange interpolation [48]. We get the following error estimates in the natural norm of the time discretization.
where the constant c is independent of τ n , τ and T .
Proof By splitting and recalling the semidiscrete counterpart (3.12), we get that for almost every t ∈ (0, T ), such that it is sufficient to derive the asserted error bounds of the theorem for u 0 − u 0 τ instead of estimating u − u τ . This will be done in the following.
By (3.16)-(3.18) it holds that For the discrete functions w τ : (3.21) where the Galerkin orthogonalities have been used. From (3.21) along with (3.20), we find that From inequality (3.22) along with the interpolation error estimate (3.20) we conclude the assertion of the theorem by means of the triangle inequality.
Theorem 3.5 yields an error estimate with respect to the natural space-time norm of the discretization scheme. The estimate is sharp with respect to the contribution of ∂ t (u − u τ ) L 2 (I ;W ) to the overall norm (2.12). However, the estimate is suboptimal with respect to u − u τ L 2 (I ;W ) . In the following theorem, we sharpen our analysis by providing an optimal order error estimate also for u − u τ L 2 (I ;W ) . This is done by a duality argument. For this, the following additional regularity assumption is needed. Regularity condition (R mix ) Suppose that g ∈ L 2 (I ; W ). The variational problem, Formally, the corresponding strong form of (3.23), (3.24) is given by with z(T ) = 0 and homogeneous Dirichlet boundary conditions, that is obtained by rewriting the dual problem associated with (2.6)-(2.8),  (3.28) For this we note that p ∈ L 2 (I ; V ) can be shown by using the arguments of the proof of Theorem 3.2. The a priori estimate of the vector variable p in (3.28) is then a direct consequence of the variational equation (3.23).
Remark 3.6 A regularity condition similar to (R mix ) is also used in [46,p. 48,Eq. (6.16)] to prove the optimal order convergence of a variational time discretization of second order parabolic problems in the non-mixed formulation. Currently, it remains an open problem how this limiting condition can be avoided in the theoretical analysis. The techniques that were developed recently in [25] might be helpful. However, in our numerical convergence studies of Sect. 5 the optimal convergence rate that is proved in Theorem 3.8 under the condition (R mix ) is nicely observed.
Below we also need the following auxiliary lemma. Then it holds that z − I 0 z L 2 (I n ,W ) ≤ τ n ∂ t z L 2 (I n ;W ) , (3.29) p − J 0 p L 2 (I n ;V ) ≤ τ n ∂ t p L 2 (I n ;V ) .  Applying the a priori estimate (3.28) and the additional regularity assumption (3.25) with g = e u as well as using the error estimate of Theorem 3.5, we then find that This proves the assertion of the theorem.
Next we derive an error estimate for the non-exact form (2.19), (2.20) of the cGP(r ) method. The difference of the non-exact form of cGP(r ) to (2.14), (2.15) comes through the numerically integrated right-hand side term in (2.19). Firstly, we ensure the existence and uniqueness of the solution to the non-exact form of cGP(r ).
for all w τ ∈ Y r −1 (W ) and v ∈ Y r −1 (V ) with the initial condition u τ (0) = u 0 . Existence and uniqueness of the solution {u τ , q τ } ∈ X r (W )×X r (V ) to the system (3.39), (3.40) then follows as in Theorems 3.1 and 3.2 with r f replacing f in the arguments of the proofs.
Next, we present the corresponding a priori error estimate.  (2.11) that is supposed to be sufficiently regular. Then the solution {u τ , q τ } ∈ X r (W ) × X r (V ) of the non-exact semidiscrete problem (2.14), (2.15) satisfies the error estimate where the constants c is independent of τ n , τ and T .
Since the proof of Theorem 3.10 follows from the proof of Theorem 3.5 by a standard estimate of the interpolation error, we skip it here. For the sake of completeness we summarize the proof in the "Appendix" of this work.

Existence and uniqueness of the fully discrete approximation and error estimates
In the first subsection of Sect. 4 we prove the existence and uniqueness of solutions to the fully discrete approximation scheme (2.28), (2.29

Existence and uniqueness of the fully discrete approximation
Firstly we prove the existence and uniqueness of solutions to the fully discrete cGP(r )-MFEM( p) scheme (2.28), (2.29). For this we need the following lemma (cf. [ The symmetric matrix D −1 is positive definite by assumption (2.2) andβ ii > 0 under the coefficient property (C); cf. Lemma 2.3. Therefore, Eq. (4.4) immediately implies that Q i,1 n,h = Q i,2 n,h for i = 1, . . . , r . By Lemma 4.1 there exists some v h ∈ V h such that ∇ · v h = U i,1 n,h − U i,2 n,h . Using this v h as test function in (4.2) and noting that the first term in (4.2) now vanishes, we obtain that U i,1 n,h = U i,2 n,h , for i = 1, . . . , r . This implies the uniqueness of the solution to the fully discrete problem (2.28), (2.29) and proves the assertion of the theorem.

Estimates for the error between the semidiscrete and the fully discrete solution
In this subsection we derive estimates for the error between the semidiscrete approximation defined by Eqs. (2.14), (2.15) and the fully discrete solution given by Eqs. (2.28), (2.29). For this we use the following projection operators (cf. [5,17] and [40, p. 237]) defined in W and V , respectively, by for all w h ∈ W h and v h ∈ V h , respectively. We point out that h is firstly defined on H 1 ( ) and then extended to V by following [40, p. 237]. For these operators and the family of Raviart-Thomas elements on quadrilateral elements for the two-dimensional case and the class of Raviart-Thomas-Nédélec elements in three space dimensions there holds that for any w ∈ H p+1 ( ) and v ∈ H p+1 ( ), ∇ · v ∈ H p+1 ( ), respectively. For the error between the semidiscrete solution and fully discrete we use the notation Next, we prove two preliminary lemmas.

Lemma 4.3 Let the assumptions of
with some constant c > 0 not depending on the discretization parameters h and τ .
Proof By subtracting (2.28), (2.29) from (2.19), (2.20), respectively, it follows that (4.12) and (4.13), respectively. By adding the thus obtained equations, using the properties of the projection projectors P h and h defined in (4.5) and (4.6), respectively, and summing up from i = 1 to r we get that (4.14) We note that due to Lemma 2.4, the first term in (4.14) can be rewritten as Along with some further algebraic manipulations we then conclude from (4.14) that Recalling assumption (2.2) about D and property (C) in (2.22) of the coefficientŝ β ii and we obtain from Eq. (4.15) by applying Cauchy-Young's inequality that Summing up inequality (4.16) from n = 1 to K and noting that P h E u (t 0 ) = 0 then shows that for any K ∈ N with K ≤ N . By using now Lemma 4.1, there exists for any By E i q,n = Q i n − Q i n,h and the triangle inequality relation (4.19) implies that (4.20) Observing that E i u,n − P h E i u,n = U i n − P h U i n , inequality (4.20) proves (4.11). In the second lemma we restrict ourselves to the case that D = d I with some d > 0 is satisfied. An extension of the provided estimates to more general matrices D(x) still remains an open problem.  defined by (2.18)-(2.20). Further, let {u τ,h , q be the unique solution of the fully discrete problem (2.28), (2.29). Then, for any K = 1, . . . , N it holds that (4.21) Proof Introducing the projectors into the error equations (4.12)-(4.13) yields that is also satisfied for i = 0 and any n ≥ 1. Using this, we obtain by multiplying (4.23) withα ji and summing up the resulting identity from i = 0 to r that for any v h ∈ V h . We note that we changed the notation for the indices. By testing now (4.22) with w h = r j=0α i j P h E j u,n ∈ W h and (4.24) with v h = τ nβii P h E i q,n ∈ V h , we get by summing the resulting equations and using the inequalities of Cauchy-Schwarz and Cauchy-Young that for n = 1, . . . , N and i = 1, . . . , r . The inequality above further simplifies to (4.27) We now estimate the divergence of the flux. By testing (4.22) with w h = ∇ · h E i q,n ∈ W h , and using the inequalities of Cauchy-Schwarz and Cauchy-Young (β ii > 0 for all i = 1, . . . , r ) we get that which proves the assertions of the lemma. Now we combine the inequalities of the previous lemmas to estimate the error between the semidiscrete and the fully discrete solutions in the norms of L 2 (I ; W ) and L 2 (I ; V ).
be the solution of the fully discrete problem (2.28), (2.29). For the scalar variable u τ it holds that (4.28) For the vectorial variable q τ it holds that Further, for D(x) = d I, for some d > 0, it holds that q τ − q τ,h L 2 (I ;L 2 ( )) ≤ ch p+1 (4.30) and The constant c does not depend on the discretization parameters h and τ .
Proof By using (2.25) and recalling that E 0 u,n = E u (t n−1 ) we find that (4.32) By inequality (4.11) we get for the first term on the right-hand side of (4.32) that (4.33) Using (4.11) again, we find for the second term on the right-hand side of (4.32) that for K = 1, . . . , N . Combining now (4.32) with (4.33) and (4.34) and using the approximation properties (4.9)-(4.10) of the projection operators we then get that This proves (4.29). By using (2.25), recalling that E 0 q,n = E q (t n−1 ) and applying the boundedness of the L 2 projection operator P h we find that (4.37) For D(x) = d I the second term on the right-hand side of (4.37) can be bounded from above by means of the inequality (4.21) along with the observation that ( for K = 1, . . . , N . Finally, combining (4.37) with (4.33) and (4.38) and using the approximation properties (4.9)-(4.10) of the projection operators we then get that (4.40) The assertion (4.31) then follows from (4.40) combined with (4.21) and the approximation properties (4.9)-(4.10).
We remark that the inequalities (4.30) and (4.31) provide an error control for the spatial discretization in the Gaussian quadrature points or temporal degrees of freedom of the subintervals I n with respect to the norm of L 2 ( ) and V , respectively. For an error control with respect to the norm of L 2 (I ; V ) or L 2 (I ; V ) a further estimate of E 0 q,n is required which remains an open problem.

Error estimates for the error between the continuous and the fully discrete solution
In this section we combine the results of Theorems 3.8 and 3.10 with the estimates of Theorem 4.5 to prove the convergence of the fully discrete scheme.
The constant c in (4.41)-(4.43), respectively, does not depend on the discretization parameters h and τ .
Proof By using the triangle inequality, Theorems 3.10 and 4.5 it follows that where sufficient regularity of the continuous and semidiscrete solution with appropriate upper bounds for the solutions (cf. Theorems 3.10 and 4.5) is assumed. The inequality (4.42) is obtained similarly. The estimate (4.43) can be concluded in the same way by using now the result of Theorem 3.8 instead of Theorem 3.10.
Remark 4.7 • The error estimate (4.43) is optimal in time and space. The assumption of an interpolated right-hand side function (2.21) can still be dropped even though this is not explicitly done in this work. It requires to estimate the error between the exact form of cGP(r ) defined in (2.14), (2.15) and the fully discrete solution.
In this case the arguments used to prove Theorem 4.5 have to be augmented by an estimate of the interpolation error for the right-hand side function, similarly to the proof of Theorem 3.10. analyze if the estimates can still be sharpened to order r +1. In our numerical study presented in Sect. 5 convergence of order r + 1 will be observed for the temporal discretization of the scalar and the flux variable. Moreover, this is even observed in the (spatially) stronger norm of L 2 (0, T ; V ) instead of L 2 (0, T ; L 2 ( )) for the flux variable.

Numerical studies
In this section we present numerical studies in order to illustrate the error estimate given in Theorem 4.6 for the fully discrete scheme (2.28), (2.29) combining a variational time discretization with the MFEM. Moreover, we analyze the robustness of the convergence behaviour with respect to random perturbations of the meshes. Thereby we mimic mesh distributions of applications that are of practical interest. Additional convergence studies for variational space-time discretizations of the proposed type as well as for discontinuous time discretizations can be found in [15,37] for parabolic problems and in [36,37] for variational space-time discretizations of wave equations. In [15,37] the efficient iterative solution of the resulting algebraic system of Eqs. (2.28), (2.29) along with the construction of appropriate preconditioning techniques is carefully addressed. In the literature, further computational studies of variational time discretization schemes are presented also for different kind of flow and transport problems in, e.g., [1][2][3][30][31][32]38,46]. In order to determine the space-time convergence behavior we consider in our numerical study the cGP(2)-MFEM(2) approach. That is (2.14)-(2.15) with r = 2 combined with the mixed finite element method MFEM (2)

Uniform meshes
To determine the experimental orders of convergence the space-time mesh is refined uniformly by a factor of two in each of the space dimensions and in the time dimension.
The characteristic mesh numbers are provided in Table 1. We summarize the calculated errors and their experimental order of convergence (EOC) for the proposed space-time discretization in Table 2 and further illustrate them in Fig. 1. The numerical results confirm the expected third order rate of convergence established in Theorem 4.6 (cf. also Remark 4.7) for the discretization in the space-time domain with polynomial order r = 2 and p = 2, respectively, in the definition of the underlying finite element spaces. We note that the theoretically optimal order of convergence in time and space is obtained for the primal and the flux variable. Thus, the estimate (4.42) might be suboptimal with respect to the time discretization; cf. Remark 4.7. The estimate (4.43) is nicely confirmed by the presented numerical results. Further, we note that the optimal rate of convergence is obtained for the spatial discretization of the flux field in the norm of V . In this point the family of Raviart-Thomas pairs of mixed finite elements is superior to the family of Brezzi-Douglas-Marini pairs of mixed finite elements (cf. [17]) for that the optimal order of convergence of the flux variable can be obtained only in the norm of L 2 ( ).

Distorted meshes
In the second part of the numerical convergence studies we approximate the same analytical solution as before but we use spatial meshes with randomly distorted interior  vertices. Precisely, in each of the computations we start on a coarse mesh and do uniform refinement steps by halvening the spatial mesh width. On the thus obtained finest mesh each of the interior vertices is distorted by a randomly chosen vector. The magnitude of the distortion vector is chosen randomly up to a given factor of relative length to the corresponding edge length. The characteristic numbers of the refinement levels are summarized in Table 3. The resulting distorted meshes are illustrated in Fig.  2 for the refinement level 3. The temporal mesh is chosen in the same way as in the first numerical experiment; cf. Table 1.
We summarize the calculated errors and the corresponding experimental order of convergence (EOC) for the proposed space-time discretization on the distorted spatial meshes in Table 4 for the scalar-valued primal variable and in Tables 5 and 6 for the vector-valued flux variable and further illustrate them in Fig. 3. Tables 5 and 6 differ by the norms in that the errors of the flux approximation are measured. In Tables 4 and 5 the expected order of convergence in space and time, for the primal variable measured in the norm of L 2 (I ; L 2 ( )) and for the flux variable measured in the norm of L 2 (I ; L 2 ( )), is largely confirmed even for the strongly perturbed meshes with a distortion factor of 25%. This nicely demonstrates the robustness of the numerical scheme. Solely in Table 6 a slight reduction of the experimental order of convergence is observed depending on the degree of mesh perturbation. On the randomly distorted meshes the quasi uniformity condition, that is typically assumed about the finite element meshes in the numerical analyses, deteriorates successively. We conjec- Table 2 Norm values and corresponding experimental orders of convergence in space-time for cGP(2)-MFEM(2) on the refinement levels as given in Table 1 Level e    on distorted meshes given in Table 3 Level on distorted meshes given in Table 3 Level ture that this impacts the convergence behavior in the stronger L 2 (I ; V ) norm more severely than in the L 2 (I ; L 2 ( )) norm. The higher sensitivity of the derivatives in the L 2 (I ; V ) norm with respect to the mesh perturbations seems to be quite natural. Nevertheless, we note that even though a strong random mesh perturbation is applied, a robust convergence behavior is still ensured and optimal order of convergence in the L 2 (I ; L 2 ( )) norm is obtained. Finally, we note that the space-time convergence studies on the distorted spatial meshes were done with exactly the same numerical solver settings as for the above-given studies on uniform meshes.   Table 3 6

Conclusions
In this work a numerical analysis of a family of variational space approximation schemes that combine continuous finite elements in time with the MFEM in space was presented for a parabolic prototype model of flow in porous media. The existence and uniqueness of the temporally semidiscrete and the fully discrete approximations were proved. Error estimates with explicit rates of convergence, including an optimal order error estimate, in natural norms of the scheme were established. The error estimates were illustrated and confirmed by numerical convergence studies. We believe that our analyses and techniques can be extended and applied to more sophisticated flow and transport processes in porous media or to incompressible viscous free flow. This will be our work for the future.
Firstly, we show the uniqueness of solutions to (A.1). In the sequel we denote by ϕ n, j = ϕ n, j (t) for j = 0, . . . , r the Lagrange basis functions in I n = (t n−1 , t n ] with respect to r + 1 quadrature points t n,l , l = 0, . . . , r . Here, we choose the Gauss-Lobatto quadrature rule that is exact for polynomials of maximum degree 2r − 1. In particular, for the quadrature nodes in I n it holds that t n,0 = t n−1 and t n,r = t n . Then, any function u 0 τ ∈ X r 0 (H 1 0 ( )) and its time derivative admit the representation for all t ∈ I n with coefficient functions U j n ∈ H 1 0 ( ) for j = 0, . . . , r . Proof Let u 0 τ,1 , u 0 τ,2 ∈ X r 0 (H 1 0 ( )) denote two solutions of the semidiscrete variational problem (A.1). We put u 0 τ (t) := u 0 τ,1 − u 0 τ,2 . We choose the test function w τ := A −1 ∂ t u 0 τ + μ∂ t u 0 τ for some fixed parameter μ ≥ 0. By means of (A.2), it holds that w τ ∈ Y r −1 (H 1 0 ( )). For this choice of w τ it follows that This implies that u 0 τ = 0 and, consequently, that u 0 τ,1 = u 0 τ,2 . The uniqueness of solutions to (A.1) is thus established.
We remark that testing Eq. (A.1) with v τ = A −1 ∂ t u 0 τ or v τ = ∂ τ u 0 τ would already be sufficient for proving the uniqueness result. Further, the symmetry of a(·, ·) is essential in the previous proof. A generalization of the arguments to problems with nonsymmetric bilinear forms, for instance to convection-diffusion equations, still remains an open problem.
The existence of a solution to the semidiscrete problem (A.1) follows from the uniqueness of the solutions. Using the eigenspaces of A, problem (A.1) can be reduced to a set of finite dimensional problems, for each of which obviously uniqueness implies existence. For this we recall the following result from [ , u 0 and f as the be satisfied. Then the semidiscrete problem (A.1) admits a solution u 0 τ ∈ X r 0 (H 1 0 ( )).
Proof The operator S := A −1 : L 2 ( ) → L 2 ( ) with A being defined in (2.1) is a bounded, linear compact operator mapping L 2 ( ) into itself. By means of Lemma A.2 there exists a set of appropriately scaled eigenfunctions {w k } ∞ k=1 ⊂ L 2 ( ) with w k ∈ H 1 0 ( ) such that {w k } ∞ k=1 is an orthogonal basis of H 1 0 ( ) and an orthonormal basis of L 2 ( ).
In terms of these eigenfunctions {w k } ∞ k=1 ⊂ H 1 0 ( ) the solution u 0 τ of problem (A.1) can be represented as By means of the approximation properties (3.16) to (3.18) we then get that Combining this estimate with the triangle inequality yields the assertion of Theorem 3.10.