A Posteriori Error Analysis for Implicit–Explicit hp-Discontinuous Galerkin Timestepping Methods for Semilinear Parabolic Problems

A posteriori error estimates in the L∞(H)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_\infty (\mathcal {H})$$\end{document}- and L2(V)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_2(\mathcal {V})$$\end{document}-norms are derived for fully-discrete space–time methods discretising semilinear parabolic problems; here V↪H↪V∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathcal {V}\hookrightarrow \mathcal {H}\hookrightarrow \mathcal {V}^*$$\end{document} denotes a Gelfand triple for an evolution partial differential equation problem. In particular, an implicit–explicit variable order (hp-version) discontinuous Galerkin timestepping scheme is employed, in conjunction with conforming finite element discretisation in space. The nonlinear reaction is treated explicitly, while the linear spatial operator is treated implicitly, allowing for time-marching without the need to solve a nonlinear system per timestep. The main tool in obtaining these error estimates is a recent space–time reconstruction proposed in Georgoulis et al. (A posteriori error bounds for fully-discrete hp-discontinuous Galerkin timestepping methods for parabolic problems, Submitted for publication) for linear parabolic problems, which is now extended to semilinear problems via a non-standard continuation argument. Some numerical investigations are also included highlighting the optimality of the proposed a posteriori bounds.


Introduction
Galerkin finite element methods are popular for the numerical solution of various classes of partial differential equations (PDEs) and related initial/boundary value problems due to their flexibility and accuracy in dealing with various challenging solution properties and computational domain geometries. Owing to their variational structure, Galerkin methods are known to be suitable for mesh adaptivity in an effort to provide sufficient resolution for localised solution features without the introduction of excessive overhead of numerical degrees of freedom. Indeed, adaptive finite element methods for complex nonlinear problems is an active area of research. In their more mathematically justified settings, adaptive algorithms are driven by rigorous, computable bounds on the error residual, the so-called a posteriori error bounds.
Nevertheless, the study of nonlinear time-dependent PDE problems continues to necessitate further research, as a number of important challenges are yet still to be addressed. A central such challenge, in our view, is the development of a posteriori bounds for arbitrary order explicit and implicit-explicit time-stepping methods for nonlinear evolution PDEs, especially treating fully-discrete numerical schemes, in an effort to drive adaptive algorithms. Adaptivity is crucial for computational complexity reduction in this context due to the wealth of potential non-linear features that may appear in the solutions of such PDE problems (interfaces, pattern formation, travelling waves, etc); we refer to [12] for a recent directly related application. Indeed, we are aware of an extremely limited number of works discussing rigorous a posteriori error bounds for explicit time-stepping methods for linear evolution problems [25,26]. The challenge posed by explicit (or implicit-explicit) timestepping methods in the context of rigorous a posteriori error control is the careful construction of an "implicit perturbation" of the explicit scheme for which we can construct suitable, optimal order, reconstructions that, in turn, can be naturally inserted into the original PDE to construct residuals.
A posteriori error analysis for stationary and evolution problems has been widely investigated over the last 30 years or so, and significant developments have been achieved, see, e.g., the treatises [1,51] and the references therein for elliptic problems, the works [11,[17][18][19]22,27,29,35,36,38,40,42,43,50] for space-discrete or fully-discrete parabolic problems, or [2,[4][5][6]26,34,41,46] for a posteriori error bounds for time semi-discretisations of evolution PDEs; the above lists contain the results most relevant to this work from the growing literature in the area and are, of course, far from constituting a complete bibliographical account.
Discontinuous Galerkin (dG) timestepping for parabolic problems is classical [15,16,23,31,37,39,45,48]. These methods have received considerable interest in the context of spacetime adaptivity throughout the years, as they offer a variational, arbitrary order timestepping framework and, crucially, allow for locally variable timestep sizes in different spatial regions of the computational domain. This is an important attribute towards the aim of full spacetime adaptive numerical methods, which was already recognised in [18,19]. During the last 10 years or so, there has been a revived interest in the derivation of rigorous a posteriori error bounds for dG timestepping schemes [20,21,24,28,41,46].
Nonlinear reaction-diffusion parabolic problems are abundant in the literature, especially in the modelling of biological processes and of population dynamics. In such applications, a variety of norms is often required, especially in the context of model parameter calibration from real data. Therefore, the availability of a posteriori bounds of the error in various norms, with different rates of convergence, is desirable to drive space-time adaptive algorithms. To the best of our knowledge, however, there are no previous results on a posteriori error bounds for implicit-explicit high order fully-discrete methods involving dG-timestepping for nonlinear evolution PDEs. This is in contrast with the increasing number of interesting works on a posteriori error analyses for low order time-stepping schemes for nonlinear evolution problems; see, e.g., [8,10,12,17,36,52] and the references therein. At the same time, there exist only few works on the a posteriori error analysis of high order time-stepping schemes for time-discrete nonlinear parabolic problems [30,33,34,41,46]. This work is concerned with the derivation of conditional a posteriori error bounds in the L ∞ (H)and L 2 (V)-norms for fully discrete implicit-explicit (IMEX) methods of variable order for semilinear parabolic problems; here V → H → V * denotes a Gelfand triple setting for an evolution PDE. Typical examples for which the results presented hold are H = L 2 ( ) and V = H 1 0 ( ) or V = H 2 0 ( ). The nonlinear reaction term is assumed to be locally Lipschitz and satisfying a growth condition in the spirit of [49]. Such growth conditions allow for the unified treated also of PDE systems of reaction-diffusion-convection type [9,12]. The time discretisation consists of an hp-version discontinuous Galerkin method treating implicitly the linear spatial operator, and of an explicit multistep method for the nonlinear reaction term. This is combined with the standard conforming finite element method used for the spatial discretisation. The dG-multistep IMEX time discretisation we consider in this work was introduced in [23], whereby a priori error bounds were proven for the case of globally Lipschitz nonlinear reactions. To reduce the computational overhead, the nonlinear reactions are treated explicitly via sufficiently high-order interpolation of solution values from previous timesteps [23]. Therefore, the solution of one linear system per timestep is required. The proof combines the recent space-time reconstruction proposed in [28] for the implicit dG discretisation, along with a suitable implicit perturbation of the explicitly discretised nonlinear reaction part in the spirit of [25,26]. The treatment of the non-Lipschitz nonlinearity involves a continuation argument in the spirit of [8][9][10] along with suitable Sobolev imbeddings. The resulting a posteriori error bound is of conditional type, i.e., it is valid subject to an a posteriori smallness condition being satisfied. Such conditional estimates are typical for strongly nonlinear problems [8][9][10]12,33], i.e., problems whose nonlinearities do not satisfy strong monotonicity and/or global Lipschitz conditions. In this work, we use energy arguments for the proof a posteriori error bounds, as Sobolev imbeddings are sufficient for control of the nonlinear terms. For the case of blow-up problems an alternative technique based on Duhamel's principle combined with L ∞ -control bootstrapping arguments in the time variable is also available; we refer to [30,33,34] for time-discrete results in this vein.
Crucially, no a priori Courant-Friedrichs-Lewy (CFL) type conditions (with the respective often obscure constants involved) will be required for the validity of our a posteriori error bounds for explicit timestepping methods (cf., also [25,26]). Indeed, for unstable combinations of local spatial and temporal meshsizes, the a posteriori estimator remains reliable. In fact, this remarkable property motivates the study of a posteriori estimation of CFL constants as a non-standard potential use of rigorous a posteriori error upper bounds for (implicit-)explicit methods; this will be discussed elsewhere.
The remainder of this work is organised as follows. In Sect. 2 we introduce some notation and define the space-time scheme. In Sect. 3 we introduce the space-time reconstruction operators and state the corresponding error bounds. The a posteriori error analysis for fullydiscrete semilinear parabolic equations in L ∞ (H) and L 2 (V) norms is presented in Sect. 4. In Sect. 5 we present a set of numerical examples for both linear and semilinear test problems investigating the performance of the a posteriori error bounds, while, in the last section, we draw some conclusions.

Abstract Setting
For H a real Hilbert space and I = [a, b] ⊂ R, the Bochner space L p (I; H) is defined by L p (I; H) := {v : I → H such that v L p (I;H) < ∞}, with the respective norm given by Upon denoting by v the (weak) derivative of v with respect to the "time"-variable t ∈ I, we can also define the Sobolev-Bochner spaces another Hilbert space with norm · V and let V * denote its the dual space defined by the functions z for which the norm is finite; the spaces V, H and V * form a, so-called, Gelfand triple with the duality pairing (·, ·) V * ×V extending the inner product (·, ·) H , in the sense that, for all u ∈ H and v ∈ V holds (u, v) V * ×V = (u, v) H . The subscript V * × V in the duality pairing will be omitted whenever no confusion is likely to occur. Although we shall work within the above abstract setting, typical cases include H = L 2 ( ), V = H 1 0 ( ), giving V * = H −1 ( ), or V = H 2 0 ( ), giving V * = H −2 ( ). We consider the semilinear parabolic initial value problem: for some known function u 0 ∈ H, where A : V −→ V * is a linear elliptic operator, which is continuous and coercive with respect to the norm of V. We also define the bilinear form which inherits the continuity and coercivity properties of A, viz., with C cont , C coer positive constants independent of w, v.
The function f : I × R d × R → R is smooth and locally Lipschitz-continuous, bounded in the first two arguments and satisfying the growth condition for the third argument: for all z 1 , z 2 ∈ R with | · | denoting the Euclidean distance, for a positive constant C, uniform with respect to the first two arguments. In what follows, we shall often suppress for brevity the dependence of f on its first two arguments writing, therefore, f (t, x, w) = f (w). Generalisations of the above assumptions in the first two arguments are possible in the context of certain Caratheodory-type conditions, but we refrain from discussing these in the interest of simplicity of the presentation. Such growth conditions allow for the unified treatment of PDE systems of reaction-diffusion-convection type [9,12], as in such cases it is either very difficult or, even impossible to deduce positivity/monotonicity properties from complicated reaction patterns. We stress, however, that the results proven below hold subject to restrictions on the range of the exponent r ≥ 0, depending on the particular choices of the triple (1) and on the dimension of the spatial computational domain ⊂ R d .

Space-Time Galerkin Spaces
Let I = [0, T ] be the time interval with final time T > 0 and, for 0 = t 0 < t 1 < · · · < t N = T , consider the partition {I n , n = 0, . . . , N } of I into subintervals I n := (t n−1 , t n ] for n = 1, . . . , N , and I 0 := {0}, with corresponding timesteps k n := t n −t n−1 , n = 1, 2, . . . , N . We also consider a finite sequence {V n } N n=0 of conforming finite element subspaces of V, each associated with the time subintervals I n . To account for mesh-change effects, we also define the largest common subspace V n := V n−1 ∩ V n , for all n = 1, . . . , N . Let H be a Hilbert space. We define We also consider the space-time finite element subspace X n := P r n (I n ; V n ), for all n = 0, 1 . . . , N with r n denoting the local temporal polynomial degree, which may vary from one timestep to another. Collecting the latter for all time-steps, we define the space-time Galerkin space often suppressing the dependence on the polynomial degree vector r := (r 1 , r 2 , . . . , r N ) for brevity. Moreover, for a piecewise continuous function v : I ⊂ R → H, with the time nodes t n as possible points of discontinuity, we define the time-jump: where v ± n := lim δ→0 + v(t n ± δ), the respective one-sided (right and left) limits for n = 0, 1, . . . , N .
For every n = 0, 1, . . . , N , we introduce the V * -projection operator P n : V * → V n defined by and the corresponding projection operator P n defined by (P n v, χ) H = (v, χ) V * ×V for all χ ∈ V n , respectively. Also, we define the elliptic projection operatorP n : withP n the respective elliptic projection onto V n . For w ∈ H, we define the time lifting operator L n : H → P r n (I n ; H), by If W ⊂ H is a linear subspace of H, we have the property for more details, we refer to [46].

Space-Time Finite Element Methods
Set U − 0 :=P 0 u 0 . Then, the fully-discrete implicit time discontinuous and spatially conforming Galerkin approximation of the exact solution u of (2) reads: find U ∈ X such that for all V ∈ X n and for n = 1, . . . , N , where we recall that [U ] n = U + n − U − n . The space-time method (10) is fully implicit in the sense that a nonlinear system of equations for the numerical degrees of freedom has to be solved to advance one time interval.
Aiming for a linearly implicit method, we follow [23] and we replace f (U ) in (10) by its linear interpolant in time f (U ), defined so that f (U )| I n ∈ P 2r n (I n , V n ), for all n = 1, . . . , N , using values of U from previous time intervals I m , m < n only and extrapolating the resulting interpolant into I n . In this case, the solution process will result in a linear system for U per time-step, giving rise to an implicit-explicit (IMEX) method. Of course, one can also interpolate on the previous and the current time intervals I m , m ≤ n. This case will lead to a nonlinear system of equations for U , although it can be potentially more easy to implement for certain nonlinearities f . In both cases, the time interpolant f (U ) can be represented on each I n as where λ n− j , j = 0, 1, is the interpolation operator for polynomials of degree λ at the nodes t n− j−λ , . . . , t n− j and χ l the respective Lagrange basis functions. The corresponding IMEX space-time scheme reads: set U − 0 :=P 0 u 0 and find U ∈ X such that for all V ∈ X n , for n = 2r 1 + j, . . . , N . Of course, as this is a multistep method, we can only use it after a certain number of time-steps, depending on the order of the method. Without potential loss of convergence rate, however, we can consider a few (very small in size) timesteps with the zeroth order method, i.e., the implicit Euler method with explicit treatment of the nonlinear reaction, before using (12) with higher order than zero. The interpolant degree λ = 2r n is required to represent exactly the integrand is a product of two polynomials of degree r n each with respect to the time variable. Finally, for j = 1, we arrive at the IMEX method, while, for j = 0, we retrieve the fully implicit scheme; for further details we refer to [23]. Note that the values U − l are known to be points of superconvergence for the respective time-discrete problem [3,32].
Despite the specific choices discussed above, in what follows, we shall endeavour to be general with respect to the particular approximation of the nonlinear term. To that end, we shall refrain from using specific properties of any particular interpolant/extrapolant used in the proof of the a posteriori error bounds below, in an effort to be versatile in the choice of linearisation. Indeed, the a posteriori error bounds given below will involve the computable

Reconstructions
We now discuss the space-time reconstruction technique proposed in [28] for the respective linear problem, which is a modification of the concepts of elliptic reconstruction for the spatial discretisation [35,40] and of the dG-timestepping reconstruction presented first in [41], and further analysed in the hp-setting in [46].

Time Reconstruction
The time reconstructionŴ ∈ H 1 (0, T ; H) of a time-discrete function W ∈ P r (I; H) is defined for each I n , n = 1, . . . , N , by the conditionŝ andŴ The time reconstructionŴ is well-defined: we have r n +2 unknowns per time interval I n and r n + 1 conditions from (14) and one more condition from (15). It is also unique; we refer to [41, Lemma 2.1] for a proof of the uniqueness, which also shows that the time reconstruction is also globally continuous with respect to the time variable. Equivalently, using the lifting operator (8), we can defineŴ | I n ∈ P r n+1 (I n ; H) on each time interval I n , n = 1, . . . , N , bŷ where we recall that W − 0 := u 0 .
Proof The proof of (17) first appeared in [41, Lemma 2.2]; the formula for K n was further refined to be explicit in the dependence on r n in [46, Theorem 2].

Elliptic Reconstruction
For each conforming finite element space V n ⊂ V, we define the respective discrete elliptic operator A n : V n → V n to be the unique linear operator such that withÛ denoting the time reconstruction of the numerical solution U . The relation (19) can be written in point-wise form as AŨ ( From the definition of A n and from (19), we have (20) and, hence, we have at each t ∈ I n . That is, U is the elliptic projection of the elliptic reconstructionŨ . In other words, U is the approximate solution of the elliptic problem whose exact solution is the elliptic reconstruction functionŨ . Therefore, a crucial consequence of this construction is the ability to estimate the differenceŨ − U by a posteriori error estimators for elliptic problems in various norms available in the literature. As we prefer to keep the exposition independent of specific choices of a posteriori error bounds for elliptic problems, we opt for merely postulating their existence.

Assumption 3.2 (Elliptic a posteriori error bounds)
Let w ∈ V be the exact solution of the elliptic problem Aw = g, with respective boundary conditions, and let W ∈ V h ⊂ V be the finite element solution of this problem in a finite element space V h . We assume that there exist a posteriori error bounds The literature for such elliptic a posteriori error bounds is vast; see, e.g., [1,51] and the references therein. In particular, Assumption 3.2 will imply the validity of the estimates among other things; see Proposition 4.4 below for details.
Using (14) and (20), the IMEX method (12) can be re-written on I n as for all V ∈ X n , for n = 1, . . . , N or, equivalently, in strong form aŝ noting carefully the cancellation of the terms P nÛ in the course of the calculation.

Remark 3.3 (mesh change error via elliptic reconstruction)
The elliptic reconstruction (19) includes the mesh-change type term P nÛ −Û , in contrast to the (standard) elliptic reconstruction proposed in [35]. In fact, it is the high order counterpart of the elliptic reconstruction presented in [11, Definition 6.1] for backward Euler timestepping. Indeed, on each I n we have, respectively, By construction [41,46] there exists a polynomial κ n of degree r n on I n such that

A Posteriori Error Bounds
Upon defining the space-time reconstruction w :=Û , i.e., the time-reconstructed elliptic reconstruction, we begin by decomposing the error as Note that σ is the time reconstruction error which can be estimated using Proposition 3.1. Similarly, is the elliptic reconstruction error and, therefore, can be estimated using Assumption 3.2. Thus, it remains to estimate ρ by quantities involving the problem data and/or σ and . To do so, we shall work with energy estimates, in conjunction with a continuation argument to treat the non-Lipschitzian nonlinear reactions.

Error Equation
Subtracting (25) from (2), we obtain after elementary manipulations on I n , for n = 1, . . . , N . For brevity, we set P : [0, T ] → V, defined as P| I n = P n , n = 0, . . . , N . Testing (27) against ρ, integrating in space and in time between 0 to t ∈ I m , for some m = 1, . . . , N , we deduce noticing that ρ(0) = 0 by construction, and upon defining Dz to be the time-wise broken derivative of a piecewise smooth function z subordinate to the temporal subdivision. Employing the coercivity (5) and continuity (4) of a, along with standard inequalities, the last estimate implies for any γ > 0. Selecting now γ = 1/2 in (29), we arrive at We shall now estimate each term on the right-hand side of (30) separately.

Estimating the Nonlinear Term
We decompose the integrand in the nonlinear term in (30) as As we shall make use of the Sobolev Imbedding Theorem, we first discuss the specific choice H = L 2 ( ) and V = H 1 0 ( ); the case of non-essential boundary conditions also follows without any technical challenge, although it is omitted here for brevity.
Proof Using the growth condition (6), we have, respectively, For the first term on the right-hand side of (33) we use the inequality thereby, deducing Recalling the assumption 0 ≤ r < 2, Hölder's inequality with exponent p = 2/r , (and, thus, using the Sobolev Imbedding Theorem ρ L 4/(2−r) ( ) ≤ C S ∇ρ L 2 ( ) , with 0 ≤ r < 2 for d = 2 and 0 ≤ r ≤ 4/3 for d = 3. Similarly, we have the same estimate (36), with ρ replaced by σ and . Now, the second term of (33) can be dealt with as follows Combining the above estimates, we arrive at the required bound.
To retain the abstract and more compact notation from the previous section, we write (32) as follows for some known positive scalar function G and and constant C nl and we assume its validity henceforth for any suitable H and V for the given range of exponents r ∈ [0, r max ].

Continuation Argument
The bound of the nonlinear term (38) still contains norms of ρ on the right-hand side. To eliminate these, we shall employ a continuation argument in the spirit of [8][9][10].
To this end, assuming (38), or using Lemma 4.1, to bound the respective term on the right-hand side of (30), we arrive at where Upon observing that we deduce for C 1 := C nl max{1, (2C −1 coer ) 1+r /2 }. For each n = 1, . . . , N , consider the interval where we set F(t n , U ) := exp C nl t n 0 G(U ) dt , for brevity. We observe that J n = ∅ as ρ 2 L ∞ (0,t;H) + C coer 2 t 0 ρ 2 V dτ is continuous with respect to t and that it is equal to zero for t = 0, owing to the property ρ(0) = 0; also, J n is closed.
Assuming, without loss of generality, that r > 0, (for, otherwise, f in (2) is globally Lipschitz continuous and, thus, the a posteriori bounds follow by combining the results from [28] along with a standard Grönwall inequality; see Corollary 4.11 below for details) we set t := max J n > 0.
Suppose that t n > t , i.e., t n / ∈ J n . Hence, E 1 (t n ) ≥ E 1 (t ). Therefore, (42) with t = t yields and Grönwall inequality, thus, implies since F(t n , U ) ≥ F(t , U ). Upon assuming that E 1 (t n ) is such that the estimate (44) becomes this is a contradiction, as t was assumed to be the maximum element of J n . Hence, t n = t and, thus, we have already proven the following result.
with E 1 (t n ) as in (40), we have the bound We observe that the condition (46) in the estimate above is computable, provided that E 1 (t n ) is computable. With this in mind, we shall bound the norms of σ and in E 1 by computable quantities below. Assuming that σ and are available, we note that all the remaining constants involved in E 1 (t n ) are either computable or estimable from above. For instance, the Poincaré/Sobolev imbedding constants for which upper bounds are available for very general spatial domains ; we refer to [47] and the references therein for explicit formulas and a detailed discussion. If E 1 (t n ) is computable, then (47) becomes an a posteriori bound for ρ. Triangle inequality, then, would already yield an a posteriori bound for the error e. Of course, we expect that E 1 (t n ) decreases arbitrarily as the maximum timestep and spatial meshsize decay and/or the order of the dG-timestepping increases. We note, finally, that such conditional estimates are the "a posteriori equivalents" to the standard smallness assumptions on timestep and meshsize appearing in a priori error bounds for finite element methods for nonlinear evolution problems.

Remark 4.3
Crucially, there is no explicit CLF-type restriction in the statement of Lemma 4.2, despite this being concerned with an IMEX discretisation. Indeed, for unstable combinations of timesteps and spatial meshsizes, the bound (47) remains valid, provided the condition (60) is satisfied. It is, therefore, conceivable that (60) holds for CFL-unstable scenarios also; in such cases, (47) will remain valid, resulting to arbitrarily large right-hand sides, c.f., also [26].

Estimating the Norms of and of
and with Proof The first estimate is obvious from Assumption 3.2.
For the second, we work as follows. The definition of time reconstruction (16) implieŝ Now, observing the identity, which is valid for all v ∈ V, we have the Galerkin orthogonality property Thus, we can estimateŨ − U using Assumption (3.2) to deduce Next, working as in Remark 3.3, we have The result already follows by resorting once more to Assumption 3.2.
which, together with Assumption 3.2 give rise to the estimate (18) in Proposition 3.1, we also have which, working as above, gives the second estimate.
For an alternative bound, we refer to [28,Lemma 4.4].

Remark 4.6
If no mesh modification takes place, i.e., when V n−1 = V n , the above estimates simplify considerably, since we then have It is possible to avoid invoking to known a posteriori error bounds for elliptic problems as per Assumption 3.2 for the estimation of norms of σ . Instead, we can prove directly alternative bounds upon assuming the existence of standard (possibly rough) approximation (e.g., Clément, Scott-Zhang or standard interpolation estimates, depending on the choice of H, V,) or smoothness estimates (necessary for a duality argument) for lower order norms. Crucially, the estimates below do not require the computation ofP n . Proposition 4.7 (Direct bounds on norms of σ = w −Ũ ) Let t ∈ I n , n = 0, 1 . . . , N , and assume that for every v ∈ V, there exists an approximant V ∈ V n , such that h −s for some s > 0 with h V n a (smooth enough) positive scalar function representing the local spatial mesh-size of the approximation space V n . Then, we have the estimates Let further a subspace ofṼ of V so that Az ∈ H for z ∈Ṽ and such that it contains V n . Assume the existence of z ∈Ṽ, so that z is the solution to the problem a(w, z) = ([Ũ −U ] n−1 , w) for all w ∈ V and that z Ṽ ≤ C sm [Ũ −U ] n−1 H . Moreover, assume that for somes > 0. Then, for p ∈ {2, ∞}, we have the estimate , and using the simultaneous orthogonality of the L 2 -projection errors against V n , respectively. Thus, we conclude the identity for any v ∈ V and V ∈ V n , by invoking to (55). Upon observing the (trivial) inequalities C coer w V ≤ Aw V * ≤ C cont w V , for all w ∈ V, and using Proposition 3.1, we deduce Selecting now v = [Ũ ] n−1 , in (56), and using the coercivity of a along with standard arguments, we deduce from which (53) already follows. We continue by proving an upper bound for σ L p (I n ;H) , for p = {2, ∞}. Proposition 3.1 implies Employing now the dual problem as per the statement, we have from (56) with A| I n := A n , n = 1, . . . , N . The result already follows upon invoking to the approximation and smoothness estimates postulated in the statement: and by the triangle inequality The constants appearing in the bounds in Propositions 4.4 and 4.5 (or 4.7) are standard in the a posteriori error analysis literature, and depend typically on the shape-regularity of each mesh and on the spatial geometry. It is possible to estimate these constants from above in typical settings, e.g., when V ≡ H 1 0 ( ) and H ≡ L 2 ( ) or V ≡ H 2 0 ( ) and H ≡ L 2 ( ). We refer, e.g., to [13,14,47] for some results in this direction.
Using Propositions 4.4 and 4.5 we can bound the term E 1 (t n ) given in (40) by E 1 (t n , U ) defined as with K | I n := K n , ζ S | I n := ζ S,n , and θ S | I n := θ S,n , for n = 1, . . . , N and S ∈ {H, V, V * }. We can also use instead the direct bounds on norm of σ from Proposition 4.7, by replacing θ S byθ S .

Remark 4.8
We note that an alternative and somewhat simpler a posteriori error analysis for the L 2 (V)-norm error only is possible by modifying the space-time reconstruction discussed above, by considering simply the time-reconstruction for the time-derivative and the elliptic reconstruction for the spatial terms. This would result to simpler bounds for the L 2 (V)-norm and avoids completely the need to introduce the -projections. The details can be found in [28]. Crucially, however, this approach would lead to suboptimal bounds in the L ∞ (H)-norm. Thus, we have opted for not presenting this additional error analysis here for L ∞ (H)-norm error is typically a quantity of practical interest in this context of semilinear PDEs with non-Lipschitz growth.
We are now in a position to finalise the a posteriori error analysis.

Completing the a Posteriori Error Bounds
We are now ready to complete the a posteriori error analysis.
for n = 1, . . . , N , with E 1 (t n , U ) given in (59), we have the a posteriori error bound Proof We begin by observing that the proof and the statement of Lemma 4.2 holds with E 1 (t n , U , σ, ) replaced by E 1 (t n , U ). Then, triangle inequality implies with E 1 (t n , U ) given in (59).

Proof
The proof follows, again, by triangle inequality, Lemma 4.2 and Propositions 4.4 and 4.5 (or 4.7).
We now briefly discuss the practical relevance and applicability of the derived conditionaltype a posteriori error estimates. As noted above the constants involved in the condition (60) are either available or estimable from above. This renders the numerical verification of (60) practical. A second question is the realisability of the condition (60) within a space-time adaptive algorithm. Assuming that the algorithm is able to modify the local time-stepping, (60) will be satisfied if the norms involved in E 1 (t n ) are shown to simply converge to zero upon spatial and/or temporal refinement. A simple inspection of the terms involved shows that this is, indeed, the case upon additional regularity assumptions on the exact solution and on f ; we refer to [23,44] for such a priori error bounds.
The algorithmic details on how one can implement such conditional estimates within adaptive algorithms is a rich subject in itself and will not be covered here. For instance, for nonlinearities with sufficiently strong growth, it may be necessary to restart the adaptive algorithm with smaller space and time discetisation resolutions for the condition to be satisfied. We refer to [10,12], where a number of such algorithms using conditional estimates and related challenges in their design are presented and discussed in detail.
Finally, we provide the respective result for the simpler case of a globally Lipschitz nonlinearity, i.e., when r = 0, for completeness. Note that in this case, no smallness condition is required. The proof follows by inspection of the arguments presented above for r = 0, upon noting that, in this case, we can simply take C 1 = 0 in (42).  (38), for r = 0, we have the a posteriori error bounds

Numerical Experiments
We present a series of numerical experiments aimed at testing the reliability and efficiency of the a posteriori error bounds derived above. The numerical implementation is based on the deal.II finite element library [7] and the tests run in the high performance computing facility ALICE at the University of Leicester.
In the examples below we consider both linear and semilinear parabolic problems. In all cases, A = , i.e., the Dirichlet Laplacian, yielding the heat equation with either linear or nonlinear source terms and H = L 2 ( ), V = H 1 0 ( ), giving H * = H −1 ( ). We study the asymptotic behaviour in the L ∞ (L 2 )and L 2 (H 1 )-norms of the error and of the respective estimators by monitoring the evolution of the experimental order of convergence (EOC) over time on a sequence of uniformly refined space meshes indexed by the mesh size h. In each instance, we fix a constant time step k as some power of h and we also use fixed polynomial degrees in both space and time. The resulting errors and estimators are plotted against the corresponding space mesh size h. The EOC of a given sequence of positive quantities a i defined on a sequence of meshes of step size h i is defined by We report the EOC relative to the last computed quantities in all figures as an indication of the asymptotic rate of convergence. We also report the respective effectivity indices, i.e., the ratio between estimator and error for each instance. The estimator is deemed reliable if the effectivity is greater than or equal to one and it is most efficient when the effectivity is close to one.

Example 1: A Linear Problem
We test the IMEX fully discrete scheme analysed in this work on (2) with I × := [0, 1] × [0, 1] 2 , f independent of the exact solution u and initial and boundary conditions such that the exact solution is given by u(t, x, y) = sin(πt) sin(π x) sin(π y).
The respective a posteriori error bounds when the PDE is linear are given in Corollary 4.11. We report the results of two tests using different combinations of polynomial orders q and p in time and space, respectively, denoted as dG(q)-cG( p) scheme.

Example 1A: dG(1)-cG(2) Scheme
Here, we employ quasiuniform biquadratic elements in space ( p = 2) and uniform linear elements in time (q = 1), i.e., the dG(1)-cG(2) scheme. Figure 1 shows the convergence history with k = h (left plot) and with k = h 3/2 (right plot) for both the L ∞ (L 2 )and L 2 (H 1 )norms. In the case k = h, we observe that the L 2 (H 1 ) estimator provide the required order of convergence as EOC ≈ 2, in close agreement with the corresponding error; the effectivity is in between 2.90 and 8.93. Also the L ∞ (L 2 ) estimator yields the correct rate as EOC ≈ 3, with effectivity between 47.41 and 63.41. For the case k = h 3/2 , we again observe the expected order of convergence of the L 2 (H 1 )norm error and estimator, while for the L ∞ (L 2 )-norm we have an EOC of 4.64 and 4.72, respectively, corresponding to the convergence rate expected in time, thus indicating that the time discretisation error dominates in this case. The effectivity is approximately 5.28 and 7.16 for the L 2 (H 1 )and L ∞ (L 2 )-norm estimators, respectively.
The numerical results corresponding to k = h are shown in the left plot of Fig. 2. We observe that our error estimators provide the expected order of convergence in both the L 2 (H 1 )and L ∞ (L 2 )-norms.
The results obtained with the choice k = h 4/3 are reported on the right plot of Fig. 2. Again we observe an optimal experimental order of convergence as EOC ≈ 2 for both the L 2 (H 1 )-norm estimator and error. The respective experimental order of convergence of the L ∞ (L 2 )-norm estimator and error are EOC ≈ 4, corresponding to the optimal convergence rate with respect to the timestep size. In both cases, the estimators' effectivities show little differences with the corresponding values obtained in Example 1A and are, therefore omitted for brevity.

Example 2: A Nonlinear Problem
On I × := [0, 1] × [0, 1] 2 we consider the semilinear problem (2) with f = −u 2 + f (x, y, t), withf such that the exact solution is given by u(t, x, y) = sin(πt) sin(π x) sin(π y); (64) note that we have r = 1 in this case. We test the respective a posteriori error bounds from Theorems 4.9 and 4.10. We test the dG(1)-cG(2) scheme, by considering the two choices k = h and k = h 3/2 with corresponding numerical results in the left and right plots of Fig. 3, respectively. This results are in line with those of the linear example above. In particular, for k = h we again observe good agreement between the estimators and the corresponding errors, with EOC ≈ 2 and EOC ≈ 3 for the L 2 (H 1 )and L ∞ (L 2 )quantities, respectively. The results corresponding to k = h 4/3 are also confirming the theoretical asymptotic rate of convergence. For the L 2 (H 1 )-norm estimator and error we have EOC ≈ 2 and, similarly to the linear problem considered earlier, for the L ∞ (L 2 )-norm estimator and error we have EOC ≈ 4.5. The effectivity index was found to be between 1.07 and 12.18 in all computations.

Conclusions
A posteriori error bounds in the L ∞ (H)and L 2 (V)-norms for the hp-version dG timestepping scheme coupled with conforming finite elements in space for semilinear evolution problems have been derived and tested numerically. The numerical experiments show that the a posteriori error estimators are optimal, reliable, and efficient. An interesting aspect of the a posteriori analysis concerning implicit-explicit time stepping methods, is that no a priori CFL type conditions are required for the validity of the conditional a posteriori error bounds. Hence, the a posteriori estimators remain reliable even for unstable combinations of local spatial and temporal mesh sizes. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.