Numerical modeling and open-source implementation of variational partition-of-unity localizations of space-time dual-weighted residual estimators for parabolic problems

In this work, we consider space-time goal-oriented a posteriori error estimation for parabolic problems. Temporal and spatial discretizations are based on Galerkin finite elements of continuous and discontinuous type. The main objectives are the development and analysis of space-time estimators, in which the localization is based on a weak form employing a partition-of-unity. The resulting error indicators are used for temporal and spatial adaptivity. Our developments are substantiated with several numerical examples.


Introduction
Space-time methods for solving differential equations with Galerkin-type finite element methods go back to [50] and a recent state-of-the-art summary was compiled in [40].Space-time methods can be divided into two categories: the numerical solution and error estimation.In this work, we are primarily interested in the latter.After the previously mentioned early work, various problems have been considered in space-time formulations such as incompressible flow [61], first-order hyperbolic systems [15], elastic wave equation [31,32,34], visco-acoustic/visco-elastic wave equations [16], financial mathematics [28], the Biot equations in poroelasticity [6], and fluid-structure interaction [30,63,64,62,60,26].Advancements for the numerical solution by means of space-time methods, such as for example multigrid methods, were undertaken in [49,27,55] and with space-time domain decomposition [58].
Employing the dual-weighted residual method [8,9] for space-time goal-oriented error estimation comes with the challenge that the adjoint problem is running backwards-in-time.For nonlinear problems, this means that the primal solution must be available at the respective time points.This can be done by simply storing all primal solutions in the RAM (random access memory) or hard disk, or by using checkpointing techniques [56,43].
The main objective in this work is to combine space-time concepts from [57] with an easy-toimplement partition-of-unity localization proposed in [53].The latter was established for stationary problems, which is extended in this study to space-time error estimation.Two error estimators are proposed: joint and split.Because of Galerkin orthogonality, we need higher-order information in the adjoint problem for calculation of the primal residual estimator.There are different ways to achieve this.For stationary problems a mixed order approach is often used.There, we discretize the primal problem with the low order cG(s)dG(r) elements and the adjoint problem with high order cG(s+1)dG(r+1) elements.The notation was proposed in [23,20] and means that spatial discretization is based on cG(s) or cG(s + 1) continuous Galerkin finite elements respectively, where s ∈ N indicates the polynomial degree, while temporal discretization is based on dG(r), discontinuous Galerkin finite elements, where r ∈ N 0 indicates the temporal polynomial degree.
This approach needs interpolation operators to calculate the low order adjoint solution.In the equal low order approach both problems are discretized with low order elements and the high order solutions are obtained by a suitable patch wise reconstruction operator.For the adjoint estimator higher-order information in the primal problem is needed as well.For the two earlier approaches this can again be calculated by patch wise reconstruction.Alternatively, in the equal high order approach both problems can be discretized with high order elements.Then, only interpolation operators are needed but the whole solution becomes more expensive.Additionally, these approaches can be mixed by using different approaches for temporal and spatial discretization.From the resulting a posteriori error estimates, local error indicators are extracted to establish adaptive algorithms for both temporal and spatial mesh refinement.For verification, error reductions and effectivity indices are observed.Some preliminary results were published in the conference proceedings papers [66] (heat equation) and [67] (low Mach number combustion).Moreover, the successfull application to incompressible flow is documented in [54] and a summary of all developments will appear in [65].However, technical derivations and the theory have not been worked out therein.Moreover, the current work provides (for the first time) detailed computational comparisons of the joint and split error estimators in terms of effectivity indices as well as thorough investigations of the PU-DWR method in a space-time context.
The outline of this paper is as follows: In Section 2, the primal problem statements are provided including their space-time discretizations.Next, in Section 3, the dual-weighed residual method is recapitulated.Afterward in Section 4, partition-of-unity DWR space-time goal-oriented a posteriori error estimators are proposed.In Section 5, three numerical examples are studied in order to substantiate our algorithmic developments.We summarize our work in Section 6.

Space-time notation and problem formulations
In this section, we introduce notation and space-time spaces.Then, abstract forms on the continuous level, semi-discrete in time level, and fully discrete level are introduced.These are subsequently realized with specific problem statements, namely the heat equation and a combustion problem, respectively.The road map of our developments is first based on abstract derivations as we believe that with this knowledge our results can be more easily applied to other problem statements such as incompressible flow, e.g., as already done in [54], and further nonstationary, nonlinear, coupled problems.

Notation and spaces
The space time domain is denoted by Σ ⊂ R d+1 with Σ = {Ω(t) ⊂ R d : t ∈ I}, with the temporal interval I = (0, T ) and spatial domain Ω.In this paper, we consider time-independent domains Ω, resulting in Σ = Ω × I.
Here, Γ D ⊆ ∂Ω denotes the Dirichlet boundary with condition v(t) = 0 on Γ D .For inhomogeneous conditions i.e. v(t) = g(t) on Γ D the ansatz space is X(I, V ) + g, we will however limit derivations to the homogeneous case for the sake of brevity.On X(I, V ) we can use the L 2 (I, L 2 (Ω)) scalar product Since we want to use discontinuous Galerkin discretizations in time we have to define an infinite dimensional space X(T k , V (Ω)) that allows for jumps at the grid points of the temporal mesh T k .To obtain T k we decompose the temporal interval hold.Then, T k is the collection of all intervals from I 1 to I M .Following the nomenclature of [14] we can define the broken Bochner space For each local space the continuous embedding W (I m , V (Ω)) ⊂ C( Īm , H(Ω)) holds such that the limits from above and below, i.e. v ± m := lim ε→0 v(t m ± ε) are well defined.
Using these we can define the jump across temporal intervals as for all inner temporal grid points t m ∈ T k .Additionally, we introduce [v] 0 as a shorthand for the weakly imposed initial conditions, i.e. [v] Finally, let A : X × X → R be a semi-linear form representing the PDE (partial differential equation) in weak form, being nonlinear in the first argument and linear in the second argument.Let F : X → R be a linear form representing given right hand side data.Then, the abstract problem reads: Problem 2.1 (Abstract form on the continuous level).Find u ∈ X such that We provide strong forms and the respective weak forms in terms of (7) for the heat equation and a combustion problem in Section 2.3 and Section 2.4, respectively.

Discretization
In principle, the discretization steps are the same for all parabolic problems.Since we want to be able to have different trial functions in time and space, we split the discretization, starting with the temporal basis functions.

Semi-discretization in time
Given the temporal triangulation T k as defined above we can obtain the semidiscrete space by discretization of the temporal functions into piecewise polynomials of degree r ∈ N 0 : With A(u k )(φ) and F (φ) depending on the actual problem, the general time-discrete weak dG(r) formulations reads:

Fully discrete abstract problem
For the spatial discretization we use triangulations T m h of Ω, where m = 1, . . ., M indicates each temporal subinterval I m .These are decomposed into quadrilateral/hexagonal elements K and we use continuous test and trial functions of degree s resulting in V s h (T m h ) ⊂ V (Ω) defined as We notice that we can have different triangulations on different subintervals, resulting in timedependent or dynamic meshes.Using this, we can define the fully discrete function space : With these definitions, we obtain the fully discrete cG(s)dG(r) formulation: Specific realizations are provided in the following two subsections, Section 2.3 and Section 2.4, respectively, by setting

Heat equation
Having the abstract formulations on the continuous, semi-discrete (in time) and fully discrete levels at hand, we now proceed and provide two specific realizations.First, the heat equation is considered in this subsection and then a combustion problem is introduced in the next subsection.Let u : Σ → R be the solution of the heat equation for a given initial condition u 0 ∈ H and right hand side function f ∈ L 2 (0, T ; V * ).Using the discretization steps as described above, we obtain the following equations

Combustion
The following coupled nonlinear PDE describes the temperature dependent reaction and diffusion of a combustible substance without the influence of an additional fluid flow (v ≡ 0); see [36].Therefore, the low Mach number hypothesis holds and the fluid flow is not influenced by the reaction and can be ignored.The resulting equations for the dimensionless temperature θ : Σ → R and the species concentration Y : Σ → R are with the combustion reaction described by Arrhenius law The Arrhenius law is parametrized by the Lewis number Le > 0, the gas expansion α > 0 and the dimensionless activation energy β > 0. We want to be able to allow all three common types of boundary conditions i.e. those of Dirichlet, Neumann and Robin type.For this we split the boundary ∂Ω into three non-overlapping parts Γ D , Γ N and Γ R .The Dirichlet boundary conditions are built into the function spaces, which is the usual approach.The Neumann and Robin boundary conditions are given by It remains to state the initial conditions: By following the typical steps for the derivation of a weak formulation, integration by parts in space, and subsequent summation, we obtain By introducing jump terms as described earlier, we obtain the following semi-linear and linear forms, respectively: with u kh = (θ kh , Y kh ) and φ kh = (φ θ kh , φ Y kh ).

General formulation of parabolic problems
A general parabolic weak formulation that includes the previous problem statements can be stated by with an elliptic operator a(u, φ) := T 0 ā(u(t), φ(t))dt.Then, (non-)linearity of A solely depends on the (non-)linearity of ā.
Remark 2.4.We notice that additional terms due to Neumann or Robin boundary conditions would appear inside ā(u(t), φ(t)) and/or F (•).

Numerical solution
In the algorithmic realization, we notice that the choice of trial and test spaces allow for a decoupling of the temporal discretization into slabs, i.e. slices of the space-time cylinder.In the simplest case a slice just encompasses a single temporal interval, resulting in a sequential time-stepping scheme due to the dG(r) test functions, and therefore effectively yielding classical time discretization schemes.For dG(0) an implicit Euler-type scheme is recovered.At each time slab, the spatial problems are solved as described in the following.
For the basic implementation of the (linear) heat equation with a classical DWR error estimator, we refer to the dwr-diffusion package [35] and the solvers implemented therein.There, the sparse direct solver UMFPACK [13] is used for the linear equation systems.
For the nonlinear combustion equation we employ a classical Newton-type solver as briefly described in the following.In the space-time setting we have to solve where u kh is the complete space-time solution over all intervals.Given an initial guess u 0 kh , find the update δu ∈ X r,s kh (T k , T 1,...,M h ) of the linear defect-correction problem for j = 0, 1, 2, . . .
Remark 2.5.As we have discontinuous test functions the nonlinear problem can be decoupled into one subproblem per time interval.Then, we obtain a time-stepping scheme with one Newton loop per interval.
The defect-correction problems are solved using the parallel sparse direct solver MUMPS [1].The Jacobian A ′ (•)(•, •) is derived by using analytical expressions, e.g., [70,Chapter 13], in order to maintain superlinear (or quadratic) convergence.However, for j > 0 reassembly of the matrix is omitted if the relative reduction of the residual is below a certain threshold.This simplified Newton approach saves a lot of computational time as the factorization only needs to be done after a reassembly step.This approach also benefits from preconditioned Krylow methods as the preconditioner is only recomputed after reassembly.The step size α is is determined by a damping line-search after solving the defectcorrection problem.

Error representation and estimation
Let J : X → R be some goal functional representing some quantity of interest (QoI).The general form reads where T 1 , T 2 ∈ I, e.g., T 1 = 0 and T 2 = T and T is the end time value.We are interested in the discretization error J(u) − J(u kh ) and more specifically to minimize this error for a reasonable computational cost: min J(u) − J(u kh ) This becomes a constrained optimization problem since the solutions u ∈ X and u kh ∈ Xr,s k,h are obtained as PDE solutions from ( 7) and (11), respectively.These PDE problem statements are seen as constraint in terms of the optimization problem.Consequently, for a given goal functional J(u) we want to solve the following optimization problem min This problem setting is the same in [9][Section 2.2, (2.13)].Clearly, after the discretization, we can then measure the discretization error J(u) − J(u kh ).We notice that from an optimization viewpoint J(u kh ) is a constant in (29) and therefore implicitly contained therein modulo the constant shift.We apply the method of Lagrange multipliers for this constrained optimization problem (see e.g., [9]) and introduce the dual variable z ∈ X(I, V ).To account for the discontinuities in the primal problem we define a discontinuous Lagrange functional L(u, z) : Note that A(u)(z) and F (z) contain the jump terms and weakly imposed initial conditions.For stationary points (u, z) ∈ X(I, V ) × X(I, V ) all jump terms vanish and the initial conditions are met exactly, such that L(u, z) is consistent with the continuous functional L(u, z).The first order optimality conditions yield the original primal problem A(u)(φ) = F (φ) as well as the adjoint problem: Find z ∈ X(T k , V ) such that Remark 3.1.Note that the test and trial functions are switched in (31).Accordingly, the temporal derivative is now applied to the test function ψ.To rectify this, we apply integration by parts to the corresponding scalar product, obtaining: The negative sign of the temporal derivative means the adjoint problem has to be solved backwards in time, with an initial value z M depending on the goal functional.
For functionals evaluated on the whole temporal domain z M = 0 holds and for functionals defined only at T J ′ u (u(T ), ψ(T )) can be reinterpreted as an initial value z M for details see e.g.[43].
For linear PDEs and linear goal functionals we obtain the exact error representations (see [9]): In the following, we focus on the primal error representation.As we can see we would need both the exact dual solution z and the discrete dual solution z kh .As this is infeasible for complicated problems we use a discrete solution of higher order for z.In the equal low order approach both primal and adjoint problem are discretized using the same low order elements.The higher order adjoint solution is then obtained by a patch wise reconstruction.This reconstruction is described in detail in [57].In the mixed order approach the adjoint problem is discretized by higher order elements and the solution is used as the representation of the exact solution.The fully discrete solution is then obtained by interpolation into the lower order space.It is also possible to use different approaches in time and space e.g.discretizing the primal problem with cG(1)dG(0) and the adjoint problem with cG(2)dG(0).Additionally, (33) can be split into a temporal and a spatial part by introducing the semidiscrete adjoint solution z k such that where temporal and spatial errors are given by, respectively, That way (36) can be used for temporal refinement and ( 37) for spatial refinement.

Adjoint problem statements
For A gen. the left hand side of the adjoint problem (31) in the space-time context explicitly reads as As an example, for the combustion problem described in the Section 2.4 we obtain F = F comb. as well as the operator of the semi-linear form and its directional derivative, respectively, Therein, ω ′ (u)(ψ) is the directional derivative of ω(u) into the direction ψ.

Goal-oriented error representations
As there are two ways to separate the full estimator into parts, we want to use a clear and precise terminology to distinguish between those two.Firstly, we can separate by the problem residuals we compute.This gives the primal and adjoint/dual estimators.Oftentimes, the average of those two parts is also called mixed estimator instead of full estimator.
Secondly, we can split the difference between the exact and the fully discrete solution by introducing the time-discrete solution.Not doing so we will call the resulting estimator the joint estimator.If we calculate both the temporal and spatial error estimator, we will call the sum of both the split error estimator.As both seperations can be done simultaneously, combined expressions like split primal estimator or joint dual estimator are possible.
For nonlinear problems we obtain the following error representation [9]: Theorem 3.2 (Joint error identity).Let the primal problem and adjoint problem be given.Let ).Then, we have the space-time joint error identity with the primal error estimator ρ and the adjoint error estimator , as well as a remainder term R kh of higher order.
) and J(u kh ) = L(u kh , z kh ) the assumptions of [57][Proposition 3.1] hold, proving the representation.Theorem 3.3 (Split error identity).With the previous assumptions, we have the split error identity Proof.The proof follows the same ideas as before but with

Error estimators
From the previous error identities, we obtain error estimators in four variants.First, we have the full error estimator However, the unknown solutions u and z still enter.This is already an error estimator, because for cases where u and z are known, we can already estimate (discretization) errors in goal functionals.Of course, for most problems in practice, this first version does not play a role.
To this end, higher-order approximations u ∈ X and z ∈ X are introduced [9].Examples of such approximations are u := u r+1,s+1 kh and z := z r+1,s+1 kh such that we obtain the computable error estimator If the remainder term is omitted (which is indeed usually done in practice) we obtain the practical error estimator Finally, we also introduce the primal-based error estimator: As we also need higher order information of the primal problem for ( 43) and ( 44) to calculate the adjoint estimator we now have three possible approaches.In addition to the two previous discretization approaches the equal high order approach uses a higher order element discretization for both the primal and adjoint problem.Then, interpolation into a lower order element space yields both u kh and z kh .However, inserting the interpolated u kh into the goal functional yields worse results compared to a native low order solution, so ideally the primal problem should also be solved in low order to calculate the functional values.Various algorithmic realizations with corresponding theoretical results, and performance analyses for stationary problems were recently established in [19].

Error localization
In this section, we address our key development, namely the construction of a space-time partitionof-unity (PU) localization of goal-oriented a posteriori error estimators.In the published literature as mentioned in the introduction, so far only stationary cases have been addressed with the PU localization.Here, we first extend the idea to time-dependent problems.Since we want to use the DWR error estimator for grid refinement, we need to split the estimator into element-or DoF-wise error contributions.Three known approaches are the classical integration by parts [9,5], a variational filtering operator over patches of elements [11] and a variational partition-of-unity localization [53].For stationary problems, the effectivity of these localizations was established and numerically substantiated in [53].First, in Section 4.1, we exemplarily derive the variational partition of unity approach for our spacetime estimator of the heat equation.Second, Section 4.3 focuses on details of the actual evaluation including the needed interpolation operations.Next, we list in Section 4.4 the resulting error indicators, finally followed by the adaptive algorithms designed in Section 4.5.

The partition-of-unity approach for the heat equation
In this key section, we extend the ideas from [53] and apply a partition-of-unity (PU) localization to a space-time error estimator.To this end, we first design the PU space.The simplest choice is k,h , i.e. a cG(1)dG(0) discretization.Effectively, this yields one spatial partition of unity (χ i,m ) As this is a Lagrangian finite element, we have immediately Proof.Follows immediately from the properties of the finite element functions.
Remark 4.2.Another common choice for the PU space is V P U = X 1,1 kh , i.e. a cG(1)cG(1) discretization, for example in native d + 1-dimensional discretizations [17].In general this ensures a coupling between neighboring temporal elements to address the problem shown in [12].However, for discontinuous Galerkin discretizations the dominating edge residuals, i.e. jump terms, are explicitly included in the estimator.Remark 4.3.Clearly, using a dG discretization in time yields a natural decoupling for which the space-time PU reduces effectively to a PU in space.Nonetheless, we formulated our concepts using a space-time PU as the methodology applies (see again [17]) to a larger class of problems.As our work is one of the first into this direction in the literature, our aim is to provide the full methodology in order to have a starting point for further future work.
In the following, ũ ∈ X and z ∈ X denote approximations of the exact solutions.In principle the joint and split estimators only differ in the interpolation difference i.e z − z kh or z − z k and z k − z kh respectively.
For the joint estimator the local contributions are summed over all DoFs of a fixed interval to obtain the corresponding temporal estimator.Subsequent summation over all time intervals yields the global estimator.For the split estimators the spatial estimator is summed over all space-time DoFs and the temporal estimator is summed over all time intervals.The total error estimator is then the sum of these two error parts.Proposition 4.4 (Primal joint error estimator for the heat equation).For the space-time formulation of the heat equation, we have the following a posteriori joint error estimator with partition-of-unity localization: with the error indicators Proof.We start from Theorem 3.2 with which yields Considering the primal part only (see (45)), gives us Inserting the PU (46) yields Then, employing the definition of the primal residual leads to Here, we employ the left hand side and right hand side of the heat equation, namely ( 13) and ( 14), respectively, by replacing the test function φ kh by the PU-weighted adjoint sensitivity measure (z − z kh )χ i,m .Finally, the (unknown) solution z is approximated by some higher order representation z from which we obtain the assertion.Definition 4.5 (Effectivity and indicator indices).We notice that the effecitivity index is defined as Applying the triangle inequality on η joint yields a more strict criterion, i.e., here from which the so-called indicator index can be defined.
Remark 4.6.Note that we only use the temporal part for marking time steps and calculating the global estimator.Since the indicators for each time step are obtained by summing over all elements in said time step the spatial PU χ i,m effectively cancels due to the PU property.As a consequence, the spatial PU can be omitted directly in the computation of the temporal indicators.
Proposition 4.7 (Primal split error estimator for the heat equation).For the space-time formulation of the heat equation, we have the following a posteriori split error estimator with partition-of-unity localization: with the temporal error indicators and the spatial error indicators Proof.For the primal split error estimator we start with Theorem (3.3).Then, we proceed for both parts with the primal estimator: Combining the left hand sides to the full error estimator and utilizing the PU in a similar way as the proof of Proposition (4.4) yields Here, the error indicators are obtained as in the proof of Proposition (4.4), which yields the assertion.
Remark 4.8.Note that the basis functions are globally defined, so the DoF-wise errors contain an implicit sum over all elements in practice.However, calculating element based estimators by constraining the spatial integrals to each element K yields the unlocalized estimator, as the sum over all χ i,m on a single element is 1 and effectively cancels out.To use well-known element based marking strategies the DoF-estimators have to be calculated globally.Afterwards element wise estimators can be calculated by summing all estimators belonging to the DoFs of the corresponding element: where • stands for h or kh.
Proposition 4.9 (Adjoint joint error estimator for the heat equation).For the space-time formulation of the heat equation, we have the following a posteriori joint error estimator with partition-of-unity localization: with the error indicators Proof.We start from Theorem 3.2 with which yields Considering the adjoint part only, gives us Utilizing the PU as in the proof of Proposition (4.4) and the respective definition of J in (28) and the adjoint A ′ u of the heat equation by again approximating u by some higher-order approximation ũ yields the assertion.Proposition 4.10 (Adjoint split error estimator for the heat equation).For the space-time formulation of the heat equation, we have the following a posteriori split error estimator with partition-of-unity localization: with the temporal error indicators and the spatial error indicators Proof.The proof starts as in Proposition 4.7, but now taking the adjoint residual.The rest is then a combination of the previous three propositions and follows conceptionally the same lines.
Remark 4.11.We emphasize that in this work, we estimate discretization errors only.The spacetime extension of stationary PU-DWR versions such as [44,52,18] to linear or nonlinear iteration errors in our current space-time setting is part of future work.

The partition-of-unity approach for the combustion problem
In this section, we state the error estimator of the combustion problem: Proposition 4.12 (Primal split error estimator for combustion).Let us assume homogeneous boundary conditions on Γ N as well as for the species concentration on Γ R .For the temperature we have the cooling conditions κθ + ∂ n θ = 0 on the Robin boundary, such that Then, we have the following a posteriori primal split error estimator with partition-of-unity localization for the space-time formulation of the time-dependent combustion problem: with the temporal error indicators and the spatial error indicators Proof.We start as in Proposition 4.7 and employ for the primal residual and the right hand side the weak form of the combustion problem, i.e., (24) and (25), respectively.
Proposition 4.13 (Adjoint split error estimator for combustion).Using the same boundary conditions, we have the following a posteriori adjoint split error estimator with partition-of-unity localization for the space-time formulation of the time-dependent combustion problem: with the temporal error indicators , and the spatial indicators Proof.We start as in Proposition 4.10 and the respective definition of J in (28).In contrast, the adjoint A ′ u is now derived from the combustion system by approximating u = (θ, Y ) by some higherorder approximation ũ = ( θ, Ỹ ) yields the assertion.

Evaluation of the space-time PU-DWR
For the practical evaluation we need to properly define the interpolation differences.Depending on the approach, we need interpolations from a high order space into a low order space and reconstructions the other way around.In space we denote them as : X r,s k,h → X r+1,s k,h .In the following, we take a closer look at the interpolation difference for a higher order solution as used in the mixed and equal high order approach, i.e. the interpolations i .After that we write down the resulting localized error estimators for each PU-DoF.For good visual representations of the high order reconstructions based on a low order solution see [53] and [57] for the spatial part i respectively.For our visualization the high order space X 1,2 k,h and the low order space X 0,1 k,h are used.Then, u k and z k are elements of X 0,2 k,h .Furthermore, we notice that in this specific case of piecewise constant discrete solutions the identities u − k (t m ) = u + k (t m−1 ) and u − kh (t m ) = u + kh (t m−1 ) hold.With this, we obtain the following operators: Definition 4.14 (Temporal interpolation operators for r = 0).The interpolation operator from piecewise linear elements to piecewise constant elements reads as The reconstruction of the high order solution, i. e. the other way around, is obtained by linear interpolation For the spatial interpolation operator i 2 h we use a linear finite element ansatz with the vertex DoFs of the spatial triangulation.Using i 2 h on the temporally interpolated solution i k z yields i kh z and vice versa.To illustrate this, Figure 1 shows the different interpolation levels for a single 1+1D finite element.
For z − kh (t m ) we have the same interpolations on ẑ as for u − kh (t m ) on û.
For these finite element spaces we can use the midpoint rule with t 0 = (t m+1 −t m )/2 for all temporal integrals when the resulting terms are linear in time.For temporal nonlinearities and higher-order right hand side functions f , higher-order quadrature rules, usually Gauss quadratures, have to be used.Remark 4.16 (Reconstructions for the adjoint estimator).For the adjoint estimator we additionally need u k and u.The semi-discrete u k can be obtained the same way as z k , but for u we need to change the interpolation direction, which results in Remark 4.17.Finally, we comment on the treatment of pointwise evaluations such as in Proposition 4.10 and Proposition 4.13 in the terms , respectively.As an example, we explain more details for the heat equation.Due to the reverse construction into a higher order space using i ) for which we deal with a jump at t m , which in general is not identically equal to zero.Moreover, we note that these jump terms include a spatial integration which is done by Gauss-Legendre quadrature, i.e. with quadrature points that do not lie on the boundaries of the spatial elements.Therefore, and corresponding terms will in general be nonzero as well, irrespective of the spatial interpolation.Conversely, if the difference is zero then the low order solution is an exact representation of the high order solution such that no refinement is needed.

Error indicators in space and time
With the previous evaluations, we can now define the respective indicators in space and time for the heat equation and the combustion problem.Note that the temporal derivative ∂ t u kh vanishes for u kh ∈ X 0,s k,h , i.e. piecewise constant elements in time.

Natural PU cG(1)dG(0)
Employing the previously introduced PU, namely cG(1)dG(0) yields to the following results for the error indicators.
Proposition 4.18 (Joint primal error indicator for the heat equation).We have the following joint error indicator for the heat equation and Proof.Starting from Proposition 4.7 and the representations of η m k and η i,m h , we evaluate the temporal integrals and employ the interpolation operators derived in Section 4.3 and obtain the error indicators η m k,heat and η i,m h,heat .
Using the same interpolations we obtain Proposition 4.20 (Split primal error indicators for combustion).For the combustion problem we have the following primal error indicators Proof.Starting from Proposition 4.12 and the representations of η m k and η i,m h , we evaluate the temporal integrals and employ the interpolation operators derived in Section 4.3 and obtain the error indicators η m k,combustion and η i,m h,combustion .

Remark 4.21 (Identity of the indicator variants). Since
such that the choice between the two indicator variants is only important for adaptive refinement.If one is only interested in estimating the error then both variants are identical.

Alternative PU cG(1)cG(1)
To investigate the impact of the choice of the PU space we derive the split primal indicators for the heat equation based on a cG(1)cG(1) PU.Now, the right hand side integration might need even higher order quadrature rules.Apart from that, the highest temporal order in the temporal indicators is quadratic (constant primal solution, linear adjoint solution and PU) so we apply Simpsons rule instead.Additionally, we obtain two sets of spatial and temporal indicators per interval I m , i.e.
The temporal indicators for refinement of I m are then obtained by For the spatial element indicators we first have to interpolate the indicator vectors (η Employing these derivations for the new choice of the PU, Simpson's rule for quadrature in time, and then proceeding as in Section 4.4.1, we obtain the following results. and as well as and For the general adjoint estimator we also need z kh and u k from I m+1 which can be obtained by the interpolations described in ( 63) and ( 64) respectively.Additionally we only look at goal functionals of the types which are essentially interchangeable in the following formulas.Therefore, we only write down the indicators for J 1 .
Proposition 4.23 (Joint adjoint error indicator for the heat equation).We have the following joint error indicator for the heat equation and All functionals we want to examine for the combustion equation are only dependent on u kh and the primal weight, which is at most linear in time.Therefore, we can simplify the estimator by also applying the midpoint rule to the functional.

Adaptive algorithm
There are multiple options for solving the primal and adjoint problems, i.e. time-stepping, timeslabbing and fully simultaneous space-time, which all need adjustments to marking and refinement.However, all simulations shown here are time-stepping based so we limit ourselves to this approach.Again, we note that in a space-time context the adjoint problem runs backward in time.Then, all information is collected to evaluate the error estimators.Since one of the error components might dominate, we employ an equilibration for time-stepping as proposed in [57].This will sometimes restrict refinement to space or time.The overall procedure follows the typical loop: solve, estimate, mark, and refine.
For error estimation in time-stepping we obtain Algorithm 1. There, the main choice is in whether to use the split or joint estimators.Note that in the case of the joint estimator, the indicators η m kh and η m, * kh are calculated by summation of the spatial indicators.Having calculated the estimators we mark and refine elements by following Algorithm 2. There, the equilibration is done by first calculating the global estimators.Note that these coincide in the joint case, such that no equilibration is performed.with η s/ s kh as the general estimator for ûkh ∈ X 0,s k,h and ẑkh ∈ X 0, s k,h as defined in Propositions 4.19 and 4.24 for the primal and adjoint estimators respectively.All estimators are computed using Algorithm 1.
Table 1 shows that all approaches yield effectivity indices very close to 1.We also notice, that the biggest difference between primal and adjoint estimator is obtained for the mixed order approach.However, this is not surprising as the approach is tailored to the primal estimator and calculating the adjoint estimator could be seen as questionable for the two following reasons.When interpolating the higher order solution for the adjoint problem for z kh we do not obtain the optimal z kh compared to solving with bilinear elements directly.Additionally, reconstructing the higher order primal solution yields a worse approximation compared to solving directly with biquadratic finite elements.Furthermore, we notice that we use a lot more elements in time than in space.In space-time the estimator is dependent on a good balance between space and time discretization.For the sake of brevity we investigate this further in the following configuration as the solution is much more interesting.

Problem statement
This test case was designed in [29].We again solve the heat equation.The manufactured solution is a rotating hill on a unit-square spatial domain Ω = (0, 1) 2 in the time interval (0, T ), T = 1.The manufactured solution is given as The right hand side of the problem is obtained as in Section 5.1 by inserting this solution into the heat equation.Additionally, all definitions for η s/ s kh and I s/ s eff are the same using again Algorithm 1 to compute the indicators.

Goal functional
Since we are interested in capturing the local behaviour of the solution, we choose the L 2 -error as functional of interest, i.e.

Comparison of the different spatial approaches
We start by looking at the exact error and the resulting estimators for different choices of M initial .We see that the estimators in Tables 2, 3, 4 and 5 are converging to relatively stable values with rising M initial .We can also see that since the temporal elements are only doubled while the spatial elements are quadrupled, the initial number of temporal elements has to be large enough for uniform refinement.In adaptive simulations we can control this better as we can choose different fractions of spatial and temporal elements to be marked for refinement.
The equal low order estimators (Table 3) are underestimating the error with an effectivity of roughly 0.34 − 0.40 for the coarsest mesh.This gets much better after the two refinements where the estimator is close to the exact L 2 -error.However, as the reconstruction effectively works on an even coarser mesh this is not surprising.
The mixed order estimators (Table 4) are less dependent on the spatial mesh size but overestimate the error with an effectivity of 1.20 − 1.48.We can also see that the overestimation gets smaller with rising M initial but this is of course an additional cost factor.
The equal high order estimators (Table 5) are also less dependent on the spatial mesh size, but they also use double the amount of degrees of freedom for the primal problem.They also overestimate the error but for large enough M initial the effectivity is less than 1.10.However, Table 6 shows that solving the primal problem with biquadratic elements and using a bilinear interpolation between the vertices does not recover the best approximation u kh .In practice this means that the primal problem should be solved natively with bilinear elements to obtain the solution for which the error is actually estimated, which leads to additional costs.This gets especially expensive for nonlinear problems.Note that in this particular case the resulting adjoint solution and consequently the estimators would also be different as the L 2 error itself factors into J ′ u , so the accuracy of the estimator could be better.In conclusion, both reconstructing a higher order solution and natively solving the primal or adjoint problem with higher order elements in space work well for linear problems, but the reconstruction is cheaper and leads to a better estimator for fine enough meshes.How the low and mixed order approach perform for higher order finite elements and corresponding errors could be subject of further studies.

Comparison of adaptive refinement to the original computations
For this configuration we want to compare our results to those described by Hartmann [29], where the manufactured solution was first formulated.To our knowledge this configuration was not reproduced and published so far, so the original thesis is the only point of comparison.There, the classical estimator is used, which is obtained by partial integration to obtain a strong form with jump terms in space.There, Q 1 elements in space and dG(0) elements in time are used as well, but it is unknown which quadrature formula was used in time for the nonlinear f .We used the right box rule as this is what corresponds to the implicit Euler scheme and got our error (1.92e − 02) closest to the error of the original results (1.75e − 02).Table 7 shows our results with the split and joint estimators.These perform very well in comparison to the original Hartmann results from [29,Table 3.4].
Even though the marking strategy used by Hartmann is unknown, we got close to the number of temporal elements and the maximum number of spatial elements with fixed rate marking in Algorithm 2. In comparison, our estimators better localize the error as we get comparable errors with one less loop that additionally has a smaller maximum number of spatial elements.This can be seen in Figure 2, where the original results are only performing roughly as well as our computation with uniform refinement, while both PU-DWR estimators yield better convergence.We plotted the L 2 error against M * N max which is an upper bound for the actual number of space-time elements as no further information was available from the original computations.However, Table 8 shows that at least for our simulations the actual number is not too far from the upper bound.Additionally, we can see that, as expected, the split estimator outperforms the joint estimator.Finally, Figure 3 shows that the local refinement nicely matches the corresponding solution of Figure 4 and that the meshes are indeed changing over time.Here, we want to examine whether the additional coupling with a cG(1) partition-of-unity in time is beneficial for adaptive refinement.For this, we performed multiple adaptive simulations with the corresponding low, mixed and high order estimators.Figures 5, 6 and 7 show the exemplary results with an initial temporal mesh of 1600 elements with fixed rate marking of 60% in time and 40% in space.In all three cases the dG(0) PU performs better than the cG(1) PU, with the lowest difference for the equal high order estimator.For the equal high order case the L 2 -error is calculated on the interpolated primal solution, which does not recover the actual best approximation solution of directly solving the primal solution in the low order space, such that the initial error is higher than for uniform refinement.Finally, Figure 8 shows the results for the mixed order estimator with a finer initial temporal mesh, which are qualitatively the same as for Figure 6.
Overall we conclude that for discontinuous Galerkin discretizations in time the dG(0) partition-of-unity is completely sufficient in addition to being cheaper to compute and easier to implement.The final test case is as described in [57] (originally based on [36]) and some preliminary results were published in our prior work [67].Here, we solve the nonlinear combustion equations described in Section 2.4.

Parameters
The reaction parameters are a Lewis number of Le = 1, a gas expension of α = 0.8 and a dimensionless energy of β = 10.

Goal Functionals
The first functional we investigate is the average reaction rate i.e.
This nonlinear functional is defined on the whole space-time domain.For the second functional we calculate the average species concentration on the cooled rods Γ R i.e.
This is a linear functional, but it is only defined on part of the boundary.

Discussion of findings for J 1
For both functionals the indicators for η s/ s kh are computed using Propositions 4.20 and 4.25 and Algorithm 1 with u, u k , u kh and z, z k , z kh following from ( 63) and ( 64) with ûkh ∈ X 0,s k,h and ẑkh ∈ X 0, s k,h .Then, the full estimators are computed by taking the averages of the primal and adjoint indicators at each space-time Dof and summing over all DoFs.
Tables 9, 10 and 11 show the behaviour of the primal, adjoint and full estimators for J 1 respectively.We can see that all estimators with the mixed order and equal high order approach behave similarly.For both, the adjoint estimator and the resulting full estimator are overestimating the error by about two orders of magnitude.However, the primal estimators are not too far off.The equal low order approach yields the best results on the finest level, and the primal and adjoint estimators are comparable.It can be inferred that there is no benefit from calculating the full estimator in this case.Therefore, we use Algorithm 2 with fixed rate marking refining 50% of all temporal elements and on each interval 30% of all spatial elements based only on the primal indicators.
Figure 9 shows the error convergence under adaptive refinement when using only the primal estimator.When only counting the primal unknowns for both estimators, our findings for equal low order and mixed order are comparable, and both perform better than global refinement.However, when both the number of Dofs of the adjoint and the PU are additionally taken into account the low order approach clearly outperforms the mixed order approach.We can also see that the starting disadvantage of having the same error on the coarsest mesh with more DoFs is rectified by the first adaptive refinement step.Figure 10 shows the reaction rate and the corresponding grids for two different time points.We can see that the grid evolves nicely and follows the combustion reaction.This shows that our localization works well in capturing the physics and refining accordingly.Tables 12, 13 and 14 show the behaviour of the primal, adjoint and full estimators for J 2 respectively.The equal low order approach shows a similar behaviour to the previous functional.However, the overestimation of the adjoint estimators for the mixed order and high order approach is not as bad as for the nonlinear functional.On the other hand their respective primal estimators are underestimating the error by quite a bit.In the full estimators these over-and underestimations are cancelling out quite nicely such that this estimator would be more useful here.For this reason the simulations for Figure 11 were done by using the full estimator for adaptivity in Algorithm 2 with the same marking strategy as before.As with the first functional we see that both approaches yield comparable results when only counting the memory cost for solving the primal problem.We also see that both approaches eventually outperform uniform refinement when taking the total cost into account.But the equal low order approach is again and still the best choice.Figure 12 shows the species concentration and the refined grids based on J 2 at different time steps.We see that along the cooled rods the grid again follows the combustion reaction.As that is the area where we have changes in the concentration this fits well.We also see that the mesh is refined around the cooled rod once the reaction moved past them.Together with the convergence behaviour we see again that the novel localization works well.

Conclusions
In this work, we proposed partition-of-unity (PU) dual-weighted residual a posteriori error estimators and space-time adaptivity for linear and nonlinear partial differential equations.From the algorithmic side, the main novelties are the extension of the PU localization to space-time Galerkin finite element discretizations and the realization of split and joint error estimators.From the implementation side, despite starting from pre-implementations in the DTM package dwr-diffusion [35] and deal.II [3], extensive code developments and debugging was necessary, which greatly exceed existing implementations, specifically for the nonlinear features such as the nonlinear combustion PDE as well as nonlinear goal functionals.In three numerical examples, we studied in the detail the computational performance for the linear heat equation and also for a nonlinear low Mach number combustion problem.We also found that the equal low order approach yielded the best estimation and adaptive performance across the board and that a cG(1)dG(0) PU is sufficient for cG(s)dG(r) discretizations of the primal problem.Furthermore, an example of an immediate practical application of our framework can be found within the excellence cluster PhoenixD1 in which space-time methods and goal-oriented error estimation are of interest for the efficient solution of multiphysics problems and where the heat equation and the Navier-Stokes equations are needed.

Figure 1 :
Figure 1: Different interpolation levels on a single 1+1D space-time element ) with the time step size k m = t m − t m−1 .Proof.Starting from Proposition 4.4 and the representations of η i,m kh , we evaluate the temporal integrals and employ the interpolation operators derived in Section 4.3 and obtain the joint error indicator η i,m kh,heat .Accordingly, we have Proposition 4.19 (Split primal error indicators for the heat equation).The split indicators η m k,heat and η i,m h,heat for the heat equation are given by

Proposition 4 . 22 (
Split primal error indicators for the heat equation with cG(1)cG(1) PU).The split indicators for the heat equation are given by

Proposition 4 . 24 (
Split adjoint error indicators for the heat equation).The split indicators η m, * k and η m, * h,i for the heat equation are given by

Proposition 4 . 25 (
Split adjoint error indicators for combustion).For the combustion problem we have the following adjoint error indicators

Algorithm 1
ESTIMATE on a single interval I m , i.e. time-stepping based Require: ûkh on I m−1 and I m and ẑkh on I m−1 , I m and I m+1 interpolate/reconstruct ũ, u k , u kh as well as z, z k , z kh at quadrature points.Calculate η m ⋆ and η m, * ⋆ , where ⋆ denotes k or kh Calculate η i,m • and η i,m, * • for each PU-DoF i on T m h , where, again, • denotes h or kh

Algorithm 2 )
MARK and REFINE for the time-stepping approach Require: indicators on each interval I m and equilibration factor c > 0 Calculate global temporal estimator η k = if |η k | * c ≥ |η h | then mark I m for temporal refinement based on chosen strategy end if if |η h | * c ≥ |η k | then for m = 1, . . ., M do mark and refine elements in T m h based on chosen strategy end for end if for m = 1, . . ., M do if I m is marked then Split/Refine I m into two intervals with (possibly new) mesh T m h end if end for

Figure 6 :Figure 7 :Figure 8 :
Figure 6: Section 5.2: Error convergence of the Hartmann testcase for η 1/2 and M init = 1600 1e−04 3e−04 1e−03 3e−03 reaction is simulated in a rectangular channel of length L = 60 and height H = 16 in which two cooled rods of length L/4 and height H/4 are inserted into both channel walls at L/4.The reaction is solved for a total of T = 60 with 256 time and 896 space DoFs initially.The cooling of Γ R is described by the Robin boundary condition ∂ n θ = −0.1θ,with homogeneous Neumann conditions for the species concentration.The left wall Γ D is kept at a constant temperature of θ D = 1 without any combustible species Y D = 0.All other walls Γ N are described by homogeneous Neumann boundary conditions.An initial flame front is described by

Figure 9 :
Figure 9: Section 5.3: Error convergence for the reaction rate functional.On the left only the number of unknowns for the primal problem and on the right all unknowns are taken into account.

Figure 11 :
Figure 11: Section 5.3: Error convergence for the species concentration functional.On the left only the number of unknowns for the primal problem and on the right all unknowns are taken into account.

Table 1 :
Section 5.1: Performance of the primal (left) and adjoint (right) error estimators under global refinement for temporal dG(0) discretization of the adjoint equation.Averaged solution functional.

Table 2 :
Section 5.2: The exact L 2 error under global refinement for different initial (uniform) temporal grids and bilinear finite elements in space.

Table 5 :
Section 5.2:The primal equal high order error estimator η

Table 6 :
Section 5.2: The approximated L 2 error under global refinement for different initial (uniform) temporal grids and biquadratic finite elements in space, where the solution is interpolated down to bilinear elements.

Table 7 :
Section 5.2: Our results with the equal low order primal split (left) and joint (right) PU-DWR estimator with fixed rate marking of 95% in time and 40% in space.

Table 8 :
Section 5.2: Comparison of actual number of space time elements and estimation by M * N max loop M * N max (split) #elements(split) M * N max (joint) #elements(joint)

Table 9 :
Section 5.3: Performance of the primal error estimators under global refinement for J 1