1 Introduction

Space-time methods for solving differential equations with Galerkin-type finite element methods go back to [50] and a recent state-of-the-art summary was compiled in [40]. Space-time methods can be divided into two categories: the numerical solution and error estimation. In this work, we are primarily interested in the latter. After the previously mentioned early work, various problems have been considered in space-time formulations such as incompressible flow [61], first-order hyperbolic systems [15], elastic wave equation [31, 32, 34], visco-acoustic/visco-elastic wave equations [16], financial mathematics [28], the Biot equations in poroelasticity [6], and fluid–structure interaction [26, 30, 60, 62,63,64]. Advancements for the numerical solution by means of space-time methods, such as for example multigrid methods, were undertaken in [27, 49, 55] and with space-time domain decomposition [58].

Classical norm-based a posteriori error estimation was done for parabolic problems in [21, 22, 37, 39, 59, 68]. Goal-oriented error estimation of space-time problems was performed in [10, 43, 56, 57]. Therein, space-time formulations may serve three purposes: spatial error estimation, temporal error estimation [26, 28, 45, 46] or both simultaneously [4, 10, 15, 16, 56, 57]. Decoupling space and time for rate-dependent problems in elasto-plasticity was considered in [51]. Moreover, we mention space-time developments in PDE-based optimization with and without a posteriori error control [7, 24, 25, 33, 38, 41,42,43, 47, 48]. A brief review of space-time concepts for goal-oriented a posteriori error estimation in fluid–structure interaction for deriving the adjoint in goal-oriented error estimation and optimization was conducted in [69].

Employing the dual-weighted residual method [8, 9] for space-time goal-oriented error estimation comes with the challenge that the adjoint problem is running backwards-in-time. For nonlinear problems, this means that the primal solution must be available at the respective time points. This can be done by simply storing all primal solutions in the RAM (random access memory) or hard disk, or by using checkpointing techniques [43, 56].

The main objective in this work is to combine space-time concepts from [57] with an easy-to-implement partition-of-unity localization proposed in [53]. The latter was established for stationary problems, which is extended in this study to space-time error estimation. Two error estimators are proposed: joint and split. Because of Galerkin orthogonality, we need higher-order information in the adjoint problem for calculation of the primal residual estimator. There are different ways to achieve this. For stationary problems a mixed order approach is often used. There, we discretize the primal problem with the low order cG(s)dG(r) elements and the adjoint problem with high order \(cG(s+1)dG(r+1)\) elements. The notation was proposed in [20, 23] and means that spatial discretization is based on cG(s) or \(cG(s+1)\) continuous Galerkin finite elements respectively, where \(s\in {\mathbb {N}}\) indicates the polynomial degree, while temporal discretization is based on dG(r), discontinuous Galerkin finite elements, where \(r\in {\mathbb {N}}_0\) indicates the temporal polynomial degree.

This approach needs interpolation operators to calculate the low order adjoint solution. In the equal low order approach both problems are discretized with low order elements and the high order solutions are obtained by a suitable patch wise reconstruction operator. For the adjoint estimator higher-order information in the primal problem is needed as well. For the two earlier approaches this can again be calculated by patch wise reconstruction. Alternatively, in the equal high order approach both problems can be discretized with high order elements. Then, only interpolation operators are needed but the whole solution becomes more expensive. Additionally, these approaches can be mixed by using different approaches for temporal and spatial discretization. From the resulting a posteriori error estimates, local error indicators are extracted to establish adaptive algorithms for both temporal and spatial mesh refinement. For verification, error reductions and effectivity indices are observed. Some preliminary results were published in the conference proceedings papers [66] (heat equation) and [67] (low Mach number combustion). Moreover, the successfull application to incompressible flow is documented in [54] and a summary of all developments appeared in [65]. However, technical derivations and the theory have not been worked out therein. Moreover, the current work provides (for the first time) detailed computational comparisons of the joint and split error estimators in terms of effectivity indices as well as thorough investigations of the PU-DWR method in a space-time context.

The outline of this paper is as follows: In Sect. 2, the primal problem statements are provided including their space-time discretizations. Next, in Sect. 3, the dual-weighed residual method is recapitulated. Afterward in Sect. 4, partition-of-unity DWR space-time goal-oriented a posteriori error estimators are proposed. In Sect. 5, three numerical examples are studied in order to substantiate our algorithmic developments. We summarize our work in Sect. 6.

2 Space-Time Notation and Problem Formulations

In this section, we introduce notation and space-time spaces. Then, abstract forms on the continuous level, semi-discrete in time level, and fully discrete level are introduced. These are subsequently realized with specific problem statements, namely the heat equation and a combustion problem, respectively. The road map of our developments is first based on abstract derivations as we believe that with this knowledge our results can be more easily applied to other problem statements such as incompressible flow, e.g., as already done in [54], and further nonstationary, nonlinear, coupled problems.

2.1 Notation and Spaces

The space time domain is denoted by \(\Sigma \subset {\mathbb {R}}^{d+1}\) with \(\Sigma = \{\Omega (t)\subset {\mathbb {R}}^d:t \in I \}\), with the temporal interval \(I=(0,T)\) and spatial domain \(\Omega \). In this paper, we consider time-independent domains \(\Omega \), resulting in \(\Sigma = \Omega \times I\).

Using \(V=V(\Omega )= H^1_D(\Omega ) = \{v\in H^1(\Omega ); v|_{\Gamma _D}=0\} \) and \(H = H(\Omega )= L^2(\Omega )\) we can define our space-time Hilbert space as

$$\begin{aligned} X {:}{=}X(I,V){:}{=}W((I,V(\Omega )){:}{=}\{v:v\in L^2(I,V(\Omega ))\text { and }\partial _t v\in L^2(I,V^*(\Omega )) \} \end{aligned}$$
(1)

where \(L^2(I,V(\Omega ))\) is the Bochner space of \(L^2\) functions over I with values in \(V(\Omega )\).

Here, \(\Gamma _D\subseteq \partial \Omega \) denotes the Dirichlet boundary with condition \(v(t)=0\text { on }\Gamma _D\). For inhomogeneous conditions i.e. \(v(t)=g(t)\text { on }\Gamma _D\) the ansatz space is \(X(I,V)+g\), we will however limit derivations to the homogeneous case for the sake of brevity. On X(IV) we can use the \(L^2(I,L^2(\Omega ))\) scalar product

$$\begin{aligned} (u,v){:}{=}(u,v)_{L^2(I,H)} = \int \limits _0^T (u(t),v(t))_{H} \textrm{d}t. \end{aligned}$$
(2)

Since we want to use discontinuous Galerkin discretizations in time we have to define an infinite dimensional space \({\widetilde{X}}({\mathcal {T}}_k,V(\Omega ))\) that allows for jumps at the grid points of the temporal mesh \({\mathcal {T}}_k\).

To obtain \({\mathcal {T}}_k\) we decompose the temporal interval I into M open subintervals \(I_m{:}{=}(t_{m-1},t_m)\) of length \(k_m=t_m-t_{m-1}\), with the condition that

$$\begin{aligned} {\bar{I}} = {\bar{I}}_1\cup {\bar{I}}_2\cup \dots \cup {\bar{I}}_M\quad \text {and}\quad I_i\cap I_j = \{ \}\; \forall i\ne j \end{aligned}$$
(3)

hold. Then, \({\mathcal {T}}_k\) is the collection of all intervals from \(I_1\) to \(I_M\).

Following the nomenclature of [14] we can define the broken Bochner space

$$\begin{aligned} {\widetilde{X}}({\mathcal {T}}_k,V(\Omega )){:}{=}\{v\in L^2(I,L^2(\Omega ))\text { and }v|_{I_m}\in W(I_m,V(\Omega ))\;\forall I_m\in {\mathcal {T}}_k\}. \end{aligned}$$
(4)

For each local space the continuous embedding \(W(I_m,V(\Omega ))\subset C({\bar{I}}_m,H(\Omega ))\) holds such that the limits from above and below, i.e. \(v_m^\pm {:}{=}\lim \limits _{\varepsilon \rightarrow 0} v(t_m\pm \varepsilon )\) are well defined.

Using these we can define the jump across temporal intervals as

$$\begin{aligned}{}[v]_m{:}{=}v_m^+-v_m^- \end{aligned}$$
(5)

for all inner temporal grid points \(t_m\in {\mathcal {T}}_k\).

Additionally, we introduce \([v]_0\) as a shorthand for the weakly imposed initial conditions, i.e.

$$\begin{aligned}{}[v]_0{:}{=}v_0^+-v^0. \end{aligned}$$
(6)

Finally, let \(A:X\times X\rightarrow {\mathbb {R}}\) be a semi-linear form representing the PDE (partial differential equation) in weak form, being nonlinear in the first argument and linear in the second argument. Let \(F:X\rightarrow {\mathbb {R}}\) be a linear form representing given right hand side data. Then, the abstract problem reads:

Problem 2.1

(Abstract form on the continuous level) Find \(u\in X\) such that

$$\begin{aligned} A(u)(\varphi ) = F(\varphi ) \quad \forall \varphi \in X. \end{aligned}$$
(7)

We provide strong forms and the respective weak forms in terms of (7) for the heat equation and a combustion problem in Sects. 2.3 and 2.4, respectively.

2.2 Discretization

In principle, the discretization steps are the same for all parabolic problems. Since we want to be able to have different trial functions in time and space, we split the discretization, starting with the temporal basis functions.

2.2.1 Semi-discretization in Time

Given the temporal triangulation \({\mathcal {T}}_k\) as defined above we can obtain the semidiscrete space by discretization of the temporal functions into piecewise polynomials of degree \(r\in {\mathbb {N}}_0\):

$$\begin{aligned} {\widetilde{X}}_k^r \left( {\mathcal {T}}_k,V \right)&{:}{=}\left\{ v_k\in L^2(I,H)\text { and }v_k|_{I_m}\in {\mathcal {P}}_r \left( I_m,V \right) \right\} \subset {\widetilde{X}} \left( {\mathcal {T}}_k,V \right) . \end{aligned}$$
(8)

with \(A(u_k)(\varphi )\) and \(F(\varphi )\) depending on the actual problem, the general time-discrete weak dG(r) formulations reads:

Problem 2.2

[(Abstract form semi-discrete in time) Find \(u_k\in {\widetilde{X}}_k^r({\mathcal {T}}_k,V)\), where \(r\ge 0\), such that

$$\begin{aligned} A(u_k)(\varphi _k) = F(\varphi _k)\;\forall \varphi _k\in {\widetilde{X}}_k^r \left( {\mathcal {T}}_k,V \right) . \end{aligned}$$
(9)

2.2.2 Fully Discrete Abstract Problem

For the spatial discretization we use triangulations \({\mathcal {T}}_h^m\) of \(\Omega \), where \(m=1,\ldots , M\) indicates each temporal subinterval \(I_m\). These are decomposed into quadrilateral/hexagonal elements K and we use continuous test and trial functions of degree s resulting in \(V_h^s \left( {\mathcal {T}}_h^m \right) \subset V(\Omega )\) defined as

$$\begin{aligned} V_h^{s} \left( {\mathcal {T}}_h^m \right) {:}{=}\left\{ v_h\in V\text { and }v_h|_K\in Q_s(K) \forall K\in {\mathcal {T}}_h^m \right\} . \end{aligned}$$
(10)

We notice that we can have different triangulations on different subintervals, resulting in time-dependent or dynamic meshes. Using this, we can define the fully discrete function space:

$$\begin{aligned} {\widetilde{X}}_{k,h}^{r,s} \left( {\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M} \right)&:= \left\{ v_{kh} \in L^2(I,H)\text { and }v_{kh}|_{I_m} \in P_r \left( I_m,V_h^{s} \left( {\mathcal {T}}_h^m \right) \right) \;\forall I_m\in {\mathcal {T}}_k \right\} \\&\subset {\widetilde{X}}_k^r \left( {\mathcal {T}}_k,V \right) . \end{aligned}$$

With these definitions, we obtain the fully discrete cG(s)dG(r) formulation:

Problem 2.3

(Abstract form fully discrete level) Find \(u_{kh}\in {\widetilde{X}}_{k,h}^{r,s}({\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M})\), such that

$$\begin{aligned} A(u_{kh})(\varphi _{kh}) = F(\varphi _{kh})\quad \forall \varphi _{kh}\in {\widetilde{X}}_{k,h}^{r,s} \left( {\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M} \right) . \end{aligned}$$
(11)

Specific realizations are provided in the following two subsections, Sects. 2.3 and 2.4, respectively, by setting

$$\begin{aligned}&A(u_{kh})(\varphi _{kh}) := A_{\text {heat}}(u_{kh})(\varphi _{kh}), \quad F(\varphi _{kh}) := F_{\text {heat}}(\varphi _{kh}),\\&A(u_{kh})(\varphi _{kh}) :=A_{\text {comb.}}(u_{kh})(\varphi _{kh}), \quad F(\varphi _{kh}) := F_{\text {comb.}}(\varphi _{kh}). \end{aligned}$$

2.3 Heat Equation

Having the abstract formulations on the continuous, semi-discrete (in time) and fully discrete levels at hand, we now proceed and provide two specific realizations. First, the heat equation is considered in this subsection and then a combustion problem is introduced in the next subsection. Let \(u:\Sigma \rightarrow {\mathbb {R}}\) be the solution of the heat equation

$$\begin{aligned} \partial _t u - \Delta u&= f\text { in }\Sigma ,\nonumber \\ u&= 0\text { on }\partial \Omega \times I\nonumber ,\\ u&= u^0\text { on }\Omega \times \{t=0\}, \end{aligned}$$
(12)

for a given initial condition \(u^0\in H\) and right hand side function \(f\in L^2(0,T;V^*)\). Using the discretization steps as described above, we obtain the following equations

$$\begin{aligned} A_{\text {heat}}(u_{kh})(\varphi _{kh}) \,&{:}{=}\sum \limits _{m=1}^M\int \limits _{I_m} \left( \partial _tu_{kh},\varphi _{kh} \right) _H\textrm{d}t + \left( \nabla u_{kh},\nabla \varphi _{kh} \right) \end{aligned}$$
(13)
$$\begin{aligned}&\quad + \sum \limits _{m=1}^{M-1} \left( \left[ u_{kh} \right] _m,\varphi _{kh,m}^+ \right) _H + \left( u_{kh,0}^+,\varphi _{kh,0}^+ \right) _H\nonumber \\ F_{\text {heat}} \left( \varphi _{kh} \right)&\, {:}{=}\left( f,\varphi _{kh} \right) + \left( u^0,\varphi _{kh,0}^+ \right) _H . \end{aligned}$$
(14)

2.4 Combustion

The following coupled nonlinear PDE describes the temperature dependent reaction and diffusion of a combustible substance without the influence of an additional fluid flow (\(v\equiv 0\)); see [36]. Therefore, the low Mach number hypothesis holds and the fluid flow is not influenced by the reaction and can be ignored. The resulting equations for the dimensionless temperature \(\theta :\Sigma \rightarrow {\mathbb {R}}\) and the species concentration \(Y:\Sigma \rightarrow {\mathbb {R}}\) are

$$\begin{aligned} \partial _t \theta -\Delta \theta&= \omega (\theta ,Y) \text { in } \Sigma , \end{aligned}$$
(15)
$$\begin{aligned} \partial _t Y -\frac{1}{Le}\Delta Y&= -\omega (\theta ,Y) \text { in } \Sigma , \end{aligned}$$
(16)

with the combustion reaction described by Arrhenius law

$$\begin{aligned} \omega (u) := \omega (\theta ,Y) {:}{=}\frac{\beta ^2}{2Le}Y\exp \left( \frac{\beta (\theta -1)}{1+\alpha (\theta -1)}\right) . \end{aligned}$$
(17)

The Arrhenius law is parametrized by the Lewis number \(Le>0\), the gas expansion \(\alpha >0\) and the dimensionless activation energy \(\beta >0\). We want to be able to allow all three common types of boundary conditions i.e. those of Dirichlet, Neumann and Robin type. For this we split the boundary \(\partial \Omega \) into three non-overlapping parts \(\Gamma _D\), \(\Gamma _N\) and \(\Gamma _R\). The Dirichlet boundary conditions are built into the function spaces, which is the usual approach. The Neumann and Robin boundary conditions are given by

$$\begin{aligned}&\partial _n \theta = g_N^\theta \text { on }\Gamma _N\times I \end{aligned}$$
(18)
$$\begin{aligned}&\partial _n Y = g_N^Y \text { on }\Gamma _N\times I \end{aligned}$$
(19)
$$\begin{aligned}&a_R^\theta \theta + b_R^\theta \partial _n \theta = g_R^\theta \text { on }\Gamma _R\times I \end{aligned}$$
(20)
$$\begin{aligned}&a_R^Y Y + b_R^Y \partial _n Y = g_R^Y \text { on }\Gamma _R\times I. \end{aligned}$$
(21)

It remains to state the initial conditions:

$$\begin{aligned} \theta = \theta ^0 \quad \text {on } \Omega \times \{t=0\},\\ Y = Y^0 \quad \text {on } \Omega \times \{t=0\}. \end{aligned}$$

By following the typical steps for the derivation of a weak formulation, integration by parts in space, and subsequent summation, we obtain

$$\begin{aligned}&\big (\partial _t \theta ,\varphi ^\theta \big ) + \big (\nabla \theta ,\nabla \varphi ^\theta \big ) - \int \limits _{I\times \partial \Omega } \partial _n \theta \varphi ^\theta \textrm{d}s\textrm{d}t - \big (\omega (\theta ,Y),\varphi ^\theta \big ) \\&\quad + \big (\partial _t Y,\varphi ^Y \big ) + \big (\nabla Y,\nabla \varphi ^Y \big ) - \int \limits _{I\times \partial \Omega } \partial _n Y \varphi ^Y \textrm{d}s\textrm{d}t + \big (\omega (\theta ,Y),\varphi ^Y \big ) =0. \end{aligned}$$

Splitting the boundary integrals and considering homogeneous Dirichlet conditions on some parts, i.e., \(\varphi ^\theta |_{\Gamma _D}=\varphi ^Y|_{\Gamma _D}=0\), we get

$$\begin{aligned} \int \limits _{I\times \partial \Omega } \partial _n \theta \textrm{d}s\textrm{d}t&= \int \limits _{I\times \Gamma _N} g_N^\theta \varphi ^\theta \textrm{d}s\textrm{d}t + \int \limits _{I\times \Gamma _R} \frac{g_R^\theta }{b_R^\theta }\varphi ^\theta -\frac{a_R^\theta }{b_R^\theta }\theta \varphi ^\theta \textrm{d}s\textrm{d}t, \end{aligned}$$
(22)
$$\begin{aligned} \int \limits _{I\times \partial \Omega } \partial _n Y \textrm{d}s\textrm{d}t&= \int \limits _{I\times \Gamma _N} g_N^Y\varphi ^Y \textrm{d}s \textrm{d}t + \int \limits _{I\times \Gamma _R} \frac{g_R^Y}{b_R^Y}\varphi ^Y-\frac{a_R^Y}{b_R^Y}Y\varphi ^Y \textrm{d}s\textrm{d}t. \end{aligned}$$
(23)

By introducing jump terms as described earlier, we obtain the following semi-linear and linear forms, respectively:

$$\begin{aligned} A_{\text {comb.}}(u_{kh}) \left( \varphi _{kh} \right) \,&{:}{=}\sum \limits _{m=1}^M\int \limits _{I_m} \left( \partial _t\theta _{kh},\varphi _{kh}^\theta \right) _H + \left( \nabla \theta _{kh},\nabla \varphi _{kh}^\theta \right) _H \; \textrm{d}t +\sum \limits _{m=1}^{M-1} \left( \left[ \theta _{kh} \right] _{m},\varphi _{kh,m}^{\theta ,+} \right) _H \nonumber \\&\quad + \sum \limits _{m=1}^M\int \limits _{I_m} \left( \partial _tY_{kh},\varphi _{kh}^Y \right) _H + \left( \nabla Y_{kh},\nabla \varphi _{kh}^Y \right) _H \;\textrm{d}t + \sum \limits _{m=1}^{M-1} \left( \left[ Y_{kh} \right] _{m},\varphi _{kh,m}^{Y,+} \right) _H \nonumber \\&\quad + \int \limits _{I\times \Gamma _R} \frac{a_R^\theta }{b_R^\theta }\theta \varphi ^\theta + \frac{a_R^Y}{b_R^Y}Y\varphi ^Y \textrm{d}s\textrm{d}t + \left( \omega \left( u_{kh} \right) ,\varphi _{kh}^Y-\varphi _{kh}^\theta \right) \nonumber \\&\quad + \left( \theta _{kh,0}^{+},\varphi _{kh,0}^{\theta ,+} \right) _H+ \left( Y_{kh,0}^{+},\varphi _{kh,0}^{Y,+} \right) _H, \end{aligned}$$
(24)
$$\begin{aligned} F_{\text {comb.}} \left( \varphi _{kh} \right) \,&{:}{=}\int \limits _{I\times \Gamma _N} g_N^\theta \varphi ^\theta + g_N^Y\varphi ^Y \textrm{d}s\textrm{d}t +\int \limits _{I\times \Gamma _R} \frac{g_R^\theta }{b_R^\theta }\varphi ^\theta +\frac{g_R^Y}{b_R^Y}\varphi ^Y \textrm{d}s\textrm{d}t \nonumber \\&\quad + \left( \theta ^{0},\varphi _{kh,0}^{\theta ,+} \right) _H+ \left( Y^{0},\varphi _{kh,0}^{Y,+} \right) _H , \end{aligned}$$
(25)

with \(u_{kh} = (\theta _{kh},Y_{kh})\) and \(\varphi _{kh} = (\varphi _{kh}^\theta ,\varphi _{kh}^Y)\).

2.5 General Formulation of Parabolic Problems

A general parabolic weak formulation that includes the previous problem statements can be stated by

$$\begin{aligned} \begin{aligned} A_{\text {gen.}}(u_{kh}) \left( \varphi _{kh} \right)&{:}{=}\sum \limits _{m=1}^M \int \limits _{I_m} \left( \partial _t u_{kh},\varphi _{kh} \right) _H \textrm{d}t+ a \left( u_{kh},\varphi _{kh} \right) \\&\quad +\sum \limits _{m=1}^{M-1} \left( [u_{kh}]_m,\varphi _{kh,m}^+ \right) _H + \left( u_{kh,0}^+,\varphi _{kh,0}^+ \right) ,\\ F_{\text {gen.}} \left( \varphi _{kh} \right)&{:}{=}\left( f,\varphi _{kh} \right) + \left( u^0,\varphi _{kh,0}^+ \right) _H \end{aligned} \end{aligned}$$
(26)

with an elliptic operator \(a(u,\varphi ){:}{=}\int \limits _0^T {\bar{a}}(u(t),\varphi (t))\textrm{d}t\). Then, (non-)linearity of A solely depends on the (non-)linearity of \({\bar{a}}\).

Remark 2.4

We notice that additional terms due to Neumann or Robin boundary conditions would appear inside \({\bar{a}}(u(t),\varphi (t))\) and/or \(F(\cdot )\).

2.6 Numerical Solution

In the algorithmic realization, we notice that the choice of trial and test spaces allow for a decoupling of the temporal discretization into slabs, i.e. slices of the space-time cylinder. In the simplest case a slice just encompasses a single temporal interval, resulting in a sequential time-stepping scheme due to the dG(r) test functions, and therefore effectively yielding classical time discretization schemes. For dG(0) an implicit Euler-type scheme is recovered. At each time slab, the spatial problems are solved as described in the following.

For the basic implementation of the (linear) heat equation with a classical DWR error estimator, we refer to the dwr-diffusion package [35] and the solvers implemented therein. There, the sparse direct solver UMFPACK [13] is used for the linear equation systems.

For the nonlinear combustion equation we employ a classical Newton-type solver as briefly described in the following. In the space-time setting we have to solve

$$\begin{aligned} A (u_{kh}) \left( \varphi _{kh} \right) = 0 \quad \forall \varphi _{kh}\in {\widetilde{X}}^{r,s}_{kh} \left( {\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M} \right) , \end{aligned}$$

where \(u_{kh}\) is the complete space-time solution over all intervals. Given an initial guess \(u_{kh}^{0}\), find the update \(\delta u\in {\widetilde{X}}_{kh}^{r,s}({\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M})\) of the linear defect-correction problem for \(j=0,1,2, \ldots \)

$$\begin{aligned} A_u ' \left( u_{kh}^{j} \right) \left( \delta u , \varphi _{kh} \right)&= -A \left( u_{kh}^{j} \right) \left( \varphi _{kh} \right) , \nonumber \\ u_{kh}^{j+1}&= u_{kh}^{j} + \alpha \delta u, \quad \alpha \in (0,1]. \end{aligned}$$
(27)

Remark 2.5

As we have discontinuous test functions the nonlinear problem can be decoupled into one subproblem per time interval. Then, we obtain a time-stepping scheme with one Newton loop per interval.

The defect-correction problems are solved using the parallel sparse direct solver MUMPS [1]. The Jacobian \(A'(\cdot )(\cdot ,\cdot )\) is derived by using analytical expressions, e.g., [70, Chapter 13], in order to maintain superlinear (or quadratic) convergence. However, for \(j>0\) reassembly of the matrix is omitted if the relative reduction of the residual is below a certain threshold. This simplified Newton approach saves a lot of computational time as the factorization only needs to be done after a reassembly step. This approach also benefits from preconditioned Krylow methods as the preconditioner is only recomputed after reassembly. The step size \(\alpha \) is determined by a damping line-search after solving the defect-correction problem.

3 The Dual Weighted Residual Method in a Space-Time Setting

In this section, we review the general ideas of the DWR method. We then derive joint and split error identities and corresponding error estimators.

3.1 Error Representation and Estimation

Let \(J:X\rightarrow {\mathbb {R}}\) be some goal functional representing some quantity of interest (QoI). The general form reads

$$\begin{aligned} J(u) = \int _{T_1}^{T_2} u(t)\, dt + u(T), \end{aligned}$$
(28)

where \(T_1,T_2\in I\), e.g., \(T_1 = 0\) and \(T_2 = T\) and T is the end time value. We are interested in the discretization error \(J(u) - J(u_{kh})\) and more specifically to minimize this error for a reasonable computational cost:

$$\begin{aligned} \min J(u) - J(u_{kh}) \end{aligned}$$

This becomes a constrained optimization problem since the solutions \(u\in X\) and \(u_{kh} \in {\tilde{X}}_{k,h}^{r,s}\) are obtained as PDE solutions from (7) to (11), respectively. These PDE problem statements are seen as constraint in terms of the optimization problem. Consequently, for a given goal functional J(u) we want to solve the following optimization problem

$$\begin{aligned} \min _{u\in X(I,V)}J(u) \quad s.\,t.\; A(u)(\varphi )=F(\varphi )\quad \forall \varphi \in X(I,V). \end{aligned}$$
(29)

This problem setting is the same in [9, Sect. 2.2, (2.13)]. Clearly, after the discretization, we can then measure the discretization error \(J(u) - J(u_{kh})\). We notice that from an optimization viewpoint \(J(u_{kh})\) is a constant in (29) and therefore implicitly contained therein modulo the constant shift.

We apply the method of Lagrange multipliers for this constrained optimization problem (see e.g., [9]) and introduce the dual variable \(z\in X(I,V)\). To account for the discontinuities in the primal problem we define a discontinuous Lagrange functional \(\widetilde{{\mathcal {L}}}(u,z):{\widetilde{X}}({\mathcal {T}}_k,V)\times {\widetilde{X}}({\mathcal {T}}_k,V)\rightarrow {\mathbb {R}}\), such that

$$\begin{aligned} \widetilde{{\mathcal {L}}}(u,z){:}{=}J(u)-A(u)(z)+F(z). \end{aligned}$$
(30)

Note that A(u)(z) and F(z) contain the jump terms and weakly imposed initial conditions.

For stationary points \((u,z)\in X(I,V)\times X(I,V)\) all jump terms vanish and the initial conditions are met exactly, such that \(\widetilde{{\mathcal {L}}}(u,z)\) is consistent with the continuous functional \({\mathcal {L}}(u,z)\). The first order optimality conditions yield the original primal problem \(A(u)(\varphi )=F(\varphi )\) as well as the adjoint problem:

Find \(z\in {\widetilde{X}}({\mathcal {T}}_k,V)\) such that

$$\begin{aligned} A_u'(u)(\psi ,z) = J_u'(u)(\psi )\;\forall \psi \in {\widetilde{X}}({\mathcal {T}}_k,V). \end{aligned}$$
(31)

Remark 3.1

Note that the test and trial functions are switched in (31). Accordingly, the temporal derivative is now applied to the test function \(\psi \). To rectify this, we apply integration by parts to the corresponding scalar product, obtaining:

$$\begin{aligned} \begin{aligned}&\sum \limits _{m=1}^M\int _{I_m} \left( \psi ,-\partial _t z \right) + {\bar{a}}'_u(u)(\psi ,z)\;\textrm{d}t+ \sum \limits _{m=1}^{M-1} \left( \psi _m^-,-[z]_m \right) _H\\ {}&\quad + \left( \psi (T),z(T) \right) _H = \left( \psi (T),z^M \right) _H +J_u'(u)(\psi ) \quad \forall \psi \in {\widetilde{X}} \left( {\mathcal {T}}_k,V \right) . \end{aligned} \end{aligned}$$
(32)

The negative sign of the temporal derivative means the adjoint problem has to be solved backwards in time, with an initial value \(z^M\) depending on the goal functional.

For functionals evaluated on the whole temporal domain \(z^M=0\) holds and for functionals defined only at T \(J_u'(u(T),\psi (T))\) can be reinterpreted as an initial value \(z^M\) for details see e.g. [43].

For linear PDEs and linear goal functionals we obtain the exact error representations (see [9]):

$$\begin{aligned} J(u)-J \left( u_{kh} \right)&= F\left( z-z_{kh} \right) -A \left( u_{kh},z-z_{kh} \right) \quad \text {(primal error)} \end{aligned}$$
(33)
$$\begin{aligned}&= J \left( u-u_{kh} \right) -A \left( u-u_{kh},z_{kh} \right) \quad \text {(adjoint error)}. \end{aligned}$$
(34)

In the following, we focus on the primal error representation. As we can see we would need both the exact dual solution z and the discrete dual solution \(z_{kh}\). As this is infeasible for complicated problems we use a discrete solution of higher order for z. In the equal low order approach both primal and adjoint problem are discretized using the same low order elements. The higher order adjoint solution is then obtained by a patch wise reconstruction. This reconstruction is described in detail in [57]. In the mixed order approach the adjoint problem is discretized by higher order elements and the solution is used as the representation of the exact solution. The fully discrete solution is then obtained by interpolation into the lower order space. It is also possible to use different approaches in time and space e.g. discretizing the primal problem with cG(1)dG(0) and the adjoint problem with cG(2)dG(0).

Additionally, (33) can be split into a temporal and a spatial part by introducing the semidiscrete adjoint solution \(z_k\) such that

$$\begin{aligned} J(u)-J(u_{kh})&= J(u)-J(u_k)+J(u_k)-J(u_{kh}), \end{aligned}$$
(35)

where temporal and spatial errors are given by, respectively,

$$\begin{aligned} J(u)-J(u_k)&= F(z-z_k)-A(u_{kh},z-z_k), \end{aligned}$$
(36)
$$\begin{aligned} J(u_k)-J(u_{kh})&= F(z_k-z_{kh})-A(u_{kh},z_k-z_{kh}). \end{aligned}$$
(37)

That way (36) can be used for temporal refinement and (37) for spatial refinement.

3.2 DWR for Nonlinear Time Dependent Problems

3.2.1 Adjoint Problem Statements

For \(A_{\text {gen.}}\) the left hand side of the adjoint problem (31) in the space-time context explicitly reads as

$$\begin{aligned} \begin{aligned} A'_{u,\text {gen.}}(u)(\psi ,z) \,&{:}{=}\sum \limits _{m=1}^M \int _{I_m} \left( \psi ,-\partial _t z_{kh} \right) + {\bar{a}}'_u(u(t)) \left( \psi (t),z(t) \right) \;\textrm{d}t \\&\quad +\sum \limits _{m=1}^{M-1} \left( \psi _m^-,[z_{kh}]_m \right) _H+ \left( \psi (T),z_{kh}(T) \right) _H. \end{aligned} \end{aligned}$$
(38)

As an example, for the combustion problem described in the Sect. 2.4 we obtain \(F = F_{\text {comb.}}\) as well as the operator of the semi-linear form and its directional derivative, respectively,

$$\begin{aligned} {\bar{a}}(u(t),\varphi (t))&= \left( \nabla \theta ,\nabla \varphi ^\theta \right) _H + \left( \nabla Y,\nabla \varphi ^Y \right) _H + \left( \omega (u),\varphi ^Y-\varphi ^\theta \right) _H \nonumber \\&\quad +\int \limits _{\Gamma _R} \frac{a_R^\theta }{b_R^\theta }\theta \varphi ^\theta + \frac{a_R^Y}{b_R^Y}Y\varphi ^Y \textrm{d}s, \end{aligned}$$
(39)
$$\begin{aligned} {\bar{a}}_u'(u(t))(\psi ,z)&= \left( \nabla \psi ^\theta ,\nabla z^\theta \right) _H + \left( \nabla \psi ^Y,\nabla z^Y \right) _H + \left( \omega _\theta '(u) \left( \psi ^\theta \right) ,z^Y-z^\theta \right) _H \nonumber \\&\quad +\int \limits _{\Gamma _R} \frac{a_R^\theta }{b_R^\theta }\psi ^\theta z^\theta + \frac{a_R^Y}{b_R^Y}\psi ^Y z^Y \textrm{d}s+ \left( \omega _Y'(u) \left( \psi ^Y \right) ,z^Y-z^\theta \right) _H. \end{aligned}$$
(40)

Therein \(\omega '(u)(\psi )\) is the directional derivative of \(\omega (u)\) into the direction \(\psi \).

3.2.2 Goal-Oriented Error Representations

As there are two ways to separate the full estimator into parts, we want to use a clear and precise terminology to distinguish between those two.

Firstly, we can separate by the problem residuals we compute. This gives the primal and adjoint/dual estimators. Oftentimes, the average of those two parts is also called mixed estimator instead of full estimator.

Secondly, we can split the difference between the exact and the fully discrete solution by introducing the time-discrete solution. Not doing so we will call the resulting estimator the joint estimator. If we calculate both the temporal and spatial error estimator, we will call the sum of both the split error estimator. As both seperations can be done simultaneously, combined expressions like split primal estimator or joint dual estimator are possible.

For nonlinear problems we obtain the following error representation [9]:

Theorem 3.2

(Joint error identity) Let the primal problem and adjoint problem be given. Let \((u,z)\in {\widetilde{X}}({\mathcal {T}}_k,V)\times {\widetilde{X}}({\mathcal {T}}_k,V)\), \((u_k,z_k)\in {\widetilde{X}}_k^r({\mathcal {T}}_k,V) \times {\widetilde{X}}_k^r({\mathcal {T}}_k,V)\) and \((u_{kh},z_{kh})\in {{\widetilde{X}}}_{k,h}^{r,s}({\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M}) \times \widetilde{X}_{k,h}^{r,s}({\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M})\). Then, we have the space-time joint error identity

$$\begin{aligned} J(u)-J(u_{kh})&= \frac{1}{2} \rho (u_{kh})(z-z_{kh}) + \frac{1}{2} \rho ^*(u_{kh},z_{kh})(u-u_{kh})+ {\mathcal {R}}_{kh}, \end{aligned}$$
(41)

with the primal error estimator \(\rho \) and the adjoint error estimator \(\rho ^*\)

$$\begin{aligned} \rho (u_{kh})(z-z_{kh})&:= F(z-z_{kh}) - A(u_{kh},z-z_{kh}),\\ \rho ^*(u_{kh},z_{kh})(u-u_{kh})&:= J'(u_{kh})(u-u_{kh}) - A'_u(u_{kh})(u-u_{kh},z_{kh}), \end{aligned}$$

as well as a remainder term \({\mathcal {R}}_{kh}\) of higher order.

Proof

With \(\widetilde{X}_{k,h}^{r,s}({\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M}) \subset {\widetilde{X}}({\mathcal {T}}_k,V)\) and \(J(u_{kh})=\widetilde{{\mathcal {L}}}(u_{kh},z_{kh})\) the assumptions of [57][Proposition 3.1] hold, proving the representation. \(\square \)

Theorem 3.3

(Split error identity) With the previous assumptions, we have the split error identity

$$\begin{aligned} J(u)-J(u_{kh}) = (J(u) - J(u_k)) + (J(u_k) - J(u_{kh})), \end{aligned}$$

with

$$\begin{aligned} J(u) - J(u_k)&= \frac{1}{2} \rho (u_{k})(z-z_{k}) + \frac{1}{2} \rho ^*(u_{k},z_{k})(u-u_{k}) + R_k,\\ J(u_k) - J(u_{hk})&= \frac{1}{2} \rho (u_{kh})(z_k-z_{kh}) + \frac{1}{2} \rho ^*(u_{kh},z_{kh})(u_k-u_{kh}) + R_h. \end{aligned}$$

Proof

The proof follows the same ideas as before but with \(\widetilde{X}_{k,h}^{r,s}({\mathcal {T}}_k,{\mathcal {T}}_h^{1,\dots ,M}) \subset {\widetilde{X}}_k^r({\mathcal {T}}_k,V) \subset {\widetilde{X}}({\mathcal {T}}_k,V)\) as well as \(J(u_{kh})=\widetilde{{\mathcal {L}}}_k(u_{kh},z_{kh})\) and \(J(u_k) = \widetilde{{\mathcal {L}}}(u_k,z_k)\). \(\square \)

3.2.3 Error Estimators

From the previous error identities, we obtain error estimators in four variants. First, we have the full error estimator

$$\begin{aligned} \eta := \frac{1}{2} \rho (u_{kh})(z-z_{kh}) + \frac{1}{2} \rho ^*(u_{kh},z_{kh})(u-u_{kh})+ {\mathcal {R}}_{kh}. \end{aligned}$$
(42)

However, the unknown solutions u and z still enter. This is already an error estimator, because for cases where u and z are known, we can already estimate (discretization) errors in goal functionals. Of course, for most problems in practice, this first version does not play a role.

To this end, higher-order approximations \({{\widetilde{u}}}\in {\widetilde{X}}\) and \({{\widetilde{z}}}\in {\widetilde{X}}\) are introduced [9]. Examples of such approximations are \({{\widetilde{u}}}:= u_{kh}^{r+1,s+1}\) and \({{\widetilde{z}}}:= z_{kh}^{r+1,s+1}\) such that we obtain the computable error estimator

$$\begin{aligned} \eta ^{(r+1,s+1)}&:= \frac{1}{2} \rho (u_{kh}) \left( z_{kh}^{r+1,s+1}-z_{kh} \right) + \frac{1}{2} \rho ^* \left( u_{kh},z_{kh} \right) \left( u_{kh}^{r+1,s+1}-u_{kh} \right) + {\mathcal {R}}_{kh}. \end{aligned}$$
(43)

If the remainder term is omitted (which is indeed usually done in practice) we obtain the practical error estimator

$$\begin{aligned} \eta _h^{(r+1,s+1)}&:= \frac{1}{2} \rho (u_{kh})\left( z_{kh}^{r+1,s+1}-z_{kh} \right) + \frac{1}{2} \rho ^* \left( u_{kh},z_{kh} \right) \left( u_{kh}^{r+1,s+1}-u_{kh} \right) . \end{aligned}$$
(44)

Finally, we also introduce the primal-based error estimator:

$$\begin{aligned} \eta _{prim}^{(r+1,s+1)} := \rho (u_{kh}) \left( z_{kh}^{r+1,s+1} - z_{kh} \right) . \end{aligned}$$
(45)

As we also need higher order information of the primal problem for (43) and (44) to calculate the adjoint estimator we now have three possible approaches. In addition to the two previous discretization approaches the equal high order approach uses a higher order element discretization for both the primal and adjoint problem. Then, interpolation into a lower order element space yields both \(u_{kh}\) and \(z_{kh}\). However, inserting the interpolated \(u_{kh}\) into the goal functional yields worse results compared to a native low order solution, so ideally the primal problem should also be solved in low order to calculate the functional values. Various algorithmic realizations with corresponding theoretical results, and performance analyses for stationary problems were recently established in [19].

4 Error Localization

In this section, we address our key development, namely the construction of a space-time partition-of-unity (PU) localization of goal-oriented a posteriori error estimators. In the published literature as mentioned in the introduction, so far only stationary cases have been addressed with the PU localization. Here, we first extend the idea to time-dependent problems. Since we want to use the DWR error estimator for grid refinement, we need to split the estimator into element- or DoF-wise error contributions. Three known approaches are the classical integration by parts [5, 9], a variational filtering operator over patches of elements [11] and a variational partition-of-unity localization [53]. For stationary problems, the effectivity of these localizations was established and numerically substantiated in [53].

First, in Sect. 4.1, we exemplarily derive the variational partition of unity approach for our space-time estimator of the heat equation. Second, Sect. 4.3 focuses on details of the actual evaluation including the needed interpolation operations. Next, we list in Sect. 4.4 the resulting error indicators, finally followed by the adaptive algorithms designed in Sect. 4.5.

4.1 The Partition-of-Unity Approach for the Heat Equation

In this key section, we extend the ideas from [53] and apply a partition-of-unity (PU) localization to a space-time error estimator. To this end, we first design the PU space. The simplest choice is \(V_{PU} = {\widetilde{X}}_{k,h}^{0,1}\), i.e. a cG(1)dG(0) discretization. Effectively, this yields one spatial partition of unity \((\chi _{i,m})_{i=1}^{\#DoFs({\mathcal {T}}_h^m)}\in Q_1({\mathcal {T}}_h^m)\) per time interval \(I_m\) for \(m=1,\ldots , M\). As this is a Lagrangian finite element, we have

Proposition 4.1

For a function \(\chi \in V_{PU}\), it holds

$$\begin{aligned} \sum _{m=1}^M \sum _{i=1}^{\#DoFs({\mathcal {T}}_h^m)} \chi _{i,m} \equiv 1. \end{aligned}$$
(46)

Proof

Follows immediately from the properties of the finite element functions. \(\square \)

Remark 4.2

Another common choice for the PU space is \(V_{PU} = X_{kh}^{1,1}\), i.e. a cG(1)cG(1) discretization, for example in native \(d+1\)-dimensional discretizations [17]. In general this ensures a coupling between neighboring temporal elements to address the problem shown in [12]. However, for discontinuous Galerkin discretizations the dominating edge residuals, i.e. jump terms, are explicitly included in the estimator.

Remark 4.3

Clearly, using a dG discretization in time yields a natural decoupling for which the space-time PU reduces effectively to a PU in space. Nonetheless, we formulated our concepts using a space-time PU as the methodology applies (see again [17]) to a larger class of problems. As our work is one of the first into this direction in the literature, our aim is to provide the full methodology in order to have a starting point for further future work.

In the following, \(\tilde{u}\in {\widetilde{X}}\) and \(\tilde{z}\in {\widetilde{X}}\) denote approximations of the exact solutions. In principle the joint and split estimators only differ in the interpolation difference i.e \(\tilde{z}-z_{kh}\) or \(\tilde{z}-z_k\) and \(z_k-z_{kh}\) respectively.

For the joint estimator the local contributions are summed over all DoFs of a fixed interval to obtain the corresponding temporal estimator. Subsequent summation over all time intervals yields the global estimator. For the split estimators the spatial estimator is summed over all space-time DoFs and the temporal estimator is summed over all time intervals. The total error estimator is then the sum of these two error parts.

Proposition 4.4

(Primal joint error estimator for the heat equation) For the space-time formulation of the heat equation, we have the following a posteriori joint error estimator with partition-of-unity localization:

$$\begin{aligned} |J(u) - J(u_{kh}) | \le&|\eta _{\text {joint}}| {:}{=}\left| \sum \limits _m \eta _{kh}^m \right| , \quad \text {with }\; \eta _{kh}^m{:}{=}\sum \limits _{i\in {\mathcal {T}}_h^m} \eta _{kh}^{i,m}, \end{aligned}$$
(47)

with the error indicators

$$\begin{aligned} \begin{aligned} \eta _{kh}^{i,m}&:= \int \limits _{I_m} \left( f, \left( \tilde{z}-z_{kh} \right) \chi _{i,m} \right) _H \;\textrm{d}t -\int \limits _{I_m} \left( \nabla u_{kh}, \nabla \left( \left( \tilde{z}-z_{kh} \right) \chi _{i,m} \right) \right) _H\;\textrm{d}t\\&\quad -\int \limits _{I_m} \left( \partial _t u_{kh}, \left( \tilde{z}-z_{kh} \right) \chi _{i,m} \right) _H \;\textrm{d}t - \left( \left[ u_{kh} \right] _{m-1}, \left( \tilde{z}^+\left( t_{m-1} \right) -z_{kh}^+ \left( t_{m-1} \right) \right) \chi _{i,m} \right) _H. \end{aligned} \end{aligned}$$
(48)

Proof

We start from Theorem 3.2 with

$$\begin{aligned} J(u)-J(u_{kh}) = \frac{1}{2} \rho (u_{kh})(z-z_{kh}) + \frac{1}{2} \rho ^*(u_{kh},z_{kh})(u-u_{kh})+ {\mathcal {R}}_{kh}, \end{aligned}$$

which yields

$$\begin{aligned} |J(u)-J(u_{kh})| \le |\frac{1}{2} \rho (u_{kh})(z-z_{kh}) + \frac{1}{2} \rho ^*(u_{kh},z_{kh})(u-u_{kh}) + {\mathcal {R}}_{kh}|. \end{aligned}$$

Considering the primal part only (see (45)), gives us

$$\begin{aligned} |J(u)-J(u_{kh})| \le |\eta _{joint}|:= |\rho (u_{kh})(z-z_{kh})|. \end{aligned}$$

Inserting the PU (46) yields

$$\begin{aligned} |J(u)-J(u_{kh})| \le |\eta _{joint}|:= \left| \sum _{m=1}^M \sum _{i=1}^{\#DoFs({\mathcal {T}}_h^m)} \rho (u_{kh})\big ((z-z_{kh})\chi _{i,m}\big ) \right| . \end{aligned}$$

Then, employing the definition of the primal residual leads to

$$\begin{aligned} |J(u)-J(u_{kh})| \le |\eta _{joint}|:= \left| \sum _{m=1}^M \sum _{i=1}^{\#DoFs({\mathcal {T}}_h^m)} F\big ( (z-z_{kh})\chi _{i,m} \big ) - A(u_{kh})\big ( (z-z_{kh})\chi _{i,m} \big ) \right| . \end{aligned}$$

Here, we employ the left hand side and right hand side of the heat equation, namely (13) and (14), respectively, by replacing the test function \(\varphi _{kh}\) by the PU-weighted adjoint sensitivity measure \((z-z_{kh})\chi _{i,m}\). Finally, the (unknown) solution z is approximated by some higher order representation \({\tilde{z}}\) from which we obtain the assertion. \(\square \)

Definition 4.5

(Effectivity and indicator indices) We notice that the effectivity index is defined as

$$\begin{aligned} I_{eff}:= \frac{|J(u)-J(u_{kh})|}{|\eta _{joint}|}. \end{aligned}$$

Applying the triangle inequality on \(\eta _{joint}\) yields a more strict criterion, i.e., here

$$\begin{aligned} |\eta _{joint}| = \left| \sum \limits _m \eta _{kh}^m \right| \le \sum \limits _m \sum \limits _{i\in {\mathcal {T}}_h^m} |\eta _{kh}^{i,m}|, \end{aligned}$$

from which the so-called indicator index

$$\begin{aligned} I_{ind}:= \frac{|J(u)-J(u_{kh})|}{\sum \limits _m \sum \limits _{i\in {\mathcal {T}}_h^m} |\eta _{kh}^{i,m}|} \end{aligned}$$

can be defined.

Remark 4.6

Note that we only use the temporal part for marking time steps and calculating the global estimator. Since the indicators for each time step are obtained by summing over all elements in said time step the spatial PU \(\chi _{i,m}\) effectively cancels due to the PU property. As a consequence, the spatial PU can be omitted directly in the computation of the temporal indicators.

Proposition 4.7

(Primal split error estimator for the heat equation) For the space-time formulation of the heat equation, we have the following a posteriori split error estimator with partition-of-unity localization:

$$\begin{aligned} |J(u) - J(u_{kh})| \le |\eta _{\text {split}}| {:}{=}\left| \sum \limits _m \left( \eta _k^m + \sum _{i\in {\mathcal {T}}_h^m} \eta _{h}^{i,m} \right) \right| , \end{aligned}$$
(49)

with the temporal error indicators

$$\begin{aligned} \begin{aligned} \eta _k^m \,&:=\int \limits _{I_m} \left( \left( f,\tilde{z}-z_{k} \right) _H - \left( \partial _t u_{kh}, \tilde{z}-z_{k} \right) _H - \left( \nabla u_{kh}, \nabla \left( \tilde{z}-z_{k} \right) \right) _H \right) \;\textrm{d}t\\&\quad - \left( \left[ u_{kh} \right] _{m-1},\tilde{z}^+ \left( t_{m-1} \right) -z_{k}^+ \left( t_{m-1} \right) \right) _H, \end{aligned} \end{aligned}$$
(50)

and the spatial error indicators

$$\begin{aligned} \begin{aligned} \eta _{h}^{i,m}&:= \int \limits _{I_m} \left( f, \left( z_k-z_{kh} \right) \chi _{i,m} \right) _H \;\textrm{d}t -\int \limits _{I_m} \left( \nabla u_{kh}, \nabla \left( \left( z_k-z_{kh} \right) \chi _{i,m} \right) \right) _H\;\textrm{d}t \\&\quad -\int \limits _{I_m} \left( \partial _t u_{kh}, \left( z_k-z_{kh} \right) \chi _{i,m} \right) _H \;\textrm{d}t - \left( \left[ u_{kh} \right] _{m-1}, \left( z_k^+ \left( t_{m-1} \right) -z_{kh}^+ \left( t_{m-1} \right) \right) \chi _{i,m} \right) _H. \end{aligned} \end{aligned}$$
(51)

Proof

For the primal split error estimator we start with Theorem 3.3. Then, we proceed for both parts with the primal estimator:

$$\begin{aligned} |J(u) - J(u_k)|&\le |\rho (u_{k})(z-z_{k})|,\\ |J(u_k) - J(u_{hk})|&\le |\rho (u_{kh})(z_k-z_{kh})|. \end{aligned}$$

Combining the left hand sides to the full error estimator and utilizing the PU in a similar way as the proof of Proposition 4.4 yields

$$\begin{aligned} |J(u) - J(u_{kh})|&\le |\rho (u_{k})(z-z_{k}) + \rho (u_{kh})(z_k-z_{kh})|\\&\le \left| \sum \limits _m \left( \eta _k^m + \sum _{i\in {\mathcal {T}}_h^m} \eta _{h}^{i,m} \right) \right| \\&=: |\eta _{\text {split}}|. \end{aligned}$$

Here the error indicators are obtained as in the proof of Proposition 4.4, which yields the assertion. \(\square \)

Remark 4.8

Note that the basis functions are globally defined, so the DoF-wise errors contain an implicit sum over all elements in practice. However, calculating element based estimators by constraining the spatial integrals to each element K yields the unlocalized estimator, as the sum over all \(\chi _{i,m}\) on a single element is 1 and effectively cancels out. To use well-known element based marking strategies the DoF-estimators have to be calculated globally. Afterwards element wise estimators can be calculated by summing all estimators belonging to the DoFs of the corresponding element:

$$\begin{aligned} \eta _K^m = \sum \limits _{i\in K} \eta _{\bullet }^{i,m}, \end{aligned}$$
(52)

where \(\bullet \) stands for h or kh.

Proposition 4.9

(Adjoint joint error estimator for the heat equation) For the space-time formulation of the heat equation, we have the following a posteriori joint error estimator with partition-of-unity localization:

$$\begin{aligned} \left| J(u) - J(u_{kh}) \right| \le \left| \eta _{\text {joint}}^* \right| {:}{=}\left| \sum \limits _m \eta _{kh}^{m,*} \right| , \quad \text {with }\; \eta _{kh}^{m,*}{:}{=}\sum \limits _{i\in {\mathcal {T}}_h^m} \eta _{kh}^{i,m,*}, \end{aligned}$$
(53)

with the error indicators

$$\begin{aligned} \begin{aligned} \eta _{kh}^{i,m,*}&:= \int \limits _{I_m}J_u' \left( u_{kh} \right) \left( \left( \tilde{u}-u_{kh} \right) \chi _{i,m} \right) \;\textrm{d}t -\int \limits _{I_m} \left( \nabla \left( \left( \tilde{u}-u_{kh} \right) \chi _{i,m} \right) ,\nabla z_{kh} \right) _H\;\textrm{d}t\\&\quad +\int \limits _{I_m} \left( \left( \tilde{u}-u_{kh} \right) \chi _{i,m},\partial _t z_{kh} \right) _H \;\textrm{d}t + \left( \left( \tilde{u}^-(t_m)-u_{kh}^-(t_m) \right) \chi _{i,m}, \left[ z_{kh} \right] _m \right) _H. \end{aligned} \end{aligned}$$
(54)

Proof

We start from Theorem 3.2 with

$$\begin{aligned} J(u)-J(u_{kh}) = \frac{1}{2} \rho (u_{kh})(z-z_{kh}) + \frac{1}{2} \rho ^*(u_{kh},z_{kh})(u-u_{kh})+ {\mathcal {R}}_{kh}, \end{aligned}$$

which yields

$$\begin{aligned} |J(u)-J(u_{kh})| \le |\frac{1}{2} \rho (u_{kh})(z-z_{kh}) + \frac{1}{2} \rho ^*(u_{kh},z_{kh})(u-u_{kh})+ {\mathcal {R}}_{kh}|. \end{aligned}$$

Considering the adjoint part only, gives us

$$\begin{aligned}{} & {} \left| J(u)-J \left( u_{kh} \right) \right| \le \left| \eta _{joint}^* \right| := \left| \rho ^* \left( u_{kh},z_{kh} \right) \left( u-u_{kh} \right) \right| \\{} & {} = \left| J'(u_{kh})(u-u_{kh}) - A'_u(u_{kh}) \left( u-u_{kh},z_{kh} \right) \right| . \end{aligned}$$

Utilizing the PU as in the proof of Proposition 4.4 and the respective definition of J in (28) and the adjoint \(A'_u\) of the heat equation by again approximating u by some higher-order approximation \({\tilde{u}}\) yields the assertion. \(\square \)

Proposition 4.10

(Adjoint split error estimator for the heat equation) For the space-time formulation of the heat equation, we have the following a posteriori split error estimator with partition-of-unity localization:

$$\begin{aligned} |J(u) - J(u_{kh}) | \le \left| \eta _{\text {split}}^* \right| {:}{=}\left| \sum \limits _m \left( \eta _k^{m,*} + \sum _{i\in {\mathcal {T}}_h^m} \eta _{h}^{i,m,*} \right) \right| , \end{aligned}$$
(55)

with the temporal error indicators

$$\begin{aligned} \begin{aligned} \eta _k^{m,*}&:=\int \limits _{I_m} \Big ( J_u'(u_{kh})(\tilde{u}-u_{k}) +(\tilde{u}-u_k,\partial _t z_{kh})_H -(\nabla (\tilde{u}-u_{k}),\nabla u_{kh})_H \Big ) \;\textrm{d}t\\&\quad -(\tilde{u}^-(t_m)-u_{k}^-(t_m),[z_{kh}]_m,)_H, \end{aligned} \end{aligned}$$
(56)

and the spatial error indicators

$$\begin{aligned} \begin{aligned} \eta _{h}^{i,m,*}&:= \int \limits _{I_m} \left( J_u'(u_{kh})((u_k-u_{kh})\chi _{i,m} \right) \;\textrm{d}t -\int \limits _{I_m} \left( \nabla \left( (u_k-u_{kh})\chi _{i,m} \right) ,\nabla z_{kh} \right) _H\;\textrm{d}t \\&\quad +\int \limits _{I_m} \left( (u_k-u_{kh})\chi _{i,m},\partial _t z_{kh} \right) _H \;\textrm{d}t -\left( \left( u_k^-(t_m)-u_{kh}^-(t_m) \right) \chi _{i,m},[z_{kh}]_m \right) _H. \end{aligned} \end{aligned}$$
(57)

Proof

The proof starts as in Proposition 4.7, but now taking the adjoint residual. The rest is then a combination of the previous three propositions and follows conceptionally the same lines. \(\square \)

Remark 4.11

We emphasize that in this work, we estimate discretization errors only. The space-time extension of stationary PU-DWR versions such as [18, 44, 52] to linear or nonlinear iteration errors in our current space-time setting is part of future work.

4.2 The Partition-of-Unity Approach for the Combustion Problem

In this section, we state the error estimator of the combustion problem:

Proposition 4.12

(Primal split error estimator for combustion) Let us assume homogeneous boundary conditions on \(\Gamma _N\) as well as for the species concentration on \(\Gamma _R\). For the temperature we have the cooling conditions \(\kappa \theta +\partial _n\theta =0\) on the Robin boundary, such that \(g_N^\theta = g_N^Y = g_R^\theta = g_R^Y \equiv 0\), \(a_R^\theta = \kappa \), \(b_R^\theta =1\) and \(a_R^Y=b_R^Y=0\) hold. Then, we have the following a posteriori primal split error estimator with partition-of-unity localization for the space-time formulation of the time-dependent combustion problem:

$$\begin{aligned} |J(\{\theta ,Y\}) -&J(\{\theta ,Y\}_{kh}) | \le |\eta | := \left| \sum \limits _m \left( \eta _{k}^m +\sum \limits _{i\in {\mathcal {T}}_h^m} \eta _{i,h}^m\right) \right| , \end{aligned}$$
(58)

with the temporal error indicators

$$\begin{aligned} \begin{aligned} \eta _{k}^m&= -\int \limits _{I_m} \left( \left( \partial _t \theta _{kh},z^\theta -z^\theta _k \right) _H + \left( \nabla \theta _{kh},\nabla \left( z^\theta -z^\theta _k \right) \right) _H \right) \textrm{d}t\\&\quad +\int \limits _{I_m} \int \limits _{\Gamma _R}\kappa \theta \left( z^\theta -z^\theta _k \right) \textrm{d}s \textrm{d}t\\&\quad +\int \limits _{I_m} \left( \partial _t Y_{kh},z^Y-z^Y_k \right) _H + \left( \nabla Y_{kh},\nabla \left( z^Y-z^Y_k \right) \right) _H \textrm{d}t\\&\quad -\int \limits _{I_m} \left( \omega \left( \theta _{kh},Y_{kh} \right) , z^\theta -z^\theta _k \right) _H + \left( \omega \left( \theta _{kh},Y_{kh} \right) , z^Y-z^Y_k \right) _H\textrm{d}t\\&\quad - \left( \left[ \theta _{kh} \right] _{m-1},z^{\theta ,+} \left( t_{m-1} \right) -z^{\theta ,+}_{k} \left( t_{m-1} \right) \right) _H\\&\quad - \left( \left[ Y_{kh} \right] _{m-1},z^{Y,+} \left( t_{m-1} \right) -z^{Y,+}_{k} \left( t_{m-1} \right) \right) _H, \end{aligned} \end{aligned}$$
(59)

and the spatial error indicators

$$\begin{aligned} \begin{aligned} \eta _{i,h}^m =&-\int \limits _{I_m} \left( \partial _t \theta _{kh}, \left( z^\theta _k-z^\theta _{kh} \right) \chi _{i,m} \right) _H + \left( \nabla \theta _{kh},\nabla \left( \left( z^\theta _k-z^\theta _{kh} \right) \chi _{i,m} \right) \right) _H\textrm{d}s\textrm{d}t\\&\, +\int \limits _{I_m}\int \limits _{\Gamma _R}\kappa \theta \left( z^\theta _k-z^\theta _{kh} \right) \chi _{i,m}\textrm{d}s\textrm{d}t\\&\, + \int \limits _{I_m} \left( \partial _t Y_{kh}, \left( z^Y_k-z^Y_{kh} \right) \chi _{i,m} \right) _H + \left( \nabla Y_{kh},\nabla \left( \left( z^Y_k-z^Y_{kh} \right) \chi _{i,m} \right) \right) _H \textrm{d}s \textrm{d}t \\&\, - \int \limits _{I_m} \left( \omega \left( \theta _{kh},Y_{kh} \right) , \left( z^\theta _k-z^\theta _{kh} \right) \chi _{i,m} \right) _H + \left( \omega \left( \theta _{kh},Y_{kh} \right) , \left( z^Y_k-z^Y_{kh} \right) \chi _{i,m} \right) _H\textrm{d}t \\&\, - \left( \left[ \theta _{kh} \right] _{m-1}, \left( z^{\theta ,+}_{k} \left( t_{m-1} \right) -z^{\theta ,+}_{kh} \left( t_{m-1} \right) \right) \chi _{i,m} \right) _H \\&\, - \left( \left[ Y_{kh} \right] _{k,m-1}, \left( z^{Y,+}_k \left( t_{m-1} \right) -z^{Y,+}_{kh} \left( t_{m-1} \right) \right) \chi _{i,m} \right) _H. \end{aligned} \end{aligned}$$
(60)

Proof

We start as in Proposition 4.7 and employ for the primal residual and the right hand side the weak form of the combustion problem, i.e., (24) and (25), respectively. \(\square \)

Proposition 4.13

(Adjoint split error estimator for combustion) Using the same boundary conditions, we have the following a posteriori adjoint split error estimator with partition-of-unity localization for the space-time formulation of the time-dependent combustion problem:

$$\begin{aligned} |J(\{\theta ,Y\}) -&J(\{\theta ,Y\}_{kh}) | \le |\eta | := \left| \sum \limits _m\left( \eta _{k}^{m,*} +\sum \limits _{i\in {\mathcal {T}}_h^m} \eta _{i,h}^{m,*}\right) \right| , \end{aligned}$$
(61)

with the temporal error indicators

$$\begin{aligned} \eta _{k}^{m,*} =&\, J'_\theta (\{\theta _{kh},Y_{kh}\})(\theta -\theta _k) \left| _{I_m}+J'_Y (\{\theta _{kh},Y_{kh}\})(Y-Y_k) \right| _{I_m}\\&-\int \limits _{I_m} \left( \theta -\theta _k,-\partial _t z_{kh}^\theta \right) _H + \left( \nabla \left( \theta -\theta _k \right) ,\nabla z_{kh}^\theta \right) _H +\int \limits _{\Gamma _R} \kappa (\theta -\theta _k)z_{kh}^\theta \textrm{d}s\textrm{d}t\\&-\left. \int \limits _{I_m} \left( Y-Y_k,\partial _t z_{kh}^Y \right) _H+ \left( \nabla (Y-Y_k),\nabla z_{kh}^Y \right) \right) _H\textrm{d}t\\&-\int \limits _{I_m} \left( \omega _\theta ' \left( \theta _{kh},Y_{kh} \right) \left( \theta -\theta _k \right) +\omega _Y' \left( \theta _{kh},Y_{kh} \right) (Y-Y_k),z_{kh}^Y-z_{kh}^\theta \right) _H \textrm{d}t\\&- \left( \left( \theta ^-(t_m)-\theta _{k}^-(t_m) \right) ,[z_{kh}^\theta ]_m \right) - \left( \left( Y^-(t_m)-Y_{k}^-(t_m) \right) , \left[ z_{kh}^Y \right] _m \right) , \end{aligned}$$

and the spatial indicators

$$\begin{aligned} \eta _{i,h}^{m,*}=&\, J'_\theta \big ( \big \{\theta _{kh},Y_{kh} \big \} \big ) \big ( \big (\theta _k-\theta _{kh} \big )\chi _{i,m} \big ) \big |_{I_m}+J'_Y \big ( \big \{\theta _{kh},Y_{kh} \big \} \big ) \big ( \big (Y_k-Y_{kh} \big )\chi _{i,m} \big ) \big |_{I_m}\\&\, -\int \limits _{I_m} \big ( \big (\theta _k-\theta _{kh} \big )\chi _{i,m},-\partial _t z_{kh}^\theta \big )_H + \big (\nabla \big ( \big (\theta _k-\theta _{kh} \big )\chi _{i,m} \big ),\nabla z_{kh}^\theta \big )_H\\&\quad +\int \limits _{\Gamma _R} \kappa \big (\theta _k-\theta _{kh} \big )\chi _{i,m}z_{kh}^\theta \textrm{d}s\textrm{d}t\\&\, -\int \limits _{I_m} \left( \big (Y_k-Y_{kh} \big )\chi _{i,m},\partial _t z_{kh}^Y \right) _H+ \left( \nabla \left( \big (Y_k-Y_{kh} \big )\chi _{i,m} \big ),\nabla z_{kh}^Y \right) \right) _H\textrm{d}t\\&\, -\int \limits _{I_m} \left( \omega _\theta ' \big ( \big (\theta _k-\theta _{kh} \big )\chi _{i,m} \big ) \big (\theta _k-\theta _{kh} \big ) +\omega _Y' \big (\theta _{kh},Y_{kh} \big ) \big ( \big (Y_k-Y_{kh} \big )\chi _{i,m} \big ),z_{kh}^Y-z_{kh}^\theta \right) _H\textrm{d}t\\&\, - \left( \big (\theta _{k}^-(t_m)-\theta _{kh}^-(t_m) \big )\chi _{i,m}, \big [z_{kh}^\theta \big ]_m \right) - \left( \big (Y_{k}^-(t_m)-Y_{kh}^-(t_m) \big )\chi _{i,m}, \big [z_{kh}^Y \big ]_m \right) . \end{aligned}$$

Proof

We start as in Proposition 4.10 and the respective definition of J in (28). In contrast, the adjoint \(A'_u\) is now derived from the combustion system by approximating \(u=(\theta ,Y)\) by some higher-order approximation \({\tilde{u}} = ({\tilde{\theta }},{\tilde{Y}})\) yields the assertion. \(\square \)

4.3 Evaluation of the Space-Time PU-DWR

For the practical evaluation we need to properly define the interpolation differences. Depending on the approach, we need interpolations from a high order space into a low order space and reconstructions the other way around. In space we denote them as

$$\begin{aligned} i_h^{(s+1)}:{\widetilde{X}}_{k,h}^{r,s+1}\mapsto {\widetilde{X}}_{k,h}^{r,s} \quad \text {and}\quad i_{2h}^{(s+1)}:{\widetilde{X}}_{k,h}^{r,s}\mapsto {\widetilde{X}}_{k,h}^{r,s+1}. \end{aligned}$$

In time we use

$$\begin{aligned} i_k^{(r+1)}:{\widetilde{X}}_{k,h}^{r+1,s}\mapsto {\widetilde{X}}_{k,h}^{r,s} \quad \text {and}\quad i_{2k}^{(r+1)}:{\widetilde{X}}_{k,h}^{r,s}\mapsto {\widetilde{X}}_{k,h}^{r+1,s}. \end{aligned}$$

In the following, we take a closer look at the interpolation difference for a higher order solution as used in the mixed and equal high order approach, i.e. the interpolations \(i_h^{(s+1)}\) and \(i_k^{(r+1)}\). After that we write down the resulting localized error estimators for each PU-DoF. For good visual representations of the high order reconstructions based on a low order solution see [53, 57] for the spatial part \(i_{2h}^{(s+1)}\) and temporal part \(i_{2k}^{(r+1)}\) respectively.

For our visualization the high order space \({\widetilde{X}}_{k,h}^{1,2}\) and the low order space \({\widetilde{X}}_{k,h}^{0,1}\) are used. Then, \(u_k\) and \(z_k\) are elements of \({\widetilde{X}}_{k,h}^{0,2}\). Furthermore, we notice that in this specific case of piecewise constant discrete solutions the identities \(u_{k}^{-}(t_{m})=u_{k}^{+}(t_{m-1})\) and \(u_{kh}^{-}(t_{m})=u_{kh}^{+}(t_{m-1})\) hold. With this, we obtain the following operators:

Definition 4.14

(Temporal interpolation operators for \(r=0\))

The interpolation operator from piecewise linear elements to piecewise constant elements reads as

$$\begin{aligned} i_k^1 \tilde{z}(t) = {\left\{ \begin{array}{ll} \tilde{z}^{-}(t_m) &{}\text { for } t\in I_m,\\ \tilde{z}^{-}(t_1) &{}\text { for } t = 0. \end{array}\right. } \end{aligned}$$
(62)

The reconstruction of the high order solution, i. e. the other way around, is obtained by linear interpolation

$$\begin{aligned} \tilde{z}|_{I_m}(t)=\frac{t_m-t}{k_m}z_k^{-}(t_{m-1})+\frac{t-t_{m-1}}{k_m} z_k^{-}(t_m). \end{aligned}$$

For the spatial interpolation operator \(i_h^2\) we use a linear finite element ansatz with the vertex DoFs of the spatial triangulation. Using \(i_h^2\) on the temporally interpolated solution \(i_k \tilde{z}\) yields \(i_{kh}\tilde{z}\) and vice versa. To illustrate this, Fig. 1 shows the different interpolation levels for a single 1\(+\)1D finite element.

Fig. 1
figure 1

Different interpolation levels on a single 1 + 1D space-time element

Definition 4.15

(Application of operators depending on the choice of solution spaces) Let \({\hat{u}}\) and \({\hat{z}}\) denote the approximated solutions of the primal and adjoint problems, depending on the choice of finite elements for each problem. Then, we obtain our terms by the following evaluations:

$$\begin{aligned} u_{kh}^{-}(t_m) ={\left\{ \begin{array}{ll} {\hat{u}}^{-}(t_m) &{}\text {for } {\hat{u}}\in {\widetilde{X}}_{k,h}^{0,1},\\ i_k^1{\hat{u}}(t) &{}\text {for } {\hat{u}}\in X_{k,h}^{1,1},\\ i_h^2{\hat{u}}^-(t_m) &{}\text {for } {\hat{u}}\in {\widetilde{X}}_{k,h}^{0,2},\\ i_h^2i_k^1{\hat{u}}(t) &{}\text {for } {\hat{u}}\in X_{k,h}^{1,2}, \end{array}\right. } \end{aligned}$$
(63)
$$\begin{aligned} z_{k}^{-}(t_m) = {\left\{ \begin{array}{ll} i_{2h}^2{\hat{z}}^{-}(t_m) &{}\text {for } {\hat{z}}\in {\widetilde{X}}_{k,h}^{0,1},\\ i_k^1i_{2h}^2{\hat{z}} &{}\text {for } {\hat{z}}\in X_{k,h}^{1,1},\\ {\hat{z}}^{-}(t_m) &{}\text {for } {\hat{z}}\in {\widetilde{X}}_{k,h}^{0,2},\\ i_k^1{\hat{z}} &{}\text {for } {\hat{z}}\in X_{k,h}^{1,2}. \end{array}\right. } \end{aligned}$$
(64)

For \(z_{kh}^{-}(t_m)\) we have the same interpolations on \({\hat{z}}\) as for \(u_{kh}^{-}(t_m)\) on \({\hat{u}}\).

For these finite element spaces we can use the midpoint rule with \(t_0 = (t_{m+1}-t_m)/2\) for all temporal integrals when the resulting terms are linear in time. For temporal nonlinearities and higher-order right hand side functions f, higher-order quadrature rules, usually Gauss quadratures, have to be used.

Remark 4.16

(Reconstructions for the adjoint estimator) For the adjoint estimator we additionally need \(u_k\) and u. The semi-discrete \(u_k\) can be obtained the same way as \(z_k\), but for u we need to change the interpolation direction, which results in

$$\begin{aligned} \tilde{u}|_{I_m}(t)=\frac{t_m-t}{k_m}u_k^{-}(t_m)+\frac{t-t_{m-1}}{k_m}u_k^{-}(t_{m+1}). \end{aligned}$$

Remark 4.17

Finally, we comment on the treatment of pointwise evaluations such as in Proposition 4.10 and Proposition 4.13 in the terms \(u_k^-(t_m)-u_{kh}^-(t_m)\) and \(Y_{k}^-(t_m)-Y_{kh}^-(t_m)\), respectively. As an example, we explain more details for the heat equation. Due to the reverse construction into a higher order space using \(i_{2k}^{(r+1)}\), it holds \(u^-(t_m) = u_k^{-}(t_{m+1})\) for which we deal with a jump at \(t_m\), which in general is not identically equal to zero. Moreover, we note that these jump terms include a spatial integration which is done by Gauss-Legendre quadrature, i.e. with quadrature points that do not lie on the boundaries of the spatial elements. Therefore, \(((u_k^{-}(t_m)-u_{kh}^{-}(t_m))\chi _{i,m},[z_{kh}]_m)\) and corresponding terms will in general be nonzero as well, irrespective of the spatial interpolation. Conversely, if the difference is zero then the low order solution is an exact representation of the high order solution such that no refinement is needed.

4.4 Error Indicators in Space and Time

With the previous evaluations, we can now define the respective indicators in space and time for the heat equation and the combustion problem. Note that the temporal derivative \(\partial _t u_{kh}\) vanishes for \(u_{kh}\in {\widetilde{X}}_{k,h}^{0,s}\), i.e. piecewise constant elements in time.

4.4.1 Natural PU cG(1)dG(0)

Employing the previously introduced PU, namely cG(1)dG(0) yields to the following results for the error indicators.

Proposition 4.18

(Joint primal error indicator for the heat equation) We have the following joint error indicator for the heat equation

$$\begin{aligned} \begin{aligned} \eta _{kh,\text {heat}}^{i,m}&= \int \limits _{t_{m-1}}^{t_m} \big (f(t), \big (\tilde{z}(t)-z_{kh}^{-}(t_m) \big )\chi _{i,m} \big )_H\textrm{d}t\\&\quad - \big (u_{kh}^{-}(t_m)-u_{kh}^{-} \big (t_{m-1} \big ), \big (z_k^{-} \big (t_{m-1} \big )-z_{kh}^{-}(t_m) \big )\chi _{i,m} \big )_H\\&\quad - \frac{k_m}{2} \cdot \big (\nabla u_{kh}^{-}(t_m), \big (\nabla z_k^{-} \big (t_{m-1} \big )+\nabla z_k^{-}(t_m)-2\nabla z_{kh}^{-}(t_m) \big )\chi _{i,m} \big )_{H}\\&\quad - \frac{k_m}{2} \cdot (\nabla u_{kh}^{-}(t_m), \big (z_k^{-} \big (t_{m-1} \big )+z_k^{-}(t_m)-2z_{kh}^{-}(t_m) \big )\nabla \chi _{i,m} \big )_H \end{aligned} \end{aligned}$$
(65)

with the time step size \(k_m=t_m-t_{m-1}\).

Proof

Starting from Proposition 4.4 and the representations of \(\eta _{kh}^{i,m}\), we evaluate the temporal integrals and employ the interpolation operators derived in Sect. 4.3 and obtain the joint error indicator \(\eta _{kh,\text {heat}}^{i,m}\). \(\square \)

Accordingly, we have

Proposition 4.19

(Split primal error indicators for the heat equation) The split indicators \(\eta _{k,\text {heat}}^{m}\) and \(\eta _{h,\text {heat}}^{i,m}\) for the heat equation are given by

$$\begin{aligned} \eta _{k,\text {heat}}^{m}&= \int \limits _{t_{m-1}}^{t_m} \big (f(t), \big (\tilde{z}(t)-z_k^{-}(t_m) \big ) \big )_H \textrm{d}t - k_m/2\cdot \big (\nabla u_{kh}^{-}(t_m),\nabla \big (z_k^{-} \big (t_{m-1} \big )-z_k^{-}(t_m) \big ) \big )_H\nonumber \\&\quad - \big (u_{kh}^{-}(t_m)-u_{kh}^{-} \big (t_{m-1} \big ), \big (z_k^{-} \big (t_{m-1} \big )-z_k^{-}(t_m) \big ) \big )_H, \end{aligned}$$
(66)

and

$$\begin{aligned} \eta _{h,\text {heat}}^{i,m}&= \int \limits _{t_{m-1}}^{t_m} \big (f(t), \big (z_k^{-}(t_m)-z_{kh}^{-}(t_m) \big )\chi _{i,m} \big )_H\textrm{d}t\nonumber \\&\quad - k_m\cdot \big (\nabla u_{kh}^{-}(t_m), \big (\nabla z_k^{-}(t_m)-\nabla z_{kh}^{-}(t_m) \big )\chi _{i,m} + \big (z_k^{-}(t_m)-z_{kh}^{-}(t_m) \big )\nabla \chi _{i,m} \big )_H\nonumber \\&\quad - \big (u_{kh}^{-}(t_m)-u_{kh}^{-} \big (t_{m-1} \big ), \big (z_k^{-}(t_m)-z_{kh}^{-}(t_m) \big )\chi _{i,m} \big )_H. \end{aligned}$$
(67)

Proof

Starting from Proposition 4.7 and the representations of \(\eta _{k}^{m}\) and \(\eta _{h}^{i,m}\), we evaluate the temporal integrals and employ the interpolation operators derived in Sect. 4.3 and obtain the error indicators \(\eta _{k,\text {heat}}^{m}\) and \(\eta _{h,\text {heat}}^{i,m}\). \(\square \)

Using the same interpolations we obtain

Proposition 4.20

(Split primal error indicators for combustion) For the combustion problem we have the following primal error indicators

$$\begin{aligned} \begin{aligned} \eta _{k,\text {combustion}}^{m}&= -k_m/2\Big [ \big (\nabla \theta _{kh}^{-}(t_m),\nabla \big (z_k^{\theta ,-} \big (t_{m-1} \big )-z_k^{\theta ,-}(t_m) \big ) \big )_H\\&\qquad \big (\nabla Y_{kh}^{-}(t_m),\nabla \big (z_k^{Y,-}(t_{m-1} \big )-z_k^{Y,-}(t_m) \big ) \big )_H \\&\quad + \int \limits _{\Gamma _R}\kappa \theta _{kh}^{-}(t_m) \big (z_k^{\theta ,-} \big (t_{m-1} \big )-z_k^{\theta ,-}(t_m) \big )\textrm{d}s \\&\quad - \big (\omega \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ),z_k^{\theta ,-} \big (t_{m-1} \big ) -z_k^{\theta ,-}(t_m) \big )_H\\&\quad + \big (\omega \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ),z_k^{Y,-} \big (t_{m-1} \big )-z_k^{Y,-}(t_m) \big )_H\Big ]\\&\quad - \big (\theta _{kh}^{-}(t_m)-\theta _{kh}^{-} \big (t_{m-1} \big ),z_k^{\theta ,-} \big (t_{m-1} \big )-z_k^{\theta ,-}(t_m) \big )_H\\&\quad - \big (Y_{kh}^{-}(t_m)-Y_{kh}^{-} \big (t_{m-1} \big ),z_k^{Y,-} \big (t_{m-1} \big )-z_k^{Y,-}(t_m) \big )_H \end{aligned} \end{aligned}$$
(68)

and

$$\begin{aligned} \begin{aligned} \eta _{h,\text {combustion}}^{i,m}&= -k_m\Big [ \big (\nabla \theta _{kh}^{-}(t_m),\nabla \big ( \big (z_k^{\theta ,-}(t_{m})-z_{kh}^{\theta ,-}(t_m) \big )\chi _{i,m} \big ) \big )_H\\&\qquad \big (\nabla Y_{kh}^{-}(t_m),\nabla \big ( \big (z_k^{Y,-}(t_{m})-z_{kh}^{Y,-}(t_m) \big )\chi _{i,m} \big ) \big )_H \\&\quad + \int \limits _{\Gamma _R}\kappa \theta _{kh}^{-}(t_m) \big (z_k^{\theta ,-}(t_{m})-z_{kh}^{\theta ,-}(t_m) \big )\chi _{i,m}\textrm{d}s \\&\quad - \big (\omega \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ), \big (z_k^{\theta ,-}(t_{m})-z_{kh}^{\theta ,-}(t_m) \big )\chi _{i,m} \big )_H\\&\quad + \big (\omega \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ), \big (z_k^{Y,-}(t_{m})-z_{kh}^{Y,-}(t_m) \big )\chi _{i,m} \big )_H\Big ]\\&\quad - \big (\theta _{kh}^{-}(t_m)-\theta _{kh}^{-}\big (t_{m-1} \big ), \big (z_k^{\theta ,-}(t_{m})-z_{kh}^{\theta ,-}(t_m) \big )\chi _{i,m} \big )_H\\&\quad - \big (Y_{kh}^{-}(t_m)-Y_{kh}^{-} \big (t_{m-1} \big ), \big (z_k^{Y,-} \big (t_{m} \big )-z_{kh}^{Y,-}(t_m) \big )\chi _{i,m} \big )_H. \end{aligned} \end{aligned}$$
(69)

Proof

Starting from Proposition 4.12 and the representations of \(\eta _{k}^{m}\) and \(\eta _{h}^{i,m}\), we evaluate the temporal integrals and employ the interpolation operators derived in Sect. 4.3 and obtain the error indicators \(\eta _{k,\text {combustion}}^{m}\) and \(\eta _{h,\text {combustion}}^{i,m}\). \(\square \)

Remark 4.21

(Identity of the indicator variants) Since \(\sum \limits _{i\in {\mathcal {T}}_h^m} \chi _{i,m} \equiv 1\) in Proposition 4.1 holds, we have

$$\begin{aligned} \eta _k^m + \sum \limits _{i\in {\mathcal {T}}_h^m} \eta _{h}^{i,m} = \sum \limits _{i\in {\mathcal {T}}_h^m} \eta _{kh}^{i,m}, \end{aligned}$$
(70)

such that the choice between the two indicator variants is only important for adaptive refinement. If one is only interested in estimating the error then both variants are identical.

4.4.2 Alternative PU cG(1)cG(1)

To investigate the impact of the choice of the PU space we derive the split primal indicators for the heat equation based on a cG(1)cG(1) PU. Now, the right hand side integration might need even higher order quadrature rules. Apart from that, the highest temporal order in the temporal indicators is quadratic (constant primal solution, linear adjoint solution and PU) so we apply Simpsons rule instead. Additionally, we obtain two sets of spatial and temporal indicators per interval \(I_m\), i.e.

$$\begin{aligned} \eta _{k,m,\text {heat}}^{m-1}, \quad \eta _{k,m,\text {heat}}^{m}, \quad \eta _{h,m,\text {heat}}^{i,m-1}, \quad \eta _{h,m,\text {heat}}^{i,m}. \end{aligned}$$

The temporal indicators for refinement of \(I_m\) are then obtained by

$$\begin{aligned} \eta _{k,\text {heat},cG(1)}^m = \sum \limits _{i\in V_h^1({\mathcal {T}}_h^{m-1})} \eta _{k,{m-1},\text {heat}}^{m-1} +\sum \limits _{i\in V_h^1({\mathcal {T}}_h^{m})} \eta _{k,{m},\text {heat}}^{m-1} +\eta _{k,{m},\text {heat}}^{m} +\sum \limits _{i\in V_h^1({\mathcal {T}}_h^{m+1})} \eta _{k,{m+1},\text {heat}}^{m}.\nonumber \\ \end{aligned}$$
(71)

For the spatial element indicators we first have to interpolate the indicator vectors \((\eta _{h,m-1,\text {heat}}^{i,m-1})_{i\in V_h^1({\mathcal {T}}_h^{m-1})}\) and \((\eta _{h,m+1,\text {heat}}^{i,m})_{i\in V_h^1({\mathcal {T}}_h^{m+1})}\) to \({\mathcal {T}}_h^m\). Then, the element indicator for \(K\in {\mathcal {T}}_h^m\) is calculated as

$$\begin{aligned} \eta _{h,\text {heat},cG(1)}^{K,m} = \sum \limits _{i\in K} \eta _{h,m-1,\text {heat}}^{i,m-1} + \eta _{h,m,\text {heat}}^{i,m-1}+ \eta _{h,m,\text {heat}}^{i,m} + \eta _{h,m+1,\text {heat}}^{i,m}. \end{aligned}$$
(72)

Employing these derivations for the new choice of the PU, Simpson’s rule for quadrature in time, and then proceeding as in Sect. 4.4.1, we obtain the following results.

Proposition 4.22

(Split primal error indicators for the heat equation with cG(1)cG(1) PU) The split indicators for the heat equation are given by

$$\begin{aligned} \begin{aligned} \eta _{k,m,\text {heat}}^{m-1}&= \int \limits _{t_{m-1}}^{t_m} \left( f(t), \left( \tilde{z}(t)-z_k^{-}(t_m) \right) \frac{t_m-t}{t_m-t_{m-1}} \right) _H \textrm{d}t\\&\quad - 2k_m/6\cdot \left( \nabla u_{kh}^{-}(t_m),\nabla \left( z_k^{-} \left( t_{m-1} \right) -z_k^{-}(t_m) \right) \right) _H\\&\quad - \left( u_{kh}^{-}(t_m)-u_{kh}^{-} \left( t_{m-1} \right) , \left( z_k^{-} \left( t_{m-1} \right) -z_k^{-}(t_m) \right) \right) _H, \end{aligned} \end{aligned}$$
(73)

and

$$\begin{aligned} \begin{aligned} \eta _{k,m,\text {heat}}^{m}&= \int \limits _{t_{m-1}}^{t_m} \left( f(t), \left( \tilde{z}(t)-z_k^{-}(t_m) \right) \frac{t-t_{m-1}}{t_m-t_{m-1}} \right) _H \textrm{d}t\\&\quad - k_m/6\cdot \left( \nabla u_{kh}^{-}(t_m),\nabla \left( z_k^{-}(t_{m-1})-z_k^{-}(t_m) \right) \right) _H, \end{aligned} \end{aligned}$$
(74)

as well as

$$\begin{aligned} \begin{aligned} \eta _{h,m,\text {heat}}^{i,m-1}&= \int \limits _{t_{m-1}}^{t_m} \left( f(t), \left( z_k^{-}(t_m)-z_{kh}^{-}(t_m) \right) \frac{t_m-t}{t_m-t_{m-1}}\chi _{i,m} \right) _H\textrm{d}t\\&\quad - k_m/2\cdot \left( \nabla u_{kh}^{-}(t_m), \left( \nabla z_k^{-}(t_m)-\nabla z_{kh}^{-}(t_m) \right) \chi _{i,m}+ \left( z_k^{-}(t_m)-z_{kh}^{-}(t_m) \right) \nabla \chi _{i,m} \right) _H\\&\quad - \left( u_{kh}^{-}(t_m)-u_{kh}^{-}(t_{m-1}),(z_k^{-}(t_m)-z_{kh}^{-}(t_m))\chi _{i,m} \right) _H, \end{aligned} \end{aligned}$$
(75)

and

$$\begin{aligned} \begin{aligned} \eta _{h,m,\text {heat}}^{i,m}&= \int \limits _{t_{m-1}}^{t_m} \left( f(t), \left( z_k^{-}(t_m)-z_{kh}^{-}(t_m) \right) \frac{t-t_{m-1}}{t_m-t_{m-1}}\chi _{i,m} \right) _H\textrm{d}t\\&\quad - k_m/2\cdot \left( \nabla u_{kh}^{-}(t_m), \left( \nabla z_k^{-}(t_m)-\nabla z_{kh}^{-}(t_m) \right) \chi _{i,m}+ \left( z_k^{-}(t_m)-z_{kh}^{-}(t_m) \right) \nabla \chi _{i,m} \right) _H\\ \end{aligned} \end{aligned}$$
(76)

For the general adjoint estimator we also need \(z_{kh}\) and \(u_k\) from \(I_{m+1}\) which can be obtained by the interpolations described in (63) and (64) respectively. Additionally we only look at goal functionals of the types

$$\begin{aligned} J_1(u)(\varphi )&= \int \limits _0^T ({\bar{J}}_1(u),\varphi )_H \textrm{d}t, \qquad J_2(u)(\varphi ) = \int \limits _0^T \int \limits _{\partial \Omega } {\bar{J}}_2(u)\varphi \textrm{d}s\textrm{d}t, \end{aligned}$$

which are essentially interchangeable in the following formulas. Therefore, we only write down the indicators for \(J_1\).

Proposition 4.23

(Joint adjoint error indicator for the heat equation) We have the following joint error indicator for the heat equation

$$\begin{aligned} \begin{aligned} \eta _{kh,\text {heat}}^{i,m,*}&= \int \limits _{t_{m-1}}^{t_m} J_u'(u_{kh}) \big ( \big (\tilde{u}(t)-u_{kh}^{-}(t_m) \big )\chi _{i,m} \big )\textrm{d}t\\&\quad + \big ( \big (u_k^{-}(t_{m+1})-u_{kh}^{-}(t_m) \big )\chi _{i,m}, z_{kh}^{-}(t_{m+1})-z_{kh}^{-}(t_{m}) \big )_H \\&\quad - \frac{k_m}{2} \cdot \big ( \big (\nabla u_k^{-} (t_{m+1})+\nabla u_k^{-}(t_m)-2\nabla u_{kh}^{-}(t_m) \big )\chi _{i,m},\nabla z_{kh}^{-}(t_m) \big )_H. \end{aligned} \end{aligned}$$
(77)

Proposition 4.24

(Split adjoint error indicators for the heat equation) The split indicators \(\eta _k^{m,*}\) and \(\eta _{h,i}^{m,*}\) for the heat equation are given by

$$\begin{aligned} \eta _{k,\text {heat}}^{m,*}&= \int \limits _{t_{m-1}}^{t_m}J_u'(u_{kh}) \big (\tilde{u}(t)-u_{k}^{-}(t_m) \big ) \textrm{d}t - k_m/2\cdot \big (\nabla \big (u_k^{-} \big (t_{m+1} \big )-u_k^{-}(t_m) \big ),\nabla z_{kh}^{-}(t_m) \big )_H\nonumber \\&\quad + \big (u_{k}^{-} \big (t_{m+1} \big )-u_{k}^{-}(t_m),z_{kh}^{-} \big (t_{m+1} \big )-z_{kh}^{-}(t_m) \big )_H, \end{aligned}$$
(78)

and

$$\begin{aligned} \eta _{h,\text {heat}}^{i,m,*}&= \int \limits _{t_{m-1}}^{t_m}J_u'(u_{kh})((u_k^{-}(t_m)-u_{kh}^{-}(t_m))\chi _{i,m})\textrm{d}t\nonumber \\&\quad - k_m\cdot ((\nabla u_k^{-}(t_m)-\nabla u_{kh}^{-}(t_m))\chi _{i,m} +(u_k^{-}(t_m)-u_{kh}^{-}(t_m)\nabla \chi _{i,m},\nabla z_{kh}^{-}(t_m))_H\nonumber \\&\quad +((u_k^{-}(t_m)-u_{kh}^{-}(t_m))\chi _{i,m},z_{kh}^{-}(t_{m+1})-z_{kh}^{-}(t_m))_H. \end{aligned}$$
(79)

All functionals we want to examine for the combustion equation are only dependent on \(u_{kh}\) and the primal weight, which is at most linear in time. Therefore, we can simplify the estimator by also applying the midpoint rule to the functional.

Proposition 4.25

(Split adjoint error indicators for combustion) For the combustion problem we have the following adjoint error indicators

$$\begin{aligned} \begin{aligned} \eta _{k,\text {combustion}}^{m,*}&= k_m/2\Big [ \big (J_{1,\theta }' \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ),\theta _k^{-}(t_{m+1})-\theta _k^{-}(t_m) \big )_H\\&\quad + \big (J_{1,Y}' \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ) ,Y_k^{-} \big (t_{m+1} \big )-Y_k^{-}(t_m) \big )_H\\&\quad - \big (\nabla \big (\theta _k^{-} \big (t_{m+1} \big )-\theta _k^{-}(t_m) \big ) ,\nabla z_{kh}^{\theta ,-}(t_m))_H -\int \limits _{\Gamma _R}\kappa \big (\theta _k^{-} \big (t_{m+1} \big )-\theta _k^{-}(t_m) \big ) z_{kh}^{\theta ,-}(t_m) \textrm{d}s\\&\quad - \big (\nabla \big (Y_k^{-}(t_{m+1})-Y_k^{-}(t_m) \big ),\nabla z_{kh}^{Y,-}(t_m) \big )_H\\&\quad - \big (\omega _\theta ' \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ) \big (\theta _k^{-}(t_{m+1})-\theta _k^{-}(t_m) \big ),z_{kh}^{Y,-}(t_m)-z_{kh}^{\theta ,-}(t_m) \big )_H\\&\quad + \big (\omega _Y' \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ) \big (Y_k^{-}(t_{m+1})-Y_k^{-}(t_m) \big ),z_{kh}^{Y,-}(t_m)-z_{kh}^{\theta ,-}(t_m) \big )_H \Big ]{}\\&\quad + \big (\theta _{k}^{-}(t_{m+1})-\theta _{k}^{-}(t_m),z_{kh}^{\theta ,-}\big (t_{m+1} \big )-z_{kh}^{\theta ,-}(t_m) \big )_H\\&\quad + \big (Y_{k}^{-}(t_{m+1})-Y_{k}^{-}(t_m),z_{kh}^{Y,-}(t_{m+1})-z_{kh}^{Y,-}(t_m) \big )_H, \end{aligned} \end{aligned}$$
(80)

and

$$\begin{aligned} \begin{aligned} \eta _{h,\text {combustion}}^{i,m,*}&= k_m\Big [ \big (J_{1,\theta }' \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ), \big (\theta _k^{-}(t_m)-\theta _{kh}^{-}(t_m) \big )\chi _{i,m} \big )_H\\&\quad + \big (J_{1,Y}' \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ), \big (Y_k^{-}(t_m)-Y_{kh}^{-}(t_m) \big )\chi _{i,m} \big )_H\\&\quad - \big (\nabla \big ( \big (\theta _k^{-}(t_m)-\theta _{kh}^{-}(t_m) \big )\chi _{i,m} \big ),\nabla z_{kh}^{\theta ,-}(t_m) \big )_H \\&\quad - \big (\nabla \big ( \big (Y_k^{-}(t_m)-Y_{kh}^{-}(t_m) \big )\chi _{i,m} \big ),\nabla z_{kh}^{Y,-}(t_m) \big )_H\\&\quad -\int \limits _{\Gamma _R}\kappa (\theta _k^{-}(t_m)-\theta _{kh}^{-}(t_m))\chi _{i,m}z_{kh}^{\theta ,-}(t_m) \textrm{d}s\\&\quad - \big (\omega _\theta ' \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ) \big (\theta _k^{-}(t_m)-\theta _{kh}^{-}(t_m) \big )\chi _{i,m},z_{kh}^{Y,-}(t_m)-z_{kh}^{\theta ,-}(t_m) \big )_H\\&\quad + \big (\omega _Y' \big (\theta _{kh}^{-}(t_m),Y_{kh}^{-}(t_m) \big ) \big (Y_k^{-}(t_m)-Y_{kh}^{-}(t_m) \big )\chi _{i,m},z_{kh}^{Y,-}(t_m)-z_{kh}^{\theta ,-}(t_m) \big )_H \Big ]\\&\quad + \big ( \big (\theta _k^{-}(t_m)-\theta _{kh}^{-}(t_m) \big )\chi _{i,m},z_{kh}^{\theta ,-}(t_{m+1})-z_{kh}^{\theta ,-}(t_m) \big )_H\\&\quad + \big ( \big (Y_k^{-}(t_m)-Y_{kh}^{-}(t_m) \big )\chi _{i,m},z_{kh}^{Y,-} \big (t_{m+1} \big )-z_{kh}^{Y,-}(t_m) \big )_H. \end{aligned} \end{aligned}$$
(81)

4.5 Adaptive Algorithm

There are multiple options for solving the primal and adjoint problems, i.e. time-stepping, time-slabbing and fully simultaneous space-time, which all need adjustments to marking and refinement. However, all simulations shown here are time-stepping based so we limit ourselves to this approach. Again, we note that in a space-time context the adjoint problem runs backward in time. Then, all information is collected to evaluate the error estimators.

Since one of the error components might dominate, we employ an equilibration for time-stepping as proposed in [57]. This will sometimes restrict refinement to space or time. The overall procedure follows the typical loop: solve, estimate, mark, and refine.

For error estimation in time-stepping we obtain Algorithm 1. There, the main choice is in whether to use the split or joint estimators. Note that in the case of the joint estimator, the indicators \(\eta _{kh}^m\) and \(\eta _{kh}^{m,*}\) are calculated by summation of the spatial indicators.

Algorithm 1
figure a

ESTIMATE on a single interval \(I_m\), i.e. time-stepping based

Having calculated the estimators we mark and refine elements by following Algorithm 2. There, the equilibration is done by first calculating the global estimators. Note that these coincide in the joint case, such that no equilibration is performed.

Algorithm 2
figure b

MARK and REFINE for the time-stepping approach

5 Numerical Tests

In this final section, we substantiate our space-time error estimators and algorithms with the help of three numerical experiments. In the first configuration, a \(2+1D\) heat equation with manufactured solution is considered. This allows us to investigate in detail effectivity indices. Next, in Configuration 2, again a \(2+1D\) heat equation is utilized, but with a dynamic manufactured solution inspired by Hartmann [29]. In our final configuration, we consider a nonlinear coupled problem, namely nonlinear combustion. Therein, very detailed comparisons of different polynomial degrees, primal, adjoint and full estimators are undertaken. The computations are based on extensions of the DTM package dwr-diffusion [35], which itself is based on deal.II [3]. The programming codes are open-source can be found on https://github.com/jpthiele/pu-dwr-diffusion and https://github.com/jpthiele/pu-dwr-combustion respectively, and follow good practices of sustainable research software developments [2].

5.1 Configuration 1: 2 + 1D Heat Equation with a Simple Manufactured Solution

5.1.1 Problem Statement

To test the 2 + 1D implementation and the derived estimators we prescribe the following solution for the heat equation introduced in Sect. 2.3

$$\begin{aligned} u(t,x,y) = -\frac{\left( x^2-x \right) \left( y^2-y \right) }{4}t. \end{aligned}$$
(82)

Inserting the solution into the PDE yields the right hand side function

$$\begin{aligned} f(t,x,y) = -\frac{ \left( x^2-x \right) \left( y^2-y \right) }{4} + \frac{ \left( x^2-x \right) }{2}t + \frac{ \left( y^2-y \right) }{2}t. \end{aligned}$$
(83)

5.1.2 Configuration

The PDE is solved on the unit square and the temporal interval (0, 1), i.e. \(T=1\). Inserting \(x=0\), \(y=0\) or \(t=0\) yields \(u = 0\), resulting in homogeneous Dirichlet boundary conditions and an initial condition of \(u^0 \equiv 0\).

5.1.3 Goal Functionals

To test whether the error identity holds, we need a linear goal functional. A simple choice is the averaged solution

$$\begin{aligned} J(u) = \frac{1}{|\Omega |T} \int \limits _0^T \int \limits _{\Omega } u(t,x)\textrm{d}t\textrm{d}x. \end{aligned}$$
(84)

Inserting the analytical solution for an arbitrary T we obtain

$$\begin{aligned} J(u) = -\frac{T}{288}, \end{aligned}$$
(85)

as reference value.

5.1.4 Discussion of Findings

We expect (33) and (34) to hold, which is identical to \(I_{\text {eff}}= 1\). We observe the effectivity indices for all three approaches in space defined as

$$\begin{aligned} I_{\text {eff}}^{s/{\widetilde{s}}} = \frac{\eta _{kh}^{s/{\widetilde{s}}}}{J(u)-J(u_{kh})} \end{aligned}$$

with \(\eta _{kh}^{s/{\widetilde{s}}}\) as the general estimator for \({\hat{u}}_{kh}\in {\widetilde{X}}_{k,h}^{0,s}\) and \({\hat{z}}_{kh}\in {\widetilde{X}}_{k,h}^{0,{\widetilde{s}}}\) as defined in Propositions 4.19 and 4.24 for the primal and adjoint estimators respectively. All estimators are computed using Algorithm 1.

Table 1 shows that all approaches yield effectivity indices very close to 1. We also notice, that the biggest difference between primal and adjoint estimator is obtained for the mixed order approach.

However, this is not surprising as the approach is tailored to the primal estimator and calculating the adjoint estimator could be seen as questionable for the two following reasons. When interpolating the higher order solution for the adjoint problem for \(z_{kh}\) we do not obtain the optimal \(z_{kh}\) compared to solving with bilinear elements directly. Additionally, reconstructing the higher order primal solution yields a worse approximation compared to solving directly with biquadratic finite elements.

Furthermore, we notice that we use a lot more elements in time than in space. In space-time, the estimator is dependent on a good balance between space and time discretization. For the sake of brevity we investigate this further in the following configuration as the solution is much more interesting.

Table 1 Section 5.1: Performance of the primal (left) and adjoint (right) error estimators under global refinement for temporal dG(0) discretization of the adjoint equation

5.2 Configuration 2: 2 + 1D Heat Equation with a Dynamic Manufactured Solution

5.2.1 Problem Statement

This test case was designed in [29]. We again solve the heat equation. The manufactured solution is a rotating hill on a unit-square spatial domain \(\Omega = (0,1)^2\) in the time interval \((0,T),\;T=1\).

The manufactured solution is given as

$$\begin{aligned}&u(x,y,t) = \frac{1}{1+50 \left( \left( x-x_0(t) \right) ^2+ \left( y-y_0(t) \right) ^2 \right) }, \end{aligned}$$
(86)
$$\begin{aligned}&x_0(t) = \frac{1}{2}+\frac{1}{4}\cos (2\pi t), \end{aligned}$$
(87)
$$\begin{aligned}&y_0(t) = \frac{1}{2}+\frac{1}{4}\sin (2\pi t). \end{aligned}$$
(88)

The right hand side of the problem is obtained as in Sect. 5.1 by inserting this solution into the heat equation. Additionally, all definitions for \(\eta _{kh}^{s/{\widetilde{s}}}\) and \(I_{\text {eff}}^{s/{\widetilde{s}}}\) are the same using again Algorithm 1 to compute the indicators.

5.2.2 Goal Functional

Since we are interested in capturing the local behaviour of the solution, we choose the \(L_2\)-error as functional of interest, i.e.

$$\begin{aligned} J(u_{kh}) = (u-u_{kh},u-u_{kh})^{1/2}. \end{aligned}$$
(89)

5.2.3 Comparison of the Different Spatial Approaches

We start by looking at the exact error and the resulting estimators for different choices of \(M_{\text {initial}}\). We see that the estimators in Tables 2, 3, 4 and 5 are converging to relatively stable values with rising \(M_{\text {initial}}\). We can also see that since the temporal elements are only doubled while the spatial elements are quadrupled, the initial number of temporal elements has to be large enough for uniform refinement. In adaptive simulations we can control this better as we can choose different fractions of spatial and temporal elements to be marked for refinement.

The equal low order estimators (Table 3) are underestimating the error with an effectivity of roughly \(0.34-0.40\) for the coarsest mesh. This gets much better after the two refinements where the estimator is close to the exact \(L_2\)-error. However, as the reconstruction effectively works on an even coarser mesh this is not surprising.

The mixed order estimators (Table 4) are less dependent on the spatial mesh size but overestimate the error with an effectivity of \(1.20-1.48\). We can also see that the overestimation gets smaller with rising \(M_{\text {initial}}\) but this is of course an additional cost factor.

The equal high order estimators (Table 5) are also less dependent on the spatial mesh size, but they also use double the amount of degrees of freedom for the primal problem. They also overestimate the error but for large enough \(M_{\text {initial}}\) the effectivity is less than 1.10. However, Table 6 shows that solving the primal problem with biquadratic elements and using a bilinear interpolation between the vertices does not recover the best approximation \(u_{kh}\). In practice this means that the primal problem should be solved natively with bilinear elements to obtain the solution for which the error is actually estimated, which leads to additional costs. This gets especially expensive for nonlinear problems. Note that in this particular case the resulting adjoint solution and consequently the estimators would also be different as the \(L_2\) error itself factors into \(J'_u\), so the accuracy of the estimator could be better.

In conclusion, both reconstructing a higher order solution and natively solving the primal or adjoint problem with higher order elements in space work well for linear problems, but the reconstruction is cheaper and leads to a better estimator for fine enough meshes.

How the low and mixed order approach perform for higher order finite elements and corresponding errors could be subject of further studies.

Table 2 Section 5.2: The exact \(L_2\) error under global refinement for different initial (uniform) temporal grids and bilinear finite elements in space
Table 3 Section 5.2: The primal equal low order error estimator \(\eta _{kh}^{1/1}\) under global refinement for different initial (uniform) temporal grids
Table 4 Section 5.2: The primal mixed order error estimator \(\eta _{kh}^{1/2}\) under global refinement for different initial (uniform) temporal grids
Table 5 Section 5.2: The primal equal high order error estimator \(\eta _{kh}^{2/2}\) under global refinement for different initial (uniform) temporal grids
Table 6 Section 5.2: The approximated \(L_2\) error under global refinement for different initial (uniform) temporal grids and biquadratic finite elements in space, where the solution is interpolated down to bilinear elements

5.2.4 Comparison of Adaptive Refinement to the Original Computations

For this configuration we want to compare our results to those described by Hartmann [29], where the manufactured solution was first formulated. To our knowledge this configuration was not reproduced and published so far, so the original thesis is the only point of comparison. There, the classical estimator is used, which is obtained by partial integration to obtain a strong form with jump terms in space. There, \(Q_1\) elements in space and dG(0) elements in time are used as well, but it is unknown which quadrature formula was used in time for the nonlinear f. We used the right box rule as this is what corresponds to the implicit Euler scheme and got our error (\(1.92e-02\)) closest to the error of the original results (\(1.75e-02\)). Table 7 shows our results with the split and joint estimators. These perform very well in comparison to the original Hartmann results from [29, Table 3.4].

Even though the marking strategy used by Hartmann is unknown, we got close to the number of temporal elements and the maximum number of spatial elements with fixed rate marking in Algorithm 2. In comparison, our estimators better localize the error as we get comparable errors with one less loop that additionally has a smaller maximum number of spatial elements. This can be seen in Fig. 2, where the original results are only performing roughly as well as our computation with uniform refinement, while both PU-DWR estimators yield better convergence. We plotted the \(L_2\) error against \(M*N_{\max }\) which is an upper bound for the actual number of space-time elements as no further information was available from the original computations. However, Table 8 shows that at least for our simulations the actual number is not too far from the upper bound. Additionally, we can see that, as expected, the split estimator outperforms the joint estimator.

Finally, Fig. 3 shows that the local refinement nicely matches the corresponding solution of Fig. 4 and that the meshes are indeed changing over time.

Table 7 Section 5.2: Our results with the equal low order primal split (top) and joint (bottom) PU-DWR estimator with fixed rate marking of \(95\%\) in time and \(40\%\) in space
Fig. 2
figure 2

Section 5.2: Error convergence of the Hartmann testcase

Table 8 Section 5.2: Comparison of actual number of space time elements and estimation by \(M*N_{\max }\)
Fig. 3
figure 3

Section 5.2: Grid after 4 refinement loops with the split PU-DWR estimator at \(t=i/4\), \(i\in \{1,2,3,4\}\)

Fig. 4
figure 4

Section 5.2: Solution after 4 refinement loops with the split PU-DWR estimator at \(t=i/4\), \(i\in \{1,2,3,4\}\)

5.2.5 Comparison of PU Spaces

Here, we want to examine whether the additional coupling with a cG(1) partition-of-unity in time is beneficial for adaptive refinement. For this, we performed multiple adaptive simulations with the corresponding low, mixed and high order estimators. Figures 5, 6 and 7 show the exemplary results with an initial temporal mesh of 1600 elements with fixed rate marking of \(60\%\) in time and \(40\%\) in space. In all three cases the dG(0) PU performs better than the cG(1) PU, with the lowest difference for the equal high order estimator. For the equal high order case the \(L^2\)-error is calculated on the interpolated primal solution, which does not recover the actual best approximation solution of directly solving the primal solution in the low order space, such that the initial error is higher than for uniform refinement. Finally, Fig. 8 shows the results for the mixed order estimator with a finer initial temporal mesh, which are qualitatively the same as for Fig. 6.

Overall we conclude that for discontinuous Galerkin discretizations in time the dG(0) partition-of-unity is completely sufficient in addition to being cheaper to compute and easier to implement.

Fig. 5
figure 5

Section 5.2: Error convergence of the Hartmann testcase for \(\eta ^{1/1}\) and \(M_\text {init} = 1600\)

Fig. 6
figure 6

Section 5.2: Error convergence of the Hartmann testcase for \(\eta ^{1/2}\) and \(M_\text {init} = 1600\)

Fig. 7
figure 7

Section 5.2: Error convergence of the Hartmann testcase for \(\eta ^{2/2}\) and \(M_\text {init} = 1600\)

Fig. 8
figure 8

Section 5.2: Error convergence of the Hartmann testcase for \(\eta ^{1/2}\) and \(M_\text {init} = 3200\)

5.3 Configuration 3: Nonlinear Combustion

5.3.1 Problem Statement

The final test case is as described in [57] (originally based on [36]) and some preliminary results were published in our prior work [67]. Here, we solve the nonlinear combustion equations described in Sect. 2.4.

5.3.2 Configuration

The reaction is simulated in a rectangular channel of length \(L=60\) and height \(H=16\) in which two cooled rods of length L/4 and height H/4 are inserted into both channel walls at L/4. The reaction is solved for a total of \(T=60\) with 256 time and 896 space DoFs initially.

The cooling of \(\Gamma _R\) is described by the Robin boundary condition \(\partial _n\theta = -0.1\theta \), with homogeneous Neumann conditions for the species concentration. The left wall \(\Gamma _D\) is kept at a constant temperature of \(\theta _D = 1\) without any combustible species \(Y_D = 0\). All other walls \(\Gamma _N\) are described by homogeneous Neumann boundary conditions. An initial flame front is described by

$$\begin{aligned} \theta ^0 = {\left\{ \begin{array}{ll} 1,&{} x\le 9\\ \exp (9-x),&{} x > 9 \end{array}\right. }\end{aligned}$$
(90)
$$\begin{aligned} Y^0 = {\left\{ \begin{array}{ll} 0, &{} x\le 9\\ 1-\exp (Le(9-x)), &{} x > 9 . \end{array}\right. } \end{aligned}$$
(91)

5.3.3 Parameters

The reaction parameters are a Lewis number of \(Le = 1\), a gas expension of \(\alpha = 0.8\) and a dimensionless energy of \(\beta = 10\).

5.3.4 Goal Functionals

The first functional we investigate is the average reaction rate i.e.

$$\begin{aligned} J_1(\theta ,Y) = \frac{1}{T|\Omega |} \int \limits _0^T\int \limits _\Omega \omega (\theta ,Y) \textrm{d}x\textrm{d}t. \end{aligned}$$
(92)

This nonlinear functional is defined on the whole space-time domain. For the second functional we calculate the average species concentration on the cooled rods \(\Gamma _R\) i.e.

$$\begin{aligned} J_2(\theta ,Y) = \frac{1}{T|\Gamma _R|} \int \limits _0^T\int \limits _{\Gamma _R} Y \textrm{d}s\textrm{d}t. \end{aligned}$$
(93)

This is a linear functional, but it is only defined on part of the boundary.

5.3.5 Discussion of Findings for \(J_1\)

For both functionals the indicators for \(\eta _{kh}^{{s/{\widetilde{s}}}}\) are computed using Propositions 4.20 and 4.25 and Algorithm 1 with \({{\widetilde{u}}}\), \(u_k\), \(u_{kh}\) and \({{\widetilde{z}}}\), \(z_k\), \(z_{kh}\) following from (63) to (64) with \({\hat{u}}_{kh}\in {\widetilde{X}}^{0,s}_{k,h}\) and \({\hat{z}}_{kh}\in {\widetilde{X}}^{0,{{\widetilde{s}}}}_{k,h}\). Then, the full estimators are computed by taking the averages of the primal and adjoint indicators at each space-time Dof and summing over all DoFs.

Tables 9, 10 and 11 show the behaviour of the primal, adjoint and full estimators for \(J_1\) respectively. We can see that all estimators with the mixed order and equal high order approach behave similarly. For both, the adjoint estimator and the resulting full estimator are overestimating the error by about two orders of magnitude. However, the primal estimators are not too far off. The equal low order approach yields the best results on the finest level, and the primal and adjoint estimators are comparable. It can be inferred that there is no benefit from calculating the full estimator in this case. Therefore, we use Algorithm 2 with fixed rate marking refining \(50\%\) of all temporal elements and on each interval \(30\%\) of all spatial elements based only on the primal indicators.

Figure 9 shows the error convergence under adaptive refinement when using only the primal estimator. When only counting the primal unknowns for both estimators, our findings for equal low order and mixed order are comparable, and both perform better than global refinement. However, when both the number of Dofs of the adjoint and the PU are additionally taken into account, the low order approach clearly outperforms the mixed order approach. We can also see that the starting disadvantage of having the same error on the coarsest mesh with more DoFs is rectified by the first adaptive refinement step.

Table 9 Section 5.3: Performance of the primal error estimators under global refinement for \(J_1\)
Table 10 Section 5.3: Performance of the adjoint error estimators under global refinement for \(J_1\)
Table 11 Section 5.3: Performance of the full error estimators under global refinement for \(J_1\)
Fig. 9
figure 9

Section 5.3: Error convergence for the reaction rate functional. On the left only the number of unknowns for the primal problem and on the right all unknowns are taken into account

Figure 10 shows the reaction rate and the corresponding grids for two different time points. We can see that the grid evolves nicely and follows the combustion reaction. This shows that our localization works well in capturing the physics and refining accordingly.

Fig. 10
figure 10

Section 5.3: reaction rate and grid at \(t=20\) (left) and \(t=60\) (right)

5.3.6 Discussion of Findings for \(J_2\)

Tables 12, 13 and 14 show the behaviour of the primal, adjoint and full estimators for \(J_2\) respectively. The equal low order approach shows a similar behaviour to the previous functional. However, the overestimation of the adjoint estimators for the mixed order and high order approach is not as bad as for the nonlinear functional. On the other hand their respective primal estimators are underestimating the error by quite a bit. In the full estimators these over- and underestimations are cancelling out quite nicely such that this estimator would be more useful here. For this reason the simulations for Fig. 11 were done by using the full estimator for adaptivity in Algorithm 2 with the same marking strategy as before. As with the first functional we see that both approaches yield comparable results when only counting the memory cost for solving the primal problem. We also see that both approaches eventually outperform uniform refinement when taking the total cost into account. But the equal low order approach is again and still the best choice.

Table 12 Section 5.3: Performance of the primal error estimators under global refinement for \(J_2\)
Table 13 Section 5.3: Performance of the adjoint error estimators under global refinement for \(J_2\)
Table 14 Section 5.3: Performance of the full error estimators under global refinement for \(J_2\)
Fig. 11
figure 11

Section 5.3: Error convergence for the species concentration functional. On the left only the number of unknowns for the primal problem and on the right all unknowns are taken into account

Figure 12 shows the species concentration and the refined grids based on \(J_2\) at different time steps. We see that along the cooled rods the grid again follows the combustion reaction. As that is the area where we have changes in the concentration this fits well. We also see that the mesh is refined around the cooled rod once the reaction moved past them. Together with the convergence behaviour we see again that the novel localization works well.

Fig. 12
figure 12

Section 5.3: Rod species concentration and corresponding grid at \(t=20\) (left) and \(t=60\) (right)

6 Conclusions

In this work, we proposed partition-of-unity (PU) dual-weighted residual a posteriori error estimators and space-time adaptivity for linear and nonlinear partial differential equations. From the algorithmic side, the main novelties are the extension of the PU localization to space-time Galerkin finite element discretizations and the realization of split and joint error estimators. From the implementation side, despite starting from pre-implementations in the DTM package dwr-diffusion [35] and deal.II [3], extensive code developments and debugging was necessary, which greatly exceed existing implementations, specifically for the nonlinear features such as the nonlinear combustion PDE as well as nonlinear goal functionals. In three numerical examples, we studied in the detail the computational performance for the linear heat equation and also for a nonlinear low Mach number combustion problem. We also found that the equal low order approach yielded the best estimation and adaptive performance across the board and that a cG(1)dG(0) PU is sufficient for cG(s)dG(r) discretizations of the primal problem. Furthermore, an example of an immediate practical application of our framework can be found within the excellence cluster PhoenixDFootnote 1 in which space-time methods and goal-oriented error estimation are of interest for the efficient solution of multiphysics problems and where the heat equation and the Navier–Stokes equations are needed.