Interpolation operators for parabolic problems

We introduce interpolation operators with approximation and stability properties suited for parabolic problems in primal and mixed formulations. We derive localized error estimates for tensor product meshes (occurring in classical time-marching schemes) as well as locally in space-time refined meshes.


Introduction
In recent years simultaneous space-time variational formulations for parabolic problems became more and more popular.Besides practical aspects like highly parallelizable computations [DGZ18; NS19; HLNS19; VW21] the ansatz offers analytical advantages including quasi-optimality of the discrete solution [TV16] (also called symmetric error estimates in [DL02;CW06]).This property motivates adaptive time stepping [Fei22], adaptive wavelet schemes [RS19], adaptive wavelet-intime and finite-element-in-space approaches [SVW22], and even adaptive mesh refinements locally in space-time [LS20; LSTY21; DS22; GS22].While numerical experiments suggest superiority of the latter approach for singular solutions, theoretical results are restricted to plain convergence [GS21] but do not verify optimal convergence rates as they do for elliptic problems [Ste07;CFPP14].Motivated by the extension of such optimality results to parabolic problem, this paper introduces and investigates a main ingredient in the analysis of adaptive schemes for parabolic problems like the heat equation in a time-space cylinder Q = J × Ω, namely interpolation operators suited for the norm Additionally, we introduce an interpolation operator for first-order formulations of the heat equation satisfying a beneficial commuting diagram property.On tensor product meshes the interpolation operators are stable and have optimal approximation properties.We give upper bounds for the interpolation errors and emphasize the need of parabolic scaling if the solution is rough in time.The localization of the interpolation error in space leads to unavoidable weights in terms of negative powers of the local mesh size.Under realistic regularity assumptions we can overcome these negative powers due to parabolic scaling.Unfortunately, this strategy cannot be applied to the interpolation error of adaptively refined meshes.In fact, we illustrate that any (local) interpolation operator experiences these difficulties.
Overall, this paper's main contributions are the following.
• We present approximation properties suited for parabolic problems in Section 3-4.• We introduce an interpolation operator with optimal approximation properties on tensor product meshes in Section 5.1.• We introduce an interpolation operator suited for first-order formulations with optimal approximation properties on tensor product meshes and a commuting diagram property in Section 5.2.• We introduce an interpolation operator for locally in space-time refined meshes and discuss its stability in Section 6.

Bochner spaces and their discretization
This section introduces Bochner spaces, suitable discretizations by finite elements, and their underlying partitions.
2.1.Bochner spaces.Our analysis is motivated by the approximation of parabolic problems like the heat equation.Given a time-space cylinder with bounded time interval J = [0, T ] ⊂ R d and bounded Lipschitz domain Ω ⊂ R d , this problem seeks with given right-hand side f : Q → R and initial data u 0 : Ω → R the solution A suitable analytical setting relies on Sobolev-Bochner spaces.Therefore, we set the space H −1 (Ω) as the dual of the Sobolev space H 1 0 (Ω) equipped with norm ∇ x • L 2 (Ω) and dual pairing •, • Ω := •, • H −1 ,H 1 0 (Ω) which equals the L 2 inner product for smooth functions.Given V ∈ {H 1 0 (Ω), L 2 (Ω); H −1 (Ω)}, we set p 2 L 2 (J ;V ) := J p(s) 2 V ds for all p : J → V, for all v : J → V.
Remark 1 (Tensor spaces).Bochner spaces can be seen as closure of algebraic tensor product spaces [EG21b, Rem.64.24], i.e., for We are particularly interested in the space Lemma 2 (Embedding).We have for all v ∈ X and t ∈ J = [0, T ] Proof.This is a known result which we prove to stress the dependency on T often hidden in textbooks.Let v ∈ X and t ∈ J .The fundamental theorem of calculus [EG21b,Thm. 64.31] reveals for all τ ∈ J Thus, (3) leads to the bound (5) Notice that elliptic regularity results imply for convex or smooth domains Ω The estimates in (4)-(6) provide some reasonable regularity assumptions.

Triangulation. Rather than using simplicial partitions of the time-space cylinder
. The following considerations motivate the use of such partitions.
• A special case of cylindrical partitions are tensor product meshes which typically occur in time-marching schemes and are thus of great interest.Such refinements can easily be achieved with cylindrical meshes.
• The faces of each time-space cell in a cylindrical partition T are either parallel or perpendicular to the time axis.This allows for the design of finite elements that are better suited for approximations in spaces like L 2 (J ; , where div x denotes the divergence in space.This leads to significantly improved rates of convergence compared to finite elements on simplicial meshes; see [GS22].Throughout this paper we suppose that the partition T of Q = J × Ω consists of time-space cells K = K t × K x ⊂ R d+1 with shape regular d-simplices K x .A special class of meshes satisfying these assumptions are tensor-product meshes.Given conforming partitions T t and T x of the time interval J and the domain Ω into shape-regular simplices, these meshes read Besides these tensor product meshes, we discuss adaptively refined meshes with hanging vertices in Section 6.
2.3.Finite element spaces.Let T be a partition of Q as described in the previous subsection.For all cells K = K t × K x ∈ T and polynomial degrees k ∈ N 0 we set for L ∈ {K, K t , K x } the space of polynomials Given polynomial degrees k, ∈ N, we discretize the space X in (2) by A special class of meshes included in our analysis are tensor product meshes T ⊗ = T t ⊗ T x introduced in (7).We set the spaces If T = T ⊗ is a tensor product mesh, the ansatz space in (8) equals

Local estimates
In this section we introduce several local estimates for functions on a time-space cell This definition and Friedrichs' inequality lead to the upper bound The following lemma shows that these two terms are equivalent for polynomials.
Lemma 3 (Inverse estimate).Let k ∈ N 0 .We have the upper bound for all f h ∈ P k (K x ).
The hidden constant depends solely on the degree k and the shape regularity of K x .
The following result is of crucial importance for the analysis of parabolic problems.It involves the integral mean The result is stated in a very general formulation in [DSSV17, Lem.2.9].Rather than using the more general result, we give an alternative direct proof.
More general, we have for k, ∈ N 0 The hidden constant depends solely on the polynomial degrees k and as well as the shape regularity of K x .
The proof of the theorem splits the approximation of v by a polynomial on K into the approximation by a polynomial in time and a polynomial in space.While approximation properties of the latter are well understood, we state approximation properties of functions in for all m = 0, . . ., k.
Proof.This result follows directly from the tensor product structure in Remark 1 and approximation properties of polynomials in H m (K t ).A detailed proof (for general L p spaces with p ∈ [1, ∞]) can be found in the appendix of [DST21].
Let I L 2 t : L 2 (K t ) → P k (K t ) be an L 2 (K t ) stable projection onto the space of polynomials of maximal degree k ∈ N 0 in time.Its application everywhere in space leads to a mapping for functions on the entire time-space cell K, that is, Proof.This result follows by classical arguments using Lemma 5. See [DST21, Thm.24] for a detailed proof.
With these two results we are able to verify Theorem 4.
Proof of Theorem 4. We denote the L 2 (K t ) orthogonal projection in time and the H −1 (K x ) orthogonal projection in space onto constant functions by By applying them everywhere in time or space they extend to semi-discrete maps Since their composition maps onto constant functions, we have for all f ∈ L 2 (K) These two estimates (with g = f − f Kx ) and Poincaré's inequality show The second addend in (11) is bounded due to the inverse estimate in Lemma 3, stability of Π H −1 (Kx) in H −1 (K x ), and approximation properties in Lemma 5-6 by Combining ( 11)-(13) concludes the proof of the first inequality in the theorem.Similar arguments yield the second inequality.
If the function v in Theorem 4 satisfies additionally that ∂ t v ∈ L 2 (K), an application of (10) to the first estimate leads to the Poincaré inequality In this regard Theorem 4 can be seen as a weaker version of Poincaré's inequality that is better suited for parabolic problems.For example, the regularity stated in (4) does not yield ∂ t ∇ x u ∈ L 2 (K) for the solution to the heat equation, preventing an application of (14).However, Theorem 4 applies and yields with parabolic scaling h t h 2 x the convergence result ).The need of parabolic scaling for irregular solutions is further illustrated by the numerical experiment in [DS22, Sec.7.4].
Remark 7 (Sharp estimate).Inverse estimates show that the bound in Theorem 4 must be sharp.More precisely, let v = v t v x with polynomials v t ∈ P k (K t ) and

Interpolation in space or time
The main idea in this paper's design of interpolation operators in space-time is to exploit the tensor product structure of Bochner spaces like H 1 (J ; This allows us to apply an interpolation operator in time to the H 1 (J ) component and in space to the H −1 (Ω) component.

4.1.
Interpolation operator in space.We utilize the H −1 (Ω) stable interpolation operator for conforming and shape-regular partitions T x of Ω with ∈ N. Throughout this subsection we assume that T x is such a partition.Let N x denote the set of vertices in T x and set for all j ∈ N x the corresponding vertex patch We denote the nodal basis functions by ϕ Proof.Let ξ ∈ H −1 (Ω).The partition of unity 1 = j∈Nx ϕ x,j leads for all w ∈ H 1 0 (Ω) to the upper bound This concludes the proof of the upper bound.The lower bound follows with standard arguments (see for example [DST21,Lem. 11]).
The upper bound in Lemma 8 is indeed sharp, as one can see by localizing the H −1 (Ω) norm of the constant function ξ = 1 ∈ H −1 (Ω).The operator I x allows for a localization of the H −1 (Ω) norm without any additional weights.In particular, we have the following result involving the patches Theorem 9 (Interpolation operator I x ).The operator Moreover, it satisfies for all ξ ∈ L 2 (Ω) and K ∈ T Proof.This result is shown in [DST21, Thm.1].
Remark 10 (Boundary data).It is possible to modify the design of I x in order to replace the space L 1 ,0 (T x ) equipped with zero boundary data by the space L 1 (T x ) without zero boundary data; see [DST21] for details.
An application of I x everywhere in time extends the operator to a mapping for almost all s ∈ J .(16) 4.2.Interpolation operators in time.Besides the interpolation operator I x in space introduced in the previous subsection, we utilize an interpolation operator I t : H 1 (J ) → L 1 k (T t ) with polynomial degree k ∈ N and partition T t of the time interval J .We set the operator locally for each for time interval Its definition involves the bubble function b Kt ∈ P 2 (K t ) with Kt b Kt ds = 1 and b Kt (a) = 0 = b Kt (b).For v ∈ H 1 (K t ) we set the operator as follows.Let I 1 Kt v ∈ P 1 (K t ) denote the nodal interpolation defined by We set the interpolation of v as Theorem 11 (Interpolation operator I Kt ).The operator

an integration by parts and the definition of I 2
Kt yield for all w h This proves the commuting diagram property.The commuting diagram property yields the best-approximation property and leads to the projection property.
By applying the operator everywhere in Ω, the operator extends to a mapping Applying I Kt on each time cell K t ∈ T t leads to the operator

Tensor product meshes
This section introduces interpolation operators for special cylindrical partitions of Q, namely tensor product meshes T = T t ⊗ T x with a partition T t of the time interval J and a conforming simplicial partition T x of the domain Ω.Such partitions are of special interest since classical time-marching schemes can be seen as a spacetime ansatz using such meshes and ansatz spaces X h = L 1 (T t ; L 1 ,0 (T x )) as well as some specific discretization of the test space L 2 (J ; H 1 0 (Ω)); see for example [UP14; Fei22] for the Crank-Nicolson scheme.We introduce and investigate a suitable interpolation operator in the first subsection.The second subsection introduces and investigates an interpolation operator for mixed schemes.

Interpolation operator I ⊗
X .Due to the tensor product structure of the mesh T = T t ⊗ T x , the discrete space X h defined in (8) equals ,0 (T x ).This allows for the direct application of the operators More precisely, we set the interpolation operator I ⊗ X : X → X h as the composition This operator has the following beneficial properties involving the local mesh sizes Moreover, we have for all v ∈ H 1 (J ; H −1 (Ω)) the upper bound Proof.Let v ∈ X.The triangle inequality yields The approximation properties displayed in Theorem 9 yield for the first addend Due to inverse estimates (Lemma 3) and Theorem 11 the second addend satisfies This proves the first inequality in the theorem.Let v ∈ L 2 (J ; H −1 (Ω)).Since I x ∂ t = ∂ t I x , we have An application of Theorem 9 to the first addend yields The H −1 (Ω) stability of I x and the approximation properties of I t yield Combining the estimates concludes the proof.
Due to the continuous embedding X → C 0 (J ; L 2 (Ω)) in Lemma 2, we have for all v ∈ X and t ∈ J = [0, T ] the upper bound The following result improves this bound.We set the diameters h t (K t ) := |K t | and h x (K x ) := diam(K x ) for all K t ∈ T t and K x ∈ T x .
Then we have The arguments in the proof of Theorem 12 lead to the bound Combining this estimate with the approximation properties displayed in Theorem 12 concludes the proof.
We conclude this subsection with two remarks.
Remark 14 (Stability in L 2 (J ; H 1 0 (Ω))).While the operator I ⊗ X is always stable in H 1 (J ; H −1 (Ω)), its (uniform) stability in X requires the parabolic scaling h t (K) h x (K) 2 for all K ∈ T .This is due to the change of the norm It is possible to avoid this change of norms when I t is replace by some L 2 stable projection operator and assume that neighboring time cells K t , K t ∈ T t are of equivalent size.A similar proof as in Theorem 12 leads for all v ∈ L 2 (J ; H 1 0 (Ω)) to Furthermore, it satisfies for all v ∈ H 1 (J ; H −1 (Ω)) Note that this operator has increased the domain of dependence with respect to the time direction compared to the operator in Theorem 12.
Remark 15 (Localization of the H 1 (K t ; H −1 (Ω)) norm).While the interpolation error for localizes only in time.Lemma 8 shows that it is possible to localize further but at the cost of negative powers of the local mesh size, that is for all v ∈ H 1 (J ; This upper bound is indeed sharp, as the following consideration shows.Suppose there exists with some s < 2 for all v ∈ H 1 (J ; H −1 (Ω)) an estimate The estimate holds in particular for functions w = w t w x with w t ∈ H 1 (J ) and Hence, we have This proves convergence of ∂ t (w t − I t w t ) L 2 (J ) independent of the time discretization, which cannot be possible.

Commuting interpolation operator
Simultaneous space-time minimal residual methods [FK21; GS21; GS22; DS22] and time marching schemes in mixed form [JT81; BRK17; KP22] involve the time-space divergence div (v, τ ) := ∂ t v + div x τ and the related space We set for all (v, τ ) ∈ Λ the squared norm In particular, we have with Σ := L 2 (J ; H(div x , Ω)) and Let T = T t ⊗ T x be a tensor product mesh with conforming triangulations T t of J and T x of Ω. Set the Raviart-Thomas finite element space RT (T x ), which reads with identity mapping id : Ω → Ω Theorem 16 (Commuting interpolation operator I RT ).There exists an interpolation operator Moreover, it has for all p ∈ L 2 (Ω; R d ) the approximation property Proof.A suitable operator is investigated for example in [EG21a,Sec. 23].
We set the discrete subspace Moreover, we denote the L 2 orthogonal projections onto the space of piece-wise polynomials in time L 0 k−1 (T t ) and piece-wise polynomials in space L 0 (T x ) by We set the interpolation operator We have the commuting diagram property The triangle inequality yields The approximation properties of the semi-discrete operators Π L 0 k−1 (Tt) and I RT lead to the approximation property in the theorem.Theorem 16 implies the commuting diagram property.
We set the discrete subspace Λ h := X h ⊗ Σ h .By exploiting the tensor product structure we can define for each (v, τ ) ∈ Λ an interpolation (I ⊗ X v, I ⊗ Σ τ ) ∈ Λ h with good approximation properties.In fact, a similar interpolation operator has been suggested in [GS22].We modify this ansatz to achieve additionally a commuting diagram property.The modification involves the application of the inverse Laplacian (−∆ x ) −1 : L 2 (J ; H −1 (Ω)) → L 2 (J ; H 1 0 (Ω)) everywhere in time defined for all ξ ∈ L 2 (J ; H −1 (Ω)) as solution operator to for all w ∈ L 2 (J ; H 1 0 (Ω)).We set for all (v, τ ) ∈ Λ the interpolation operator ).Moreover, we control for all (v, τ ) ∈ X × Σ the interpolation error by ).The approximation property follows from an application of the triangle inequality and the L 2 stability of . Remark 19 (Smoothing rough-right hand sides).The papers [FHK21; DST21; Füh22] suggest smoothing of the right-hand side in least-squares and mixed formulations for the Poisson model problem to conclude optimal rates of convergence even with right-hand sides in H −1 (Ω).The key in the proof are suitable properties of the smoothing operator and the commuting diagram property of the operator I RT .Using the commuting diagram property of the operator I Λ and using a suitable smoother (which results from the composition of Π L 0 k (Tt) and the smoother for the Poisson model problem in space) lead to the same results for least-squares and mixed schemes for the heat equation.

Irregular meshes
In this section we introduce an operator I X : X → X h with locally in spacetime refined underlying triangulation T .Such local refinements lead to irregular partitions as for example displayed in Figure 1.In order to have some local support of the nodal basis functions (ϕ j ) j∈N ⊂ X h where N denotes the set of degrees of freedom in X h , we need additional assumptions like the 1-irregular rule in [GS22].
To avoid technicalities, we do not discuss the impact of these properties and rather state the following assumption.
• (Shape regularity) All d-simplices K x with K = K t × K x ∈ T are shape regular.• (Local grading) Let K ∈ T and set N (K) := {j ∈ N : K ⊂ supp(ϕ j )}.We define the patch ω K = ω K,t × ω K,x ⊃ j∈N (K) supp(ϕ j ) as the smallest cylinder that contains the support of all basis functions ϕ j with j ∈ N (K).We assume that simplices Lemma 20 (Operator Π K ).Let K ∈ T and v ∈ X.We have + min Proof.Let v ∈ X and let K = K t ×K x ∈ T .Recall the L 2 (K t ) orthogonal projector Π P k (Kt) defined in (18).Let Π P (Kx) denote the L 2 (K x ) orthogonal projection onto Using the approximation properties of the semi-discrete projection operators leads to the first estimate in the lemma.Similar arguments yield due to the identity Π K v = Π P (Kx) Π P k (Kt) v the second estimate.
We assign to each degree of freedom j ∈ N a simplex K(j) ∈ T with j ∈ K and set the operator I X : X → X h with (I X v)(j) := (Π K(j) v)(j) for all v ∈ X and j ∈ N .
Theorem 21 (Interpolation operator I X ).The operator I X is a projection onto X h that satisfies for all v ∈ X Proof.The projection property of I X follows directly by its definition.Let v ∈ X, v h ∈ X h , and Let N loc (K) denote the degrees of freedom in P k (K t ) ⊗ P (K x ) with associated basis functions b γ = b γ,t b γ,x , where b γ,t ∈ P k (K t ) and b γ,x ∈ P γ (K x ) are such that b γ (β) = δ γ,β for all γ, β ∈ N loc (K).Then there exists a dual basis b Hence, we have The values of I X δ at the local degree of freedom γ ∈ N loc (K) read as follows.
• If γ is not on the boundary J × ∂Ω, the value of I X at γ depends on the values of I X at some degrees of freedom (j m γ ) • If the local degree of freedom γ is on the boundary J × ∂Ω, we have (I X δ)(γ) = 0 and set N γ := 0.
Case 1 (No dofs on boundary).We suppose that a local degree of freedom γ ∈ N loc (K) is neither on the boundary J × ∂Ω nor it depends on some degree of freedom j ∈ N (K) on the boundary.Then (28) as well as the fact that I X preserves constant functions (away from the boundary) and j∈N (K) We set the integral means δ ω K,t := − ω K,t δ ds and δ ω K,x := x Ω and scaling arguments yield Case 2 (dof on boundary).Suppose that a degree of freedom j ∈ N (K) is on the boundary J × ∂Ω.Then the patch ω K shares a face with the boundary and so scaling arguments and Friedrichs' inequality lead for all γ ∈ N loc (K) to Combining ( 26)-( 31) and Lemma 20 leads to the first bound in the lemma.The second follows similarly.
We proceed with a comparison of the interpolation error estimates on tensor product and irregular meshes.Let K ∈ T be a time-space cell with h t := h t (K) and h x := h x (K) in a tensor product mesh (if we apply I ⊗ X ) or in an irregular mesh.Due to Theorem 4, 12, and 21 we have the stability and approximation properties Despite a smaller domain of dependency with respect to time (neglected in the comparison above), the advantages of the operator I ⊗ X are restricted to stability properties in X rather than in L 2 (J ; H 1 0 (Ω)) ∩ H 1 (J ; L 2 (Ω)).However, under reasonable regularity assumptions like in (4) and (5) both operators lead to the same approximation properties.These properties suggest the following mesh scalings.
• If we only have (4), the results suggest the parabolic scaling h t h 2 x .
Notice however, that similar arguments as in the proof of Theorem 12 allow us to conclude upper bounds for the L 2 (Q) norm of the interpolation error ∂ t (v − I ⊗ X v) as well.This leads to the following comparison for I ∈ {I ⊗ X , I X }, where the values in brackets are solely needed if I = I X : Under the regularity assumptions in (4) and (5) the estimates show for both operators a reduced rate of convergence compared to (32).This can be expected, since we investigate the error with respect to the stronger L 2 (Q) norm.The combination of the error estimates with the regularity properties in (4) and (5) suggests a scaling h t h 3/2 x .If we grade the mesh too strongly, for example h t h 2 x , the operator I X experiences, unlike the operator I ⊗ X , stability issues due to the terms h Notice that unlike for operators on tensor meshes, such terms which do not depend on the time derivative ∂ t v must occur in bounds for the interpolation error ∂ t (v − I X v), since on irregular meshes the interpolated function I X v might vary in time even so v might be constant in time, that is, the property ∂ t v = 0 does in general not imply ∂ t I X v = 0.The following remark investigates this aspect in more detail.
Remark 22 (Parabolic scaling vs. local refinements).Let I : X → X h be some locally defined interpolation operator with first order ansatz space X h = X 1,1 h and basis functions (ϕ j ) j∈N .For simplicity we assume that the operator has weights ϕ * j ∈ X with supp(ϕ * j ) ⊂ supp(ϕ j ) and Iv = j∈N v, ϕ * j ϕ j for all v ∈ X.
Moreover, we assume that these weights solely depend on the shape of the element patch.Let the underlying mesh result from refining a uniform tensor mesh T t ⊗ T x with 0 < h t = |K t | for all K t ∈ T t and 0 < h x = diam(K x ) for all K x ∈ T x in every fourth time interval K t (4), K t (8), K t (12), • • • ∈ T t as depicted in Figure 1.We can find a function v ∈ X with ∂ t v = 0 such that (Iv)| Kt(4m−2)×Ω = 0 equals zero on every (4m − 2)-th time interval in T t with m ∈ N and ϕ(j) = 1 for all degrees of freedom j ∈ N inside the refined area, that is inside j ∈ int(K t (4m) × Ω).Scaling arguments lead to x .By definition we have ∂ t Iv L 2 (J ;H −1 (Ω)) ∂ t Iv L 2 (Q)) h −1 t .Thus, the interpolation error reads In this regard the terms in (33) cannot be avoided in interpolation error estimates for the time derivative on irregular meshes.

Conclusion
This paper introduces interpolation operators and investigates their stability and (localized) approximation properties displayed in Theorem 12, 18, and 21.Their derivation led to the following observations.
• While it is possible to localize interpolation errors in the H −1 (Ω) as for example done in [DST21], it is not possible to localize the L 2 (J ; H −1 (Ω)) error in space without introducing a negative power of the local mesh size as weight; see Remark 15. • The parabolic Poincaré inequality in Theorem (4) suggest a parabolic scaling h t (K) h 2 x (K) for the interpolation of irregular functions v ∈ X.This scaling occurs also when we change the norm in our interpolation error estimates like in (23).Roughly speaking, this change of norms reads On irregular meshes, we have to use the L 2 (J ; L 2 (Ω)) norm to localize the error in the approximation of the time derivative.If we change the norm (which we have to do according to Remark 22), we observe roughly speaking This indicates that parabolic scaling occurs naturally for tensor product meshes but causes difficulties for irregular meshes.Remark 22 underlines the latter observation.All in all, we have shown that the localization of the L 2 (J ; H −1 (Ω)) norm in space leads to some unavoidable difficulties, which can partially be overcome by assuming additional smoothness of the underlying function.However, interpolation operators I : X → X h cannot have the same beneficial properties as interpolation operators for elliptic problems.It is likely that similar difficulties occur in the numerical analysis of simultaneous space-time variational formulations, in particular when the underlying mesh is irregular.For example, to the authors' knowledge there exists no numerical scheme that leads to quasi-optimal approximations with respect to the norm in X with underlying meshes that do not have some kind of tensor product structure; c.f. [SW21].An exception are minimal residual methods [FK21;GS21; Nγ m=1 ⊂ N (K) with some uniformly bounded number N γ ∈ N.More precisely, there exist coefficientsα m γ ∈ R with (I X w)(γ) = Nγ m=1 α m γ (I X w)(j m γ ) for all w ∈ X.(28)For each basis function j m γ ∈ N (K) there exist by definition ofI X dual weight functions (b m γ ) * = (b m γ,t ) * (b m γ,x ) * with (b m γ,t ) * ∈ P k (K t (j m γ )) and (b m γ,x ) * ∈ P (K x (j m γ )) such that (I X w)(γ)

•
The parabolic Poincaré inequality in Theorem 4 suggests the use of parabolically scaled meshes for irregular solutions.Thus, we want to allow for local mesh refinements such that the diameter of local cells in space direction h x and the length of cells in time direction h t satisfy h t h 2 x if we scale parabolically, h t h x if we scale equally.

•
If we only have (4) and (5), the results suggest the scaling h t h x .•If we additionally have ∂ 2 t v ∈ L 2 (Q), the results suggest the scaling h t h x .