Oscillation in a posteriori error estimation

In a posteriori error analysis, the relationship between error and estimator is usually spoiled by so-called oscillation terms, which cannot be bounded by the error. In order to remedy, we devise a new approach where the oscillation has the following two properties. First, it is dominated by the error, irrespective of mesh fineness and the regularity of data and the exact solution. Second, it captures in terms of data the part of the residual that, in general, cannot be quantified with finite information. The new twist in our approach is a locally stable projection onto discretized residuals.


Introduction
Finite element methods are a successful and well-established technique for the solution of partial differential equations.A key tool for the quality assessment of a given finite element approximation and the application of adaptive techniques are so-called a posteriori error estimators.These are functionals that are computable in terms of data and the finite element approximation and aim at quantifying the approximation error.For all known estimators, their actual relationship to the error is spoiled by oscillation, i.e., by some additive terms measuring distances between non-discrete and discrete data.Remarkably, oscillation may be even greater than the error.This flaw directly interferes with the quality assessment and, on top of that, it weakens results on adaptive methods and complicates their proofs.
In this article we introduce a new approach to a posteriori error estimation, where oscillation is error-dominated, i.e. it is bounded by the error of the finite element approximation, up to a multiplicative constant depending on the shape-regularity of the underlying mesh.
We illustrate this new approach in the simplest case, where the weak solution u P H 1 0 pΩq of the Dirichlet-Poisson problem ´∆u " f in Ω, u " 0 on BΩ (1.1) is approximated by the Galerkin approximation U that is continuous and piecewise affine over some simplicial mesh M of Ω.It is instructive to start by recalling the a posteriori error bounds in terms of the standard residual estimator

¸1{2
; see, e.g., Ainsworth and Oden [2] or Verfürth [25].If f P L 2 pΩq, then the energy norm error }u ´U } H 1 0 pΩq and the estimator are almost equivalent.More precisely, we have where the interfering oscillation is given by (1.4) osc 0 pf, Mq 2 :" Let us discuss the relationship of this classical L 2 -oscillation and the energy norm error; for the proofs of the nontrival statements, see §3.8.Customarily, oscillation is associated with higher order.This idea is supported by the following observation: if f is actually in H 1 pΩq, then osc 0 pf, Mq " Oph 2 M q as h M :" max KPM h K OE 0. On any fixed mesh however, the oscillation osc 0 pf, Mq may be arbitrarily greater than the energy norm error }u ´U } H 1 0 pΩq .This is a consequence of the fact that the L 2 -norm is strictly stronger than the H ´1-norm.The use of the L 2 -norm in (1.4) can be traced back to its use in the element residual h K }f } L 2 pKq in (1.2) and so it can be motivated by the request for the computability of the estimator.In fact, in contrast to an element residual based upon some local H ´1-norm of f , this form reduces to the (approximate) computation of an integral.
One may think that the use of the L 2 -norm is the only reason for the possible relative largeness of oscillations like osc 0 pf, Mq.Yet, Cohen, DeVore and Nochetto present in [11] a striking example which entails that even the H ´1-oscillation min gPP0pMq }f ´g} 2 H ´1pΩq with P 0 pMq :" tg P L 8 pΩq | @K P M g| K is constantu (1.5) from Braess [7] and Stevenson [22] may converge slower than the error; see Lemma 21 below.Notice that this contradicts the aforementioned idea that osc 0 pf, Mq is always of higher order and, moreover, in view of osc 0 pf, Mq À E R pU, f, Mq, it entails that also the estimator E R pU, f, Mq may decrease sightly slower than the error.
The key tool to overcome the shortcomings of the above oscillations is a new projection operator P M enjoying the following properties; see § §3.3-3.5: ‚ P M f is discrete for any functional f P H ´1pΩq.In comparison to P 0,M , the image of P M is enriched by the span of the face-supported Dirac distributions and so contains true functionals.‚ P M f is computable in a local manner.Here computable means that it can be determined from the information available in the linear systems for finite element approximations.‚ The local dual norms of the new oscillation f ´PM f are dominated by corresponding local errors.This property hinges on the face-supported Dirac distributions and on local H ´1-stability of P M f .‚ In contrast to the local dual norms of the residual f `∆U , the local dual norms of the discretized residual P M f `∆U can be estimated from below and above in a computable manner.
Thanks to these properties, we derive in § §3.6-3.7 abstract a posteriori bounds such that the oscillation is bounded by the error.In §4 we provide several realizations leading to hierarchical estimators and estimators based on local problems or based on equilibrated fluxes.Furthermore, in §4.2 we show that an extension of the standard residual estimator (1.2) onto the image of P M satisfies (1.6) }u ´U } 2 where V stands for the set of vertices of M and ω z is the star around the vertex z.
A comparison with (1.3) immediately yields: ‚ Both E R pU, f, Mq and the right-hand side of (1.6) bound the energy norm error in terms of U , f , and M.However, while the latter one is free of overestimation, the first one may overestimate, even asymptotically.
‚ Since P M f is discrete and computable in the aforementioned sense, we have that E R pU, P M f, Mq is also computable, while E R pU, f, Mq is not.‚ Equivalence (1.6) thus splits the estimation of the error in two parts, reflecting the spirit of Verfürth [25,Remark 1.8] and Ainsworth [1, Section 3.1]: One part is computable and related to the underlying differential operator.The other one depends solely on data; its computation, or rather estimation, hinges on a priori knowledge.

Model problem and discretization
In order to exemplify our new approach to a posteriori error estimation, we consider the homogeneous Dirichlet problem for Poisson's equation and the energy norm error of the associated linear finite element solution.The purpose of this section is to recall the relevant properties of this boundary value problem and discretization.
We shall use the following notation associated with a (Lebesgue) measurable set ω of R d , d P N. Given m P N, we let L 2 pω; R m q denote the Lebesgue space of square integrable functions over ω with values in R m .We write xv, wy ω and } ¨}2 ω for its scalar product and its induced norm.For m " 1, we abbreviate L 2 pω; Rq to L 2 pωq.
If ω Ă R d is non-empty and open, H 1 pωq stands for the Sobolev space of all functions in L 2 pωq whose distributional gradient is also in L 2 pω; R d q.Moreover, we let H 1 0 pωq be the closure in H where the dual brackets xℓ, wy ω :" ℓpwq, w P H 1 0 pωq, extend-restrict the scalar product in L 2 pωq.If D Ă R d is a set such that D is suitable for one of the preceding notations, we also use D instead of the more cumbersome D, e.g.we write also H 1 pDq instead of H 1 p Dq.
Let Ω be an open, bounded and connected subset of R d whose closure can be subdivided into simplices.We shall omit Ω in the notation of dual pairings and norms.The weak formulation of (1.1) reads as follows: (2.2) Given f P H ´1pΩq, find u " u f P H 1 0 pΩq such that @v P H 1 0 pΩq x∇u, ∇vy " xf, vy .In other words: we are looking for the Riesz representation of f in H 1 0 pΩq.Notice that the Riesz representation theorem establishes an isomorphism between the space H 1 0 pΩq of solutions and the space H ´1pΩq of loads.In particular, a unique solution exists not only for f P L 2 pΩq but for all f P H ´1pΩq.This fact suggests that, at least conceptually, an approximation method for (2.2), along with its a posteriori analysis, should cover also loads in H ´1pΩq.
In order to approximate the solution of (2.2), we use a Galerkin approximation based upon finite elements.For the sake of simplicity, we restrict ourselves to simplicial meshes and lowest order.
Let M be a simplicial, face-to-face (conforming) mesh of the domain Ω.Given an element K P M, we denote by h K :" diam K :" sup x,yPK |x ´y| its diameter and by ρ K :" suptdiam B | B ball in Ku the maximal diameter of inscribed balls.In what follows, 'À' stands for 'ď C', where the generic constant C may depend on d and the shape coefficient In the case of both inequalities 'À' and 'Á', we shall use '»' as shorthand.
An interelement face of M is a simplex F with d vertices arising as the intersection F " K 1 X K 2 of two uniquely determined elements K 1 , K 2 P M. Its associated patch is We let F " F pMq denote the set of all pd ´1q-dimensional interelement faces of M. Given F P F and K P M with F Ă K, we write for the height of K over F .Furthermore, V " VpMq stands for the set of all vertices of M. To any vertex z P V, we associate the sets for which we have If K P M with K Ă ω z for some z P V, then the diameter h z of ω z verifies Moreover, if e is a direction, i.e. e P R d with |e| " 1, we write h z;e for the maximal length of a line segment in ω z with direction e.Then (2.7) ρz :" inf Let P k be the space of polynomials of degree at most k P N over R d and let P k pMq :" V P L 8 pΩq | V | K P P k pKq for all K P M ( be its piecewise counterpart over M. The space of continuous, piecewise affine functions over M is then Its nodal basis tφ z u zPV is defined by φ z P VpMq such that φ z pyq :" δ zy for all z, y P V.
This basis provides the nodal value representation for any V P VpMq and the partition of unity (2.9) where, for each vertex z P V, we have supp φ z " ω z , with skeleton σ z .Finally, we recall that, for any element K P M and any powers α z P N 0 , z P V X K, we have The finite element functions satisfying the boundary condition in (2.2) form the space V 0 pMq :" tV P VpMq | V pzq " 0 for all z P V X BΩu " P 1 pMq X H 1 0 pΩq.The associated Galerkin approximation U " U f ;M is characterized by (2.11) U P V 0 pMq such that @V P V 0 pMq x∇U, ∇V y " xf, V y .
Notice that the right-hand side and so U are well-defined, also for f P H ´1pΩq, thanks to the conformity of V 0 pMq.Céa's lemma states that the Galerkin approximation is the best approximation with respect to the energy norm error, i.e., }∇u ´∇U } Ω ď }∇u ´∇V } Ω for all V P V 0 pMq.(2.12) In order to determine the Galerkin approximation U , one usually obtains its values at the interior vertices V 0 :" V X Ω by solving the symmetric positive definite linear system M α " F, where (2.13) α " pU pzqq zPV0 , M " `x∇φ z , ∇φ y y ˘y,zPV0 , F " pxf, φ y yq yPV0 .
We thus see that the Galerkin approximation U is computable whenever the load evaluations (2.14) xf, φ y y , y P V 0 , are known exactly.
Strictly speaking, these evaluations are in general not computable.In fact, even if f P L 2 pΩq is a function, the evaluation of xf, φ y y " ´Ω f φ y requires the computation of an integral, which in general can be done only approximately by means of numerical integration.Notwithstanding, error analyses of approximations like (2.11) have proved very useful for the theoretical understanding and underpinning of finite element methods and are therefore very common.Accordingly, we shall suppose that the evaluations (2.14) are known to us.In §3.6 below, we will discuss which kind of additional information is used in our a posteriori analysis.

A posteriori analysis with error-dominated oscillation
We present our new approach to a posteriori error analysis by deriving bounds for the energy norm error of the Galerkin approximation (2.11).The key feature of these bounds is that all involved terms are dominated by the error.
3.1.Residual norms.Given some load f P H ´1pΩq and a Galerkin approximation U f ;M , we want to quantify the energy norm error }∇pu f ´Uf;M q}, where the exact solution u f of (2.2) is typically unknown to us.
Our starting point is the so-called residual Respf ; Mq P H ´1pΩq given by xRespf ; Mq, vy :" xf, vy ´x∇U f ;M , ∇vy for all v P H 1 0 pΩq.It is defined in terms of data and the computable Galerkin approximation and vanishes if and only if the latter equals the exact solution.The following lemma shows that appropriately measuring the size of the residual relates to the error.
Lemma 1 (Error, residual and load).We have Proof.Thanks to the differential equation in (2.2), we have, for all v P H 1 0 pΩq, (3.1) xRespf ; Mq, vy " x∇pu f ´Uf;M q, ∇vy " x´∆pu f ´Uf;M q, vy , where ´∆ indicates the distributional Laplacian.Consequently, the claimed equality follows from the fact that ´∆ : H 1 0 pΩq Ñ H ´1pΩq is an isometry (which follows from the Cauchy-Schwarz inequality in L 2 pΩq and from testing with v " u f ´Uf;M ).The claimed inequality follows by invoking also (2.12): Thus, we aim now at quantifying the dual norm } Respf ; Mq} H ´1pΩq .The following simple observation shows that this task requires much more information than computing the Galerkin approximation.
Lemma 2 (Bounding residual norms).Without any a priori information on the load f P H ´1pΩq, the residual norm } Respf ; Mq} H ´1pΩq cannot be bounded in terms of a finite number of adaptive evaluations of the form: xf, vy with v P H 1 0 pΩq.Proof.Suppose that the claim is false.Then, for each f P H ´1pΩq, there is a number Bpf q ě } Respf ; Mq} H ´1 pΩq which is given in terms of evaluations xf, v i y, i " 1, . . ., n f , where the choice of v i may depend deterministically on the previous evalutations xf, v 1 y , . . ., xf, v i´1 y.Fix some functional 0 ‰ ℓ P H ´1pΩq. Since H 1 0 pΩq is infinite-dimensional, we can choose a normalized w P H 1 0 pΩq that is perpendicular to V 0 pMq and all test functions v i , i " 1, . . ., n ℓ associated with ℓ.Set δ :" 3Bpℓqp´∆qw and observe that U δ;M " 0 and xδ, v i y " 0 for all i " 1, . . ., n ℓ .Therefore xℓ `δ, v i y " xℓ, v i y and we obtain the contradiction

Remark 3 (Load evaluations vs exact integrals).
A similar yet simpler argument shows that, without any a priori information on f P L 2 pΩq, also }f } cannot be bounded in terms of adaptive evaluations ´Ω f v with v P L 2 pΩq.
Before discussing in §3.3 repercussions of Lemma 2, it is useful to take into account a further requirement for a posteriori bounds.

Localized residual norm.
Adaptive mesh refinement is an important application of a posteriori bounds.It is usually based upon the comparison of local quantities.Therefore, it is of interest to split a posteriori bounds, or the residual norm itself, into local contributions.
Such a localization appears implicitly, e.g., in the a posteriori error analysis of Babuška and Miller [3].It is based upon the W 1,8 -partition of unity (2.9) and the orthogonality property: We thus introduce the subclass of residuals associated with Galerkin approximations.Recall that supp φ z " ω z and that H ´1pω z q is a shorthand for H ´1pω z q.
where the hidden constant depends only on d and the shape coefficient σpMq.
(ii) We have ÿ zPV }ℓ} 2 H ´1pωz q ď pd `1q}ℓ} where the reals c z P R are given by c z :" ´ωz vφ z dx ´ωz φ z dx for z P V 0 , and c z " 0 for z P VzV 0 .
To prove (ii), we let v z P H 1 0 pω z q with }∇v z } ωz ď 1 for any node z P V and set v " and, with the help of two Cauchy-Schwarz inequalities, Consequently, we conclude (ii) by taking the suprema over all v z for all z P V.
Thus, in the context of adaptive mesh refinement, we are also interested in quantifying the single terms of the localized residual norm Of course, we face the same problem for the local residual norms as for the global one.
Corollary 5 (Bounding local residual norms).Without any a priori information on f P H ´1pΩq, each local residual norm } Respf, Mq} H ´1 pωzq , z P V, cannot be bounded in terms of a finite number of adaptive evaluations of f .
Proof.Replace the domain Ω by ω z in the proof of Lemma 2 and extend functionals in H ´1pω z q by 0 on the orthogonal complement of H 1 0 pω z q in H 1 0 pΩq.

3.3.
Towards error-dominated oscillation.In view of Lemma 2 and Corollary 5, a posteriori bounds for the residual norm or its localized variant require knowledge on the load f beyond a finite number of evaluations.The actual knowledge of f can be of different nature and, accordingly, may require different techniques.Here we want to address only aspects of a posteriori error estimation that are independent of the nature of this knowledge.Correspondingly, we split the residual into an discretized residual and data approximation: pMq can be bounded with the help of a finite number of evaluations of f and ‚ the task of bounding }f ´PM f } H ´1pMq hinges only on knowledge of the load f ; this task may be viewed as a matter of approximation theory since, apart from the choice of the norm, it is independent of the boundary value problem (2.2).
Here we have used the localized dual norm } ¨}H ´1 pMq in order to allow for applications in mesh adaptivity.It is then desirable that both parts are dominated by the error, i. e., we have In view of Lemma 1 and Lemma 4, the two conditions are equivalent.
The construction of a suitable mapping P M is the new twist in our approach.In order to get first hints on this, let us test out several candidates with necessary conditions arising from (3.6b).
The proof of Corollary 5 suggests that the problem lies in the fact that f is taken from an infinite-dimensional space.The projection P 0,M into discrete data from (1.4) is thus a candidate for P M .This choice, however, does not verify (3.6).In fact, Lemma 1, Lemma 4 (ii), and (3.6b) imply the stability estimate while P 0,M f is not even defined for a general f P H ´1pΩq (and cannot be continuously extended; cf.Lemma 20).This flaw is easily remedied.For any element K P M, we replace in (1.4) the characteristic function χ K of K by the weighted mean thanks to (2.10) and consider Since ψ K P H 1 0 pKq Ă H 1 0 pΩq is an admissible test function, the operator P0,M is defined for all functionals in H ´1pΩq and satisfies the stability estimate (3.7); see Remark 11 below.
But still, the new operator P0,M does not verify (3.6).To see this, consider f " ´∆V with V P V 0 pMq arbitrary.We then have u f " U f ;M and therefore Respf ; Mq " 0 and property (3.6b) entails (3.10) @V P V 0 pMq P M p∆V q " ∆V.
In addition, integration by parts yields that, for all v P H 1 0 pΩq, where ds indicates the pd ´1q-dimensional Hausdorff measure in R d and JpV q is the jump in the normal flux ∇V ¨n across interelement sides.More precisely, if F " K 1 X K 2 is the intersection of the elements K 1 , K 2 P M with respective outer normals n 1 , n 2 , then JpV q| F :" ∇V | K1 ¨n1 `∇V | K2 ¨n2 P R. If V ‰ 0, then we have also ∆V ‰ 0, while (3.11) yields P0,M p∆V q " 0, in contradiction with (3.10).Hence (3.6) does not hold for P0,M .The two conditions (3.7) and (3.10) are central to our goals.Although they can be checked without involving the Galerkin approximation (2.11), they are also sufficient for (3.6), Incidentally, they imply that P M has to be a near best 'interpolation' operator in light of the Lebesgue lemma.
The failure of (3.10) for P0,M is not related to the choice of the test functions ψ K , K P M, but to its range.In fact, (3.11) and the fundamental lemma of calculus of variation show that ∆V R L 2 pΩq whenever V ‰ 0, while P0,M pV 0 pMqq Ă L 2 pΩq.In other words: to remedy, we have to change the range.
Finally, it is desirable that P M is a local operator for two reasons.First, this comes in useful when evaluating P M .Second, since ´∆ is a local operator, we have the following lower bound for the local error: which follows from testing (3.1) with all v from H 1 0 pω z q.This bound can be exploited if we strengthen (3.6) to the local conditions for all z P V. We shall therefore demand the stability (3.7) and invariance (3.10) in a suitable local manner.
In order to formulate local invariance, let us introduce the following notations associated with an open subset ω Ă Ω.If ℓ 1 , ℓ 2 P H ´1pΩq, we say ℓ 1 " ℓ 2 on ω whenever ℓ 1 pvq " ℓ 2 pvq for all v P H 1 0 pωq.Moreover, we write ℓ 1 P DpMq on ω when additionally ℓ 2 can be chosen such that ℓ 2 P DpMq.Notice that, thanks to the fundamental lemma of the calculus of variations, these notions reduce to the usual ones if ℓ P L 2 pΩq, i.e. ℓpvq " ´Ω gv for all v P H 1 0 pΩq.Let us summarize our discussion by a list of desired properties for the operator P M and its range DpMq Ă H ´1pΩq, which corresponds to the set of all possible discretized residuals.This list provides the guidelines for our approach and choices.Denoting by ∆pV 0 pMqq " t∆V | V P V 0 pMqu the image of V 0 pMq under the distributional Laplacian, we aim for the following properties: ∆pV 0 pMqq Ă DpMq, (3.14a) if ℓ P DpMq on ωz , then }ℓ} H ´1pωz q is quantifiable with a finite number (3.14b) of evaluations of ℓ, Regarding the above discussion, we have that conditions (3.14f), (3.14e) and (3.14a) are equivalent to (3.13); cf.§3.7.Conditions (3.14d) and (3.14b) allow to quantify the local dual norms of the approximate residual P M f `∆U f ;M P DpMq in a computable manner; compare also with §3.6 below.
In the next three sections we construct two operators P M fulfilling (3.14).
3.4.Discretized residuals and a locally stable biorthogonal system.We present a possible choice of the set DpMq of discretized residuals and introduce an associated biorthogonal system, which is instrumental in constructing a suitable operator P M with range DpMq.We set for all v P H 1 0 pΩq with c K , c F P R for K P M, F P F u. Every functional ℓ P DpMq is thus constant on each element and on each face.Obviously, condition (3.14a) is verified.More precisely, DpMq is a strict superset of ∆pV 0 pMqq, since in ∆pV 0 pMqq only certain linear combinations of the constants c F , F P F are allowed.The fact that these constants are independent in DpMq facilitates the definition of P M .Moreover, we have added the contributions given by the constants c K , K P M, for comparability with the classical oscillations and a posteriori error estimators and because similar contributions will appear for higher order elements; cf.Kreuzer and Veeser [15].In spite of these enlargements, we still have dim DpMq ă 8. Consequently, an argument as in the proof of Lemma 2, which hinges on infinite dimension, is ruled out.
Let us associate a biorthogonal system with DpMq.To this end, we introduce the surface Dirac distributions and we identify the characteristic functions χ K , K P M, with their associated distributions Notice that the definitions of χ F and χ K involve different measures for integration: the pd ´1q-dimensional Hausdorff measure for χ F and the d-dimensional Lebesgue measure for χ K .Correspondingly, each χ K is absolutely continuous and each χ F is singular with respect to the d-dimensional Lebesgue measure.We collect all elements and interelement faces in the index set I " IpMq :" M Y F and derive in the next lemma the properties of the functionals χ i , i P I, that are of interest to us.
Lemma 6 (Basis and scaling).The functionals χ i , i P I, are a basis of DpMq.For any element K P M and any face F P F containing a vertex z P V, we have Proof.We will use the Friedrichs inequality (3.17) @v P H 1 0 pω z q }v} ωz ď ρz }∇v} ωz and the following trace theorem: if F P F with F Q z and n denotes a normal of F , then (3.18) @w P W 1,1 0 pω z q }w} L 1 pF q ď 1 2 }∇w ¨n} L 1 pωzq .
Given K P M with K Q z and any v P H 1 0 pω z q, the Cauchy-Schwarz inequality and (3.17) yield which verifies the first claimed inequality.To show the second one, fix F P F with F Q z and let again v P H 1 0 pω z q.Using (3.18) with w " v 2 and then again (3.17), we derive and also the second claimed inequality is proved.
In order to complete the basis of Lemma 6 to a biorthogonal system, we use the following test functions: Given any element K P M, take Given any interelement face F P F , let z i , i " 1, 2, be the vertices in the patch ω F , see (2.3), that are opposite to F and set Let us verify that the basis χ i , i P I and the test functions ψ i , i P I, actually form a biorthogonal system with a crucial stability condition.
Lemma 7 (Locally stable biorthogonal system).Together with the basis χ i , i P I, the test functions ψ i , i P I, form a locally stable biorthogonal system: (i) We have @i, j P I xχ i , ψ j y " δ ij .(ii) Let I z :" ti P I | i Q zu denote the elements and faces containing a vertex z P V. Then @i P I z }χ i } H ´1pωz q }∇ψ i } ωz ď C ψ , where the stability constant C ψ only depends on d and the shape coefficient σpMq.
Proof.To show (i), we consider the cases of elements j P M and faces j P F separately.First, let K P M be an element.As already seen in (3.8), we have xχ K , ψ K y " ´K ψ K " 1.Moreover, since ψ K " 0 in Ωz 8 K, we infer xχ K 1 , ψ K y " 0 for any K 1 P MztKu and xχ F , ψ K y " 0 for any F P F .
Second, fix a face F P F .Using (2.10), we obtain From ψ F " 0 in Ωz 8 ω F , where ω F is the patch of the two elements containing the face F , we infer xχ F 1 , ψ F y " 0 for any F 1 P F ztF u and xχ K , ψ F y " 0 for any K P M with K Č F .Last, let K P M such that K Ą F .Using again (2.10), we deduce For (ii), we again treat elements and faces separately.Let K P M be an element containing z.The well-known inverse estimate }∇ψ Combining this with the first inequality in Lemma 6 and (2.8), we obtain the claimed inequality for elements: Let F P F be an interelement face containing z and write F " K 1 X K 2 , where K 1 , K 2 P M are the two elements containing F .Proceeding as before, we deduce In what follows, we shall rely only on the properties of the test functions ψ i , i P I expressed in Lemma 7. In other words: what counts is not their special form, but the fact that they form a stable biorthogonal system with the basis χ i , i P I, of DpMq.

3.5.
Construction and properties of P M .We now propose a possible choice for the projection operator P M and verify the desired properties (3.14).Set (3.21) where the functionals χ i , i P I, are given by (3.16) and the test functions ψ i , i P I, by (3.19).Clearly, P M is linear and P M f is locally computable in terms of a finite number of evaluations of f , i. e., we have (3.14c) and (3.14d).The biorthogonality of these functionals and test functions implies the following local counterparts of the algebraic condition (3.10).
Theorem 8 (Local invariance).For any functional ℓ P H ´1pΩq, element K P M, and side F P F , the operator P M does not change the following discrete restrictions: (i) If ℓ P DpMq on K, then P M ℓ " ℓ on K. (ii) If ℓ P DpMq on ωF , then P M ℓ " ℓ on ωF .
Proof.Let ℓ " cχ K on K with c P R. For any i P I, we have xℓ, ψ i y " c ´K ψ i " cδ K,i by means of Lemma 7 (i).Consequently, P M ℓ " cχ K on K, which proves (i).
To show (ii), let K 1 , K 2 P M be the two elements containing F and let ℓ " cχ F `ři"1,2 c i χ Ki on ωF with c, c 1 , c 2 P R. Using again Lemma 7 (i), we observe and xℓ, ψ i y " 0 for all i P IztF, K 1 , K 2 u.Consequently, and also (ii) is verified.
Theorem 8 implies in particular (3.14e).Moreover, it has the following global consequences.
Corollary 9 (Global invariance).The operator P M is a linear projection onto the discretized residuals DpMq from (3.15).In particular, we have P M p∆V q " ∆V and P M pf q " f.
for any V P V 0 pMq and any M-piecewise constant function f P P 0 pMq.
Next, we verify the local stability (3.14f) of P M .As a side product, we also obtain the local stabilty of the operator P0,M , which was left open in §3.3.
Theorem 10 (Local stability).The linear projection P M is locally H ´1-stable: for any functional ℓ P H ´1pΩq and any vertex z P V, we have where the hidden constant depends only on d and σpMq.
Proof.Given v P H 1 0 pω z q, we derive where we used Lemma 7 (ii) and #I z À 1.
Remark 11 (Stability of P0,M ).The argument in the proof of Theorem 10 also shows that P0,M is locally H ´1-stable.In fact, one simply replaces P M by P0,M and the index set I z by I z X M.
Let us conclude this section with the following further remarks on the linear projection P M .
Remark 12 (Orthogonality).For any ℓ P H ´1pΩq, the functional ℓ ´PM ℓ is orthogonal to span tψ i | i P Iu.This a immediate consequence of Lemma 7 (i).
Remark 13 (Adjoint of P M ).Formally, the adjoint of P M is given by Here Lemma 7 (i) implies (3.22) ˆK P Mv " ˆK v and ˆF P Mv " ˆF v for all elements K P M and interelement faces F P F .The operator P M and these conditions, which characterize it, were used in Veeser [23] to derive an a posteriori error upper bound in terms of a hierarchical estimator.That argument, as well as Morin, Nochetto, and Siebert [18, Theorem 3.6] and Verfürth [24, (3.14)], is closely related to Theorem 15 below.
3.6.Required a priori information, an alternative to P M , and quantification of the discretized residual.The purpose of this section is twofold.First, we illustrate which type of a priori information on f in (2.2) is needed to carry out our approach, presenting also a possible alternative to P M .Second, we show that a stable biorthogonal system is not only useful to construct P M , but also to quantify the local dual norms of discretized residuals.
Clearly, the operator P M of §3.5 can be applied to the right-hand side f of (2.2) whenever (3.23) xf, ψ i y , i P I, are known exactly.
In order to ensure a meaningful discretized residual, this information goes beyond (2.14), the information necessary for the Galerkin approximation (2.11) on the mesh M; it is available, e.g., when one is able to compute the counterpart of (2.11) of order d `1 over M.
There are other possibilities to obtain a meaningful discretized residual.The following one fits particularly well to (2.14) in the context of mesh adaptivity.Suppose that we are given an initial mesh and a refinement procedure such that the set M of all refined meshes form a shape-regular family.Furthermore, suppose that, for any mesh M P M, there is a refinement Ă M P M with vertices Vp Ă Mq that satisfies the following properties: Let us now fix a mesh M P M and a refinement Ă M P M satisfying (3.24).For any i P IpMq, using (3.24b), we fix a vertex r z P Vp Ă Mq interior to i and denote by r φ r z its associated hat function in Vp Ă Mq.We then obtain counterparts r ψ i , i P I, of the test functions ψ i , i P I, by using these hat functions with a suitable scaling in place of the element and faces bubble functions in (3.19) such that the following lemma holds.We skip the technical details, referring to Morin, Nochetto and Siebert [17] and Veeser [23].
Lemma 14 (Another locally stable biorthogonal system).Together with the basis χ i , i P I, the test functions r ψ i , i P I, form a locally stable biorthogonal system: (i) We have (ii) Let I z " ti P I | i Q zu denote the elements and faces containing a vertex z P V. Then @i P where the stability constant C ψ only depends on d and the shape coefficient σpMq.
Thus, the operator defines an alternative to P M and the properties (3.14) without (3.14b)can be established as for P M .The operator r P M can be evaluated on any mesh M P M whenever where t r φ z u zPV0p Ă Mq denotes the nodal basis of V 0 p Ă Mq.This is exactly (2.14) for all meshes in M. Consequently, it is also needed to ensure that an adaptive algorithm with the above refinement procedure can always compute the Galerkin approximation (2.11).
Let us now turn to the quantification of the discretized residual and verify (3.14b), considering a general locally stable biorthogonal system.
Theorem 15 (Quantifying local dual norms).Let ψ i , i P I, be the test functions from Lemma 7 or Lemma 14.If ℓ P DpMq on a star ω z , then the corresponding local dual norm can be quantified by a finite number of evaluations: where the hidden constant depends on d, σpMq, and C ψ .
Proof.Let us first prove the lower bound, which holds for any arbitrary functional ℓ P H ´1pΩq.In fact, the definition of the dual norm readily yields ( and the proof of the lower bound is finished. To show the upper bound, we (need to) assume that ℓ P DpMq on ω z .Given v P H 1 0 pω z q, we can then write xℓ, vy " ÿ iPIz c i xχ i , vy with c i P R.
In light of the biorthogonality, we have c i " xℓ, ψ i y.Using also the local stability of the biorthogonal system, we infer Since the solid angle of every simplex containing z is bounded away from 0 in terms of d and the shape coefficient σpMq, we have #I z ď C σpMq .Consequently, the Cauchy-Schwarz inequality on the sum implies the desired upper bound.
Theorem 15 implies the missing (3.14b) for both operators P M and r P M and, in accordance with §3.3, we have splittings of the local residual norms with the desired properties.Notice that, in view of the discussion of this section and Corollary 5, bounding the terms cannot be done in general with a finite number of evaluations of the load f .Notably, these terms involve only the load and the discretized residuals can be quantified with finite information, which, in light of Remark 3, is less than the information required for evaluating local L 2 -norms of the load f .

3.7.
A posteriori error bounds.We now summarize our preceding results by deriving a posteriori error bounds.The resulting bounds are defined for any load f P H ´1pΩq and the oscillation is dominated by the error.
The following statements remain correct if P M is replaced by r P M from (3.25).
Theorem 16 (Abstract upper bound).For any functional f P H ´1pΩq and any conforming mesh M, we have Each local dual norm }P M f ´∆U f ;M } H ´1 pωzq of the discretized residual can be quantified with a finite number of evaluations of f , while the quantification of the local dual norms }P M f ´f } H ´1pωz q of the oscillation requires additional a priori information on f .
Proof.Lemma 1, Lemma 4 and a triangle inequality imply the claimed bound.
Recalling that P M f `∆U f ;M P DpMq, Lemma 15 and Corollary 5 ensure the statements about the quantification of the two parts of the bound.
In contrast to previous results available in literature, the complete upper bound in Theorem 16 is also a lower bound, even locally.
Theorem 17 (Abstract local lower bounds).For any functional f P H ´1pΩq and any conforming mesh M, the discretized residual and the oscillation are locally dominated by the error: for every vertex z P V, we have Proof.In light of (3.12), the first claimed inequality follows from the triangle inequality and the second one.The latter is a consequence of Theorems 8 and 10 and (3.12): Squaring and summing, we readily get the global lower bounds.
Corollary 18 (Abstract global lower bounds).For any functional f P H ´1pΩq and any conforming mesh M, the discretized residual and the oscillation are globally dominated by the error in that To summarize: if we are able to quantify the oscillation terms }P M f ´f } H ´1pωz q , z P V, then the right-hand side in Theorem 16 is a truly equivalent a posteriori error estimator.
Remark 19 (Surrogate oscillation).The quantification of the local dual norms }P M f ´f } H ´1pωz q , z P V, of the oscillation appears to be a difficult matter.In [11, Section 7], Cohen, DeVore, and Nochetto consider similar terms for special f and resort to surrogates that can be approximated with the help of numerical integration.Those surrogates hinge on additional regularity of f , which entails the risk of overestimation; cf.Lemma 20 below.
3.8.Classical versus error-dominated oscillation.In this section we compare the error-dominated oscillation p ř zPV }P M f ´f } 2 H ´1pωz q q 1{2 with the classical L 2and H ´1-oscillation, osc 0 pf, Mq and min gPP0pMq }f ´g} H ´1pΩq , from (1.4) and (1.5) in the introduction.Doing so, we verify statements of the introduction and substantiate the advantages of the stability and invariance properties of the operator P M .Let us first show that the error-dominated oscillation is always smaller, up to a multiplicative constant, than both classical oscillations.To this end, let f P H ´1pΩq and let g P P 0 pMq be an arbitrary piecewise constant approximation over M. The local invariance and stability properties of P M in Theorems 8 and 10 imply that, for all z P V, Combining this with Lemma 4 (ii) and minimizing over g, we obtain the bound in terms of the classical H ´1-oscillation: To show the other bound, suppose f P L 2 pΩq.Making use of the orthogonality of P 0,M and Poincaré inequalities in the elements of ω z , we deduce which together with (3.29) gives the bound in terms of the L 2 -oscillation: The converse bounds of (3.30) do not hold.For the classical L 2 -oscillation, this applies even on a fixed mesh and is in particular due to stability issues.The following lemma provides an illustration, relating directly to the error of the error-dominated oscillation.
Lemma 20 (Overestimation of classical L 2 -oscillation).For any conforming mesh M, there exists a sequence pf k q k Ă L 2 pΩq such that On the one hand, the energy norm errors }∇pu f k ´Uf k ;M q} are uniformly bounded with respect to k.On the other hand, in view of lim kÑ8 }f k } L 2 pΩq " 8, the oscillation osc 0 pf k , Mq becomes arbitrarily large for k Ñ 8.
In the case of the classical H ´1-oscillation, (3.30a) cannot be inverted because of invariance issues.Let us illustrate this again by the relationship to the Galerkin error.Consider (3.31) f " ´∆V for some V P V 0 pM : qzt0u, where M : is some conforming simplicial mesh of Ω.For any conforming refinement M of M : , we then have u f " V " U f ;M and f R P 0 pMq.Hence where the classical H ´1-oscillation can be made arbitrarily large for a given M but decreases to 0 under suitable refinement.One could argue that the (neighborhoods of the) loads (3.31) are very special, in particular because the optimal convergence rate of (3.31) is formally 8.Here is another example based upon Cohen, DeVore, and Nochetto [11,Section 6.4], where the optimal nonlinear convergence rate for the error is finite and often encountered in practice.

¸1{2
ě L n n ´1{2 (3.33) hold.It thus remains to establish (3.32b).To this end, we fix temporarily an arbitrary vertex z P V of a conforming mesh M and let g P P 0 pMq.The inverse triangle and (3.12) yield }f ´g} H ´1pωz q ě }∆U f ;M `g} H ´1pωz q ´}f `∆U f ;M } H ´1 pωzq ě }∆U f ;M `g} H ´1pωz q ´}∇pu f ´Uf;M q} ωz .
By Lemma 7, we have, for all K P M, and, for all F P F and Theorem 15 therefore implies Exploiting also Lemma 4, we arrive at ˜ÿ zPV
Remark 22 (Overestimation of H ´1-variant of standard residual estimator).As pointed out by Cohen, DeVore, and Nochetto [11], the example of Lemma 21 entails that the right-hand side of a variant of the standard residual estimator defined for all loads f P H ´1pΩq, is overestimating.In §4.2 below, we propose through our new approach another variant that is free of overestimation.

Realizations with classical techniques
The a posteriori error bounds in §3.7 are abstract in that they are given in terms of the local dual norms } ¨}H ´1pωz q , z P V, of the discretized residual and the oscillation.For the norms }P M f `∆U f ;M } H ´1 pωzq , z P V, of the discretized residual, we required a quantification in terms of finite information on the load and provided a possible realization in Theorem 15.In this section we discuss a selection of alternative realizations.All realizations are motivated by classical approaches to a posteriori analysis and cover two explicit and two implicit techniques.It is worth making the following observations: As a consequence, we also have the following counterpart of (4.2): In order to prove the converse bound, we may proceed with the help of P M as in [23].However, having Theorem 15 at our disposal, it is simpler to exploit (4.3).We immediately see We therefore obtain }∇ψ F } ´1}∇λ i } À |F | and (4.5b) Summing up, the hierarchical estimator quantifies the local discretized residual, and we have the following a posteriori bounds.
Theorem 23 (Hierarchical estimator with error-dominated oscillation).For any functional f P H ´1pΩq and any conforming mesh M, we have the global equivalence as well as the following local lower bounds: for every z P V, The hidden constants depend only on d and σpMq.

4.2.
An improved standard residual estimator.The standard residual estimator applies suitably scaled norms to the jump and element residual; see, e.g., Verfürth [25,Section 1.4].In the case of the discretized residual this leads to the following indicators: where h F and h K denote, respectively, the diameters of F and K and computability is given in terms of U f ;M and (3.23).These indicators actually quantify the discretized residual and in a way that is very tight to Theorem 15: for any interelement face F P F , where the hidden constants depend only on d and σpMq.To see (4.6a), let F P F be any interelement face.Lemma 7 (i), the trace inequality (3.18) for w " ψ 2 F and the Friedrichs inequality (3.17) for v " ψ F , both with ω F in place of ω z , give Inserting the combination of Theorem 15 and (4.6) in the abstract a posteriori analysis of §3.7, we obtain the following result.
Theorem 24 (Standard residual estimator with error-dominated oscillation).For any functional f P H ´1pΩq and any conforming mesh M, we have the global equivalence as well as the following local lower bounds: for z P V, The hidden constants depend only on d and σpMq.
Theorem 24 relies on key features of the approach in §3, which the following remark elaborates on.
Remark 25 (Classical vs new standard residual estimator).In contrast to the classical standard residual estimator (1.2) and its H ´1-variant in Remark 22, the variant of Theorem 24 is completely equivalent to the error.The reason for this improvement lies in a suitable correction of the original jump residual.To elucidate this, remember that both the classical standard residual estimator and its H ´1variant in Remark 22 do not discretize the residual and therefore compare them to ÿ , which also does not split off an infinite-dimensional part of the load f .The corrections xf, ψ F y, F P F , of the jump residual make sure that the new jump residual has the invariance properties necessary for avoiding overestimation, i. e., it vanishes whenever the exact solution happens to be discrete.Corrections with this property have been used previously.For example, Nochetto [19] considers the special case f " f 1 `div f 2 , where f 1 , f 2 are suitable functions, and assigns pdiv f 2 q| K , K P M, to the element residual and the jumps in the normal trace of f 2 across interelement sides correct the jump residual.Similarly, in standard residual estimators for the Stokes problem, pressure jumps correct the jump residual associated with the velocity.The novelty is that the corrections xf, ψ F y, F P F , are defined for an arbitrary f P H ´1pΩq and also locally H ´1-stable and so fulfill the second necessary condition to avoid local overestimation.Notably, the latter entails that, even if f is a smooth function, the jump residual will be corrected significantly in certain cases.where the functions ψ i and λ i are defined, respectively, in (3.19) and (4.1).Given a vertex z P V, the indicator is then E L pf, M, zq :" }∇ν z }, where ν z P U z such that @λ P U z ˆΩ ∇ν z ¨∇λ dx " xRespf ; Mq, λy .
Thus, ν z is computable in terms of U f ;M and, e.g., (3.23).The indicator E L pf, M, zq may be viewed as an implicit counterpart of p ř iPIz E H pf, M, iq 2 q 1{2 from §4.1.Taking λ " ν z , we immediately obtain the constant-free lower bound (4.7) E L pf, M, zq ď } Respf ; Mq} H ´1pωz q , which slightly improves upon (4.2).Notice that, in light of Remark 12, the solution ν z can be interpreted also as a lift of the discretized residual P M f `∆U f ;M .Consequently, the first inequality in E L pf, M, zq ď }P M f `∆U f ;M } H ´1 pωzq À E L pf, M, zq (4.8) is correct.The second one follows from Remark 13 and Theorem 10 in the spirit of Morin, Nochetto and Siebert [18].In fact, for v P H 1 0 pω z q, we have xP M f `∆U f ;M , vy " xRespf ; Mq, P Mvy " ˆωz ∇ν z ¨∇P M v dx ď }∇ν z }}∇P Mv} ωz À E L pf, M, zq}∇v} ωz .
Theorem 26 (Estimator based on local problems with error-dominated oscillation).
For any functional f P H ´1pΩq and any conforming mesh M, we have the global equivalence as well as the following local lower bounds: for every z P V, E L pf, M, zq ď }∇pu f ´Uf;M q} ωz and }P M f ´f } H ´1pωz q À }∇pu f ´Uf;M q} ωz .
The hidden constants depend only on d and σpMq.

4.4.
An estimator based on flux equilibration.While indicators based on local problems provide constant-free local lower bounds, estimators based on flux equilibration aim for a constant-free, or at least explicit, global upper bound.This is achieved with the help of other, more sophisticated liftings within the framework of the fundamental theorem of Prager and Synge [21], which, for the homogeneous Dirichlet problem (1.1), can be formulated as follows: For any v P H 1 0 pΩq, we have }∇pv ´uq} " min }ξ} | ξ P L 2 pΩ; R d q with div ξ " ∆v `f in H ´1pΩq ( .(4.9) Realizations of this idea in Ainsworth [1], Braess and Schöberl [9], Ern, Smears and Vohralik [13,14], and Luce and Wohlmuth [16] make use of some classical oscillation.Its replacement by an error-dominated oscillation requires some adjustment to the approach of §3.
The upper bound in the localization of Lemma 4 involves a non-explicit multiplicative constant.In order to improve on this, we replace the local spaces H 1 0 pω z q, z P V, with H z :"

#
tv P H 1 pω z q | ´ωz v " 0u, if z P V 0 " V X Ω, tv P H 1 pω z q | v| BωzXBΩ " 0u, if z P VzV 0 , equip them with the norm }∇ ¨}ωz , and denote the respective dual spaces by H z .
(i) If ℓ P R M , then Splitting the residual up in discretized residual and oscillation, we then obtain the following abstract error bounds; we do not state the global lower bound as it is immediate consequence of the local one.

4. 3 .
An estimator based on local problems.A local problem lifts the residual to a local extension of the given finite element space and so provides a local correction, the norm of which is used an error indicator; cf.Babuška and Rheinboldt[4].While computability requires finite-dimensional extensions, the higher cost with respect to the previous explicit estimators is tied up with the hope of improved accuracy.The following instance from Verfürth [25, Section 1.7.1 and Remark 1.21] is vertex-based and uses the local extensions U z :" spantλ i | i P I z u " spantψ i | i P I z u, z P V,
1pωq of all infinitely differentiable function with compact support in ω.If the boundary Bω of ω is sufficiently regular (e.g., Lipschitz), this are all functions in H 1 pωq with vanishing trace on the boundary Bω.Thanks to Friedrichs' inequality, H 1 0 pωq is a Hilbert space with scalar product x∇¨, ∇¨y ω and norm }∇ ¨}ω .As usual, H ´1pωq indicates the dual space of H 1 0 pωq, i.e. the space of linear and continuous functionals on H 1 0 pωq.We identify L 2 pωq with its dual space Fˇˇˇˇď}ℓ} H ´1psupp ψiq for any i P I z .Notice that the essential supremum ofx Þ Ñ #ti P I z | supp ψ i Q xu is bounded by d `1.Arguing as in the proof of Lemma 4 (ii), we therefore obtain Lemma 21 (Another overestimation of classical H ´1-oscillation).Let Ω " p0, 1q 2 .There is a functional f P H ´1pΩq and a sequence pL n q n with log n Á L n Ñ 8 as where M varies in all meshes created by recursive or iterative newest vertex bisection of some conforming initial mesh M 0 of Ω.Proof.In [11, Section 6.4] Cohen, DeVore and Nochetto construct some function u f P H 1 0 pΩq and a sequence L n as claimed for which (3.32a) and Hierarchical estimators and estimators based upon local problems implicitly introduce a splitting of the residual like the one proposed in §3.3.‚ The overestimation of the standard residual estimator in Remark 22 can be cured with the help of the splitting of the residual in §3.3.‚ Employing different local dual norms, the approach of §3 can be extended to estimators based on flux equilibration.‚ Each realization quantifies a local dual norm of the discretized residual by a computable, equivalent norm.Both equivalence and computability hinge on the finite-dimensional nature of the discretized residual.and computable in terms of U f ;M and the evaluations xf, λ i y, i P I.This definition implies the constant-free local lower bounds E H pf, M, iq ď } Respf ; Mq} H ´1psupp λiq and therefore, cf.(3.28), we have that, for every z P V and I z " ti P I | i Q zu, local counterpart of the global lower bound in Veeser [23, Lemma 3.3].This estimator is very closely related to the discretized residuals of §3.4 and Theorem 15.Indeed, if K P M and F P F , K 1 , K 2 P M such that F " K 1 X K 2 , Hence spantψ i | i P Iu " spantλ i | i P Iu and Remark 12 yields xf, λ i y " xP M f, λ i y, i P I, and the indicators may be viewed also as evaluations of the discretized residual: for i P I, E H pf, M, iq " ˇˇˇB P