Residual estimates for post-processors in elliptic problems

In this work we examine a posteriori error control for post-processed approximations to elliptic boundary value problems. We introduce a class of post-processing operator that `tweaks' a wide variety of existing post-processing techniques to enable efficient and reliable a posteriori bounds to be proven. This ultimately results in optimal error control for all manner of reconstruction operators, including those that superconverge. We showcase our results by applying them to two classes of very popular reconstruction operators, the Smoothness-Increasing Accuracy-Enhancing filter and Superconvergent Patch Recovery. Extensive numerical tests are conducted that confirm our analytic findings.

1. Introduction. Post-Processing techniques are often used in numerical simulations for a variety of reasons from visualisation purposes [BMBS95] to designing superconvergent approximations [BS77] through to becoming fundamental building blocks in constructing numerical schemes [GP18,GZZ18,CGZZ17]. Another application of these operators is that they are a very useful component in the a posteriori analysis for approximations of partial differential equations (PDEs) [AO00,Ver96]. The goal of an a posteriori error bound is to computationally control the error committed in approximating the solution to a PDE. In order to illustrate the ideas, let u denote the solution to some PDE and let u h denote a numerical approximation. Then, the simplest possible use of post-processing in a posteriori estimates is to compute some u * from u h and to use u − u h ≈ u * − u h as an error estimator. However, a key observation here (and in several more sophisticated approaches) is that u * must be at least as good of an approximation of the solution u as u h is. In fact, in many cases, u * is actually expected to be a better approximation. This raises a natural question: If an adaptive algorithm computes (on any given mesh) not only u h but also u * and u * is a better approximation of u than u h is, why is u h and not u * considered as the "primary" approximation of u? Indeed, the focus of this paper is to consider u * as the primary approximation of u. We are therefore interested in control of the error u − u * and in adaptivity based on an a posteriori estimator for u − u * . Specifically, we aim to provide reliable and efficient error control for u − u * .
Note that our goal is not to try to construct "optimal" superconvergent post-processors. Rather we try to determine, from the a posteriori viewpoint, the accuracy of some given post-processed solution and to determine how this is useful for the construction of adaptive numerical schemes based on an error tolerance for u − u * .
There are several examples of superconvergent post-processors, includig SIAC and superconvergent patch recovery. More details on the history, properties and implementation of these methods will be provided in Sections 4.1 and 4.2. However, our a posteriori analysis, aims at being applicable for a wide variety of post-processors and, therefore, we avoid special assumptions that are only valid for specific post-processors. Indeed, our analysis makes only very mild assumptions on the post-processing operator. Specifically, we only require that: 1. The post-processed solution u * belongs to a finite dimensional space that contains piecewise polynomials, although it does not necessarily need to be piecewise polynomial itself. 2. The post-processed solution should be piecewise smooth over the same triangulation, or a subtriangulation, of the finite element approximation. Given a post-processor, u * that satisfies these rather mild assumptions, we perturb it slightly and call the result u * * . This is to ensure an orthogonality condition holds which then allows us to show various desirable properties including: 2 1. The orthogonal post-processor provides a better approximation than the original post-processor, i.e. u − u * * A h ≤ u − u * A h in the energy norm, see Lemma 3.4. 2. The orthogonal post-processor has an increased convergence order in the L 2 norm. Practically, this is not always the case for the original post-processor, see Lemma 3.6. 3. Efficient and reliable a posteriori bounds are available for the error committed by the orthogonal post-processor. This, motivates us to consider u * * (and not u * or u h ) as the primary approximation. Since the improved accuracy of u * * , compared to u h , stems from superconvergence it is much more sensitive with respect to smoothness of the exact solution, i.e. in regions where the exact solution is C ∞ we expect u * * to be much more accurate than u h whereas in places where the exact solution is less regular, e.g. has kinks, u h and u * * are expected to have similar accuracy. Therefore, meshes constructed based on error estimators for u − u h will usually not be optimal when used for computing u * * in the sense that the ratio of degrees of freedom to error u − u h would be much better for other meshes, this is elaborated upon in Section 5.
We will demonstrate the good approximation properties of our modification strategy for post-processors and the benefits of basing mesh adaptation on an estimator for u − u h in a series of numerical experiments. In order to highlight the versatility of our approach, we conduct experiments based on two popular postprocessing techniques: The Smoothness Increasing Accuracy Conserving (SIAC) filter and superconvergent patch recovery (SPR). Background on these methods is provided in Sections 4.1 and 4.2 respectively.
The rest of the paper is structured as follows: In §2 we introduce the model elliptic problem and its dG approximation. We also recall some standard results for this method. In §3, for a given reconstruction, we perturb it so it satisfies Galerkin orthogonality and show some a priori type results. We then study a posteriori results and give upper and lower bounds for a residual type estimator. In §4 we describe the two families of post-processor that we consider in this work. Finally, in §5 we perform extensive numerical tests on the SIAC and SPR post-processors to show the performance of the a posteriori bounds, the effect of smoothness of the solution on the post-processors and to study adaptive methods driven by these estimators.
For f ∈ L 2 (Ω) we consider the problem where D : Ω → R d×d is a uniformly positive definite diffusion tensor and D ∈ H 1 (Ω) ∩ L ∞ (Ω) d×d . Weakly, the problem reads: find u ∈ H 1 0 (Ω) such that Let T be a triangulation of Ω into disjoint simplicial or box-type (quadrilateral/hexahedral) elements K ∈ T such that Ω = K∈T K. Let E be the set of edges which we split into the set of interior edges E i and the set of boundary edges E b .
We introduce the standard broken Sobolev spaces. For s ∈ N 0 we define and we will use the notation as an elementwise norm for the broken space.
3 For p ∈ N we denote the set of all polynomials over K of total degree at most p by P p (K). For p ≥ 1, we consider the finite element space Let v ∈ H 1 (T ) be an arbitrary scalar function. For any interior edge e ∈ E i there are two adjacent triangles K − , K + and we can consider the traces v ± of v from K ± respectively. We denote the outward normal of K ± by n ± and define average and jump operators for one E i by (2.6) For boundary edges there is only one trace of v and one outward pointing normal vector n and we define For vector valued functions v ∈ [H 1 (T )] d we define jumps and averages on interior edges by As before, for boundary edges, we define jumps and averages using traces from the interior only. Note that For any triangle K ∈ T we define h K := diam K and collect these values into an element-wise constant function h : Ω → R with h| K = h K . We denote the radius of the largest ball inscribed in K by ρ K . For every edge e we denote by h e ={ { h } }, i.e., the mean of diameters of adjacent triangles. For our analysis we will assume that T belongs to a family of triangulations which is quasi-uniform and shape-regular. Let us briefly recall the definitions of these two notions: The triangulation T is called • shape-regular if there exists C > 0 so that Note that for shape-regular triangulations we have inverse and trace inequalities [DPE12, Lemmas 1.44, 1.46]. Note that the quasi-uniformity assumption is only required for the first part of our analysis, in §3.1 and can be relaxed in §3.2. In this work we will consider a standard interior penalty method to approximate solutions of (2.2). We consider the Galerkin method to seek u h ∈ V p h such that Note that the bilinear form (2.12) is stable provided σ = σ(D) is large enough, see [ABCM02].
Remark 2.1 (Continuous Galerkin methods). Note that if we restrict test and trial functions to V p h ∩H 1 (Ω) then all jumps on interior edges vanish and (2.11), (2.12) reduces to a (continuous) finite element method with weakly enforced boundary data. Our analysis is equally valid in this case. 4 We introduce two dG norms (2.13) which are equivalent provided σ > 0 is sufficiently large and conclude this section by stating a-priori estimates for the Galerkin method as is standard in the literature [ABCM02,KP03].
Theorem 2.2 (Error bounds for the dG approximation). Let u ∈ H s (Ω) for s ≥ 2 be the solution of (2.1) and u h ∈ V p h be the unique solution to the problem (2.11). Then, Further, for u ∈ H 1 (Ω), we have the a posteriori error bound . (2.16) and C 1 is a constant depending on the shape-regularity and quasi-uniformity constants of T and C 2 depends only upon the shape-regularity. Here R h is a computable residual that we refer to during our numerical simulations.
3. The orthogonal reconstruction, a priori and a posteriori error estimates. In this section, we derive robust and efficient error estimates. We make the assumption that we have access to a computable reconstruction, u * ∈ V * h ⊂ H 2 (T ) generated from our numerical solution u h , where V * h is required to contain the original finite element space, that is V p h ⊂ V * h . We are unable to provide reliable a posteriori error estimates for u * directly, but we can modify, and, as we shall demonstrate, improve any such reconstruction such that a robust and efficient error estimate can be obtained for the modified reconstruction.
We split this section into two parts, the first subsection contains the definition of the improved reconstruction and some of its properties. In particular, we study this from an a priori viewpoint, show that it satisfies Galerkin orthogonality as well as some desirable a priori bounds. Throughout this subsection we assume that u ∈ H 2 (Ω) and the underlying mesh is quasi-uniform. In the second part we derive reliable and efficient a posteriori estimates under the weaker assumption that u ∈ H 1 (Ω) and the mesh is shape regular.
3.1. Improved reconstruction. In the following assume that u ∈ H 2 (Ω) solves (2.2) and let u * ∈ V * h ⊂ H 2 (T ) be a reconstruction of the discrete solution u h , e.g. a SIAC reconstruction as described in Section 4.1 or obtained by some patch recovery operator as described in Section 4.2.
Definition 3.1. Let R : H 2 (T ) → V p h denote the Ritz projection with respect to A h (·, ·), i.e., We define the improved reconstruction as We work under the assumption that a post-processor u * is already being computed. To realise u * * we are required to solve the original elliptic problem a second time with a different forcing term. This means the improved reconstruction u * * is computable at a small additional cost to u * . Once u * has been computed, u * * can be computed by solving a discrete elliptic problem over V p h . A typical scenario is that the user already has a good scheme for computing u h , and that the cost of computing u * * (after the post-processing to obtain u * ) is just that of solving the same system as that for u h with a different right hand side. This means the assembly and preconditioning, perhaps ILU or AMG, can be reused without change. Estimating the cost of computing u * is more complicated and will depend on the method used and the implementation. While our implementation for solving u h and the correction are optimized (and implemented in C++) the computation of u * * is a proof of concept implementation in Python and is therefore not competitive. The improved reconstruction u * * is computable at a small additional cost to u * . Once u * has been computed, u * * can be computed by solving a discrete elliptic problem over V p h . A typical scenario is that the user already has a good scheme for computing u h , and that the cost of computing u * * (after the post-processing to obtain u * ) is just that of solving the same system as that for u h with a different righthand side, e.g if an LU decomposition of the system matrix was determined for computing u h this LU decomposition can be reused. 3. Note that where id is the identity mapping, i.e., the error of u * * is the Ritz-projection of the error of u * onto the orthogonal complement of V p h . 4. Even if u * is continuous, this does not necessarily hold for u * * as V p h may contain discontinuous functions.
One of the key properties of the improved reconstruction is that it satisfies a Galerkin orthogonality result.
Lemma 3.3 (Galerkin orthogonality). The reconstruction u * * from (3.2) satisfies Galerkin orthogonality, i.e., Proof. For any v h ∈ V p h , we have using (3.3) by definition of the Ritz projection, as required. Now, we show that with respect to · A h the new reconstruction u * * indeed improves upon u * : Lemma 3.4 (Better approximation of the improved reconstruction). Let u * * be defined by (3.2), then the following holds: In (3.6) the inequality is an equality if and only if u * * = u * , i.e., if the original reconstruction u * itself satisfies Galerkin orthogonality.
Proof. Since the images of R and (id − R) are orthogonal with respect to A h (·, ·), Pythagoras' theorem implies We have used (3.3) in the third step. Note that if u * is not Galerkin orthogonal then R(u − u * ) A h > 0 leading to a strict inequality in the first step. This completes the proof.
6 Remark 3.5. One appealing feature of the new reconstruction that results from Galerkin orthogonality is that if the reconstruction u * has some superconvergence properties in the energy norm this is inherited by u * * and also immediately implies an additional order of accuracy in L 2 . This results from an Aubin-Nitsche trick being available.
Lemma 3.6 (Dual bounds). Let Ω be a convex polygonal domain and let u * * be defined by (3.2), then there exists a constant C > 0 (only depending on the shape regularity of the mesh) such that for any ψ h ∈ V p h where the last equality follows from Galerkin orthogonality (3.4). Thus, choosing ψ h as the best approximation of ψ in the piecewise linear subspace of V p h , we obtain by elliptic regularity of the dual problem, concluding the proof.

3.2.
A posteriori error estimates. Now that we have shown some fundamental results on the improved reconstruction, we relax the regularity requirements on u in this subsection allowing for weak solutions to (2.1), that is, u ∈ H 1 (Ω). With that in mind we modify the definition of A h (·, ·) such that it is a suitable extension over H 1 (T ) × H 1 (T ) to The lifting operators satisfy the stability estimate, [DPE12, Lemma 4.34], (3.14) , For test and trial functions in V * h (which contains V p h by assumption) the new definition of A h (·, ·) is equivalent to the one given in (2.12). Therefore for any function v * ∈ V * h the Ritz projection given in Definition 3.1 remains the same still satisfying h . But note that we no longer have u h = Ru and Galerkin orthogonality for u * * no longer holds in general, it only holds for a H 1 (Ω) conforming Proof. By definition of u * * we have that (Ω) and, hence, continuous, as required. Let a quantity of interest be given by the linear functional J ∈ H −1 (T ), the dual space of H 1 0 (T ). Note that H −1 (T ) ⊂ H −1 (Ω) where the latter is the dual space of H 1 0 (Ω). We begin by deriving an error representation formula. Following [HSW05], we split u * * into a continuous part u * * Theorem 3.8 (Dual error representation). Let u ∈ H 1 0 (Ω) be the solution of (2.2) and let u * * be given by (3.2), then where we made use of Galerkin orthogonality, Lemma 3.7, in the last step.
Theorem 3.9 (Primal error estimate). There exists some constant C A > 0 depending on mesh geometry and polynomial degree such that is bounded by the right hand side of (3.22). In Theorem 3.8 we may choose Note that, by definition, z ∈ H 1 0 (Ω) so that, if V p h contains discontinuous functions, z = u−u * * . Nevertheless, z satisfies the stability estimate . (3.27) Integrating by parts in (3.27) and using (3.14) we obtain with a constant C P > 0 which is independent of h but depends on the shape regularity of the mesh and the polynomial degree and we also note that . 9 We insert (3.29) and (3.30) into (3.28) and apply trace inequality and Cauchy-Schwarz inequality and obtain . (3.31) (3.32) and insert (3.32) into (3.31) to obtain the assertion of the theorem.
The error estimator, derived in Theorem 3.9, is locally efficient in the following sense: Theorem 3.10 (Local efficiency). Assume f and D are piecewise polynomial on T . Then, there exists a constant C > 0 independent of h such that for any K ∈ T and any e ∈ E the following estimates hold: where K e denotes the union of cells sharing common edge e.
Proof. Both proofs are standard and follow [Ver96].
Remark 3.11 (Data oscillation). In case f or D are not polynomial the right hand side of (3.33) contains additional data oscillation terms.

Post-Processors.
In order to show the versatility of our results, we consider two families of reconstruction operators. Namely, the Smoothness-Increasing Accuracy-Conserving (SIAC) post-processing [Tho77, BS77, Rya15] as well as patch reconstruction via the Zienkiewicz and Zhu [ZZ92, ZZ98] Superconvergent Patch Recovery (SPR) technique. Below we outline the procedure for performing these reconstructions as well as error estimates for the ideal case.
4.1. SIAC post-processors. One example of a superconvergent post-processor that we examine is the Smoothness Increasing Accuracy Conserving (SIAC) filter. The SIAC filter has its roots in an accuracyenhancing post-processor developed by Bramble and Schatz [BS77]. The original analysis was done for finite element approximations for elliptic equations. This technique has desirable qualities including its locality, allowing for efficient parallel implementations, and its effectiveness in almost doubling the order of accuracy rather than increasing the order of accuracy by one or two orders. This post-processor was also explored from a Fourier perspective and for derivative filtering by Thomeé [Tho77] and Ryan and Cockburn [Rya09].
SIAC filters are an extension of the above ideas and have traditionally been used to reduce the error oscillations and recover smoothness in the solution and its derivatives for visualization purposes [MRK10,WRKH09,SCKR08] or to extract accuracy out of existing code [Rya05]. It has been extended to a variety of PDEs as well as meshes [JXR12]. A quasi-interpolant perspective on SIAC can be found in [MRK16]. The important property of these filters is that, in addition to increasing the smoothness, for smooth initial data and linear problems, the filtered solution is more accurate than the DG solution. To combat the high in [DSRMK17].
For ease of presentation the following discussion only details the design of the filter and presents a-priori error estimates for the case of a smooth solution. Although the discussion is limited to one-dimension, it can be extended to Cartesian meshes in more than one space dimension using a tensor product approach. More advanced applications of the multi-dimensional SIAC post-processor are the Hexagonal SIAC [MJRK17] or Line SIAC [DSRMK17].
The basic idea is that the reconstruction is done via convolution post-processing: where h is the mesh size of the numerical scheme and H is the scaling of the post-processor. The convolution kernel, K 2r+1,m+1 (·), is defined as This is a linear combination of 2r+1 shifted copies of some function, ψ (m+1) (x). The function weights are real scalars, c 2r+1,m+1 γ ∈ R. For the kernel, r is chosen to satisfy consistency as well as 2r moment requirements, i.e., polynomial reproduction conditions, which are necessary for preserving the accuracy of the Galerkin scheme and m is chosen for smoothness requirements. We focus on kernels built from B-splines which are defined via the B-Spline recurrence relation: for m ≥ 1.
Remark 4.1 (Kernel scaling.). For cartesian grids, the kernel scaling is typically chosen to be the element size, H = h. In adaptive meshes or structured triangular meshes and tetrahedral meshes, H is typically chosen to be the length of the mesh pattern [MJRK11]. For Line SIAC, the kernel scaling is taken to be the element diagonal [DSRMK17], for unstructured meshes, the kernel scaling is taken to be largest element side [MKRK13,MRK14].
It can be shown that when the solution is sufficiently smooth the post-processed numerical solution u * is a superconvergent approximation.
In particular, if u ∈ C ∞ (Ω), then the Galerkin solution converges in Theorem 2.2 as If we choose r = p and m = p − 2 then see Theorem 1 in [Tho77,BS77], which for p ≥ 2 constitutes an improvement. It is possible to obtain the same estimates in H 1 by taking higher order B-Splines.
In this paper, in order to apply the post-processor globally, we mirror the underlying approximation as an odd function at the boundary as discussed in [BS77].
Remark 4.2 (Impact of in-cell regularity of u * * ). If u * , u * * ∈ H 2 (T ) Theorem 3.9 does not hold. Still, as long as u * ∈ H 1 (T ) similar results can be obtained by slightly modifying the proof of Theorem 3.9. One interesting example is SIAC reconstruction with m = 0. In this case, for any K ∈ T the restriction u * * | K contains several kinks, i.e. there are hypersurfaces (points for d = 1, lines for d = 2) across which u is continuous but not differentiable. However, for any K there exists a triangulation of T K of K, such that u * * | K ∈ H 2 (T K ). If we follow the steps of the proof of Theorem 3.9 we realise that integration by parts can only be carried out on elements of ∪ K∈T T K and each term h K (f + ∆u * * ) 2 L 2 (K) in the error bound needs to be replaced by where E K denotes the set of interior edges of T K . Efficiency of this modified estimator can be shown along the same lines as in Theorem 3.10 but bubble functions with respect to the elements and edges in the sub-triangulation T K need to be used.

Superconvergent Patch
Recovery. The second post-processing operator we study is based on the superconvergent patch recovery (SPR) technique. This was originally studied numerically and showed a type of superconvergence for elliptic equations using finite element approximations [ZZ87]. The mathematical theory behind this recovery technique was addressed by Zhang and Zhu [ZZ95] for the two-point boundary value problems and for two-dimensional problems and extended to parabolic problems in [LW06,LP12]. The superconvergent patch recovery method works by recovering the derivative approximation values for one element from patches surrounding the nodes of that element using a least squares fitting of the superconvergent values at the nodes and edges. In typical derivative recovery, the derivative approximation is a continuous piecewise polynomial of some given degree. For overlapping patches, the recovered derivative is just an average of the approximations obtained on the surrounding patches. Unlike SIAC post-processing, this recovery technique does not rely on translation invariance for the high-order recovery. The superconvergent patch recovery technique has been shown to work well for elliptic equations that have a smooth solution, and for less smooth solutions with a suitably refined mesh.
The usual application of this technique is for gradient recovery. However, in this article we apply this technique to recover function values.
As mentioned, we suitably modify the algorithm to construct a function u * ∈ V * h with V * h = V 2p+1 h ∩C 0 (Ω). The construction of u * , given some finite element function u h , is carried out in two steps: 1. Construct a polynomial q i of order 2p at each node v i of the mesh using a least squares fitting of function values of u h evaluated at suitable points in elements surrounding v i . 2. Given an element K we use linear interpolation of the values of q i for the three nodes of K to compute u * ∈ V * h . There are many approaches for constructing the polynomials q i in the first step at a given node v i with surrounding triangles K . For our experiments, we use the following approach. For p = 1, we construct a quadratic polynomial q i by fitting the values of u h at the nodes of all K . For a piecewise quadratic u h (p = 2) we also use the midpoints of all edges of the K . Finally for our tests with p = 3 we evaluate u h at two points on each edge chosen symmetrically around the midpoint of the edge (we use the Lobatto points with local coordinates 1 2 ± √ 5 10 ) and also add the value of u h at the barycentre of the K . This is depicted in Figure 4.1. To guarantee that we have enough function values to compute the least squares fits, we add a second layer of triangles around v i if necessary, e.g., at boundary nodes.
Note that this procedure is similar, although not the same, as the approach investigated in [ZN05]. Another related procedure was proposed in [Ova07].

Numerical Results.
In this section we study the numerical behaviour of the error indicators proposed for the SIAC and SPR post-processing operators. We compare this behaviour with the true error on some typical model problems. The computational work was done in the DUNE package [BBD + 08] based on the new Python frontend for the DUNE-FEM module [DKNO10,DN18].

Smoothness-Increasing
Accuracy-Conserving post-processors. The implementation of the post-processor is done through simple matrix-vector multiplication and is discussed in [Mir12].
We first investigate the behaviour of the error and the residual estimator for the problem (2.1) in one space dimension with D = 1, i.e. the Laplace problem where the forcing function f is chosen so that the exact solution is (5.2) u(x) = sin (6πx) 2 cos Å 9 2 πx ã on the interval (0, 1). We show both the L 2 and H 1 errors for the Galerkin approximation u h , the SIAC postprocessed approximation, u * and the orthogonal postprocessor, u * * . We also show the two residual indicators R h (from Theorem 2.2) and R * * from (Theorem 3.9). For the basis we consider the continuous Lagrange polynomials for u h and impose the boundary conditions weakly with a penalty parameter 10p 2 h , where p is the polynomial degree and h is the grid spacing. Additional experiments were conducted using a discontinuous Galerkin approximation, but no significant differences in the outcome where found and therefore do not include the results. We solve the resulting linear system using an exact solver [Dav04] to avoid issues with stopping tolerances.
We will mainly focus on p = 2 but also show results for p = 1 and p = 3. The SIAC postprocessing is constructed using a continuous B-spline, m = 1, as well as setting r = p+1 2 . This leads to an inner stencil of 2 r + 1 2 − 1 + 1 = 2 p+1 2 + 3 elements. We also tested other choices of r, m for p = 2 but the above choice provided the best results and these are the results shown.
In Figure 5.1 we show the errors for p = 2 for a series of grid refinement levels starting with 20 intervals and doubling that number on each level. In Figure 5.1 we plot the corresponding Experimental Orders of Convergence (EOCs). As can clearly be seen, SIAC postprocessing (u * ) improves the convergence rate in H 1 from 2 to 3 and in L 2 from 3 to 4. While in H 1 the Galerkin orthogonality trick only leads to a small improvement in the error, in L 2 we see an improvement of a full order leading to a convergence rate of 5. As expected from the theory the residual indicators follow the H 1 errors of u h and u * * closely. The efficiency index is comparable between R h and R * * .
For a better understanding of how the error is reduced by utilizing SIAC postprocessing and the Galerkin orthogonality treatment, we show the pointwise errors of the approximations in Figure 5.2. It is evident that the function values are much smoother when applying SIAC. The move from u * and u * * does reintroduce small scale errors, but at a far lower level compared to the original approximation, u h . As expected from the errors, the differences in H 1 are less pronounced.
In Figures 5.3 we show errors and EOCs for p = 3. Due to the very low errors on the final grid the actual convergence rates for u * and u * * is not clear. However, the improvement due to the Galerkin orthogonality trick, especially in L 2 is quite noticable as it reduced the error by two orders of magnitude.
We next show results for p = 1 in Figure 5.4. Again, there is a clear reduction in the values of the errors from u h to u * to u * * in H 1 together with an improvement in the convergence rate due to the SIAC postprocessing. This improvement in the convergence rate is about 1 order. While the convergence rate going from u h to u * * seems to only be half an order on the higher grid resolution, it is important to note that the error using u * * is still significantly smaller than the error between the exact solution and u * by at least a factor of 2. Hence the results do not contradict the theory. In L 2 , SIAC leads to no improvement while the convergence rate of the error using u * * is at least half an order higher. Overall the improvement in the convergence rate is not quite as good as for the higher polynomial degrees. The following tests summarized in Figure 5.5 show that the weak form of the boundary conditions is responsible for the reduced order improvement. The figure shows results using a hyperpenalty of the form 10p 2 h 2 . Applying this hyperpenalty term leads to improvements that are again more in line with our observations for higher order polynomials. We note that strong enforcement of the boundary conditions also lead to similar results.
We summarize our results for the smooth problem in Table 5.1. It can be clearly seen that the step from u * to u * * , which requires solving one additional low order problem, is quite advantageous and increases the convergence rate in the L 2 norm by at least one. In the linear case this improves by two and by one in the H 1 norm. This makes it highly efficient in this case, at least when implementing the hyperpenalization or strong constraints to enforce the Dirichlet boundary conditions. The reason for this restriction will be investigated further in future work. For p = 3 the actual EOCs of the postprocessed solutions are difficult to determine and therefore we provide approximate numbers. In this case, SIAC shows a higher order in L 2 compared to the H 1 norm. The Galerkin orthogonalty trick does not improve the rate further, but note that the overall error is still a factor of 100 smaller. Additionally note that in the other cases where there is no improvement in the rate, the error is reduced by enforcing Galerkin orthogonality, e.g., in the H 1 norm with p = 2 the error is still reduced by about a factor of two. In addition, the orthogonality of u * * allows us to compute a reliable and efficient error estimator with a comparable efficiency index to the error estimator for u h using R h .
We conclude our investigations of the SIAC reconstruction and the residual estimates by studying problems with less smooth solutions. We change the forcing function so that the exact solution is of the form . Errors and convergence rates for H 1 (left two) and L 2 (right two) for polynomial degree p = 1 using SIAC reconstruction. Table 5.1 Experimental rates of convergence for the smooth problem using different values for the polynomial degree p. The convergence rates are shown for the three approximations, i.e., u h , u * , u * * . p = 1 p = 2 p = 3 EOC(u h ) EOC(u * ) EOC(u * * ) EOC(u h ) EOC(u * ) EOC(u * * ) EOC(u h ) EOC(u * ) EOC(u * * ) L 2 -error 2 2 4 3 4 5 4 6 6 H 1 -error  1  2  3  2  4  4  3  5  5 where (5.3) w(s) = sin (6πs) 2 cos Å 9 2 πs ã is the smooth function from previous studies. We show results for polynomial degree p = 2. We again iplement a simple O(h −1 ) penalty term at the boundary. Note that the solution is in C 2 \ C 3 at x = 0.7 and only in C 1 \ C 2 for x = 0.3. Overall the solution is an element of H 2 (0, 1) but not of H 3 (0, 1), i.e., it is not smooth enough to achieve optimal convergence rates for p = 2 when the mesh is not aligned. Even when the mesh is aligned, as in our experiments, we do not expect an increase of the convergence rate using the SIAC reconstruction as can be seen in Figure 5.6. The local loss of regularity at x = 0.3 and x = 0.7 is clearly visible for the pointwise errors of the two reconstructions as shown in Figure 5.7. Examining the errors in the original approximation, u h , the reduced smoothness is hardly visible. However, in both of the reconstructions a jump in the error is clearly visible. At x = 0.7, where the solution is still in C 2 the error in u * * increases approximately by two orders while at x = 0.3 it is close to four orders of magnitude larger since the solution is only C 1 at this point. The lack of smoothness is also identified by the residual indicator R * * , the spatial distribution of which is shown in Figure 5.8 together with the distribution of R h . It is worthwhile to note that the region of the 'reduced smoothness' is better isolated by R * * than by R h . Hence it would be easier for an adaptive algorithm to separate these different smoothness regions which would lead to more optimal meshes. The picture clearly shows that R h does not 'see' the kink so that an adaptive algorithm would either refine the whole non-constant region or nothing at all depending on the tolerance. In contrast, with R * * (and the right algorithm) refinement could be isolated to the kinks.

Superconvergent Patch Recovery.
In the following we solve in a two dimensional domain Ω where the forcing function f is chosen by prescribing an exact solution u. This function is also used to prescribe Dirichlet boundary conditions on all of ∂Ω. In the first example we chose a smooth exact solution u with a scalar diffusion coefficient D = I 2 |x| 2 + 1 2 , while for the second test we use a solution with a corner singularity and D = I 2 .
In the following we show results using a DG scheme on a triangular grid. The grid is refined by splitting each element into four elements. In the final examples with local adaptivity, this leads to a grid with hanging nodes. We also carried out experiments using a continuous ansatz space with very simular results.
Note that in all figures depicting errors and EOCs, the x-axis shows the number of degrees of freedom for u h . While the other approximations have a larger number of degrees of freedom, the global problem that has . Errors and convergence rates for H 1 (left two) and L 2 (right two) for polynomial degree p = 2 using SIAC for solution with reduced smoothness to be solved, i.e. solving the linear system for u h and for Ru * , scales with the number of degrees of freedom for u h and thus this seems a reasonable indication of the computational complexity.
For our first test we choose u(x, y) = sin (πx/(0.25 + xy)) sin (π(x + y)), and Ω = (0, 1) 2 . We start with an initial grid which is slightly irregular as shown in Figure 5.9. This is to avoid any superconvergence effects due to a structured layout of the triangles. Figure 5.10 shows L 2 and H 1 errors and EOCs for the three approximations u h , u * , u * * with polynomial degrees p = 1, 2, 3. It can be seen that, in general, the postprocessor u * improves the EOC by an order of 1 in the H 1 norm and that the EOC of the improved postprocessor u * * is at least as good. While the actual error of u * can be larger on coarser grids than the error computed with u h , the error using u * * is significantly better in all cases. Focusing now on the L 2 norm, we see that when computing the error using u * * , the EOC is one order better then the convergence rate in the H 1 norm, as expected. For p = 2, 3, this is also true when using u * , while for p = 1 the EOC is only 2 in this case, and an increase to 3 is only achieved with the improved postprocessor u * * . The same observation can be made when using the SIAC postprocessor in the previous section.
Using the same problem setting, we investigate the performance of an adaptive algorithm in Figure 5.11. We use a modified equal distribution strategy where elements are marked for refinement when the local indicator η K exceeds η K #elements . We compute the local indicator on either u h or on the improved reconstruction u * * . The advantage of basing the marking strategy on u * * is clearly demonstrated. While marking with respect to u h and then using the postprocessor only on the final solution (filled upward triangles) leads to a significant reduction of the final error, the difference in the convergence rate between R h and R * * results   in a finer grid than necessary for a given tolerance. A reduction in the number of degrees of freedom by a factor of 10 to 100 can be easily achieved by using R * * .
For our final test we study a reentrant corner type problem, i.e., Ω = (−1, 1) 2 \ ([0, 1] × [−1, 0]) using a regular triangulation. First we choose the well known exact solution u ∈ H   the simplicity of the solution away from the corner, the postprocessing does not only not improve the EOC but can even lead to a slight increase in the overall error clearly noticeable in the H 1 error for the p = 2 case. This is even more obvious when the postprocessor, u * , is used directly. Alternatively, going from u * to u * * leads to an approximation which is very close to the original u h in all cases. Although the results for the globally refined grid are not that promising, the postprocessing nevertheless has considerable benefits when adapting the grid using the residual indicator based on u * * . Indeed Figure 5.13 shows that, for a given number of dofs, mesh adaptation based on R * * produces an approximation u * * which has a much smaller error than u h (on a mesh constructed using R h ).
For a more challenging test, especially for p = 3, we construct the forcing function so that the exact solution is u(x, y) = ω(x, y)u corner (x, y), where u corner is the solution to the above corner problem and ω(x, y) = − sin 3 2 π(1 − x 2 )(1 − y 2 ) . The function u still has the same corner singularity but is also smooth. However, the challenging nature of this solution is that it has large gradients towards the outer boundaries. Results for p = 2, 3 are summarized in Figures 5.14 and 5.16. The final grids for p = 3 are shown in Figure  5.15 using R h and R * * to mark cells for refinement. In both cases 22 steps were needed and the resulting grids have 1597 and 4540 cells (20725 and 7381 degrees of freedom), respectively. While the corner is highly refined in both cases, the regions that are smooth but strongly varying in their solution are far less refined when using R * * . When using R h the final errors are u − u h dG ≈ 7.4 · 10 −4 and u − u * * dG ≈ 6.8 · 10 −4 while adaptivity based on R * * results in errors of the size u−u h dG ≈ 3.9·10 −3 and u−u * * dG ≈ 7.0·10 −4 . Because of the corner singularity, using the postprocessor after finishing the refinement (based on R h ) does not lead to a significant improvement while basing the adaptive process on R * * leads to an almost identical error while requiring only 35% of the cells. Figure 5.17 shows the efficiency index for all three test cases on globally refined grids. The results seem to indicate that there is only a slight increase in the efficiency index 6. Summary Discussion. In this article, we provide a strategy for improving existing post-processing strategies for numerical solutions of a model elliptic problem. The main idea is to modify the post-processed solution so that it satisfies Galerkin orthogonality. We prove various a priori type results showing desirable convergence properties of the orthogonal post-processor including an increased order of accuracy in the L 2 norm. We supported the analysis with numerical examples using two types of post-processors -that of SIAC and SPR -approximating smooth and non-smooth solutions.
In addition to the a priori results, we provide a reliable and efficient a posteriori error estimator for the orthogonal post-processed solution, it should be noted tht no such estimator is available for u − u * . We demonstrate in several examples that much more efficient meshes are obtained when adaptation is based on R * * than when refinement is based on R h and the post-processor is only applied to the numerical solution on the final mesh.      . Efficiency index on globally refined grids for smooth problem with p = 1, 2, 3 (left), simple corner problem with p = 1, 2 (middle), and extended corner problem with p = 2, 3 (right)