1 Introduction

Discontinuous Galerkin (DG) methods are a staple in the cascade of Galerkin approaches for the numerical solution of PDEs in divergence form. For the discretization of elliptic operators, the class of interior penalty DG (IPDG) methods is arguably the most popular choice due to its versatility and simplicity in implementation. The consistency of the IPDG method is achieved by the inclusion of specific terms involving integrals of solution fluxes on element interfaces. The basic, yet revolutionary, idea in IPDG methods is the further addition of penalization terms involving the jump of the approximate solution (or its derivatives for higher order elliptic PDEs) to control the departure from the PDE solution space and, therefore, to achieve stability in a consistent fashion. The IPDG method design concept was presented in full by Baker [4] and, briefly later, by Wheeler [28] and Arnold [2], thereby providing a consistent extension to the classical work of Nitsche [23] for the weak imposition of boundary conditions and to the classical inconsistent finite element method with penalty of Babuška [3]. These classical IPDG methods and their hp-version variants [21, 25] are widely used to this day.

Classical interior penalty discontinuous Galerkin (IPDG) methods for diffusion problems require a number of assumptions on the local variation of mesh-size, polynomial degree, and of the diffusion coefficient to determine the values of the, so-called, discontinuity-penalization parameter and/or to perform error analysis. Variants of IPDG methods involving weighted averages of the normal flux functions of the approximate solution have been proposed [6, 15, 16] in the context of high-contrast diffusion coefficients to mitigate the dependence of the contrast in the stability and in the error analysis. More recently, weighted averages involving local meshsizes have been proposed for treating interface problems [1, 20] on non-matching grids on either side of the interface.

To fix ideas, we shall focus on the basic second order elliptic boundary value problem on an open domain \(\Omega \subset \mathbb {R}^d\), \(d\in \mathbb {N}\): find \(u\in H^1(\Omega )\) such that

$$\begin{aligned} \begin{aligned} -\nabla \cdot (a\nabla u) =&f \quad \text { in } \Omega ;\\ u=&g \quad \text { on } \partial \Omega , \end{aligned} \end{aligned}$$
(1.1)

with data \(f\in L_2(\Omega )\), \(g\in L_2(\partial \Omega )\) and a positive definite diffusion tensor \(a\in [L_\infty (\Omega )]^{d\times d}\). Although considerable extensions to the problem can be incorporated immediately to the developments presented below, we refrain from exhausting the generalization possibilities in the interest of clarity of exposition.

Classical IPDG methods for the problem (1.1) involve a user-defined quantity, the so-called penalty or discontinuity-penalization parameter. The judicious choice of this parameter is crucial for the stability of the IPDG method, but also for its practical behaviour regarding approximation quality and conditioning. In particular, it is well known that excessive penalty parameter typically leads to increased ill-conditioning of the IPDG stiffness matrix. At the same time, the penalty parameter has to be chosen large enough to ensure stability of the method.

In this work, we focus on the in the behaviour of interior penalty methods in the presence of highly locally variable meshes, local polynomial degrees, and/or diffusion coefficients a. As we will see below, in such extreme, often combined, local variation scenarios of discretization parameters and/or coefficients, the classical IPDG method typically requires excessively large penalization to retain stability. In fact, it is possible that classical IPDG penalty parameter choices degenerate to infinity while the dimension of the approximation spaces remains finite. This, in turn may be detrimental to the conditioning of the problem and, in certain cases, to the quality of the approximation.

To resolve this state of affairs, we present a new weighted-type interior penalty discontinuous Galerkin method which is provably stable with a new choice of the discontinuity-penalization parameter and is robust with respect to extreme local variations in the discretization parameters and in the diffusion coefficient. The method, termed henceforth as robust IPDG (RIPDG) replaces the classical average-of-the-normal-flux term(s) in the IPDG by suitable, explicitly constructed weighted averages. This modification results into a robust and concrete selection of the discontinuity-penalization parameter, without resulting into excessive penalization in extreme combined local mesh, polynomial degree and coefficient variation scenarios. We note that a balanced choice for the discontinuity-penalization parameter is particularly important when no conforming subspace is available. Moreover, the new RIPDG method allows for the first time in the interior penalty literature for the proof of a priori error bounds without any local-quasi-uniformity/local bounded variation assumptions for the discretization parameters.

The modifications required to implement RIPDG starting from an available IPDG code are minimal and trivial. We envisage that the RIPDG method will be effective in the numerical approximation of complex, multiscale problems characterized by extreme local physical features necessitating highly non-uniform Galerkin approximation spaces. In particular, as we shall see in the numerical experiments below, RIPDG is particularly pertinent in scenarios where no conforming subspace of sufficient approximation capabilities is available. This is often the case when employing IPDG methods on meshes consisting of general polygonal/polyhedral (i.e., polytopic) elements, typically with many faces per element. These meshes can arise, among other processes, via mesh agglomeration procedures [5, 9,10,11]. Using elements resulting from agglomeration of finer unstructured meshes, in an effort to reduce computational complexity leads naturally to Galerkin spaces with inherent significant non-conformity. In such cases, the effect of discontinuity-penalization becomes evident and, in some extreme cases, appears to be dominating the convergence behaviour (cf., Example 1 with the “zigzag” mesh below).

At the other end of the spectrum, when using uniform meshes, constant polynomial degree to solve problems with constant diffusion coefficients RIPDG is identical to the classical IPDG. Moreover, within the usual setting of locally quasi-uniform meshes, even with locally bounded variable polynomial degree and small/medium contrast in the diffusion coefficient, RIPDG and IPDG offer essentially identical numerical solutions.

Indeed, numerical experiments indicate the favourable performance of the new RIPDG method over the classical version in terms of conditioning and error. In particular, despite extensive testing, we have not been able to identify an example whereby the RIPDG method is inferior in terms of conditioning and/or error in various norms when compared to the classical IPDG method. In most cases the difference between IPDG and RIPDG is not significant, due to the fact that a sufficiently rich conforming subspace is present in the approximation. However, in certain extreme scenarios RIPDG offers significantly better approximation and/or conditioning. Although we expect that cases with IPDG outperforming RIPDG exist, we have not yet been able to identify such example despite extensive testing.

The remainder of this work is structured as follows. In Sect. 2 we review the classical IPDG method and the weighted variant from [6, 15, 16], along with the basic argument for proving coercivity and continuity of the respective bilinear form. In Sect. 3, we present the new RIPDG method and discuss its construction and properties, as well as the motivating differences in the choice of the penalty parameter. In Sects. 4 and 6 we provide extensions of RIPDG to polytopic meshes and to degenerate elliptic problems, respectively. In Sect. 7, we present the basic Strang’s-type Lemma satisfied by IPDG and RIPDG, showcasing that in the latter case the resulting a priori error bound is independent of any local variation in meshsize, polynomial degree and diffusion contrast. We conclude with a series of numerical experiments showcasing the benefits in using weighted averages in scenarios with high such local variations.

2 Interior Penalty Discontinuous Galerkin Methods

We consider meshes \(\mathcal {T}\) consisting of mutually disjoint open simplicial and/or box-type elements \(K\in \mathcal {T}\) so that \(\cup _{K\in \mathcal {T}}\bar{K} =\bar{\Omega }\). Let also \(h_{K}:={\text {diam}}(K)\) the diameter of \(K\in \mathcal {T}\), and define the mesh-function \(\mathbf {h}:\cup _{K\in \mathcal {T}}K\rightarrow \mathbb {R}_+\) by \(\mathbf {h}|_{K}=h_{K}\), \(K\in \mathcal {T}\). Let also \(m_{K}\) denote the number of \((d-1)\)-dimensional faces of the element \(K\in \mathcal {T}\), i.e., \(m_K=d+1\) for a d-dimensional simplex and \(m_K= 2d\) for a d-dimensional box-type element, and define \(\mathbf {m}:\cup _{K\in \mathcal {T}}K\rightarrow \mathbb {R}_+\) by \(\mathbf {m}|_{K}=m_{K}\), \(K\in \mathcal {T}\). Further, we let \(\Gamma :=\cup _{K\in \mathcal {T}}\partial K\) denote the mesh skeleton and set \(\Gamma _{\mathrm{int}}:=\Gamma \backslash \partial \Omega \). The mesh skeleton \(\Gamma \) is decomposed into \((d-1)\)–dimensional simplicial or quadrilateral faces F, shared by at most two elements. These are distinct from elemental interfaces I, herein defined as the simply-connected components of the intersection between the boundaries of two neighbouring elements. As such, hanging nodes/edges are naturally permitted: an interface between two elements may consist of more than one face residing on the same hyperplane in the presence of hanging nodes. Moreover, as we will see below, we may wish to view each quadrilateral (inter)face of a three-dimensional box-type element mesh as the union of two simplicial faces. Let also \(\nabla _{\mathcal {T}}^{}\) denote the broken gradient defined as \(\nabla _{\mathcal {T}}^{}v|_K:= \nabla (v|_K)\), \(K\in \mathcal {T}\).

The discontinuous Galerkin space \(S_\mathcal {T}^{\mathbf{p}} \) with respect to \(\mathcal {T}\) is defined as

$$\begin{aligned} S_\mathcal {T}^{\mathbf{p}} :=\{v\in L_2(\Omega ) :v|_{K}\in \mathcal {P}_{p_K^{}}(K),\,K\in \mathcal {T}\}, \end{aligned}$$

with \(\mathcal {P}_{p}(K)\) denoting the space of d-variate polynomials of total degree up to p on \(K\), and \(\mathbf {p}:\cup _{K\in \mathcal {T}}K\rightarrow \mathbb {R}_+\) with \(\mathbf {p}|_{K}=p_{K}^{}\), \(K\in \mathcal {T}\). The local elemental polynomial spaces employed within \(S_\mathcal {T}^{\mathbf{p}} \) are defined in the physical coordinate system, i.e., without mapping from a given reference or canonical frame (cf.  [9,10,11]). All developments discussed below are also valid for the more ‘classical’ case of the discontinuous Galerkin spaces constructed through the usual mapping \(\mathbf {F}_K:\hat{K}\rightarrow K\) from a suitable reference element \(\hat{K}\), viz.,

$$\begin{aligned} S_{\mathcal {T},map}^{\mathbf{p}} :=\{v\in L_2(\Omega ) :v|_K\circ \mathbf {F}_{K}\in \mathcal {P}_{p_K^{}}(\hat{K}),\,K\in \mathcal {T}\}. \end{aligned}$$

Let \(K_+\) and \(K_-\) be two adjacent elements of \(\mathcal {T}\) sharing a face \(F\subset \partial K_+ \cap \partial K_- \subset \Gamma _{\mathrm{int}}\). For v and \(\mathbf {q}\) element-wise continuous scalar- and vector-valued functions, respectively, we define the weighted average across F by

$$\begin{aligned} \{\!\{v\}\!\}_{w_F^{}}|_F:=w_+v_+|_F+w_-v_-|_F, \quad \{\!\{\mathbf {q}\}\!\}_{w_F^{}}|_F:=w_+\mathbf {q}_+|_F+w_-\mathbf {q}_-|_F, \end{aligned}$$

for \(w_F:=(w_+,w_-)\in \mathbb {R}_+^2\) with \(w_+ + w_-=1\), respectively, with \(v_\pm |_F\) denoting the trace of v from the element \(K_\pm \) on F, and correspondingly for \(\mathbf {q}\). Also, we define the jump across F by

$$\begin{aligned}{}[\![v]\!]|_F :=v_+\mathbf {n}_+ +v_-\mathbf {n}_- , \quad [\![\mathbf {q}]\!]|_F := \mathbf {q}_+\cdot \mathbf {n}_+ +\mathbf {q}_-\cdot \mathbf {n}_-, \end{aligned}$$

with \(\mathbf {n}_\pm \) denoting the unit outward normal of \(K_\pm \) on F. We collect all weight pairs in a weight function \(\mathbf {w}:\Gamma _\mathrm{int}\rightarrow \mathbb {R}_+^2\), with \(\mathbf {w}|_F=w_F^{}\), for each face \(F\subset \Gamma _\mathrm{int}\). On a boundary face \(F\subset \partial \Omega \cap \partial K\), we set \(\{\!\{v\}\!\}:=v,\) \({\{\!\{\mathbf {q}\}\!\}:=\mathbf {q},}\) \([\![v]\!] :=v\mathbf {n} ,\) and \( [\![\mathbf {q}]\!] := \mathbf {q}\cdot \mathbf {n} \), respectively, with \(\mathbf {n}\) the unit outward normal to \( \partial \Omega \). In view of the latter, we extend \(\mathbf {w}\) to \(\Gamma \) by setting \(\mathbf {w}|_{\partial \Omega }=(1,0)\) with \(K_-=\emptyset \) there.

The (weighted) interior penalty discontinuous Galerkin method reads: find \(u_h\in S_\mathcal {T}^{\mathbf{p}} \), (or in \(S_{\mathcal {T},map}^{\mathbf{p}} \),) such that

$$\begin{aligned} B(u_h,v_h)=\ell (v_h),\qquad \text {for all } v_h\in S_\mathcal {T}^{\mathbf{p}} , \end{aligned}$$
(2.1)

with

$$\begin{aligned} \begin{aligned} B(u_h,v_h):=&\ \int _\Omega a\nabla _{\mathcal {T}}^{}u_h\cdot \nabla _{\mathcal {T}}^{}v_h\,\mathrm {d}x +\int _\Gamma \sigma [\![ u_h]\!]\cdot [\![v_h]\!]\,\mathrm {d}s \\&-\int _\Gamma \big (\{\!\{a\nabla u_h\}\!\}_{\mathbf {w}}\cdot [\![v_h]\!]+\theta \{\!\{a\nabla v_h\}\!\}_{\mathbf {w}}\cdot [\![u_h]\!]\big )\,\mathrm {d}s, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \ell (v_h):=&\ \int _\Omega f v_h\,\mathrm {d}x +\int _{\partial \Omega } g(\sigma v_h-\theta a\nabla v_h\cdot \mathbf {n}) \,\mathrm {d}s , \end{aligned} \end{aligned}$$

with \(\theta \in [-1,1]\) and \(\sigma :\Gamma \rightarrow \mathbb {R}\) the, so-called, discontinuity-penalization or penalty function, whose precise definition, along with that of the weights \(\mathbf {w}\) is a central concern in this work. Note that, selecting \(w_F^{}=(1/2,1/2)\), for all \(F\subset \Gamma _\mathrm{int}\), we retrieve the classical IPDG method of Wheeler [28] and of Arnold [2], while the case \(w_F^{}=(1,0)\) or \(w_F^{}=(0,1)\) for all \(F\subset \Gamma _\mathrm{int}\) gives the original method of Baker [4]. Moreover, in [6, 15], the weighted average choice \(w_F^{}=(\delta _+,\delta _-)\) with

$$\begin{aligned} \delta _\pm := \frac{\alpha _\mp }{\alpha _++\alpha _-}, \end{aligned}$$

with \(\alpha _\pm :=a|_{K_\pm }\), was proposed for the case of element-wise constant scalar diffusion coefficients, while in [16] the selection \(\alpha _\pm :=\mathbf {n}_F^Ta|_{K_\pm }\mathbf {n}_F\) for element-wise constant diffusion tensors was provided, by combining the ideas from [6, 15] and [18]. More recently, weighted averages involving also local meshsizes have been proposed for treating interface problems [1, 20] on non-matching grids on either side of the interface. These choices result to robust a priori error analysis with respect to the relative sizes of diffusion locally and/or the relative meshsize across interfaces. The main contribution of this work is the proposition of a different recipe for \(\mathbf {w}\), which will yield robust dependence not only with respect to the local variation of the diffusion, but also with respect to the local direction-wise variations in meshsize and polynomial degree. This will, in turn, yield a robust and rigorous choice for the discontinuity-penalization parameter \(\sigma \).

The choice \(\theta =1\) in (2.1) yields the symmetric version, while the choice \(\theta =-1\) the non-symmetric version of the weighted IPDG method. Although, we shall focus on the symmetric version in this work, as the most popular choice, the developments presented below are also valid for the whole range \(\theta \in [-1,1]\), with trivial modifications of the arguments and of the formulas.

To highlight the importance in the careful selection of \(\sigma \), we consider a d-dimensional simplicial triangulation \(\mathcal {T}\), we set \(\theta =1\), \(w_F=(1/2,1/2)\) for all faces \(F\subset \Gamma _\mathrm{int}\), and we study the coercivity of the bilinear form \(B(\cdot ,\cdot )\) on \(S_\mathcal {T}^{\mathbf{p}} \). To that end, for \(v_h\in S_\mathcal {T}^{\mathbf{p}} \), have

$$\begin{aligned} \begin{aligned} B(v_h,v_h)=&\ \Vert {\sqrt{a}\nabla _{\mathcal {T}}^{}v_h}\Vert _{L_2(\Omega )}^2 +\Vert {\sqrt{\sigma } [\![ v_h]\!]}\Vert _{L_2(\Gamma ) }^2 -2\int _\Gamma \{\!\{a\nabla v_h\}\!\}\cdot [\![v_h]\!]\,\mathrm {d}s, \end{aligned} \end{aligned}$$
(2.2)

with \(\{\!\{\cdot \}\!\}:=\{\!\{\cdot \}\!\}_{(1/2,1/2)}\) the usual arithmetic average used in standard IPDG methods. To further estimate the last term on the right-hand side of (2.2), we employ a trace inverse estimate, such as the one below.

Lemma 2.1

Let \(K\in \mathcal {T}\) a simplicial d-dimensional simplex and one of its faces \(F\subset \partial K\). For any \(v \in \mathcal {P}_p(K)\), we have

$$\begin{aligned} \Vert {v}\Vert _{L_2(F)}^2\le \frac{(p+1)(p+d)}{d}\frac{|F|}{|K|}\Vert {v}\Vert _{L_2(K)}^2. \end{aligned}$$
(2.3)

Proof

This follows immediately by combining a careful scaling argument with the main result from [27]. \(\square \)

We refer to [9] for a generalization of the above estimate to general curved polygonal/polyhedral elements with arbitrary number of faces. Note that for shape-regular elements, (2.3) yields the familiar version \( \Vert {v}\Vert _{L_2(\partial K)}^2\le c_\mathrm{inv} p^2/h_K\Vert {v}\Vert _{L_2(K)}^2, \) with the constant \(c_\mathrm{inv}>0\) depending on the aspect ratio of the element \(K\) and the dimension d. If, instead, we have \(v_h\in S_{\mathcal {T},map}^{\mathbf{p}} \), then \(c_\mathrm{inv}\) will also depend on the elemental maps \(F_K\) if the latter are non-affine.

Returning to (2.2), elementary calculations, along with (2.3), for faces \(F\subset \partial K_+\cap \partial K_-\) give

$$\begin{aligned} \begin{aligned}&\Big |2\int _\Gamma \{\!\{a\nabla u_h\}\!\}\cdot [\![v_h]\!]\,\mathrm {d}s \Big |\\&\quad \le {\sum _{F\subset \Gamma }\sum _{*\in \{+,-\}} \int _F |a\nabla u_h|_{K_*} [\![v_h]\!]|\,\mathrm {d}s}\\&\quad \le \sum _{F\subset \Gamma }\sum _{*\in \{+,-\}} C_\mathrm{inv,*}\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}\Vert {\nabla u_h}\Vert _{L_2(K_*)} \Vert {[\![v_h]\!]}\Vert _{L_2(F)}\\&\quad \le \sum _{F\subset \Gamma }\sum _{*\in \{+,-\}} C_{\mathrm{inv},*}\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}\Vert a^{-\frac{1}{2}}\Vert _{L_\infty (K_*)}\Vert {\sqrt{a}\nabla u_h}\Vert _{L_2(K_*)} \Vert {[\![v_h]\!]}\Vert _{L_2(F)}\\&\quad \le \frac{1}{2} \Vert {\sqrt{a}\nabla _{\mathcal {T}}^{}u_h}\Vert _{L_2(\Omega )}^2+\frac{1}{2}\sum _{F\subset \Gamma }\tau _F\Vert {[\![v_h]\!]}\Vert _{L_2(F)}^2, \end{aligned} \end{aligned}$$

with \(C_{\mathrm{inv},*}:=\sqrt{p_{K_*}^{} (p_{K_*}^{}+d-1)|F|/(d|K_*|)}\),

$$\begin{aligned} \tau _F:=2\max _{*\in \{+,-\}} m_{K_*}^{}C_{\mathrm{inv},*}^2\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}^2\Vert a^{-1}\Vert _{L_\infty (K_*)}, \end{aligned}$$
(2.4)

and \(m_{K_*}\) the number of faces of the element \(K_*\), defined above; note that \(\nabla _{\mathcal {T}}^{}u_h\in [S_\mathcal {T}^{\mathbf{p}-\mathbf{1}} ]^d\), i.e., the discontinuous Galerkin space defined by the same mesh having polynomial basis of one degree less than \(S_\mathcal {T}^{\mathbf{p}} \).

These developments, in conjunction with (2.2), imply

$$\begin{aligned} \begin{aligned} B(v_h,v_h)\ge&\frac{1}{2}\Vert {\sqrt{a}\nabla _{\mathcal {T}}^{}u_h}\Vert _{L_2(\Omega )}^2 +\sum _{F\subset \Gamma }\int _F \big (\sigma -\frac{\tau _F}{2}\big ) [\![ u_h]\!]^2\,\mathrm {d}s, \end{aligned} \end{aligned}$$

upon setting \(v_h=u_h\). Therefore, the selection \(\sigma |_F = \tau _F\), for all faces \(F\subset \Gamma \) gives the coercivity estimate

(2.5)

The above choice of the discontinuity-penalization parameter \(\sigma \) is, to the best of our knowledge, the sharpest general estimate for simplicial meshes. An analogous choice of \(\sigma \) was proposed in [9, 11] for significantly more complex element shapes. We note that hanging nodes are permissible by viewing a simplex as a polytopic element with many co-planar faces and by modifying \(m_K\) accordingly. The case of box-type elements will be discussed in more detail in Sect. 5 below.

Let us define now an extension of the bilinear form \(B:S_\mathcal {T}^{\mathbf{p}} \times S_\mathcal {T}^{\mathbf{p}} \rightarrow \mathbb {R}\) for the special case at hand, to accept arguments from the larger space \(\mathcal {V}:=H^1(\Omega )+S_\mathcal {T}^{\mathbf{p}} \), viz., \(B:\mathcal {V}\times \mathcal {V}\rightarrow \mathbb {R}\) with

$$\begin{aligned} \begin{aligned} B({z},v):=&\int _\Omega a\nabla _{\mathcal {T}}^{}{z}\cdot \nabla _{\mathcal {T}}^{}v\,\mathrm {d}x +\int _\Gamma \sigma [\![{z}]\!]\cdot [\![v]\!]\,\mathrm {d}s \\&-\int _\Gamma \big (\{\!\{a\Pi _\mathbf{{p-1}} \nabla {z}\}\!\}\cdot [\![v]\!]+ \{\!\{a\Pi _\mathbf{{p-1}} \nabla v\}\!\}\cdot [\![{z}]\!]\big )\,\mathrm {d}s, \end{aligned} \end{aligned}$$

for all \({z},v\in \mathcal {V}\), with \(\Pi _\mathbf{{p-1}}: [L_2(\Omega )]^d \rightarrow [S_\mathcal {T}^{\mathbf{p}-\mathbf{1}} ]^d\) denoting the orthogonal \(L_2\)-projection operator onto \([S_\mathcal {T}^{\mathbf{p}-\mathbf{1}} ]^d\). Completely analogous arguments to the ones presented for the proof of (2.5), along with the stability of \(\Pi _\mathbf{{p-1}}\) in the \(L_2\)-norm, result in the continuity bound

(2.6)

Remark 2.2

If a different Galerkin space than \(S_\mathcal {T}^{\mathbf{p}} \) is used, one has to modify the coercivity and continuity analysis and, correspondingly, the definition of \(\sigma \) accordingly. In particular, it may not be the case anymore that \(\nabla _{\mathcal {T}}^{}u_h \in [S_\mathcal {T}^{\mathbf{p}-\mathbf{1}} ]^d\), but rather that \(\nabla _{\mathcal {T}}^{}u_h\) resides on the Galerkin space itself.

3 A Robust Interior Penalty Discontinuous Galerkin Method

We now propose a robust interior penalty discontinuous Galerkin method by taking advantage of appropriately selected weights in (2.1). In particular, we will select diffusion tensor, mesh and local polynomial degree dependent weights. To that end, upon considering (2.1), we have

$$\begin{aligned} \begin{aligned} B(v_h,v_h)=&\Vert {\sqrt{a}\nabla _{\mathcal {T}}^{}v_h}\Vert _{L_2(\Omega )}^2 +\Vert {\sqrt{\sigma } [\![ v_h]\!]}\Vert _{L_2(\Gamma ) }^2 -2\int _\Gamma \{\!\{a\nabla v_h\}\!\}_{\mathbf {w}}\cdot [\![v_h]\!]\,\mathrm {d}s. \end{aligned} \end{aligned}$$
(3.1)

Setting \(C_{\mathrm{inv},*}:=\sqrt{p_{K_*}^{} (p_{K_*}^{}+d-1)|F|/(d|K_*|)}\), using (2.3), and working as before, we get, for \(F\subset \partial K_+\cap \partial K_-\),

$$\begin{aligned} \begin{aligned}&\Big |2\int _\Gamma \{\!\{a\nabla u_h\}\!\}_{\mathbf {w}}\cdot [\![v_h]\!]\,\mathrm {d}s \Big |\\&\quad \le \ 2 \sum _{F\subset \Gamma }\int _F \Big (\sum _{*\in \{+,-\}} w_*\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}|\nabla u_h|_{K_*}| \Big )|[\![v_h]\!]|\,\mathrm {d}s\\&\quad \le 2 \sum _{F\subset \Gamma }\sum _{*\in \{+,-\}} C_{\mathrm{inv},*}w_*\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}\Vert a^{-\frac{1}{2}}\Vert _{L_\infty (K_*)}\Vert {\sqrt{a}\nabla u_h}\Vert _{L_2(K_*)} \Vert {[\![v_h]\!]}\Vert _{L_2(F)}. \end{aligned} \end{aligned}$$

Setting now

$$\begin{aligned} \zeta _*:=\Big (2\sqrt{m_{K_*}}C_{\mathrm{inv},*}\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}\Vert a^{-\frac{1}{2}}\Vert _{L_\infty (K_*)}\Big )^{-1}, \end{aligned}$$

for \(*\in \{+,-\}\), and selecting

$$\begin{aligned} w_* =\frac{\zeta _*}{\zeta _++\zeta _-}, \end{aligned}$$
(3.2)

we deduce

$$\begin{aligned} \begin{aligned} \Big |2\int _\Gamma \{\!\{a\nabla u_h\}\!\}_{\mathbf {w}}\cdot [\![v_h]\!]\,\mathrm {d}s \Big | \le&\ \frac{1}{2} \Vert {\sqrt{a}\nabla _{\mathcal {T}}^{}u_h}\Vert _{L_2(\Omega )}^2+\frac{1}{2}\sum _{F\subset \Gamma }(\zeta _++\zeta _-)^{-2}\Vert {[\![v_h]\!]}\Vert _{L_2(F)}^2. \end{aligned} \end{aligned}$$

A reasonable selection for the discontinuity-penalization parameter is, therefore,

$$\begin{aligned} \sigma |_F^{}=(\zeta _++\zeta _-)^{-2}, \end{aligned}$$
(3.3)

with \(\zeta _-=0\) in the case of a boundary face.

Definition 3.1

The robust interior penalty discontinuous Galerkin method (RIPDG) is the Galerkin procedure defined by (2.1) with \(\mathbf {w}=(w_+,w_-)\) for \(w_*\) given by (3.2), \(*\in \{+,-\}\) and the discontinuity-penalization parameter selected as in (3.3).

Remark 3.2

The elementary identity \((\zeta _++\zeta _-)^{-2}\le (\min _{*\in \{+,-\}} 2 \zeta _*)^{-2}\), results into the bound

$$\begin{aligned} \sigma |_F^{}\le \min _{*\in \{+,-\}} m_{K_*}C^2_{\mathrm{inv},*}\Vert a\mathbf {n}|_{K_*}\Vert ^2_{L_\infty (F)}\Vert a^{-\frac{1}{2}}\Vert ^2_{L_\infty (K_*)}. \end{aligned}$$
(3.4)

The estimate (3.4) shows that the RIPDG discontinuity-penalization parameter grows proportionally with \(\min _{*\in \{+,-\}} C_{\mathrm{inv},*}\) and is effectively independent of the local size of the diffusion coefficient. In contrast, for IPDG, the discontinuity-penalization parameter grows proportionally with \(\max _{*\in \{+,-\}} C_{\mathrm{inv},*}\) and is not robust with respect to the contrast; cf., (2.4).

Remark 3.3

The above argument still applies for different choices of element-wise polynomial Galerkin spaces by replacing suitably \(C_{\mathrm{inv},*}\) by an available upper bound of the respective inverse estimate constant for the space at hand.

Remark 3.4

(On inverse estimate constants) For different choices of element-wise Galerkin spaces, we may end up with unspecified/hard to estimate universal constants in the inverse estimate. We note carefully that any such constants cancel out in the definition of the weights \(\mathbf {w}\) and, thus, the weights are independent from such constants, ensuring the practical selection of weights in various approximation space scenarios. Of course, as is the case for the standard IPDG method, the RIPDG penalty parameter will depend proportionally on such universal constants.

The proof of (2.6) is also then immediate upon considering the inconsistent formulation

$$\begin{aligned} \begin{aligned} B({z},v):=&\ \int _\Omega a\nabla _{\mathcal {T}}^{}{z}\cdot \nabla _{\mathcal {T}}^{}v\,\mathrm {d}x +\int _\Gamma \sigma [\![{z}]\!]\cdot [\![v]\!]\,\mathrm {d}s \\&-\int _\Gamma \big ({\{\!\{a\Pi _\mathbf{{p-1}} \nabla {z}\}\!\}_{\mathbf {w}}}\cdot [\![v]\!]+ {\{\!\{a\Pi _\mathbf{{p-1}} \nabla v\}\!\}_{\mathbf {w}}}\cdot [\![{z}]\!]\big )\,\mathrm {d}s, \end{aligned} \end{aligned}$$

for all \({z},v\in \mathcal {V}\), resulting in the continuity bound

(3.5)

We now compare the effects on the size of \(\sigma \) of the classical choice \(\mathbf {w}=(1/2,1/2)\) and the of the one proposed herein for the averaging operator. Any significantly different behaviour is bound to occur in extreme mesh and/or polynomial degree local variation scenarios. To that end, we consider the problem (1.1) for \(d=2\) with \(a=I_{2\times 2}\), the identity matrix. The latter’s solution is approximated via (2.1) for \(\mathbf {w}=(1/2,1/2)\) and also via (2.1) with the choice (3.2), over a mesh consisting of two quadrilateral elements

$$\begin{aligned} K_1=(0,1-\delta )\times (0,1),\qquad K_2=(1-\delta ,1)\times (0,1), \end{aligned}$$

for \(\delta <1/2\) and with uniform local polynomial degree p. The discontinuity-penalization parameter will differ for the two variants only on the interior face \(F=\{1-\delta \}\times (0,1)\). For the classical IPDG method ((2.1) with \(\mathbf {w}=(1/2,1/2)\)), we calculate:

$$\begin{aligned} \sigma _F\equiv \sigma _F^\mathrm{IP}=4p(p+1)/\delta , \end{aligned}$$

whereas for the choice (3.2), we get

$$\begin{aligned} \sigma _F\equiv \sigma _F^\mathrm{RIP}=\frac{8p(p+1)}{(\sqrt{1-\delta }+\sqrt{\delta })^2} . \end{aligned}$$

So, as \(\delta \rightarrow 0\), we have \(\sigma _F^\mathrm{IP}\rightarrow \infty \), while \( \sigma _F^\mathrm{RIP}\rightarrow 8 p(p+1). \) This extreme scenario highlights vastly different penalization pattern between the two variants. In the new choice of the weighted version of IPDG ((2.1) with (3.2)), we observe robust behaviour with respect to the local mesh variation.

Completely analogously, we now set \(\delta =1/2\), so that \(K_1\) and \(K_2\) are identical in terms of shape, and we set \(p_{K_1}^{}=1\) and \(p_{K_2}^{}=p>1\). We then compute

$$\begin{aligned} \sigma _F^\mathrm{IP}=8p(p+1)\rightarrow \infty , \end{aligned}$$

as \(p\rightarrow \infty \), whereas

$$\begin{aligned} \sigma _F^\mathrm{RIP}= 16\big ( (p(p+1))^{-1/2}+2^{-1/2}\big )^{-2}\rightarrow 32. \end{aligned}$$

Having the penalty parameter converging to infinity may become an issue in terms of the conditioning of the resulting linear systems. Therefore, when a conforming subspace of the DG space with the same approximation properties exists, e.g., for the case of a simplicial mesh, one expects only conditioning differences between the two variants. However, in the more practically interesting case whereby there is no conforming subspace of full order, e.g., for the case of \(S_\mathcal {T}^{\mathbf{p}} \) on a mesh \(\mathcal {T}\) consisting of box-type (quadrliateral/hexahedral) elements, \(\sigma _F^\mathrm{IP}\rightarrow \infty \) may be compromising in terms of approximation also.

4 Extension to General Polytopic Elements

The classical IPDG has been put forward as a method of choice for the development of Galerkin methods on meshes consisting of extremely general polygonal/polyhedral elements; see, e.g., [5, 9,10,11] and the references therein. A key development for practical applications has been their ability to admit provably stable approximations on elements \(K\) with arbitrary number of faces F, possibly with \(|F|\ll |K|^{1/d}\). The weight selection in the robust IPDG method presented above depends on the number of elemental faces. Applying the method in the form (2.1) with the selection (3.2) is may lead to spurious, excessive over-penalization as \(m_{K_*}\rightarrow \infty \).

Fortunately, it is often possible to use non-overlapping simplicial subdivisions of the element \(K\) to effectively arrive at a bounded \(m_K\). More specifically, a number of extremely mild geometric conditions are presented in [9] (cf. also [10] for earlier ideas in this vein) so that a polygonal/polyhedral element \(K\), possibly containing curved faces, admits an inverse estimate of the form

$$\begin{aligned} \Vert {v}\Vert _{L_2(F_i)}^2\le C^2_{\mathrm{inv}}(p,F_i,K)\Vert {v}\Vert _{L_2(K)}^2, \end{aligned}$$
(4.1)

for any \(v \in \mathcal {P}_p(K)\) and for \(\{F_i\}_{i=1}^{m_K}\) a mutually disjoint subdivision of \(\partial K\), with \(C_{\mathrm{inv}}(p,F_i,K)>0\) a concretely determined expression; see [9, Lemma 4.21] for the exact formula. A key idea in [9] is that each \(F_i\) is star-shaped with respect to one interior point \(x_i\) of \(K\), \(i=1,\dots ,m_K\), and may contain an arbitrary number of (possibly very small in size) \((d-1)\)-dimensional faces. Following the same line of argument as in Sect. 3, we can define a robust IPDG method on meshes consisting of general, possibly curved, polygonal/polyhedral elements by selecting

$$\begin{aligned} \zeta _*:=\Big (2\sqrt{m_{K_*}^{}}C_{\mathrm{inv}}(p,F_i,K)\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}\Vert a^{-\frac{1}{2}}\Vert _{L_\infty (K_*)}\Big )^{-1}, \end{aligned}$$

and retaining the definitions of (3.2) and (3.3). We note carefully that in the case of general polytopic meshes (4.1) may contain an unknown constant which, nevertheless, cancels out in the definition of the weights as discussed in Remark 3.4, yielding explicit choice for the weights \(\mathbf {w}\).

5 On Box-Type Elements

In Sect. 3, the discussion focused on simplicial meshes, for which weights and discontinuity-penalization parameters for RIPDG, that are free of unknown constants are provided. The main technical tool has been the availability of a trace inverse estimate with explicitly known constants for the simplicial reference element [27], also presented in Lemma 2.1 for an arbitrary simplex.

For box-type (quadrilateral, hexahedral, etc.) element meshes, classical discontinuous Galerkin literature and popular available implementations in software libraries propose the use of ‘Q-type’ element bases (i.e., polynomial spaces of degree p in each variable) in conjunction with nonlinear element mappings. In other words, we are in the setting of \(S_{\mathcal {T},map}^{\mathbf{p}} \) defined above, in complete analogy with the classical developments in conforming finite element methods. In this case, the trace inverse estimate becomes

$$\begin{aligned} \Vert {v}\Vert _{L_2(F)}^2\le (p_K^{}+1)^2\Vert J_{\mathbf {F}_K}|_{F\circ \mathbf {F}_K}\Vert _{L_\infty (F\circ \mathbf {F}_K)}^{}\Vert J_{\mathbf {F}_K}^{-1}\Vert _{L_\infty (K)}^{}\Vert {v}\Vert _{L_2(K)}^2, \end{aligned}$$
(5.1)

with \(J_{\mathbf {F}_K}\) denoting the Jacobian of the mapping \(\mathbf {F}_K:(0,1)^d\rightarrow K\), for a box-type element \(K\in \mathcal {T}\). The proof follows by employing Fubini’s Theorem in conjunction with Lemma 2.1 for \(d=1\) to show the trace inverse estimate

$$\begin{aligned} \Vert {v\circ \mathbf {F}_K}\Vert _{L_2(F\circ \mathbf {F}_K)}^2\le (p_K^{}+1)^2\Vert {v\circ \mathbf {F}_K}\Vert _{L_2((0,1)^d)}^2, \end{aligned}$$

for any \((d-1)\)-dimensional interface \(F\circ \mathbf {F}_K\) of the reference hypercube \((0,1)^d\), along with a standard scaling argument. This setting was alluded to in Remark 3.4 above. Therefore, employing (5.1) to construct the weights from RIPDG will involve the Jacobian of the element maps; this may be somewhat inconvenient in certain practical scenarios.

An alternative point of view was proposed in [11] (see also [7,8,9,10, 13, 14] for further results), whereby the finite element space is defined via \(S_\mathcal {T}^{\mathbf{p}} \) for box-type elements also, that is element-wise polynomials of total degree p are employed even for box-type elements, without the use of non-linear mappings. As it was shown numerically in [8, 11] and proven in [14], this choice results into improved spectral convergence under p-refinement in various settings, since less basis functions per element are used. In this spirit, we may view a box-type element \(K\in \mathcal {T}\) as the union of d non-overlapping simplicial elements and, correspondingly each \((d-1)\)-dimensional box-type interface can be viewed as a union of more than one \((d-1)\)-dimensional simplicial faces. In such a setting, we can employ the developments of Sect. 4 to define the RIPDG weights and discontinuity-penalization parameter, exactly like in the case of a general polytopic element. The advantage of this approach is that nonlinear mappings are not involved in the constants, at the expense of a “notional subdivision” of the box-type element into non-overlapping, simplicial subelements.. Note that this process is not increasing the number of degrees of freedom: it is performed locally only for the definition of the weights and of \(\sigma \).

6 Robust IPDG for Degenerate Elliptic Problems

It is possible to have well-posed elliptic problems of the form (1.1) with diffusion tensors a for which \(a\mathbf {n}=\mathbf {0}\), or even \(a=\mathbf {0}\), on lower-dimensional manifolds of \(\Omega \) [24]. In such cases, the choice of weights from (3.2) has to be revisited. To that end, we consider the inconsistent robust interior penalty discontinuous Galerkin method reading: find \(u_h\in S_\mathcal {T}^{\mathbf{p}} \), (or in \(S_{\mathcal {T},map}^{\mathbf{p}} \),) such that

$$\begin{aligned} \tilde{B}(u_h,v_h)=\tilde{\ell }(v_h),\qquad \text {for all } v_h\in S_\mathcal {T}^{\mathbf{p}} , \end{aligned}$$
(6.1)

with

$$\begin{aligned} \begin{aligned} \tilde{B}(u_h,v_h):=&\int _\Omega a\nabla _{\mathcal {T}}^{}u_h\cdot \nabla _{\mathcal {T}}^{}v_h\,\mathrm {d}x +\int _\Gamma \sigma [\![ u_h]\!]\cdot [\![v_h]\!]\,\mathrm {d}s \\&-\int _\Gamma \big (\{\!\{\sqrt{a}\Pi _{\mathbf {p-1}}(\sqrt{a}\nabla u_h)\}\!\}_{\mathbf {w}}\cdot [\![v_h]\!]+\theta \{\!\{\sqrt{a}\Pi _{\mathbf {p-1}}(\sqrt{a}\nabla v_h)\}\!\}_{\mathbf {w}}\cdot [\![u_h]\!]\big )\,\mathrm {d}s, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \ell (v_h):=&\int _\Omega f v_h\,\mathrm {d}x +\int _{\partial \Omega } g\big (\sigma v_h-\theta \sqrt{a}\Pi _{\mathbf {p-1}}(\sqrt{a}\nabla v_h\cdot \mathbf {n})\big ) \,\mathrm {d}s , \end{aligned} \end{aligned}$$

for \(\theta \in [-1,1]\), with \(\Pi : [L_2(\Omega )]^d \rightarrow [S_\mathcal {T}^{\mathbf{p}} ]^d\) (or, \( \Pi : [L_2(\Omega )]^d \rightarrow [S_{\mathcal {T},map}^{\mathbf{p}} ]^d\),) denoting the orthogonal \(L_2\)-projection operator onto the dG space. The method was first presented in [18] for the case \(\mathbf {w}=(1/2,1/2)\) and with \(\Pi _{\mathbf {p-1}}\) replaced by \(\Pi _{\mathbf {p}}\). For simplicity of the presentation, we shall again focus on the symmetric case \(\theta =1\); the cases \(\theta \in [-1,1)\) follow via trivial modifications of the arguments given below.

From (2.3), and working as before, we get, for \(F\subset \partial K_+\cap \partial K_-\),

$$\begin{aligned} \begin{aligned}&\Big |2\int _\Gamma \{\!\{\sqrt{a}\Pi _{\mathbf {p-1}}(\sqrt{a}\nabla u_h)\}\!\}_{\mathbf {w}}\cdot [\![v_h]\!]\,\mathrm {d}s \Big |\\&\quad \le 2 \sum _{F\subset \Gamma }\int _F \Big (\sum _{*\in \{+,-\}} w_*\Vert \sqrt{a}\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}|\Pi _{\mathbf {p-1}}(\sqrt{a}\nabla u_h)|_{K_*}| \Big )| [\![v_h]\!]|\,\mathrm {d}s\\&\quad \le 2 \sum _{F\subset \Gamma }\sum _{*\in \{+,-\}} C_{\mathrm{inv},*}w_*\Vert \sqrt{a}\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}\Vert {\sqrt{a}\nabla u_h}\Vert _{L_2(K_*)} \Vert {[\![v_h]\!]}\Vert _{L_2(F)}, \end{aligned} \end{aligned}$$

using the stability of the orthogonal \(L_2\)-projection operator. Setting now

$$\begin{aligned} \tilde{\zeta }_*:=\Big (2\sqrt{m_{K_*}}C_{\mathrm{inv},*}\Vert \sqrt{a}\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}\Big )^{-1}, \end{aligned}$$

for \(*\in \{+,-\}\), and selecting, again,

$$\begin{aligned} w_* =\frac{\tilde{\zeta }_*}{\tilde{\zeta }_++\tilde{\zeta }_-}, \end{aligned}$$
(6.2)

we deduce

$$\begin{aligned} \begin{aligned} \Big |2\int _\Gamma \{\!\{\sqrt{a}\Pi _{\mathbf {p-1}}\sqrt{a}(\nabla u_h)\}\!\}_{\mathbf {w}}\cdot [\![v_h]\!]\,\mathrm {d}s \Big | \le&\ \frac{1}{2} \Vert {\sqrt{a}\nabla _{\mathcal {T}}^{}u_h}\Vert _{L_2(\Omega )}^2 +\frac{1}{2}\sum _{F\subset \Gamma }\sigma |_F\Vert {[\![v_h]\!]}\Vert _{L_2(F)}^2, \end{aligned} \end{aligned}$$

with \(\sigma |_F:=(\tilde{\zeta }_++\tilde{\zeta } _-)^{-2}, \) for every face \(F\subset \Gamma \). If the limit \(\sqrt{a}\mathbf {n}|_{K_*}\rightarrow \mathbf {0}\) (from within the element interior to the face) is valid a.e. on F for exactly one of \(*\in \{+,-\}\), we have \(w_*\rightarrow 1\), with the other weight in the pair tending to zero; also, in this case we have \(\sigma \rightarrow 0\). Further, if we have \(\sqrt{a}\mathbf {n}|_{K_\pm }\rightarrow \mathbf {0}\) from within the element interior to the face for both traces a.e. on F, we adopt the convention that both \(w_\pm \rightarrow 0\) and \(\sigma \rightarrow 0\) on F (i.e., we no longer enforce that \(w_++w_-=1\)).

7 A Priori Error Analysis

We now show that standard a priori error bounds hold for all the robust IPDG methods described in the previous section, assuming only sufficient regularity of the exact solution. In particular, no ‘local bounded variation/local quasi-uniformity’ mesh and polynomial degree assumptions are required, as is standard in the error analysis of the classical IPDG method with \(\mathbf {w}=(1/2,1/2)\).

More specifically, let \(u\in H^{3/2+\epsilon }(\Omega )\), \(\epsilon >0\). Elementary calculations imply

$$\begin{aligned} B(u_h,w_h) = B(u,w_h) + \int _\Gamma r(u) \cdot [\![w_h]\!]\,\mathrm {d}s, \qquad w_h\in S_\mathcal {T}^{\mathbf{p}} , \end{aligned}$$
(7.1)

with \(r(u):=\{\!\{a\nabla u-\Pi _{\mathbf {p-1}}(a\nabla u)\}\!\}_{\mathbf {w}}\) for the robust IPDG from Sect. 3, or \(r(u):=\{\!\{a\nabla u-\sqrt{a}\Pi _{\mathbf {p-1}}(\sqrt{a}\nabla u)\}\!\}_{\mathbf {w}}\) for the variant from Sect. 6.

Then, the coercivity, the continuity along with standard calculations imply

(7.2)

To estimate the inconsistency term, assuming further that \(u\in H^2(\Omega )\), we make use of the best approximation estimate from [12] (see also [17, 21] for earlier results on box-type elements): there exists a constant \(C_\mathrm{ap}>0\), independent of v and of p such that

$$\begin{aligned} \Vert {v-\Pi _{\mathbf {p}}v}\Vert _{{L_2(\hat{F})}}^2\le C_\mathrm{ap} (p+1)^{-1}\Vert {\nabla v}\Vert _{L_2(\hat{K})}^2 , \end{aligned}$$
(7.3)

for any face \(\hat{F}\subset \partial \hat{K}\) of a reference simplicial element \(\hat{K}\). Employing a standard scaling argument, (7.3) implies for an element \(K\in \mathcal {T}\) the following best approximation estimate:

$$\begin{aligned} \Vert {v-\Pi _{\mathbf {p}}v}\Vert _{{L_2(F)}}^2\le C_\mathrm{ap} \frac{|F|h_K^2}{|K|(p+1)}\Vert {\nabla v}\Vert _{L_2(K)}^2. \end{aligned}$$
(7.4)

Upon observing the identity \(w_*\sigma ^{-1/2}=\zeta _*\) (or \(w_*\sigma ^{-1/2}=\tilde{\zeta }_*\) for the method of Sect. 6), we have, respectively, for any \(\mathbf {v}_h\in [S_\mathcal {T}^{\mathbf{p}-\mathbf{1}} ]^d\),

$$\begin{aligned} \begin{aligned}&\int _\Gamma r(u)\cdot [\![w_h]\!]\,\mathrm {d}s\\&\quad \le \sum _{F\subset \Gamma }\sum _{*\in \{+,-\}} \zeta _*\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}\Vert {\big (\nabla u -\mathbf {v}_h-\Pi _{\mathbf {p-1}}(\nabla u -\mathbf {v}_h)\big )|_{K_*}}\Vert _{{L_2(F)}} \Vert {\sqrt{\sigma }[\![w_h]\!]}\Vert _{{L_2(F)}}\\&\quad \le \sum _{F\subset \Gamma }\sum _{*\in \{+,-\}} \frac{\sqrt{C_{\mathrm{ap}}|F|}h_{K_*}}{\sqrt{|K_*|p_{K_*}}}\zeta _*\Vert a\mathbf {n}|_{K_*}\Vert _{L_\infty (F)}|\nabla u-\mathbf {v}_h|_{H^1(K_*)} \Vert {\sqrt{\sigma }[\![w_h]\!]}\Vert _{{L_2(F)}}\\&\quad \le \sum _{F\subset \Gamma }\sum _{*\in \{+,-\}} \frac{\Vert \sqrt{a}\Vert _{L_\infty (K_*)}\sqrt{dC_{\mathrm{ap}}}}{{2}\sqrt{m_{K_*}}\sqrt{ p_{K_*}^{}+d-1}}\frac{h_{K_*}}{p_{K_*}^{}}|\nabla u-\mathbf {v}_h|_{H^1(K_*)} \Vert {\sqrt{\sigma }[\![w_h]\!]}\Vert _{{L_2(F)}}\\&\quad \le \Big (\frac{dC_{\mathrm{ap}}}{{2}}\sum _{K\in \mathcal {T}} \Vert \sqrt{a}\Vert _{L_\infty (K)}^2\frac{h^2_{K}}{p_{K}^{3}}|\nabla u-\mathbf {v}_h|_{H^1(K)}^2\Big )^{\frac{1}{2}} \Vert {\sqrt{\sigma }[\![w_h]\!]}\Vert _{{L_2(\Gamma )}}, \end{aligned} \end{aligned}$$

using (7.4) in the first step and the definition of \(\zeta _*\) in the penultimate step. The last estimate concerns the robust IPDG method from Sect. 3 and holds under the regularity assumption \(u\in H^2(\Omega )\).

Correspondingly, for the method of Sect. 6, assuming that the a is such that \(\sqrt{a}\nabla u \in [H^1(\Omega )]^d\), (instead of \(u\in H^2(\Omega )\) as before,) we arrive at

$$\begin{aligned} \begin{aligned} \int _\Gamma r(u)\cdot [\![w_h]\!]\,\mathrm {d}s \le&\ \Big ( \frac{dC_{\mathrm{ap}}}{{2}}\sum _{K\in \mathcal {T}} \frac{h^2_{K}}{p_{K}^{3}}|\sqrt{a}\nabla u-\mathbf {v}_h|_{H^1(K)}^2\Big )^{\frac{1}{2}} \Vert {\sqrt{\sigma }[\![w_h]\!]}\Vert _{{L_2(\Gamma )}}. \end{aligned} \end{aligned}$$

Combining the above developments with (7.2), we deduce

(7.5)

for the robust IPDG of Sect. 3, or

for the method of Sect. 6, respectively. These basic estimates, together with standard hp-version best approximation results, give rise to standard a priori error bounds. Although not considered in detail here in the interest of brevity, following essentially the same steps as above, it is also possible to prove a priori error bounds for the robust IPDG method on polygonal/polyhedral elements from Sect. 4, with the possible exception of considering simplicial coverings of the elements \(K\in \mathcal {T}\) along with extension operators; we refer to [9,10,11] for details.

The crucial development in the a priori error bounds above, compared to all respective bounds we are aware of in the literature, is that the best approximation estimates are completely localized element-wise. This is in contrast to standard IPDG error bounds requiring local quasi-uniformity/bounded variation assumptions for the mesh size and local polynomial degree. This property is expected to be of importance on approximation space scenarios admitting extreme local variation.

Remark 7.1

In the above best approximation results we have taken a somewhat “classical” viewpoint of assuming sufficient regularity of the exact solution. More recent developments e.g. the so-called medius analysis [19], or the use of recovery operators [26], can provide quasi-optimality under minimal regularity assumptions on the exact solution. These approaches, however, are currently available only for the h-version IPDG under standard local quasi-uniformity mesh and diffusion coefficient bounded local variation assumptions for which the robust IPDG method presented in this work offers little or no advantage. Nonetheless, the extension of such “optimal” error analysis frameworks to the RIPDG setting is certainly of interest.

8 Numerical Experiments

We now present a series of numerical experiments showcasing the comparative performance between the classical IPDG and RIPDG. All numerical experiments below use the locally total degree p approximation space per element \(S_\mathcal {T}^{\mathbf{p}} \) on box-type or general polytopic element shapes, noting carefully that no conforming subspace of \(S_\mathcal {T}^{\mathbf{p}} \) of the same degree is available. This choice of approximation spaces has been advocated in [9,10,11] leading to optimal approximation yet reduced complexity for box-type elements.

Given that the effect of high contrast diffusion coefficient has been discussed extensively in [6, 15, 16], we confine our study to the effects of extreme mesh and polynomial degree variation. Nonetheless, we expect that the inclusion of high contrast diffusion coefficient in the spirit of [6, 16] in conjunction with extreme local variation scenarios will only augment the behaviours presented below. We stress that in all comparisons presented below both methods have exactly the same numerical degrees of freedom as they are applied to the same approximation space; the two respective implementations differ only on the choice of the weights \(\mathbf {w}\) and, correspondingly, to the choice of the penalty parameter. Therefore, any differences observed are due to the two different variants of the IPDG method implemented in the otherwise identical program, and not due to any other algorithmic factors (e.g., choice of quadrature or condition number estimation routines, etc.).

8.1 Example 1: Singularly Perturbed Reaction-Diffusion Problem

Fig. 1
figure 1

Example 1. The 9 element mesh

We begin by testing the method on anisotropic rectangular elements in the context of layer adapted hp-version IPDG/RIPDG method. Consider the problem

$$\begin{aligned} -\epsilon \Delta u + u =f, \qquad \text {on } \Omega := (-1,1)^2, \end{aligned}$$

with homogeneous Dirichlet boundary conditions and f an analytic function chosen so that the exact solution reads

$$\begin{aligned} u(x,y):= \Big (1-\frac{\cosh (x/\sqrt{\epsilon })}{\cosh (1/\sqrt{\epsilon })} \Big ) \Big (1-\frac{\cosh (y/\sqrt{\epsilon })}{\cosh (1/\sqrt{\epsilon })}\Big ). \end{aligned}$$
(8.1)

This example has been studied extensively in [17, 22]. The solution exhibits boundary layers of thickness \(\mathcal {O}(\sqrt{\epsilon })\).

To resolve the layers, we first use a layer-adapted anisotropic 9-element mesh with characteristic width \(l:=\min \{\lambda p \sqrt{\epsilon },0.5\}\), and some constant \(\lambda >0\); see Fig. 1 for an illustration. It was shown in [22] that conforming Galerkin methods on such meshes offer a robust exponential convergence under p refinement. Note that the maximum meshsize local variation constant r for the 9-element mesh is given by \( r:= {2(1-l)}/{l}. \) For \(\epsilon \ll 1\), we have \( r\approx 2/ ( \lambda p\sqrt{\epsilon } ). \)

We compare the symmetric versions of the classical IPDG against the RIPDG method on the same meshes for polynomial degrees \(p=1,\dots ,7\). In Fig. 2, it is observed that both methods converge exponentially in both dG-norm and broken \(H^1\)-seminorm \(|\cdot |_{H^1(\Omega ,\mathcal {T})}:=\Vert \nabla _{\mathcal {T}}^{}\cdot \Vert _{L_2(\Omega )}\) against the square root of the total numerical degrees of freedom (DoFs) under p refinement for \(\epsilon =10^{-5}\) and \(\lambda =0.9\). The RIPDG appears to outperform IPDG in both measures.

Fig. 2
figure 2

Example 1. Convergence in dG-norm (left) and broken \(H^1\)-seminorm (right) for \(\epsilon =10^{-5}\) and \(p=1,\dots ,7\)

Fig. 3
figure 3

Example 1. Maximum of the penalty parameter (left) and the condition number of the linear system (right) for \(\epsilon =10^{-5}\) and \(p=1,\dots ,7\)

Fig. 4
figure 4

Example 1. “Zigzag” meshes for \(p=2\) (top left), \(p=3\) (top right), \(p=5\) (bottom left), and \(p=7\) (bottom right)

Fig. 5
figure 5

Example 1. “Zigzag” mesh. Maximum of the penalty parameter (left) and the condition number of the linear system (right) for \(\epsilon =10^{-3}\) and \(p=1,\dots ,8\)

Fig. 6
figure 6

Example 1. “Zigzag” mesh. Convergence in dG-norm (left), broken \(H^1\)-seminorm (right), for \(\epsilon =10^{-3}\) and \(p=1,\dots ,8\)

Fig. 7
figure 7

Example 2. u for \(\alpha =100\) (left) and mesh and local polynomial degree distribution for the results of Table 1 (right)

In Fig. 3 (left), we compare the magnitudes of the global maximum of the penalty parameter values \(\max _{F\subset \Gamma }\sigma _F^\mathrm{IP}\) and \(\max _{F\subset \Gamma }\sigma _F^\mathrm{RIP}\), respectively, as well as the condition numbers for each method based on the same choice of basis. As expected, RIPDG requires a far smaller penalty parameter in theory for stability compared to the standard IPDG. For \(p=1\), \(\max _{F\subset \Gamma }\sigma _F^\mathrm{IP}\) is 120 times larger than \(\max _{F\subset \Gamma }\sigma _F^\mathrm{RIP}\). This difference in the definition of the penalty parameter explains also the improvement in the condition number for RIPDG compared to IPDG by a factor about 1.3 for \(p=1,\cdots ,7\); see Fig. 3 (right).

In the above experiment, we observe only a modest improvement on error and the conditioning by using RIPDG as opposed to IPDG. We highlight that the hp-version spaces used for convergence admit a conforming subspace of the same approximation capabilities.

We now employ a “zigzag” polygonal version of the 9 element mesh shown in Fig. 4 for \(p=2,3,5,7\). In particular, we replace the interior faces from the Fig. 1 by “zigzag” curves of amplitude l/6 with respect to the length scale \(l=\min \{\lambda p\sqrt{\epsilon },0.5\}\); the latter still characterises the distance of each interface to the non-intersecting portion of the boundary. On this 9 element polytopic mesh, we solve the same problem using the RIPDG and IPDG methods, on physical (i.e., unmapped) element-wise polynomial spaces of uniform degree p for \({p=2},\dots , 8\) and \(\epsilon =10^{-3}\), with the penalty choice informed by the inverse estimate (4.1). We observe that in this mesh no conforming subspace of sufficient approximation capabilities is available.

Table 1 Example 2. \(\alpha =100\). Comparison on the mesh and polynomial degree distribution given in Fig. 7 (right)
Fig. 8
figure 8

Mesh used for the solutions in Fig. 9 with \(p=1\) on all ‘small’ elements and \(p=5,8\) in the central ‘large’ element

Fig. 9
figure 9

Solution profiles for RIPDG (left) and IPDG (right) for \(\alpha =10\), \(p=1\) on all ‘small’ elements and \(p=3\) (first line), \(p=5\) (second line), and \(p=8\) (last line) in the central ‘large’ element

Fig. 10
figure 10

37 agglomerated elements made by 131072 triangular meshes

Fig. 11
figure 11

Exponential convergence in dG–norm (Left) and broken \(H^1\)–norm (Right) with \(p=1,\dots ,5\)

Fig. 12
figure 12

The maximum penalty parameters (Left) and the condition number of the linear system(Right) \(p=1,\dots ,5\)

In Fig. 5, we compare the magnitudes of the global maximum of the penalty parameter values and the condition numbers for each method based on the same choice of basis. In Fig. 6, we compare the respect errors for \(p=1,\dots ,8\). In Fig. 6, it is observed that both methods converge exponentially in dG-norm, and broken \(H^1\)-seminorm against the square root of the total numerical degrees of freedom (DoFs) under p refinement for \(\epsilon =10^{-3}\) and \(\lambda =0.9\). The RIPDG appears to significantly outperform IPDG for higher p; the IPDG appears to stagnate for \(p=7,8\). We believe that the reason behind this behaviour is the IPDG’s penalization magnitude in conjunction with the essential non-conformity of the approximation space. On the other hand, RIPDG’s p-convergence appears not to be affected.

8.2 Example 2: Poisson Problem with Gaussian Solutions

We now test the behaviour of each method against rapidly changing local polynomial degree p. To that end, in \(\Omega : =(-1,1)^2\), we consider the Poisson problem, with \(a=I_{2\times 2}\) the \(2\times 2\)-identity matrix, with Dirichlet boundary conditions, and we choose f such that the exact solution u is given by

$$\begin{aligned} u(x,y): = \exp (-\alpha (x^2+y^2)), \end{aligned}$$
(8.2)

for some \(\alpha >0\), which is analytic; in Fig. 7 (left) u for \(\alpha =100\) is depicted.

For the first numerical experiment, we employ a 9-element uniform square mesh polynomial degree \(p=2\) in the eight boundary elements and polynomial degree \(p=30\) on the interior \(\tilde{K}:=(-1/3,1/3)^2\); this approximation space is presented in Fig. 7 (right). In Table 1, we present the comparison between IPDG and RIPDG in terms of the maximum penalty used, condition number of the respective stiffness matrices and three different error measures. As we can see, RIPDG is outperforming in all respects, with the most striking improvement being in the stiffness matrix condition number.

The significant difference in the maximum of the penalty parameter confirms the respective discussion in Sect. 3. As a consequence, we can see that IPDG’s condition number is about 10 times larger than RIPDG’s for exactly the same problem and approximation space.

Finally, we demonstrate the qualitative behaviour between the two methods resulting from the significant differences in the penalization magnitude. To that end, we consider the same problem with \(\alpha =10\), resulting to a significantly less sharp Gaussian bump at the centre of the domain. This problem is approximated by the mesh given in Fig. 8, with each elemental polynomial degree also given.

In Fig. 9 we provide the solution profiles for RIPDG (left) and IPDG (right) for \(\alpha =10\), \(p=1\) on all ‘small’ elements and \(p=3\) (first line), \(p=5\) (second line), and \(p=8\) (last line) in the central ‘large’ element. We observe significantly different behaviour in the ‘large’ central element interfaces, due to the substantially smaller penalization required by the RIPDG method. Interestingly, however, we also observe modestly different behaviour further away from the ‘large’ element interfaces also, especially for \(p=3\).

Of course, one may not use encounter or use approximations spaces as the ones considered in this example. Nonetheless, such mesh and polynomial degree distributions may arise locally in the context of hp-adaptive procedures. In any case, this study is revealing in how extreme approximation space local variation behaviour may result into possibly inferior performance of classical approaches.

8.3 Example 3. Highly Agglomerated Meshes

We now consider the Poisson problem with Dirichlet boundary conditions admitting the smooth solution \(u:=\sin (\pi x)\sin (\pi y)\) on highly agglomerated meshes: we start from a fine simplicial mesh with 131, 072 triangles which is further agglomerated in a random fashion using standard partitioning tools into 37 complicated polygonal elements shown in Fig. 10. The discontinuity penalization parameter is selected as described briefly in Sect. 4; we refer to [9] for details. On this fixed mesh, we compute the numerical solution for each method for uniform \(p=1,\dots ,5\); these are presented in Fig. 11. As expected, the errors in all norms decay exponentially. Even though the polynomial degree is uniform and neighbouring elements are of ‘similar’ sizes, we can still observe an improvement in the errors for RIPDG against IPDG of about 25%. In Fig. 12, we can see that the maximum penalty and condition numbers for IPDG compared to RIPDG are approximately 4 and 2.5 times larger, respectively.