1 Introduction

The virtual element method (VEM) is a numerical framework introduced in [7] for the approximation of solutions of partial differential equations (PDEs). Key attributes of VEM are its ability to permit the use of meshes with very general polygonal/polyhedral elements [2, 4, 8, 9, 12, 18, 21] and the seamless incorporation of approximation spaces with arbitrary global regularity [12]. There has been a strong interest in recent years in the development of numerical methods on general polygonal/polyhedral meshes [7, 10, 19, 20, 22, 25,26,27, 30, 39, 44], not least due to the potential appeal of such mesh generality in the context of Lagrangian and/or adaptive refinement/coarsening algorithms. The virtual element method revolves around a virtual element space of trial and test functions, defined implicitly through local PDE problems on each element. The local spaces are designed to contain a space of (physical frame) polynomials, ultimately responsible for the accuracy of the method, as well as a complementary space of more general non-polynomial functions. In this respect, VEM belongs to the wide family of Generalised FEM [5], as do other approaches to general meshes such as the Polygonal FEM [39], numerical multiscale methods (see, e.g. [1, 28, 33] and the references therein), and, so-called, Trefftz-type methods in general [6, 29, 31, 38]. Quite differently from all the above approaches, though, the virtual element method does not require the evaluation of non-polynomial functions, even in a rough fashion. Instead, to produce fully computable and accurate VEM formulations on general meshes, the method’s degrees of freedom are carefully chosen so that relevant projections of the local virtual element functions into the local subspace of polynomials are computable. A crucial consequence of this approach is that the VEM computed solution is not available in the form of a (virtual element) function. Rather, the solution is represented via the values of its degrees of freedom, from which we can access, for instance, the piecewise polynomial projection of the corresponding complete virtual element function.

Given the virtual nature of the method, the design and analysis of fully computable a posteriori error bounds for VEM is a challenging task. In [13], a posteriori error bounds for the \(C^1\)-conforming VEM for the two-dimensional Poisson problem are proven. The \(C^1\)-continuity of the VEM space was employed to circumvent the fact that the inter-element normal fluxes of the virtual basis functions are not computable in the more standard \(C^0\)-conforming method. Furthermore, the analysis of [13] relies on a Clément-type interpolant construction requiring quadratic (or higher-order) virtual element spaces. To the best of our knowledge, [13] is the only a posteriori error analysis for VEM currently available in the literature.

In this work, we present a new residual-type a posteriori error analysis for the \(C^0\)-conforming VEM introduced in [21] for the discretization of second order linear elliptic reaction-convection-diffusion problems with non-constant coefficients in two and three dimensions. We circumvent the fact that the VEM solution normal fluxes are not computable by replacing them by a suitable projection of the fluxes instead, resulting in the introduction of virtual inconsistency terms in the a posteriori error estimator to account for this replacement error. Moreover, the analysis is based on a new Clément-type VEM interpolant in two and three dimensions, which, crucially, allows for minimal regularity interpolation by linear VEM functions. This new interpolant, which may be of independent interest, is constructed starting from the standard finite element Clément interpolant on a regular sub-triangulation; cf. also [35] for a related idea for a two-dimensional VEM interpolant. In two spatial dimensions, the resulting constants in the Clément interpolation estimate are dependent on the respective FEM Clément interpolant on a regular sub-triangulation, which are in principle available [40, 42], along with other computable quantities. In the three-dimensional case, when general polygonal element interfaces are present in the mesh, a second, not easily computable in general, constant appears; see Remark 12 for a detailed discussion. Once equipped with the above developments, a posteriori bounds are derived by careful treatment of the inconsistency terms, whereby appropriate projection operators are introduced into the discrete problem formulation; we refer to [37] for a related general framework for a posteriori analysis of inconsistent discontinuous Galerkin methods. Lemma 18 gives a lower bound for these inconsistency terms, indicating that they are of correct order up to data oscillation. Although the focus of this work is the VEM introduced in [21], the proof of the a posteriori error bounds is quite general and can be adapted in a straightforward manner to other VEM approaches, such as the VEM proposed in [9], cf. the discussion in Sect. 7 below.

Adaptive mesh refinement driven by a posteriori error estimators is a well established tool for the efficient numerical solution of PDEs exhibiting local, numerically challenging, solution features. In this context, the extreme mesh flexibility allowed by the VEM approach offers a number of potential advantages. For instance, locally adapted meshes do not require any local post-processing: very general polygonal/polyhedral meshes are admissible due to the physical frame polynomial subspaces included in the VEM space, therefore removing any restrictions posed by maximum angle conditions or mesh-distortion, as is the case for standard adaptive FEM. Moreover, VEM avoids the need to introduce additional degrees of freedom for hanging node/face removal (‘green refinement’) during mesh refinement: hanging nodes introduced by the refinement of a neighbouring element are simply treated as new nodes since adjacent co-planar elemental interfaces are perfectly acceptable. Furthermore, in the VEM context, coarsening becomes trivial and inexpensive to implement as node removal does not necessitate any further local mesh modification. The latter is particularly attractive in the context of numerical solution of evolution PDEs where mesh-coarsening is standard practice to track evolving fronts and singularities efficiently. Indeed, apart from making mesh change straightforward to implement, the mesh flexibility offered by VEM may have the potential to provide complexity reduction with respect to standard FEM on traditional simplicial or box-type meshes. At the time of writing, no results in this direction are available.

The remainder of this work is structured as follows. In Sect. 2, we describe the model problem and in Sect. 3 we introduce the virtual element method. Some fundamental approximation results are presented in Sect. 4, which are used to prove upper and lower bounds for an a posteriori error estimator in Sect. 5. This estimator is then used in Sect. 6 within an automatic mesh adaptivity algorithm in a series of numerical examples, confirming numerically its optimality. Finally, in Sect. 7 we give some concluding remarks.

Below, we shall use standard notation for the relevant function spaces. For a Lipschitz domain \(\omega \subset {\mathbb {R}}^d\), \(d=2,3\), we denote by \(H^s(\omega )\) the Hilbert space of index \(s\ge 0\) of real–valued functions defined on \(\omega \), endowed with the seminorm \(|\cdot |_{s,\omega }\) and norm \(\Vert \cdot \Vert _{s,\omega }\); further \((\cdot ,\cdot )_\omega \) stands for the standard \(L^2\) inner-product. Finally, \(|\omega |\) denotes the d–dimensional Hausdorff measure of \(\omega \).

2 The continuous problem

Let \(\varOmega \subset \mathbb {R}^d\) be a polygonal domain for \(d=2\) or a polyhedral domain for \(d=3\) and consider the linear second order elliptic boundary value problem

$$\begin{aligned} -\nabla \cdot (\varvec{\kappa }(\varvec{x})\nabla u) + \varvec{\beta }(\varvec{x}) \cdot \nabla u+ {\gamma }(\varvec{x}) u= & {} f(\varvec{x})\phantom {0} \quad \text {in}~\varOmega ,\nonumber \\ u= & {} 0\phantom {f(\varvec{x})}\quad \text {on}~\partial {\varOmega }. \end{aligned}$$

We assume that \( {\gamma }, f\in L^{\infty }(\varOmega )\), \(\varvec{\beta }\in [W^{1,\infty }(\varOmega )]^d\) and that \(\varvec{\kappa }\in [L^{\infty }(\varOmega )]^{d\times d}\) is a strongly elliptic symmetric diffusion tensor, i.e. there exist \(\kappa _*,\kappa ^*> 0\), independent of \(\mathbf {v}\) and \(\varvec{x}\), such that

$$\begin{aligned} \kappa _*{|\mathbf {v}(\varvec{x})|}^2 \le \mathbf {v}(\varvec{x}) \cdot \varvec{\kappa }(\varvec{x}) \mathbf {v}(\varvec{x}) \le \kappa ^*{|\mathbf {v}(\varvec{x})|}^2, \end{aligned}$$

for almost every \(\varvec{x}\in \varOmega \) and for any \(\mathbf {v} \in [H^1_0(\varOmega )]^{d}\), with \({|\cdot |}\) denoting the standard Euclidean norm on \(\mathbb {R}^d\). Finally, we assume that, for almost every \(\varvec{x}\in \varOmega \), there exists a constant \(\mu _0\) such that

$$\begin{aligned} \mu (\varvec{x}) := {\gamma }(\varvec{x}) - \frac{1}{2} \nabla \cdot \varvec{\beta }(\varvec{x}) \ge \mu _0> 0. \end{aligned}$$

Problem (2.1) can be written in variational form: Find \(u\in H^1_0(\varOmega )\) such that

$$\begin{aligned} (\varvec{\kappa }\nabla u, \nabla v) + (\varvec{\beta }\cdot \nabla u, v) + ({\gamma }u, v) = (f, v) \quad \quad \forall v\in H^1_0(\varOmega ), \end{aligned}$$

with \((\cdot ,\cdot )\) denoting the (standard) \(L^2\) inner-product over \(\varOmega \). Following [21], we split the bilinear form on the left-hand side of (2.4) into its symmetric and skew-symmetric parts

$$\begin{aligned} a(u,v)&:= (\varvec{\kappa }\nabla u, \nabla v)+\left( \mu u, v\right) , \end{aligned}$$
$$\begin{aligned} b(u,v)&:= \frac{1}{2} \left[ \left( \varvec{\beta }\cdot \nabla u, v\right) - \left( u, \varvec{\beta }\cdot \nabla v\right) \right] , \end{aligned}$$

and we consider the problem written in the equivalent form: find \(u\in H^1_0(\varOmega )\) such that

$$\begin{aligned} A(u,v) :=\, a(u,v) + b(u,v) = (f, v) \quad \quad \forall v\in H^1_0(\varOmega ). \end{aligned}$$

Rewriting the bilinear form in this fashion is a useful step in view of preserving the coercivity of A at the virtual (discrete) level, independently of the mesh size. An alternative VEM based on the original variational form (2.4) and without assuming coercivity is presented in [9], whose well-posedness relies on selecting sufficiently small mesh size. The a posteriori error analysis presented below can be also applied with to the method of [9], with minor modifications, cf. Sect. 7.

3 The virtual element method

3.1 Polygonal and polyhedral partitions

Let \(\{{\mathcal {T}}_h\}\) be a family of partitions of the domain \(\varOmega \) into non-overlapping simple polygonal/polyhedral elements with maximum size h; a polygon/polyhedron is termed simple when its boundary is not self-intersecting. We further assume that the boundary \(\partial E\) of each element \(E\in {\mathcal {T}}_h\) is made of a uniformly bounded number of interfaces: line segments if \(d=2\) and planar polygons with a uniformly bounded number of straight edges if \(d=3\). Elemental interfaces are either part of the boundary of \(\varOmega \) or shared with another element in the decomposition. By \(s\) we shall denote the generic \((d-1)\)-dimensional mesh interface (either an edge when \(d= 2\), or a face when \(d= 3\)) of a mesh element \(E\in {\mathcal {T}}_h\); the set of all mesh interfaces in \({\mathcal {T}}_h\) will be denoted by \({\mathcal {S}}_h\), which is subdivided into the set of boundary interfaces \({\mathcal {S}}_h^{\text {bdry}}:= \{ s\in {\mathcal {S}}_h: s\subset \partial {\varOmega }\}\) and the set of internal interfaces \({\mathcal {S}}_h^{\text {int}}:= {\mathcal {S}}_h{\setminus } {\mathcal {S}}_h^{\text {bdry}}\). Also, \(\nu _{E}\) will be the (uniformly bounded) number of interfaces \(s\in \partial {E}\).

We note that, in particular, partitions including non-convex elements are allowed, as also are elements with consecutive co-planar edges/faces, such as those typical of locally refined meshes with hanging nodes. We also make the following mesh regularity assumptions which are standard in this context, cf. [2, 21].

Assumption 1

(Mesh regularity) We assume the existence of a constant \(\rho > 0\) such that

  1. 1.

    Every element \(E\) of \({\mathcal {T}}_h\) is star-shaped with respect to a ball of radius \(\rho h_{E}\);

  2. 2.

    For every element \(E\) of \({\mathcal {T}}_h\) and every interface \(s\) of \(E\), \(h_{s} \ge \rho h_{E}\);

  3. 3.

    For \(d=3\), every interface \(s\in {\mathcal {S}}_h\) viewed as a 2-dimensional element satisfies assumptions 1 and 2 above.

Remark 2

(Global shape regularity) As in the a priori setting [7, 21], the a posteriori error analysis presented below extends in a straightforward fashion to the case of polygonal/polyhedral elements which result from simply connected finite union of sub-elements each satisfying Assumption 1. Moreover, the extension of the VEM a priori error analysis recently presented in [11] for the \(d=2\) case, indicates that it may be possible to relax the condition on the size of the interfaces. This hypothesis is explored numerically in Sect. 6. Therein, by not imposing any restrictions on the size of the edges in the mesh, we show that the performance of the method or the estimators are not affected in practice.

An immediate consequence of the above, simplifying, mesh regularity assumptions is that each element \(E\) admits a sub-triangulation \({\mathcal {T}}_h^{E}\), a partition of \(E\) into triangles when \(d=2\) and tetrahedra when \(d=3\), in such a way that the resulting global triangulation \(\widehat{{\mathcal {T}}_h}:= \bigcup _{E\in {\mathcal {T}}_h} {\mathcal {T}}_h^{E}\) is shape regular. For \(d=2\) this is obtained by joining each vertex of \(E\) with a point with respect to which \(E\) is starred. For \(d=3\) the same procedure can be applied starting from the corresponding triangulation of each face.

Throughout the paper, we denote by \(\varPi ^0_{\ell } : L^2(E) \rightarrow \mathcal {P}_{\ell }(E)\) the \(L^2(E)\)-orthogonal projection onto the space \(\mathcal {P}_{\ell }(E)\) of polynomials with total degree \(\ell \), for any \(E\in {\mathcal {T}}_h\) and \(\ell \in \mathbb {N}\cup \{0\}\).

3.2 Virtual element spaces

We begin by recalling the construction of the conforming virtual element space from [21]. For each \({\mathcal {T}}_h\) and \(p\in {\mathbb N}\), we shall construct a virtual element space \(V_h\subset H^1_0(\varOmega )\) of order \(p\) in an element-wise fashion; \(V_h\) will be of order \(p\in {\mathbb N}\) if, for each element \(E\in {\mathcal {T}}_h\), the space \(V_h^E:= V_h|_{E}\) contains the space \(\mathcal {P}_{p}(E)\) of polynomials of degree \(p\) on \(E\). In general, the space \(V_h^E\) will also contain non-polynomial functions. However, the distinctive idea of VEM is that of computability based on degrees of freedom, which stems from the view that the complement of \(\mathcal {P}_{p}(E)\) in \(V_h^E\) is made up of functions which are deemed expensive to evaluate.

Definition 3

(Computability) A term is computable if it may be evaluated using the data of the problem, the degrees of freedom, and the polynomial component of the virtual element space only.

We shall consider two types of degrees of freedom: nodal values and polynomial moments.

Definition 4

(Degrees of freedom) Let \(\omega \subset \mathbb {R}^d\), \(1\le d\le 3\), be an \(d\)-dimensional polytope, that is, a line segment, polygon, or polyhedron, respectively. For any regular enough function v on \(\omega \), we define the following sets of degrees of freedom:

  • \(\mathcal {N}^{\omega }\) are the nodal values. For a vertex \(\mathbf{z}\) of \(\omega \), \(\mathcal {N}^{\omega }_\mathbf{z}(v):=v(\mathbf{z})\) and \(\mathcal {N}^{\omega }:=\{\mathcal {N}^{\omega }_\mathbf{z}: \mathbf{z}\text { is a vertex}\}\);

  • \({\mathcal {M}}^{\omega }_{l}\) are the polynomial moments up to order l. For \(l\ge 0\),

    $$\begin{aligned}{\mathcal {M}}^{\omega }_{\varvec{\alpha }}(v)=\frac{1}{{|\omega |}} (v, m_{\alpha })_\omega \quad \text { with}\quad m_{\varvec{\alpha }} := \left( \frac{ \varvec{x}- \varvec{x}_{\omega }}{h_{\omega }}\right) ^{\varvec{\alpha }} \text { and}\quad {|\varvec{\alpha }|}\le l, \end{aligned}$$

    where \(\varvec{\alpha }\) is a multi-index with \({|\varvec{\alpha }|} := \alpha _1 +\cdots +\alpha _d\) and \(x^{\varvec{\alpha }} := x_1^{\alpha _1} \dots x_d^{\alpha _d}\) in a local coordinate system, and \(x_\omega \) denoting the barycentre of \(\omega \). Further, \({\mathcal {M}}^{\omega }_{l}=\{{\mathcal {M}}^{\omega }_{\varvec{\alpha }}:{|\varvec{\alpha }|}\le l\}\). The definition is extended to \(l=-1\) by setting \({\mathcal {M}}^{\omega }_{-1}:=\emptyset \).

The local virtual element space is constructed recursively in space dimensions. We first consider the case \(d=2\). On each edge interface \(s\in \partial {E}\) we take \(V_h^s:=\mathcal {P}_{p}(s)\) and we define the auxiliary space \(\mathcal {W}_h^{E}\) as

$$\begin{aligned} \mathcal {W}_h^{E}:= \left\{ v_h\in H^1(\varOmega ) : \varDelta v_h\in \mathcal {P}_{p}(E) \text { and } v_h|_{s} \in V_h^s\text { for all } s\in \partial {E}\right\} , \end{aligned}$$

noting that \(\mathcal {P}_{p}(E) \subset \mathcal {W}_h^{E}\subset C^0(\overline{E})\). The elements of \(\mathcal {W}_h^{E}\) can be uniquely identified by the following set of degrees of freedom [2, 21]:

$$\begin{aligned}\text {DoF}(\mathcal {W}_h^{E}) := \mathcal {N}^{E} \cup \{{\mathcal {M}}^{s}_{p-2} \text { for each edge interface } s\in \partial {E}\} \cup {\mathcal {M}}^{E}_{p}. \end{aligned}$$

These degrees of freedom make the terms \(\varPi ^0_{p} v_h\) and \(\varPi ^0_{p-1} \nabla v_h\) computable for any \(v_h\in \mathcal {W}_h^{E}\) [21]; for instance, the projection \(\varPi ^0_{p} v_h\) is given directly by the internal degrees of freedom \({\mathcal {M}}^{E}_{p}\).

The crucial property \(\mathcal {P}_{p}(E) \subset \mathcal {W}_h^{E}\) would still be satisfied by the smaller space obtained by requiring \( \varDelta v_h\in \mathcal {P}_{p-2}(E)\) instead of \( \varDelta v_h\in \mathcal {P}_{p}(E)\) in (3.1). This is, indeed, the original virtual element space introduced in [7]. However, since the elemental projection \(\varPi ^0_{p}\) is not computable in the original VEM space of [7], a different subspace of \(\mathcal {W}_h^{E}\) with the same dimension as the original space of [7] was introduced [2]. In the latter subspace definition, the \(L^2\)-projection onto \(\mathcal {P}_{p}(E)\) is computable using the extra higher-order moments.

The crucial observation is that, in the polynomial space \(\mathcal {P}_{p}(E)\subset \mathcal {W}_h^{E}\), the moments in \({\mathcal {M}}^{E}_{p}{\setminus } {\mathcal {M}}^{E}_{p-2}\) are redundant. Hence, it is possible to construct \(L^2\)-stable projection operators \(\varPi ^{*}_{p}:\mathcal {W}_h^{E}\rightarrow \mathcal {P}_{p}(E)\) which only depend on the reduced set of degrees of freedom

$$\begin{aligned} \text {DoF}(V_h^E) := \mathcal {N}^{E} \cup \{{\mathcal {M}}^{s}_{p-2} \text { for each edge interface } s\in \partial {E}\} \cup {\mathcal {M}}^{E}_{p-2}. \end{aligned}$$

In particular, and following [21], this can simply be taken as the projection corresponding to the Euclidean inner product on \(\text {DoF}(V_h^E)\). This is, indeed, the choice used in all numerical tests presented in Sect. 6.

Given any such projection operator \(\varPi ^{*}_{p}\), we can define a local virtual element space \(V_h^E\subset \mathcal {W}_h^{E}\) by clamping the internal higher-order moments:

$$\begin{aligned} V_h^E:= \Big \{ v_h\in \mathcal {W}_h^{E}: \left( v_h, m_\alpha \right) _{E} = \left( \varPi ^{*}_{p}v_h, m_\alpha \right) _{E} \,\,\, \forall m_\alpha \text { with } p-1\le |\alpha |\le p\Big \}.\nonumber \\ \end{aligned}$$

By construction, the space \(V_h^E\) is identified by the degrees of freedom \(\text {DoF}(V_h^E)\) of (3.2), used to define \(\varPi ^{*}_{p}\). A counting argument shows that the cardinality of the above sets of degrees of freedom is \(N_{E}= \nu _{E}+ \nu _{E}N_{1,p-2} + N_{2,p-2}\), where \(N_{d, k} := \dim \mathcal {P}_{k}(\mathbb {R}^{d})\). Representative examples are illustrated in Fig. 1. Further, we note that it is also possible to compute \(\varPi ^0_{p} v_h\) and \(\varPi ^0_{p-1} \nabla v_h\) for each \(v_h\in V_h^E\) just from the reduced set of degrees of freedom since we can access the higher order moments through \(\varPi ^{*}_{p}\); we refer to [21, Section 4.1] for the details.

Fig. 1
figure 1

The VEM degrees of freedom for a hexagonal mesh element for \(p=1,2,3,4\); nodal values and edge moments are marked by a circle; internal moments are marked by a square. The number of moments per edge is given by \(N_{1,p-2}=0,1,2,3\) and the number of internal moments is given by \(N_{2,p-2}=0,1,3,6\) for \(p=1,2,3,4\), respectively

For the case \(d=3\), we first (re-)define \(V_h^s\) on each face \(s\in \partial {E}\) to be the 2-dimensional virtual element space given by (3.3). The construction of the local virtual element space on \(E\) now follows by defining the auxiliary space \(\mathcal {W}_h^{E}\) of (3.1) and final space \(V_h^E\) of (3.3) in exactly the same way as the 2-dimensional case. In the 3-dimensional case, \(V_h^E\) is identified by the following set of degrees of freedom [21]:

$$\begin{aligned} \text {DoF}(V_h^E):= & {} \mathcal {N}^{E} \cup \{{\mathcal {M}}^{e}_{p-2} \text { for each edge } e\in \partial {E}\} \cup \{{\mathcal {M}}^{s}_{p-2} \nonumber \\&\text { for each face } s\in \partial {E}\} \cup {\mathcal {M}}^{E}_{p-2}. \end{aligned}$$

Therefore, the dimension of the local space for \(d= 3\) is \(N_{E}= \nu _{E}''+\nu _{E}'N_{1,p-2}+\nu _{E}N_{2,p-2}+N_{3,p-2}\) where \(\nu _{E}''\) and \(\nu _{E}'\) denote, respectively, the number of vertices and edges of \(E\), cf. [21]. Representative examples are illustrated in Fig. 2.

Fig. 2
figure 2

The VEM degrees of freedom for a cubic mesh element for \(p=1,2,3,4\); nodal values and edge moments are marked by a circle; face moments are marked by a hexagon; internal moments are marked by a square. Only the internal degrees of freedom and those of the visible faces and edges are marked. The numeric labels indicate the number of degrees of freedom when there are more than 1. The number of moments per edge is given by \(N_{1,p-2}=0,1,2,3\), the number of interface moments is given by \(N_{2,p-2}=0,1,3,6\), and the number of internal moments is given by \(N_{3,p-2}=0,1,4,10\) for \(p=1,2,3,4\), respectively

Finally, the global space is constructed from these local spaces as

$$\begin{aligned} V_h:= \left\{ v_h\in H^1_0(\varOmega ) : v_h|_{E} \in V_h^E\quad \forall E\in {\mathcal {T}}_h\right\} , \end{aligned}$$

and the global degrees of freedom are obtained by collecting the local ones, with the nodal and interface degrees of freedom corresponding to internal entities counted only once. Those on the boundary are fixed to be equal to zero in accordance with the ambient space \(H^1_0(\varOmega )\).

3.3 Discrete formulation

We shall now recall the VEM for (2.6) introduced in [21]. For every \(E\in {\mathcal {T}}_h\), let \(a^E\) and \(b^E\) be the elemental continuous forms obtained by restriction of the forms in (2.5a) and (2.5b) onto the element \(E\), respectively. A virtual bilinear form \(A_h: V_h\times V_h\rightarrow \mathbb {R}\), is constructed elementwise as

$$\begin{aligned}A_h(u_h, v_h) = \sum _{E\in {\mathcal {T}}_h} A_h^E(u_h, v_h)=\sum _{E\in {\mathcal {T}}_h}a_h^E(u_h, v_h) + b_h^E(u_h, v_h), \end{aligned}$$

for any \(u_h, v_h\in V_h\). Here, \(A_h^E\) is a bilinear form over the space \(V_h^E\), which is split into the symmetric and skew-symmetric discrete bilinear forms \(a_h^E\) and \(b_h^E\) corresponding to the continuous forms \(a^E\) and \(b^E\), respectively. To define \(A_h^E\) precisely, we begin by introducing the concept of admissible stabilising forms.

Definition 5

(Admissible stabilising forms) Let \(E\in {\mathcal {T}}_h\). Two computable (in the sense of Definition 3), symmetric, and positive definite bilinear forms \(S^{E}_{1},S^{E}_{0}:V_h^E/ \mathcal {P}_{p}(E)\times V_h^E/ \mathcal {P}_{p}(E)\rightarrow \mathbb {R}\) are said to be local admissible bilinear forms for stabilising the diffusion and reaction terms in (2.5a) if they respectively satisfy

$$\begin{aligned} C_{{\text {stab}}}^{-1} \int _E|\sqrt{\varvec{\kappa }} \nabla v_h|^2 {\text {d}}\varvec{x}\le&S^{E}_{1}(v_h, v_h) \le C_{{\text {stab}}}\int _E|\sqrt{\varvec{\kappa }} \nabla v_h|^2 {\text {d}}\varvec{x}, \\ C_{{\text {stab}}}^{-1} \int _E\mu v_h^2 {\text {d}}\varvec{x}\le&S^{E}_{0}(v_h, v_h)\le C_{{\text {stab}}}\int _E\mu v_h^2 {\text {d}}\varvec{x}, \end{aligned}$$

for all \(v_h\in V_h^E/ \mathcal {P}_{p}(E)\) for some constant \(C_{{\text {stab}}}\) independent of \(E\) and h.

A practical choice of admissible stabilising bilinear forms is given in (6.1). We note here a trivial consequence of the above definition which will be useful in the analysis: for all \(v_h\in V_h^E/ \mathcal {P}_{p}(E)\), we have

$$\begin{aligned} {||\nabla v_h||}_{0,E}^2&\le \frac{C_{{\text {stab}}}}{\kappa _*^{E}} S^{E}_{1}( v_h, v_h), \qquad {|| v_h||}_{0,E}^2 \le \frac{C_{{\text {stab}}}}{\mu _0^{E}} S^{E}_{0}( v_h, v_h), \end{aligned}$$

where \(\kappa _*^{E}, \mu _0^{E}\) are the local counterparts of \(\kappa _*, \mu _0\), respectively.

A virtual element stabilising term \(S^{E}\) may then be defined as the linear combination of any pair of admissible diffusion and reaction stabilising forms:

$$\begin{aligned} S^{E}:= s_1 S^{E}_{1} + s_0 S^{E}_{0}, \end{aligned}$$

with \(S^{E}_{1}\), \(S^{E}_{0}\) admissible stabilising bilinear forms and \(s_1\), \(s_0\) positive constants. We prefer to keep the dependence of the stabilising form on the constants \(s_1\) and \(s_0\) explicit, to be able to study their influence on the constants in the a posteriori bounds below.

For every \(E\in {\mathcal {T}}_h\), the local symmetric and skew-symmetric discrete bilinear forms \(a_h^E\) and \(b_h^E\) are defined by

$$\begin{aligned} a_h^E(u_h, v_h)&:= (\varvec{\kappa }\varPi ^0_{p-1} \nabla u_h, \varPi ^0_{p-1} \nabla v_h)_{E} + (\mu \varPi ^0_{p} u_h, \varPi ^0_{p} v_h) \nonumber \\ {}&\qquad + S^{E}(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})v_h), \end{aligned}$$
$$\begin{aligned} b_h^E(u_h, v_h)&:= \frac{1}{2} \left[ (\varvec{\beta }\cdot \varPi ^0_{p-1}\nabla u_h, \varPi ^0_{p}v_h) - (\varPi ^0_{p} u_h, \varvec{\beta }\cdot \varPi ^0_{p-1} \nabla v_h)\right] , \end{aligned}$$

respectively, for all \(u_h, v_h\in V_h^E\). Notice that all of the terms in (3.7) and (3.8) are computable since \(\varPi ^0_{p}v_h\) and \(\varPi ^0_{p-1}\nabla v_h\) are computable for any \(v_h\in V_h^E\), and \(S^{E}\) is computable by assumption.

Remark 6

(Polynomial consistency and stability) If \(p,q\in \mathcal {P}_{p}(E)\subset V_h^E\), then \(a_h^E(p, q)=a^E(p, q)\) and \(b_h^E(p, q)=b^E(p, q)\). This property is referred to as polynomial consistency in the VEM literature [7]. Furthermore, Definition 5 ensures the following stability property: there exists a positive constant \(C_{{\text {stab}}}\), independent of h and the mesh element \(E\), such that

$$\begin{aligned} (C_{{\text {stab}}})^{-1} a^E(v_h, v_h) \le \, a_h^E(v_h, v_h) \le C_{{\text {stab}}}a^E(v_h, v_h), \end{aligned}$$

for all \(v_h\in V_h^E\). This, together with the obvious identity \(b_h^E(v_h, v_h) = 0\) for all \(v_h\in V_h^E\) yields the coercivity of \(A_h\). We refer to [21] for the details.

The virtual element method (VEM) then reads: find \(u_h\in V_h\) such that

$$\begin{aligned} A_h(u_h, v_h) =({{f}_h}, v_h)\quad \forall v_h\in V_h. \end{aligned}$$

where \({{f}_h}:= \varPi ^0_{p-1} f\).

It may be shown that the problem (3.10) possesses a unique solution whenever (3.9) is satisfied [21, Theorem 1], along with optimal order a priori error bounds for the VEM solution in the \(H^1\) and \(L^2\) norms [21, Theorems 5 & 6].

4 Approximation properties

The conforming virtual element space introduced above satisfies optimal properties for approximating sufficiently smooth functions. In particular, the theory in [17] for star-shaped domains may be used to prove the following theorem regarding the approximation properties of the \(L^2(E)\)-orthogonal projection to polynomials.

Theorem 7

(Approximation using polynomials) Suppose that Assumption 1 is satisfied. Let \(E\in {\mathcal {T}}_h\) and let \(\varPi ^0_{\ell } : L^2(E) \rightarrow \mathcal {P}_{\ell }(E)\), for \(\ell \ge 0\), denote the \(L^2(E)\)-orthogonal projection onto the polynomial space \(\mathcal {P}_{\ell }(E)\). Then, for any \(w\in H^{m}(E)\), with \(1 \le m\le \ell +1\), it holds

$$\begin{aligned}{||w- \varPi ^0_{\ell }w||}_{0,E} + h_{E} {|w- \varPi ^0_{\ell }w|}_{1,E} \le C_{{\text {proj}}}h_{E}^m{|w|}_{m,E}. \end{aligned}$$

The positive constant \(C_{{\text {proj}}}\) depends only on the polynomial degree \(\ell \) and the mesh regularity.

We shall make use of standard bubble functions on polygons/polyhedra below. A bubble function \(\psi _E\in H^1_0(E)\) for a polygon/polyhedron \(E\) can be constructed piecewise as the sum of the (polynomial) barycentric bubble functions (cf. [3, 41]) on each \(d\)-simplex of the shape-regular sub-triangulation of the mesh element \(E\) discussed in Remark 2.

Lemma 8

(Interior bubble functions) Let \(E\in {\mathcal {T}}_h\) and let \(\psi _{E}\) be the corresponding bubble function. There exists a constant \(C_{{\text {bub}}}\), independent of \(h_{E}\) such that for all \(q \in \mathcal {P}_{p}(E)\)

$$\begin{aligned}C_{{\text {bub}}}^{-1} {||q||}_{0,E}^2 \le \int _E\psi _{E}q^2 {\text {d}}\varvec{x}\le C_{{\text {bub}}}{||q||}_{0,E}^2, \end{aligned}$$


$$\begin{aligned}C_{{\text {bub}}}^{-1} {||q||}_{0,E} \le {||\psi _{E}q||}_{0,E} + h_{E} {||\nabla (\psi _{E}q)||}_{0,E} \le C_{{\text {bub}}}{||q||}_{0,E}. \end{aligned}$$

Lemma 9

(Edge bubble functions) For \(E\in {\mathcal {T}}_h\), let \(s\subset \partial {E}\) be a mesh interface and let \(\psi _{s}\) be the corresponding interface bubble function. There exists a constant \(C_{{\text {bub}}}\), independent of \(h_{E}\) such that for all \(q \in \mathcal {P}_{p}(s)\)

$$\begin{aligned}C_{{\text {bub}}}^{-1} {||q||}_{0,s}^2 \le \int _{s}\psi _{s}q^2 {\text {d}}s\le C_{{\text {bub}}}{||q||}_{0,s}^2, \end{aligned}$$


$$\begin{aligned}h_{E}^{-1/2}{||\psi _{s}q||}_{0,E} + h_{E}^{1/2} {||\nabla (\psi _{s}q)||}_{0,E} \le C_{{\text {bub}}}{||q||}_{0,s}. \end{aligned}$$

Here, with slight abuse of notation, the symbol q is also used to denote the constant prolongation of q in the direction normal to \(s\).

We shall first use the above two results to prove an inverse inequality for virtual element functions, made possible by the fact that functions in \(\mathcal {W}_h^{E}\) and \(V_h^E\) have polynomial Laplacians.

Lemma 10

(Inverse inequality) Suppose that Assumption 1 is satisfied. Let \(E\in {\mathcal {T}}_h\) and let \(w\in H^1(E)\) be such that \(\varDelta w\in \mathcal {P}_{p}(E)\). There exists a constant \(C_{{\text {inv}}}\), independent of \(w\), h and \(E\), such that

$$\begin{aligned}{||\varDelta w||}_{0,E} \le C_{{\text {inv}}}h_{E}^{-1} {|w|}_{1,E}. \end{aligned}$$


We first require an auxiliary polynomial inverse inequality \({||q||}_{0,E} \le C_{{\text {inv}}}h_{E}^{-1} {||q||}_{H^{-1}(E)}\), valid for all \(q \in \mathcal {P}_{p}(E)\). This may be proven by selecting \(v= q \psi _{E}\) in the definition of the dual norm, viz.

$$\begin{aligned} {||q||}_{H^{-1}(E)} := \sup _{0\ne v\in H^1_0(E)} \frac{ \int _Eq v{\text {d}}\varvec{x}}{{||\nabla v||}_{0,E}} \ge \frac{ \int _E\psi _{E}q^2 {\text {d}}\varvec{x}}{{||\nabla (\psi _{E}q)||}_{0,E}}, \end{aligned}$$

and using Lemma 8. Applying this to \(\varDelta w\in \mathcal {P}_{p}(E)\), we find that

$$\begin{aligned} {||\varDelta w||}_{0,E}&\le C_{{\text {inv}}}h_{E}^{-1} {||\varDelta w||}_{H^{-1}(E)}. \end{aligned}$$

Now, using (4.1), along with an integration by parts, we deduce

$$\begin{aligned} {||\varDelta w||}_{H^{-1}(E)}&= \sup _{0\ne v\in H^1_0(E)} \frac{ \int _E\varDelta wv{\text {d}}\varvec{x}}{{||\nabla v||}_{0,E}} = \sup _{0\ne v\in H^1_0(E) } \frac{ -\int _E\nabla w\cdot \nabla v{\text {d}}\varvec{x}}{{||\nabla v||}_{0,E}}. \end{aligned}$$

The result then follows by applying the Cauchy-Schwarz inequality. \(\square \)

The above inverse estimate will be used to prove an approximation theorem (Theorem 11 below) for the virtual element spaces considered in this work. The proof of Theorem 11 is inspired by [35, Prop. 4.2], where a related result is obtained in the much simpler setting of the original virtual element space of [7] for \(d=2\) only. As the construction in [35, Prop. 4.2] does not appear to generalize to \(d=3\), we use a different construction for the Clément-type interpolant below.

We begin by recalling some classical polynomial interpolation results on simplicial triangulations. Assumption 1 implies the existence of a globally shape-regular sub-triangulation \(\widehat{{\mathcal {T}}_h}\) of \({\mathcal {T}}_h\), cf. Remark 2. We use this to define \(v_{c}\) as the classical Clément interpolant [24] of \(v\) of degree \(p\) over the sub-triangulation \(\widehat{{\mathcal {T}}_h}\). Then, the following approximation estimates hold [24] for any \(v\in H^1(\varOmega )\):

$$\begin{aligned} {||v- v_{c}||}_{0,T}+ h{|v- v_{c}|}_{1,T} \le \hat{C}_{{\text {Clem}}}h {|v|}_{1,\widetilde{T}}, \end{aligned}$$

for all \(T\in \widehat{{\mathcal {T}}_h}\), with \(\hat{C}_{{\text {Clem}}}\) a positive constant depending only on the polynomial degree \(p\) and on the mesh regularity. Here, \(\widetilde{T}\) denotes the usual finite element patch relative to T.

Theorem 11

(Approximation using virtual element functions) Suppose that Assumption 1 is satisfied and let \(V_h\) denote the virtual element space (3.5). For \(v\in H^1(\varOmega )\), there exists a \({v_{{\text {I}}}}\in V_h\), such that, for all elements \(E\in {\mathcal {T}}_h\), we have

$$\begin{aligned}{||v- {v_{{\text {I}}}}||}_{0,E} +h_{E}{|v- {v_{{\text {I}}}}|}_{1,E} \le C_{{\text {Clem}}}h_{E} {|v|}_{1,\widetilde{E}}, \end{aligned}$$

\(C_{{\text {Clem}}}\) being a positive constant, depending only on the polynomial degree \(p\) and the mesh regularity.


We denote by \(v_{c}\) the Clément interpolant defined over a sub-triangulation \(\widehat{{\mathcal {T}}_h}\) and satisfying (4.2). It is assumed that all edges of the polygonal/polyhedral mesh \({\mathcal {T}}_h\) are also edges of the sub-triangulation \(\widehat{{\mathcal {T}}_h}\), cf. Remark 2.

Case \(d=2\). We start by interpolating \(v_{c}\) into the enlarged virtual element space \(\mathcal {W}_h\). More specifically, we define \({w_{{\text {I}}}}\) elementwise as the solution of the problem

$$\begin{aligned} {\left\{ \begin{array}{ll} -\varDelta {w_{{\text {I}}}}= -\varDelta \varPi ^0_{p}v_{c}&{}\text { in } E, \\ {w_{{\text {I}}}}= v_{c}&{}\text { on } \partial E. \end{array}\right. } \end{aligned}$$

Then, since \(\varDelta \varPi ^0_{p}v_{c}\in \mathcal {P}_{p-2}(E) \subset \mathcal {P}_{p}(E)\) and \(v_{c}\) is a polynomial of degree \(p\) on each edge of \(E\), we may conclude that \({w_{{\text {I}}}}|_{E} \in \mathcal {W}_h^{E}\). Moreover, since \(v_{c}\) is continuous on \(\varOmega \), it follows that \({w_{{\text {I}}}}\in \mathcal {W}_h\).

Arguing as in [35, Proposition 4.2], we may show that

$$\begin{aligned} {|{w_{{\text {I}}}}- \varPi ^0_{p}v_{c}|}_{1,E} \le {|v_{c}- \varPi ^0_{p}v_{c}|}_{1,E}, \end{aligned}$$

and, therefore,

$$\begin{aligned} {|v_{c}- {w_{{\text {I}}}}|}_{1,E} \le 2{|v_{c}- \varPi ^0_{p}v_{c}|}_{1,E}. \end{aligned}$$

Now, \({w_{{\text {I}}}}\) allows us to construct an interpolant \({v_{{\text {I}}}}\in V_h\) using the definition of \(V_h^E\) [given in (3.3)] on each \(E\in {\mathcal {T}}_h\). By definition, the two interpolants \({v_{{\text {I}}}}\) and \({w_{{\text {I}}}}\) are equal on the mesh skeleton \({\mathcal {S}}_h\) and for all \(E\in {\mathcal {T}}_h\), \({\mathcal {M}}^{E}_{\varvec{\alpha }}({v_{{\text {I}}}})={\mathcal {M}}^{E}_{\varvec{\alpha }}({w_{{\text {I}}}})\) if \({|\varvec{\alpha }|}\le p-2\), while \({\mathcal {M}}^{E}_{\varvec{\alpha }}({v_{{\text {I}}}})={\mathcal {M}}^{E}_{\varvec{\alpha }}(\varPi ^{*}_{p}{w_{{\text {I}}}})\) if \(p-1\le {|\varvec{\alpha }|}\le p\). Consider, now, \({|{w_{{\text {I}}}}-{v_{{\text {I}}}}|}_{1,E}\) on each \(E\in {\mathcal {T}}_h\). Integration by parts yields

$$\begin{aligned} {|{w_{{\text {I}}}}- {v_{{\text {I}}}}|}_{1,E}^2 = -(\varDelta ({w_{{\text {I}}}}- {v_{{\text {I}}}}), {w_{{\text {I}}}}- {v_{{\text {I}}}})_{E}, \end{aligned}$$

as \({w_{{\text {I}}}}\) and \({v_{{\text {I}}}}\) coincide on \(\partial {E}\). Since \({w_{{\text {I}}}}- {v_{{\text {I}}}}\in \mathcal {W}_h^{E}\), we have \(\varDelta ({w_{{\text {I}}}}- {v_{{\text {I}}}}) \in \mathcal {P}_{p}(E)\). Let \(q_{p, p-1} \in \mathcal {P}_{p}(E) / \mathcal {P}_{p-2}(E)\) be defined by \(q_{p, p-1}=\varDelta ({w_{{\text {I}}}}- {v_{{\text {I}}}}) - \varPi ^0_{p-2} \varDelta ({w_{{\text {I}}}}- {v_{{\text {I}}}})\). Identity (4.6) can then be rewritten as

$$\begin{aligned} {|{w_{{\text {I}}}}- {v_{{\text {I}}}}|}_{1,E}^2 = -(q_{p, p-1}, {w_{{\text {I}}}}- {v_{{\text {I}}}})_{E} = -(q_{p, p-1}, {w_{{\text {I}}}}- \varPi ^{*}_{p} {w_{{\text {I}}}})_{E}, \end{aligned}$$

since \({v_{{\text {I}}}}\) and \({w_{{\text {I}}}}\) have the same moments of up to degree \(p-2\), while \({v_{{\text {I}}}}\) and \(\varPi ^{*}_{p}{w_{{\text {I}}}}\) share the same moments of degree \(p\) and \(p-1\). The Cauchy-Schwarz inequality then implies that

$$\begin{aligned} {|{w_{{\text {I}}}}- {v_{{\text {I}}}}|}_{1,E}^2 \le {||q_{p, p-1}||}_{0,E} {||{w_{{\text {I}}}}- \varPi ^{*}_{p} {w_{{\text {I}}}}||}_{0,E}. \end{aligned}$$

Further, from the stability of the \(L^2\) projection we get

$$\begin{aligned} {||q_{p,p-1}||}_{0,E} = {||({\text {I}}- \varPi ^0_{p-2}) \varDelta ({w_{{\text {I}}}}- {v_{{\text {I}}}})||}_{0,E} \le {||\varDelta ({w_{{\text {I}}}}- {v_{{\text {I}}}})||}_{0,E}, \end{aligned}$$

where \({\text {I}}\) denotes the identity operator on the space \(\mathcal {P}_{p}(E)\). Thus,

$$\begin{aligned} {|{w_{{\text {I}}}}- {v_{{\text {I}}}}|}_{1,E}^2&\le {||\varDelta ({w_{{\text {I}}}}- {v_{{\text {I}}}})||}_{0,E} {||{w_{{\text {I}}}}- \varPi ^{*}_{p} {w_{{\text {I}}}}||}_{0,E}\\&\le C_{{\text {inv}}}h_{E}^{-1} {|{w_{{\text {I}}}}- {v_{{\text {I}}}}|}_{1,E} {||{w_{{\text {I}}}}- \varPi ^{*}_{p} {w_{{\text {I}}}}||}_{0,E}, \end{aligned}$$

by Lemma 10. Further, adding and subtracting \(\varPi ^0_{p}{w_{{\text {I}}}}\) and using the stability of \(\varPi ^{*}_{p}\) and then using the Poincaré inequality (either on each 2-simplex of the shape-regular sub-triangulation \(E\) or directly on \(E\), cf. [40]), we obtain

$$\begin{aligned} {|{w_{{\text {I}}}}- {v_{{\text {I}}}}|}_{1,E}&\le C_{{\text {inv}}}(1+C_0^*) h_{E}^{-1} {||{w_{{\text {I}}}}- \varPi ^0_{p} {w_{{\text {I}}}}||}_{0,E}\\&\le C_{{\text {P}}}C_{{\text {inv}}}(1+C_0^*) {|{w_{{\text {I}}}}- \varPi ^0_{p} {w_{{\text {I}}}}|}_{1,E}, \end{aligned}$$

for some uniform constants \(C_0^*\) and \(C_{{\text {P}}}>0\) which depend on the shape regularity constant. Then, the triangle inequality, the stability of \(\varPi ^0_{p}\) with constant, say, \(C_0\), and (4.5) imply that

$$\begin{aligned} {|{w_{{\text {I}}}}- {v_{{\text {I}}}}|}_{1,E}&\le C_{{\text {P}}}C_{{\text {inv}}}(1+C_0^*) ({|{w_{{\text {I}}}}- \varPi ^0_{p}v_{c}|}_{1,E} + {|\varPi ^0_{p}v_{c}- \varPi ^0_{p} {w_{{\text {I}}}}|}_{1,E}) \\&= C_{{\text {P}}}C_{{\text {inv}}}(1+C_0^*) ({|{w_{{\text {I}}}}- \varPi ^0_{p}v_{c}|}_{1,E} + C_0{|v_{c}- {w_{{\text {I}}}}|}_{1,E}) \\&\le C_{{\text {P}}}C_{{\text {inv}}}(1+C_0^*)(1+2C_0) {|v_{c}- \varPi ^0_{p}v_{c}|}_{1,E}. \end{aligned}$$

Finally, the triangle inequality, the above bound, and (4.5), imply

$$\begin{aligned} {|v_{c}- {v_{{\text {I}}}}|}_{1,E} \le {|v_{c}- {w_{{\text {I}}}}|}_{1,E}+{|{w_{{\text {I}}}}- {v_{{\text {I}}}}|}_{1,E} \le C_1{|v_{c}- \varPi ^0_{p}v_{c}|}_{1,E}, \end{aligned}$$

with \(C_1:= (2+C_{{\text {P}}}C_{{\text {inv}}}(1+C_0^*)(1+2C_0))\). Since \({v_{{\text {I}}}}\) and \(v_{c}\) are equal on \(\partial {E}\), we may apply the Poincaré inequality to this to obtain a bound on \({||v_{c}- {v_{{\text {I}}}}||}_{0,E}\), with an extra power of \(h_E\).

The required bounds of \({|v- {v_{{\text {I}}}}|}_{r,E}\), \(r=0,1\), now follow by the triangle inequality, adding and subtracting \(v\) and \(\varPi ^0_{p}v\) to the right-hand side of (4.7), using once again the triangle inequality, and applying the bounds (4.2) and Theorem 7.

Case \(d=3\). The proof in this case is based on using on each face \(s\in \partial {E}\) the construction just considered for \(d=2\) and then extending this inside \(E\).

Let \(\mathcal {W}_h^{s}\) and \(V_h^s\) be the interface spaces respectively defined by (3.1) and (3.3) applied to the interface. For each \(s\in \partial {E}\), we consider \({w_{{\text {I}}}^s}\in \mathcal {W}_h^{s}\) as the solution of the 2-dimensional boundary value problem (4.3) set on \(s\), with \(v_{c}\) representing the three-dimensional Clément interpolant of \(v\) with respect to the 3-dimensional sub-triangulaiton \(\widehat{{\mathcal {T}}_h}\).

Further, from \({w_{{\text {I}}}^s}\) we may use the 2-dimensional construction to obtain \({v_{{\text {I}}}^s}\in V_h^s\). This face interpolant satisfies (4.7) with \(s\) in place of \(E\), namely:

$$\begin{aligned} {|v_{c}- {v_{{\text {I}}}^s}|}_{1,s}\le C_1{|v_{c}- \varPi ^{0,s}_{p}v_{c}|}_{1,s}, \end{aligned}$$

with \(\varPi ^{0,s}_{p}v_{c}\) denoting the \(L^2\)-projection of the restriction of \(v_{c}\) to \(s\). Collecting the face-wise definitions we obtain a continuous interpolant \({v_{{\text {I}}}^{\partial {E}}}\) on \(\partial {E}\). With this, we first construct \({w_{{\text {I}}}}\) on \(E\) as the solution of the problem

$$\begin{aligned} {\left\{ \begin{array}{ll} -\varDelta {w_{{\text {I}}}}= -\varDelta \varPi ^0_{p}v_{c}\text { in } E, \\ {w_{{\text {I}}}}= {v_{{\text {I}}}^{\partial {E}}}\text { on } \partial E, \end{array}\right. } \end{aligned}$$

so that \({w_{{\text {I}}}}\in \mathcal {W}_h^{E}\) by definition, as in the case \(d= 2\) (cf. (3.1)).

In view of bounding \({|{w_{{\text {I}}}}- \varPi ^0_{p}v_{c}|}_{1,E}\), it is convenient to first split the trace \(({w_{{\text {I}}}}- \varPi ^0_{p}v_{c})|_{\partial {E}}=({v_{{\text {I}}}^{\partial {E}}}-v_{c}|_{\partial {E}})+(v_{c}-\varPi ^0_{p}v_{c})|_{\partial {E}}\). Recall that, for all \(s\in \partial {E}\), we have \(({v_{{\text {I}}}^s}-v_{c})|_{\partial s}=0\). Moreover, by Assumption 1, over s we may construct a shape-regular pyramid \(P_s\subset E\) with \(|P_s|\ge \rho |E|\). By the Trace Theorem applied to \(s\in \partial P_s\), there exists \({\varphi }_s\in H^1(P_s)\) with \({\varphi }_s |_{\partial P_s{\setminus } s} = 0\) and a constant \(C_{\mathrm{T}}>0\) such that

$$\begin{aligned} {|{\varphi }_s|}_{1,P_s}\le C_{\mathrm{T}}{||{v_{{\text {I}}}^s}-v_{c}||}_{1/2,s}. \end{aligned}$$

The constant \({C}_\mathrm{T}\) can be bounded uniformly over all \(s\) by a generalised scaling argument, cf. [21] and the references therein. Hence, defining \({\varphi }=\sum _{s\in \partial {E}}{\varphi }_s+v_{c}-\varPi ^0_{p}v_{c}\), where each \({\varphi }_s\) should be interpreted as its extension to zero on \(E\), we have by construction that \({\varphi }|_{\partial {E}}=({w_{{\text {I}}}}- \varPi ^0_{p}v_{c})|_{\partial {E}}\). Thus, as in the case \(d=2\), we have

$$\begin{aligned} {|{w_{{\text {I}}}}- \varPi ^0_{p}v_{c}|}_{1,E}&\le {|{\varphi }|}_{1,E}\nonumber \\&\le \sum _{s\in \partial {E}}{|{\varphi }^s|}_{1,P_s}+{|v_{c}-\varPi ^0_{p}v_{c}|}_{1,E}\nonumber \\&\le C_{\mathrm{T}}\sum _{s\in \partial {E}}{||{v_{{\text {I}}}^s}-v_{c}||}_{1/2,s}+{|v_{c}-\varPi ^0_{p}v_{c}|}_{1,E}. \end{aligned}$$

It just remains to bound the first term on the right-hand side. To this end, we use the Sobolev Interpolation Theorem and Poincaré inequality (facewise, cf. the case \(d=2\) above):

$$\begin{aligned} {||{v_{{\text {I}}}^s}- v_{c}||}_{1/2,s}^2&\le {||{v_{{\text {I}}}^s}- v_{c}||}_{0,s}^2+C_\mathrm{S}{||{v_{{\text {I}}}^s}- v_{c}||}_{0,s}{|{v_{{\text {I}}}^s}- v_{c}|}_{1,s}\nonumber \\&\le (1+C_\mathrm{S}h_E^{-1}){||{v_{{\text {I}}}^s}- v_{c}||}_{0,s}^2+C_\mathrm{S}h_E{|{v_{{\text {I}}}^s}- v_{c}|}_{1,s}\nonumber \\&\le (C_{{\text {P}}}^2 (h_E+C_\mathrm{S})+ C_\mathrm{S}) h_E{|{v_{{\text {I}}}^s}- v_{c}|}_{1,s}^2\nonumber \\&\le (C_{{\text {P}}}^2 (h_E+C_\mathrm{S})+ C_\mathrm{S}) C_1 h_E{|v_{c}- \varPi ^{0,s}_{p}v_{c}|}_{1,s}^2, \end{aligned}$$

for some constant \({C}_\mathrm{S}>0\) which, again, can be bounded uniformly over all s by a generalised scaling argument. To obtain the last bound above we used (4.8) applied to \(s\in \partial {E}\). The interface terms above are further bounded by applying Theorem 7, yielding

$$\begin{aligned} {|v_{c}- \varPi ^{0,s}_{p}v_{c}|}_{1,s} \le C_{{\text {proj}}}{|v_{c}|}_{1,s}\le C_{{\text {proj}}}h_{E}^{-1/2}{|v_{c}|}_{1,E}. \end{aligned}$$

Using this bound in (4.10) and the latter in (4.9), we finally obtain

$$\begin{aligned} {|{w_{{\text {I}}}}- \varPi ^0_{p}v_{c}|}_{1,E}&\le C_2 {|v_{c}|}_{1,E}, \end{aligned}$$

with \(C_2>0\) depending on the (uniformly bounded) number \(\nu _{E}\) of interfaces of \(E\) and on the constants \(C_{\mathrm{T}}\), \(C_{{\text {P}}}, C_{\mathrm{S}}, C_{{\text {inv}}}\), and \(C_{{\text {proj}}}\).

Now, given \({w_{{\text {I}}}}\), we can construct an interpolant \({v_{{\text {I}}}}\in V_h\) exactly as in the 2-dimensional case and following the same (dimension-independent) argument derive the bound (4.7). This latter bound, combined with (4.11), yields

$$\begin{aligned} {|v_{c}- {v_{{\text {I}}}}|}_{1,E} \le 2{|v_{c}-\varPi ^0_{p}v_{c}|}_{1,E}+{|{w_{{\text {I}}}}- \varPi ^0_{p}v_{c}|}_{1,E} \le C_3 {|v_{c}|}_{1,E}, \end{aligned}$$

for some \(C_3>0\) depending on \(C_1\) and \(C_2\).

From (4.12) we can derive the required bound in the \(L^2\)-norm by resorting to the scaled Poincaré-Friedrichs inequality [16] and recalling (4.12):

$$\begin{aligned} {||v_{c}- {v_{{\text {I}}}}||}_{0,E}&\le C_{{\text {P}}}\left( h_E{|v_{c}- {v_{{\text {I}}}}|}_{1,E}+h_E^{-1/2}{|\int _{\partial {E}} (v_{c}- {v_{{\text {I}}}})ds|} \right) \nonumber \\&\le C_{{\text {P}}}(C_3 h_E{|v_{c}|}_{1,E}+h_E^{1/2}\sum _{s\in \partial {E}}{||v_{c}- {v_{{\text {I}}}^s}||}_{0,s}) \end{aligned}$$

The interface terms on the right-hand side can be further bounded using the Poincaré inequality once more and (4.8):

$$\begin{aligned} {||v_{c}- {v_{{\text {I}}}^s}||}_{0,s}&\le C_{{\text {P}}}h_E{|v_{c}- {v_{{\text {I}}}^s}|}_{1,s} \le C_{{\text {P}}}C_1 h_E{|v_{c}- \varPi ^{0,s}_{p}v_{c}|}_{1,s}\nonumber \\&\le C_{{\text {P}}}C_1 C_{{\text {proj}}}h_E{|v_{c}|}_{1,s}\le C_{{\text {P}}}C_1 C_{{\text {proj}}}h_E^{1/2}{|v_{c}|}_{1,E}. \end{aligned}$$

Finally, combining this bound with (4.13) yields

$$\begin{aligned} {||v_{c}- {v_{{\text {I}}}}||}_{0,E}&\le C_4 h_E{|v_{c}|}_{1,E}, \end{aligned}$$

with \(C_4>0\) depending on \(C_{{\text {P}}}, C_1,\) \(C_3\), \(C_{{\text {proj}}}\), and \(\nu _{E}\).

The statement of the theorem now follows, as in the case \(d=2\). \(\square \)

Remark 12

For \(d=3\), the proof of the above VEM approximation result makes use of both the Trace Theorem and Sobolev Interpolation Theorem applied to each mesh interface. This was necessitated by the hierarchical construction of the local virtual element spaces with respect to spatial dimension. The associated constants are uniformly bounded but depend on the polygonal shape of the mesh interfaces, and as such are not easily accessible in general. However, if the mesh interfaces are triangular or the method is constructed on the sub-triangulation of each mesh interface, the proof does only depend on easily computable quantities.

5 A posteriori error analysis

We shall now derive a residual-type a posteriori error bound for the error in the standard energy norm:


for \(v\in H^1(\omega )\), for any \(\omega \subseteq \varOmega \). The coercivity and continuity of the bilinear form \(A\) in this norm follow from the assumptions on the coefficients \(\varvec{\kappa }\) and \(\mu \), which ensure that for \(v\in H^1_0(\varOmega )\),

where \(C_{{\text {equiv}}}:= \sqrt{(1 + C_{{\text {PF}}})/{\kappa _*}}\), with \(C_{{\text {PF}}}\) the Poincaré-Friedrichs constant, and \(\widehat{C}_{{\text {equiv}}}:=\sqrt{ \max \{\kappa ^*, {||\mu ||}_{\infty }\}}\), cf. [21]. The coercivity and continuity of \(A_h\) in this norm are then inherited from \(A\) through the virtual element stability property (3.9).

To account for the effects of data oscillation, we introduce the following piecewise-polynomial approximations of the PDE coefficients:

$$\begin{aligned} {{\varvec{\kappa }}_h}\approx \varvec{\kappa }, \qquad {{\varvec{\beta }}_h}\approx \varvec{\beta }, \qquad {{\gamma }_h}\approx {\gamma }. \end{aligned}$$

For quantities which may be discontinuous across the mesh skeleton, we define the jump operator across a mesh interface \(s\in {\mathcal {S}}_h\) as follows. If \(s\in {\mathcal {S}}_h^{\text {int}}\), then there exist \(E^+\) and \(E^-\) such that \(s\subset \partial E^+\cap \partial E^-\). Denote by \(\mathbf {v}^{\pm }\) the trace of the vector-valued function \(\mathbf {v}|_{E^{\pm }}\) on \(s\) from within \(E^{\pm }\) and by \(\varvec{n}_s^{\pm }\) the unit outward normal on \(s\) from \(E^{\pm }\). Then, . If, on the other hand, \(s\in {\mathcal {S}}_h^{\text {bdry}}\), then , with \(\mathbf {v}\) representing the trace of \(\mathbf {v}\) from within the element \(E\) having \(s\) as an interface and \(\varvec{n}_{s}\) is the unit outward normal on \(s\) from \(E\).

5.1 The residual equation

Define \(e:= u- u_h\in H^1_0(\varOmega )\), and let \(v\in H^1_0(\varOmega )\). Then, we have, respectively,

$$\begin{aligned} A(e, v)&=(f, v) - A(u_h, \chi )-A(u_h, v-\chi )\nonumber \\&=(f, v)- ( {{f}_h}, \chi ) + A_h(u_h, \chi )- A(u_h, \chi )-A(u_h, v-\chi ) \nonumber \\&=(f-{{f}_h}, \chi )+ (f, v-\chi ) +A_h(u_h, \chi )-A(u_h, \chi )- A(u_h, v-\chi ) \end{aligned}$$

for any \(\chi \in V_h\), since \(u\) satisfies the weak form of the PDE problem and \(u_h\) is the virtual element solution. Notice that, in contrast to a posteriori bounds for standard finite element approximations, additional terms appear in the virtual element residual equation. These terms represent the virtual inconsistency of the VEM.

5.2 A posteriori error bound

We shall estimate each term on the right-hand side of (5.3) separately, to arrive to a computable error bound. To this end, an integration by parts and straightforward manipulation yields the identity

for any \(w \in H^1_0(\varOmega )\). Using this and the data approximations introduced in (5.2), (5.3) may be rewritten as

$$\begin{aligned} A(e, v) =&\sum _{E\in {\mathcal {T}}_h} \left( \left( R_{E}, v-\chi \right) + \left( \theta _{E}, v-\chi \right) + B^{E}(u_h,v-\chi ) \right) \nonumber \\&- \sum _{s\in {\mathcal {S}}_h} \left( \left( J_{s}, v-\chi \right) _{0,s} + \left( \theta _{s}, v-\chi \right) _{0,s} \right) \nonumber \\&+ (f-{{f}_h}, \chi ) +A_h(u_h, \chi )-A(u_h, \chi ) \end{aligned}$$

for any \(v\in H^1_0(\varOmega )\), \(\chi \in V_h\), where

$$\begin{aligned} R_{E}:=({{f}_h}+ \nabla \cdot {{\varvec{\kappa }}_h}\varPi ^0_{p-1} \nabla u_h-{{\varvec{\beta }}_h}\cdot \varPi ^0_{p-1} \nabla u_h- {{\gamma }_h}\varPi ^0_{p} u_h)|_{E}, \end{aligned}$$
$$\begin{aligned} \theta _{E}&:= (f- {{f}_h}+ \nabla \cdot (\varvec{\kappa }- {{\varvec{\kappa }}_h}) \varPi ^0_{p-1} \nabla u_h\nonumber \\&\quad - (\varvec{\beta }- {{\varvec{\beta }}_h}) \cdot \varPi ^0_{p-1} \nabla u_h- ({\gamma }- {{\gamma }_h}) \varPi ^0_{p} u_h)|_{E}, \end{aligned}$$

are the element and edge residuals, and the element and edge data oscillation terms, respectively, and

$$\begin{aligned} B^{E}(w_h,v)&:= (\varvec{\kappa }({\text {I}}- \varPi ^0_{p-1}) \nabla w_h, \nabla v)_{E} + (\varvec{\beta }\cdot ({\text {I}}- \varPi ^0_{p-1}) \nabla w_h, v)_{E} \\&\quad \ + ({\gamma }({\text {I}}- \varPi ^0_{p}) w_h, v)_{E}, \end{aligned}$$

is the ‘virtual’ residual.

Theorem 13

(Upper bound) Let \(u_h\in V_h\) be the virtual element solution to problem (3.10). Then, there exists a constant C, independent of h, \(u\) and \(u_h\), such that


$$\begin{aligned} \eta ^{E}&:= h_{E}^2 {||R_{E}||}^2_{0,E}+ \sum _{s\subset \partial {E}} h_{s}{||J_{s}||}_{0,s}^2,\\ \varTheta ^{E}&:=h_{E}^2 {||\theta _{E}||}^2_{0,E} + h_{E}^2 {||f- {{f}_h}||}_{0,E}^2 + \sum _{s\subset \partial {E}} h_{s}{||\theta _{s}||}_{0,s}^2, \\ \mathfrak {S}^{E}&= S^{E}((\varPi ^0_{p} - {\text {I}}) u_h,(\varPi ^0_{p} - {\text {I}})u_h), \end{aligned}$$

and \(\varPsi ^{E}\) encompasses the virtual inconsistency terms, defined as the sum of

$$\begin{aligned} \varPsi _1^{E}&= {||(\varPi ^0_{p-1} - {\text {I}}) (\varvec{\kappa }\varPi ^0_{p-1} \nabla u_h)||}_{0,E}^2, \\ \varPsi _2^{E}&= h_{E}^2 {||(\varPi ^0_{p} - {\text {I}}) (\varvec{\beta }\cdot \varPi ^0_{p-1} \nabla u_h)||}_{0,E}^2, \\ \varPsi _3^{E}&= {||(\varPi ^0_{p-1} - {\text {I}}) (\varvec{\beta }\varPi ^0_{p} u_h)||}_{0,E}^2, \\ \varPsi _4^{E}&= h_{E}^2 {||(\varPi ^0_{p} - {\text {I}}) (\mu \varPi ^0_{p} u_h)||}_{0,E}^2. \end{aligned}$$


Let \(e_{{\text {I}}}\in V_h\) be the interpolant of \(e\) satisfying the bounds of Theorem 11. Then, upon setting \(v=e\) and \(\chi =e_{{\text {I}}}\) in (5.4), coercivity yields

For I and II, we use the Cauchy-Schwarz inequality and the bounds of Theorem 11 to find that

$$\begin{aligned} I= \left( R_{E}, e- e_{{\text {I}}}\right) _{E}&\le C_{{\text {Clem}}}h_{E}{||R_{E}||}_{0,E} {||e||}_{1,E}, \\ II= \left( \theta _{E}, e- e_{{\text {I}}}\right) _{E}&\le C_{{\text {Clem}}}h_{E}{||\theta _{E}||}_{0,E} {||e||}_{1,E}. \end{aligned}$$

For III, we use the properties of the \(L^2\)-projection to find that

$$\begin{aligned} III&= (f- {{f}_h}, e_{{\text {I}}}- \varPi ^0_{p-1} e_{{\text {I}}})_{E} \le C_{{\text {proj}}}h_{E} {||e||}_{1,E} {||f- {{f}_h}||}_{0,E}. \end{aligned}$$

Bounding the edge terms VI and VII requires the use of the scaled trace inequality \({||v||}_{0,s}^2 \le C_{{\text {tr}}}( h_{E}^{-1} {||v||}_{0,E}^2 + h_{E} {||\nabla v||}_{0,E}^2)\), for \(v\in H^1(E)\), along with the bounds of Theorem 11 and the mesh regularity assumption, yielding

$$\begin{aligned} VI&\le {||J_{s}||}_{0,s} {||e- e_{{\text {I}}}||}_{0,s} \le C_{{\text {tr}}}C_{{\text {Clem}}}\rho ^{-\frac{1}{2}} {||e||}_{1,\omega _{s}} h_{s}^{\frac{1}{2}} {||J_{s}||}_{0,s},\\ VII&\le {||\theta _{s}||}_{0,s} {||e- e_{{\text {I}}}||}_{0,s} \le C_{{\text {tr}}}C_{{\text {Clem}}}\rho ^{-\frac{1}{2}} {||e||}_{1,\omega _{s}} h_{s}^{\frac{1}{2}} {||\theta _{s}||}_{0,s}, \end{aligned}$$

where \(\omega _{s}= E^+ \cup E^-\) with \(E^+\) and \(E^-\) the elements meeting at the edge \(s\).

Noting that

$$\begin{aligned} {||({\text {I}}- \varPi ^0_{p-1}) \nabla u_h||}_{0,E}={||({\text {I}}- \varPi ^0_{p-1}) \nabla ({\text {I}}- \varPi ^0_{p})u_h||}_{0,E}\le {|| \nabla ({\text {I}}- \varPi ^0_{p})u_h||}_{0,E}, \end{aligned}$$

we can bound IV as

$$\begin{aligned} IV&\le C_{{\text {Clem}}}\left( {||\varvec{\kappa }({\text {I}}- \varPi ^0_{p-1}) \nabla u_h||}_{0,E} + h_{E}{||\varvec{\beta }\cdot ({\text {I}}- \varPi ^0_{p-1}) \nabla u_h||}_{0,E} \right. \nonumber \\ {}&\quad \left. + h_{E}{||{\gamma }({\text {I}}- \varPi ^0_{p}) u_h||}_{0,E} \right) {||e||}_{1,E} \nonumber \\&\le C_{{\text {Clem}}}\left( \left( \kappa ^*_{E}+ h_{E}{||\varvec{\beta }||}_{\infty ,E} \right) {||({\text {I}}- \varPi ^0_{p-1}) \nabla u_h||}_{0,E} \right. \nonumber \\ {}&\quad \left. +\, h_{E}{||{\gamma }||}_{\infty ,E} {||({\text {I}}- \varPi ^0_{p}) u_h||}_{0,E} \right) {||e||}_{1,E} \nonumber \\&\le C_{{\text {Clem}}}\left( \left( \kappa ^*_{E}+ h_{E}{||\varvec{\beta }||}_{\infty ,E} \right) {||\nabla ({\text {I}}- \varPi ^0_{p}) u_h||}_{0,E} \right. \nonumber \\ {}&\quad \left. +\, h_{E}{||{\gamma }||}_{\infty ,E} {||({\text {I}}- \varPi ^0_{p}) u_h||}_{0,E} \right) {||e||}_{1,E}. \end{aligned}$$

Using now (3.6), and introducing the mesh Peclét number by \({\text {Pe}}_{E}:= h_{E}{||\varvec{\beta }||}_{\infty ,E}/\kappa _*^{E}\), we arrive to

$$\begin{aligned} IV&\le \sqrt{2}C_{{\text {Clem}}}C_{{\text {stab}}}{||e||}_{1,E} \bigg ( \left( \frac{\kappa ^*_{E}}{\kappa _*^{E}} + {\text {Pe}}_{E}\right) ^2 \kappa _*^{E}S^{E}_{1}(({\text {I}}- \varPi ^0_{p})u_h,({\text {I}}- \varPi ^0_{p})u_h) \\&\quad + \left( \frac{{||{\gamma }||}_{\infty ,E}}{\mu _0^{E}} \right) ^2 h_{E}^2 \mu _0^{E}S^{E}_{0}(({\text {I}}- \varPi ^0_{p})u_h,({\text {I}}- \varPi ^0_{p})u_h) \bigg )^{1/2}. \end{aligned}$$

Focussing on V, we begin by observing the identity (due to the properties of the \(L^2\)-projection)

$$\begin{aligned}&a^E(u_h, e_{{\text {I}}}) - a_h^E(u_h, e_{{\text {I}}})\\&\quad = (\varvec{\kappa }({\text {I}}- \varPi ^0_{p-1}) \nabla u_h, \nabla e_{{\text {I}}})_{E} + \left( ({\text {I}}- \varPi ^0_{p})u_h, \mu e_{{\text {I}}}\right) _{E} \\&\qquad + (({\text {I}}- \varPi ^0_{p-1}) \varvec{\kappa }\varPi ^0_{p-1} \nabla u_h, \nabla e_{{\text {I}}})_{E} + \left( ({\text {I}}- \varPi ^0_{p}) \mu \varPi ^0_{p} u_h, ({\text {I}}- \varPi ^0_{p}) e_{{\text {I}}}\right) _{E} \\&\qquad - S^{E}(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})e_{{\text {I}}})\\&\quad \le {||e_{{\text {I}}}||}_{1,E} \left( \kappa ^*_{E}{|| \nabla ({\text {I}}- \varPi ^0_{p})u_h||}_{0,E} + {||\mu ||}_{\infty ,E}{||({\text {I}}- \varPi ^0_{p})u_h||}_{0,E} \right. \\&\qquad \left. + {||({\text {I}}- \varPi ^0_{p-1}) \varvec{\kappa }\varPi ^0_{p-1} \nabla u_h||}_{0,E} + C_{{\text {proj}}}h_{E} {||({\text {I}}- \varPi ^0_{p}) \mu \varPi ^0_{p} u_h||}_{0,E} \right) \\&\qquad - S^{E}(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})e_{{\text {I}}}), \end{aligned}$$

with the last bound resulting from the Cauchy-Schwarz inequality and Theorem 7. From Definition 5 and Theorem 7, we may bound the final term by

$$\begin{aligned}&S^{E}(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})e_{{\text {I}}}) \\&\quad \le ( s_1^2 S^{E}_1(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})u_h))^{1/2} \\&\qquad \times \Big ( C_{{\text {stab}}}\int _E\varvec{\kappa }\nabla (({\text {I}}- \varPi ^0_{p}) e_{{\text {I}}}) \cdot \nabla (({\text {I}}- \varPi ^0_{p}) e_{{\text {I}}}) {\text {d}}\varvec{x}\Big )^{1/2} \\&\qquad + ( s_0^2 S^{E}_0(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})u_h))^{1/2} \Big (C_{{\text {stab}}}\int _E\mu (({\text {I}}- \varPi ^0_{p}) e_{{\text {I}}})^2 {\text {d}}\varvec{x}\Big )^{1/2} \\&\quad \le C_{{\text {stab}}}^{1/2} {||e_{{\text {I}}}||}_{1,E} \left( (s_1^2 \kappa ^*_{E}S^{E}_1(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})u_h))^{1/2} \right. \\&\qquad \left. + (s_0^2 {||\mu ||}_{\infty ,E} h_{E}^2 S^{E}_0(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})u_h))^{1/2} \right) . \end{aligned}$$

Combining the last two bounds, along with (3.6), we conclude that

$$\begin{aligned}&a^E(u_h, e_{{\text {I}}}) - a_h^E(u_h, e_{{\text {I}}})\\&\quad \le {||e_{{\text {I}}}||}_{1,E} \Big ({||({\text {I}}- \varPi ^0_{p-1}) \varvec{\kappa }\varPi ^0_{p-1} \nabla u_h||}_{0,E} + C_{{\text {proj}}}h_{E} {||({\text {I}}- \varPi ^0_{p}) \mu \varPi ^0_{p} u_h||}_{0,E} \\&\qquad + C_{{\text {stab}}}^{1/2} \bigg ( \Big ( \Big ( \frac{\kappa ^*_{E}}{\kappa _*^{E}} + s_1^2 \Big ) \kappa ^*_{E}S^{E}_1(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})u_h) \Big )^{1/2} \\&\qquad + \Big ( \Big ( \frac{{||\mu ||}_{\infty ,E}}{\mu _0^{E}} + s_0^2 h_{E}^2 \Big ) {||\mu ||}_{\infty ,E} S^{E}_0(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})u_h) \Big )^{1/2} \bigg )\bigg ). \end{aligned}$$

The skew-symmetric terms can be treated completely analogously, yielding

$$\begin{aligned}&b^E(u_h, e_{{\text {I}}}) - b_h^E(u_h, e_{{\text {I}}}) \\&\quad \le \frac{1}{2} \left( \left( C_{{\text {PF}}}h_{E} {||\varvec{\beta }||}_{\infty ,E} + C_{{\text {proj}}}h_{E} {||\varvec{\beta }||}_{1,\infty ,E} \right) {||\nabla ({\text {I}}- \varPi ^0_{p}) u_h||}_{0,E} \right. \\&\qquad + \left. C_{{\text {proj}}}h_{E} {||({\text {I}}- \varPi ^0_{p}) \varvec{\beta }\varPi ^0_{p-1} \nabla u_h||}_{0,E}+ {||({\text {I}}- \varPi ^0_{p-1}) \varvec{\beta }\varPi ^0_{p} u_h||}_{0,E} \right) {||\nabla e_{{\text {I}}}||}_{0,E}, \end{aligned}$$

as \( {||({\text {I}}- \varPi ^0_{p}) u_h||}_{0,E} \le C_{{\text {PF}}}h_{E} {||\nabla ({\text {I}}- \varPi ^0_{p}) u_h||}_{0,E}\) since \(({\text {I}}- \varPi ^0_{p}) u_h\) has zero average, and using (5.9). The stability bound (3.6), further implies

$$\begin{aligned}&b^E(u_h, e_{{\text {I}}}) - b_h^E(u_h, e_{{\text {I}}}) \\&\quad \le \frac{1}{2} \left( C_{{\text {stab}}}^{1/2} ( C_{{\text {PF}}}+ C_{{\text {proj}}}) h_{E} {||\varvec{\beta }||}_{1,\infty ,E} \left( S^{E}_1(({\text {I}}- \varPi ^0_{p}) u_h, ({\text {I}}- \varPi ^0_{p}) u_h) \right) ^{1/2} \right. \\&\qquad + \left. C_{{\text {proj}}}h_{E} {||({\text {I}}- \varPi ^0_{p}) \varvec{\beta }\varPi ^0_{p-1} \nabla u_h||}_{0,E}+ {||({\text {I}}- \varPi ^0_{p-1}) \varvec{\beta }\varPi ^0_{p} u_h||}_{0,E} \right) {||\nabla e_{{\text {I}}}||}_{0,E}. \end{aligned}$$

Combining the bounds for the symmetric and skew-symmetric terms above, we deduce

$$\begin{aligned}&A^E( u_h, e_{{\text {I}}}) - A_h^E(u_h, e_{{\text {I}}})\\&\quad \le {||e_{{\text {I}}}||}_{1,E} \bigg ({||({\text {I}}- \varPi ^0_{p-1}) \varvec{\kappa }\varPi ^0_{p-1} \nabla u_h||}_{0,E} + \frac{1}{2} {||({\text {I}}- \varPi ^0_{p-1}) \varvec{\beta }\varPi ^0_{p} u_h||}_{0,E} \\&\qquad + C_{{\text {proj}}}h_{E} \Big ( \frac{1}{2} {||({\text {I}}- \varPi ^0_{p}) \varvec{\beta }\varPi ^0_{p-1} \nabla u_h||}_{0,E} + {||({\text {I}}- \varPi ^0_{p}) \mu \varPi ^0_{p} u_h||}_{0,E} \Big ) \\&\qquad + C_{{\text {stab}}}^{1/2} \Big (\tilde{s}_1 S^{E}_1(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})u_h) \Big )^{1/2} \\&\qquad + \Big ( \Big ( \frac{{||\mu ||}_{\infty ,E}}{\mu _0^{E}} + s_0^2 h_{E}^2 \Big ) {||\mu ||}_{\infty ,E} S^{E}_0(({\text {I}}- \varPi ^0_{p})u_h, ({\text {I}}- \varPi ^0_{p})u_h) \Big )^{1/2} \bigg ), \end{aligned}$$


$$\begin{aligned} \tilde{s}_1:= \Big ( \frac{\kappa ^*_{E}}{\kappa _*^{E}} \Big )^2 + s_1^2 \frac{\kappa ^*_{E}}{\kappa _*^{E}} + ( C_{{\text {PF}}}+ C_{{\text {proj}}})^2 h_{E}^2 {||\varvec{\beta }||}_{1,\infty ,E}^2 . \end{aligned}$$

The result then follows by combining the individual bounds above and using the equivalence of the energy- and \(H^1\)-norms. \(\square \)

The terms of \(\eta ^E\) echo the standard element and edge residual terms while \(\varTheta ^E\) are data oscillation terms, familiar from the residual a posteriori analysis of finite element methods [3, 41, 43]. In the present virtual context, however, these terms involve only the polynomial part of \(u_h\), as they would not be computable for \(u_h\) itself. As a result, remainder terms also appear in the estimator, collected in \(\varPsi ^E\). The term \(\mathfrak {S}^E\), on the other hand, provides a computable estimate for the quality of the approximation \(\varPi ^0_{p} u_h\) of \(u_h\).

Remark 14

We note that the term \(\varPsi _3^{E}\) does not vanish when the PDE coefficients are constant, as \(\varPsi _i^{E}\), \(i=1,2,4\) do. It is possible to circumvent this quite easily by modifying the skew-symmetric bilinear form \(b_h^E\) to use the degree \(p\) projection of the gradient. The resulting method and the respective (modified) estimators are still computable in just the same manner as the current method (cf. [21]), since the virtual element functions are polynomials of degree \(p\) on each edge.

The estimator of Theorem 13 is also an estimator for the error between \(u\) and the projection of \(u_h\), and we have the following result.

Corollary 15

(Bound for the projected solution) Let \(\eta ^{E}, \varPsi ^{E}, \mathfrak {S}^{E}\) and \(\varTheta ^{E}\) be the terms of the estimator in Theorem 13. Then,


Using the triangle inequality and the definition of the stabilising term, we have

The result follows by Theorem 13. \(\square \)

5.3 Lower bound

We now prove local lower bounds of the error in the energy norm by the a posteriori error estimate. To this end, we make use of element and edge bubble functions satisfying the bounds of Lemmas 8 and 9 respectively.

Theorem 16

(Local lower bound) Let \(\eta ^{E}, \mathfrak {S}^{E}\) and \(\varTheta ^{E}\) be as in Theorem 13. Then,

where \(\omega _{E}:= \{E' \in {\mathcal {T}}_h: \mu _{d-1}(\partial {E}' \cap \partial {E}) \ne 0\}\) is the patch consisting of the element \(E\) and its neighbours, and \(\mu _{d-1}\) denotes the \((d-1)\)-dimensional measure. The constant C depends on \(C_{{\text {stab}}}, \widehat{C}_{{\text {equiv}}}, C_{{\text {bub}}}\), \(\rho \), \(\varOmega \) and the PDE coefficients, but is independent of h, \(u\) and \(u_h\).


First observe that \(R_{E}\in \mathcal {P}_{p+q}(E)\) for some \(q\in \mathbb {N}\cup \{0\}\) representing the degree of the polynomials used for the data approximations in (5.2). From (5.4) with \(\chi =0\) and the fact that \(\psi _{E}|_{\partial {E}} = 0\), we deduce

$$\begin{aligned} A(e, \psi _{E}R_{E}) = ( R_{E}, \psi _{E}R_{E})_E+ ( \theta _{E}, \psi _{E}R_{E})_E+ B^{E}(u_h, \psi _{E}R_{E}). \end{aligned}$$

Arguing as in (5.10), with \(\psi _{E}R_{E}\) in place of \(e-e_{{\text {I}}}\), we find that

$$\begin{aligned} B^{E}(u_h, \psi _{E}R_{E}) \le C (\mathfrak {S}^{E})^{\frac{1}{2}}{||\psi _{E}R_{E}||}_{1, E}, \end{aligned}$$

and consequently, using the properties of the interior bubble functions given in Lemma 8,

Using Lemma 8 again this becomes

and therefore we arrive at

For the face residual, we start by extending \(J_{s}\) into \(\omega _{s}\) through a constant prolongation in the direction normal to the face s, yielding \(J_{s}\in \mathcal {P}_{p}(\omega _{s}) \subset V_h^{\omega _{s}} := V_h^{E^+} \cup V_h^{E^-}\) with \(E^+ \cap E^- = s\). Then, (5.4) gives

$$\begin{aligned} A(e, \psi _{s}J_{s}) =&\sum _{E'\in \omega _{s}} \left[ ( R_{E'}, \psi _{s}J_{s})_{E'} + ( \theta _{E'}, \psi _{s}J_{s})_{E'} + B^{E}(u_h, \psi _{s}J_{s}) \right] \\&\quad - \left( J_{s}, \psi _{s}J_{s}\right) _{0,s} - \left( \theta _{s}, \psi _{s}J_{s}\right) _{0,s}. \end{aligned}$$

Arguing as before and using Lemma 9, we find that

Applying Lemma 9 again, using the bound for the element residual, and multiplying by \(h_s^{1/2}\), we obtain

Using Assumption 1 and putting these bounds together completes the proof. \(\square \)

This local lower bound then immediately provides a corresponding global lower bound, by simply summing the local estimates over the whole of \({\mathcal {T}}_h\). Furthermore, Theorem 16 and triangle inequality also provide us with the following lower bound on the error between the solution \(u\) and the projected virtual element solution \(\varPi ^0_{p} u_h\).

Corollary 17

(Lower bound for the projected solution) Let \(\eta ^{E}, \mathfrak {S}^{E}\) and \(\varTheta ^{E}\) be defined as in Theorem 13. Then,

where \(\omega _{E}:= \{E' \in {\mathcal {T}}_h: \mu _{d-1}(\partial {E}' \cap \partial {E}) \ne 0\}\) is the patch consisting of the element \(E\) and its neighbours, and \(\mu _{d-1}\) denotes the \((d-1)\)-dimensional measure.

In addition to a lower bound for the residual part of the estimator, \(\eta ^E\), we have the following control on the virtual inconsistency terms \(\varPsi ^E\) indicating that these are also of optimal order up to data oscillation.

Lemma 18

(Lower bound for the inconsistency terms) We have


We have, respectively,

$$\begin{aligned} \varPsi _1^E\le & {} 2 \bigg ( {||\varPi ^0_{p-1} (\varvec{\kappa }(\varPi ^0_{p-1} \nabla u_h- \nabla u))||}_{0,E}^2 + {||\varPi ^0_{p-1} (\varvec{\kappa }\nabla u) - \varvec{\kappa }\nabla u||}_{0,E}^2\\&+ {||\varvec{\kappa }\nabla u - \varvec{\kappa }\varPi ^0_{p-1} \nabla u_h||}_{0,E}^2 \bigg ) \\\le & {} 2{||\varPi ^0_{p-1} (\varvec{\kappa }\nabla u) - \varvec{\kappa }\nabla u||}_{0,E}^2 + 4{||\varvec{\kappa }( \nabla u - \varPi ^0_{p-1} \nabla u_h)||}_{0,E}^2\\\le & {} 2{||\varPi ^0_{p-1} (\varvec{\kappa }\nabla u) - \varvec{\kappa }\nabla u||}_{0,E}^2 + 8{\kappa ^*}^2 {||\nabla u - \nabla u_h||}_{0,E}^2\\&+\, 8{\kappa ^*}^2 {||\nabla u_h- \varPi ^0_{p-1} \nabla u_h||}_{0,E}^2, \end{aligned}$$

using the stability of the \(L^2\) projection operator and (2.2). Using (5.9) and (3.6) the final term can be controlled by the stabilising term, resulting in the required bound. A completely analogous argument can be applied to each of the remaining terms of \(\varPsi ^{E}\). \(\square \)

6 Numerical results

We present a series of numerical experiments aimed at testing the practical behaviour of the estimator derived in Theorem 13. In addition, we propose an adaptive algorithm based on the estimator which is applied to a variety of test problems.

The above analysis is valid by only requiring a set of abstract assumptions for the stabilisation forms \(S^{E}_{1}\), \(S^{E}_{0}\) and for the projector \(\varPi ^{*}_{p}\) [which in turn defines the space \(V_h^E\) through (3.3)], giving rise to a number of possibilities. Here, we focus on a specific scheme by providing precise choices for \(\varPi ^{*}_{p}\), \(S^{E}_{1}\) and \(S^{E}_{0}\). Define the bilinear form \(\mathcal {I}^{E}: V_h^E\times V_h^E\rightarrow \mathbb {R}\) by

$$\begin{aligned}\mathcal {I}^{E}(v_h,w_h) := \sum _{r = 1}^{N_{E}} {\text {dof}}_r(v_h) {\text {dof}}_r(w_h), \end{aligned}$$

with \({\text {dof}}_r(w_h)\) denoting the value of the \(r^{\text {th}}\) local degree of freedom of \(w_h\) with respect to an arbitrary but fixed ordering of the degrees of freedom on the element \(E\). This bilinear form corresponds to the Euclidean inner product on the space \(\mathbb {R}^{N_{E}}\) consisting of vectors of degrees of freedom. Following [21, Section 4.1], we define \(\varPi ^{*}_{p}\) to be the orthogonal projection onto the polynomial space \(\mathcal {P}_{p}(E)\) with respect to \(\mathcal {I}^{E}(\cdot , \cdot )\), and we fix

$$\begin{aligned} \begin{aligned} S^{E}_{1}(u_h,v_h) := \overline{\varvec{\kappa }}_Eh_{E}^{d- 2} \mathcal {I}^{E}(u_h,v_h), \qquad S^{E}_{0}(u_h,v_h) := \overline{\mu }_Eh_{E}^{d} \mathcal {I}^{E}(u_h,v_h), \end{aligned} \end{aligned}$$

where \(\overline{\varvec{\kappa }}_E\), and \(\overline{\mu }_E\) are some constant approximations of \(\varvec{\kappa }\), and \(\mu \) over \(E\) (e.g., local averages), respectively, resulting in

$$\begin{aligned}S^{E}(u_h,v_h):=h^{d-2}_E\left( s_0 \overline{\varvec{\kappa }}_E+ h_{E}^2 s_1 \overline{\mu }_E\right) \, \mathcal {I}^{E}(u_h,v_h). \end{aligned}$$

Remark 19

Note that the internal degrees of freedom of \(({\text {I}}- \varPi ^0_{p}) v_h\) are equal to zero, and hence the above stablilising term reduces to a term active only on the mesh skeleton.

Fig. 3
figure 3

The first two non-convex, in general, meshes used in the uniform sequence described in Sect. 6.1. Vertices are marked with a dot, and may separate coplanar edges. a The first mesh, with 25 elements. b The second mesh with 100 elements

Fig. 4
figure 4

Convergence history of the \(H^1(\varOmega )\)-seminorm error and estimator for the example (6.2) on the meshes shown in Fig. 3. a The error \(\Vert \nabla (u - \varPi ^0_k u_h ) \Vert _{0,\varOmega }\). b Estimated error

6.1 Uniformly generated meshes

As a first test to verify the asymptotic behaviour of the estimator, we consider the test problem

$$\begin{aligned} -\varDelta u= f\text { in } \varOmega , \qquad u = 0 \text { on } \partial \varOmega , \end{aligned}$$

for \(\varOmega = (0,1)^2\), and fix \(f\) such that the exact solution is given by \( u(x,y) = \sin (\pi x) \sin (\pi y), \) on a uniformly generated sequence of meshes consisting of non-convex polygonal elements. The first two meshes in the uniform sequence are shown in Fig. 3. Figure 4 depicts the convergence history of the \(H^1(\varOmega )\)-seminorm error and of the estimator on this sequence of meshes, indicating that both converge at the optimal rate for polynomial degrees \(p=1,2\) and 3. The effectivity of the estimator is defined by

$$\begin{aligned} \text {effectivity}:= \frac{ \left( \sum _{E\in {\mathcal {T}}_h} \eta ^{E} + \varTheta ^{E} + \varPsi ^{E} + \mathfrak {S}^{E} \right) ^{\frac{1}{2}}}{{||\nabla (u- \varPi ^0_{p}u_h)||}_{0,\varOmega }}, \end{aligned}$$

with \(\eta ^E, \varTheta ^E, \mathfrak {S}^{E}\) and \(\varPsi ^E\) as in Theorem 13. Asymptotically the effectivity becomes constant throughout the mesh sequence, tending to approximately 5.7 for \(p=1\), 3 for \(p=2\), and 1.84 for \(p=3\).

6.2 Adaptive refinement

We shall use a typical adaptive algorithm for elliptic problems reading: solve \(\rightarrow \) estimate \(\rightarrow \) mark \(\rightarrow \) refine. In this context, given a polygonal subdivision of \(\varOmega \), one solves the VEM problem, estimates the error using the a posteriori error bound (Theorem 13), marks a subset of elements for refinement and, subsequently, refines. The Dörfler/bulk marking strategy is used below for the mark step, marking the subset of mesh elements \({\mathcal {M}} \subset {\mathcal {T}}_h\) with the largest estimated errors such that

$$\begin{aligned} \left( \sum _{E\in {\mathcal {M}}} \eta ^E+ \varTheta ^E+ \varPsi ^{E} + \mathfrak {S}^{E} \right) ^{\frac{1}{2}} \le \theta \left( \sum _{E\in {\mathcal {T}}_h} \eta ^E+ \varTheta ^E+ \varPsi ^{E} + \mathfrak {S}^{E} \right) ^{\frac{1}{2}}, \qquad \end{aligned}$$

for some \(\theta \in (0,1)\). Here, we pick \(\theta = 0.4\).

Fig. 5
figure 5

An illustration of the refinement strategy used for polygonal elements. Edges and vertices of the original element are shown by solid lines and points. To refine the element, the midpoint of each planar face of its boundary is connected to its barycentre. Note that the two edges PQ and QR are not bisected as the refinement treats PR as a single planar face, adding only a new edge from Q to the barycentre. Consequently, the result of refining two neighbours in the mesh is independent of the order in which they are refined

To refine a polygonal element we divide elements by connecting the midpoint of each planar element face to its barycentre; see Fig. 5 for an illustration for a hexagon. Note that this strategy simply reduces to the standard refinement strategy for a square element. By refining in this fashion, hanging nodes may be introduced. Nevertheless, this is trivially accounted for in the VEM setting as the method is able to handle polygonal elements with an arbitrary number of faces. This is a flexibility which we take advantage of in these examples by imposing no restriction on the number of hanging nodes allowed on each face. In this extreme mesh flexibility, more exotic refinement strategies are certainly possible, but we leave the development of these for future work.

Remark 20

(On the mesh assumptions) By imposing no restriction on the number of hanging nodes per face, we are at risk of violating Assumption 1 by producing meshes which contain very small faces. However, this requirement does not seem to be necessary for the virtual element method to remain accurate and stable in practice. This is demonstrated in Sect. 6.3, where the effect of limiting the number of hanging nodes allowed per edge is also studied, and the results in either case are found to be very similar.

We consider the general convection-reaction-diffusion problem

$$\begin{aligned}-\nabla \cdot (\varvec{\kappa }\nabla u) + \varvec{\beta }\cdot \nabla u+ {\gamma }u= f, \end{aligned}$$

with coefficients

$$\begin{aligned} \varvec{\kappa }&= \begin{bmatrix} 1&0 \\ 0&1 \end{bmatrix}, \quad \varvec{\beta }= \begin{bmatrix}\cos (x) \exp (y) \\ \exp (x) \sin (y) \end{bmatrix}, \quad {\gamma }= \sin (2\pi x) \sin (2\pi y), \end{aligned}$$

and forcing function \(f\) chosen in accordance with two different benchmark solutions:

  1. Problem 1:

    posed over an L-shaped domain contained within \([-1,1]\times [-1,1]\) (depicted in Fig. 7a) and exhibiting low regularity at the reentrant corner located at the origin, along with a sharp Gaussian at the point (0.5, 0.5) which initially is not resolved by the mesh. This problem has the solution

    $$\begin{aligned} u(x, y) = r^{2/3} \sin (2\theta /3) + \exp (-(1000(x-0.5)^2+1000(y-0.5)^2)),\nonumber \\ \end{aligned}$$

    where \((r,\theta )\) are the usual polar coordinates centred around the point \((x,y) = (0,0)\), depicted in Fig. 6e;

  2. Problem 2:

    posed over \(\varOmega = (0,1)^2\) with a sharp layer in the interior of the domain and solution

    $$\begin{aligned} u(x, y) = 16x(1-x)y(1-y)\arctan (25x-100y+50), \end{aligned}$$

    depicted in Fig. 8e.

Fig. 6
figure 6

The behaviour of the error and estimator when applied to Problem 1 with solution (6.5) under adaptive refinement, each plotted against the number of degrees of freedom. a The error \(\Vert \nabla (u - \varPi ^0_k u_h ) \Vert _{0,\varOmega }\). b Estimated error. c Effectivity of the estimator. d The estimator components for \(p=1\). e Adaptive approximation

Fig. 7
figure 7

Some representative meshes from the adaptive refinement sequence for \(p= 1\) when solving Problem 1 with solution (6.5) together with the adaptive approximation on the final mesh. a The initial mesh. b After 28 refinement steps. c After 40 refinement steps

Fig. 8
figure 8

The behaviour of the error and estimator when applied to Problem 2 with solution (6.6) under adaptive refinement. a The error \(\Vert \nabla (u - \varPi ^0_k u_h ) \Vert _{0,\varOmega }\). b Estimated error. c Effectivity of the estimator. d The estimator components for \(p=1\). e Adaptive approximation after 40 refinement steps

Fig. 9
figure 9

Some representative meshes from the adaptive refinement sequence for \(p= 1\) when solving Problem 2 with solution (6.6). a The initial mesh. b After 25 refinement steps. c After 40 refinement steps

The behaviour of the error and estimator under adaptive refinement for Problem 1 and a representative set of the meshes obtained are shown in Figs. 6 and 7, respectively. The same results are shown for Problem 2 in Figs. 8 and 9. We first observe that, once the asymptotic regime is reached in each case, the error measured in the \(H^1(\varOmega )\)-seminorm (shown in Fig. 6a for Problem 1 and in Fig. 8a for Problem 2) converges with the theoretical optimal rate of \(N^{-p/2}\), despite the low regularity of the true solution around the reentrant corner for Problem 1.

The initial rapid drop-off in error for Problem 1 is explained by examining the magnitudes of the various components of the estimator for \(p=1\), given in Fig. 6d. In particular, it is clear that the data oscillation term initially dominates the estimator and, comparing with the mesh after 28 iterations, shown in Fig. 7b, it appears to be driving the refinement around the Gaussian centred at (0.5, 0.5). Once this is sufficiently resolved, the element and face residual terms begin to dominate, resulting in the expected refinement around the singularity at the reentrant corner. This is shown in Fig. 7c, after 40 iterations.

The key difficulty of Problem 2 is the presence of an interior sharp layer which is completely unresolved by the initial mesh. To test the resilience of the estimator in this challenging context, the initial mesh is chosen to consist of warped hexagons which are not aligned with the interior layer; see Fig. 9a for an illustration. As with Problem 1, the data oscillation terms initially dominate the estimator until the the mesh starts to resolve the layer. After this point, the element and edge residuals become the dominant terms of the estimator.

For both problems, the effectivity plots in Figs. 6c and 8c, calculated as in (6.3), indicate a good level of agreement between the estimated and calculated error.

6.3 Jumping diffusion coefficient

We now consider the Kellogg problem [32], in which the diffusion coefficient \(\varvec{\kappa }\) is piecewise constant across the domain \(\varOmega = (0,1)^2\), such that

$$\begin{aligned}\varvec{\kappa }(x,y) = {\left\{ \begin{array}{ll} b \qquad \text { for } (x - a)(y - a) \ge 0, \\ 1 \qquad \text { otherwise,} \end{array}\right. } \end{aligned}$$

for fixed \(0< a < 1\) and \(b > 0\), and no reaction or convection terms. This problem has weak solution \(u(r, \theta ) = r^{\alpha } g(\theta )\), where \((r,\theta )\) denote the polar coordinates centred at the point (aa), and

$$\begin{aligned} g(\theta ) := {\left\{ \begin{array}{ll} \cos \left( \left( \frac{\pi }{2} - \sigma \right) \alpha \right) \cos \left( \left( \theta - \frac{\pi }{4}\right) \alpha \right) \qquad &{}\text { for } 0 \le \theta< \frac{\pi }{2}, \\ \cos \left( \frac{\pi }{4}\alpha \right) \cos ((\theta - \pi + \sigma ) \alpha ) \qquad &{}\text { for } \frac{\pi }{2} \le \theta< \pi , \\ \cos (\sigma \alpha ) \cos \left( \left( \theta - \frac{5\pi }{4}\right) \alpha \right) \qquad &{}\text { for } \pi \le \theta< \frac{3\pi }{2}, \\ \cos \left( \frac{\pi }{4}\alpha \right) \cos \left( \left( \theta - \frac{3\pi }{2} - \sigma \right) \alpha \right) \qquad &{}\text { for } \frac{3\pi }{2} \le \theta < 2\pi . \end{array}\right. } \end{aligned}$$

The parameters \(\sigma , \alpha \) and b are required to satisfy a certain set of nonlinear relations [32], and following [15] we take the approximate values \(\sigma = 5.49778714378214\), \(\alpha = 0.25\), \(b = 25.27414236908818\).

The Kellogg problem is a common example used to test a posteriori estimators on a problem with pathological coefficients and a known weak solution. Typically, this problem is studied in the case when \(\kappa \) is piecewise constant with respect to the initial mesh, see, e.g. [14, 23, 34, 36]. Recently, the case in which the diffusion jumps are not aligned with the initial mesh has been studied in [15] in the context of adaptive FEM. To really test the applicability of our estimator, we consider both cases here on a variety of different meshes.

Fig. 10
figure 10

The behaviour of the estimator for the Kellogg problem (Sect. 6.3) when the discontinuities of \(\varvec{\kappa }\) are matched by the initial mesh, with \(p=1\) and either one (‘limited’) or an unlimited number of hanging nodes per edge (‘unlimited’). a The error \(\Vert \nabla (u - \varPi ^0_k u_h ) \Vert _{0,\varOmega }\). b Estimated error. c Effectivity of the estimator. d The non-zero components of the estimator. e Adaptive approximation

Whether the mesh is aligned with the problem or not is dictated by the parameter a. We first consider \(a=\frac{2}{5}\) on a square mesh, so the discontinuities of \(\varvec{\kappa }\) are matched by the initial mesh. The behaviour of the error and estimator for this problem are shown in Fig. 10. Moreover, for this problem, we also compare the effect of limiting the mesh to have just one hanging node per edge, or allowing an unlimited number of hanging nodes to be produced. In both cases, we use the Dörfler strategy from (6.4) with \(\theta = 0.6\) to select the subset of elements to be refined. See Fig. 11 for an illustration of the initial mesh and the final adapted meshes for both limited and unlimited hanging nodes per edge. In either case, the error under adaptive refinement eventually decays at the theoretical optimal rate of \(N^{-1/2}\), where N is the number of degrees of freedom. It may also be seen that the \(H^1\)-seminorm error is slightly lower for the case of a limited number of hanging nodes, although the estimated error is approximately the same for both cases. Consequently, the effectivity of the estimator is slightly better for the method with no limit on the number of hanging nodes.

Fig. 11
figure 11

The initial and final adapted meshes when solving the Kellogg problem (see Sect. 6.3) when the discontinuity in \(\varvec{\kappa }\) is aligned with the initial mesh. a The initial mesh. b The final mesh with at most one hanging node per edge. c The final mesh with no limit on the number of hanging nodes.

Fig. 12
figure 12

The behaviour of the estimator for the Kellogg problem (see Sect. 6.3) when the discontinuities in \(\varvec{\kappa }\) cannot align with any mesh in the sequence, using \(p=1\) and three different types of mesh, depicted in Fig. 13. a The error \(\Vert \nabla (u - \varPi ^0_k u_h ) \Vert _{0,\varOmega }\). b Convergence of the estimator. c Effectivity of the estimator. d The non-zero components of the estimator on the square mesh

Fig. 13
figure 13

The initial and final adapted meshes when solving the Kellogg problem (see Sect. 6.3) when the discontinuity in \(\varvec{\kappa }\) cannot align with any mesh in the sequence. a The initial square mesh. b The initial Voronoi mesh. c The initial randomised quadrilateral mesh. d The final square mesh with no limit on the number of hanging nodes. e The final Voronoi mesh with no limit on the number of hanging nodes. f The final randomised quadrilateral mesh with no limit on the number of hanging nodes. g The final square mesh with one hanging node per edge. h The final Voronoi mesh with one hanging node per edge. i The final randomised quadrilateral mesh with one hanging node per edge

Fig. 14
figure 14

The element of the mesh containing the intersection of the lines along which \(\varvec{\kappa }\) is discontinuous, before and after refinement. Solid lines indicate edges in the mesh, dotted lines indicate the lines along which \(\varvec{\kappa }\) jumps. a The lines along which \(\varvec{\kappa }\) is discontinuous pass close to the centre of the element. b After refinement, the discontinuities of \(\varvec{\kappa }\) are suddenly very close to mesh edges

Next, we consider \(a = \frac{2 \sqrt{2}}{5}\). In this case, it is not possible for the discontinuities of \(\varvec{\kappa }\) to align with any mesh in the sequence. In the spirit of keeping the mesh fully unfitted from the discontinuities in \(\varvec{\kappa }\), we also test the method on a Voronoi mesh and a randomised quadrilateral mesh alongside a more standard square mesh. For brevity, we only report here the results when an unlimited number of hanging nodes were allowed in the mesh, as limiting the number of hanging nodes leads to almost identical results in terms of convergence. There are, however, differences in the final meshes obtained in each case: illustrations of the initial meshes and the final meshes for both limited and unlimited hanging nodes are given in Fig. 13. Figure 12 shows the behaviour of the error and estimator under adaptive refinement on the three sequences of meshes: the error in the \(H^1(\varOmega )\)-seminorm (Fig. 12a) appears to reach the theoretical optimal convergence rate of \(N^{-1/2}\) on the square and randomised quadrilateral meshes, and maintains a near-optimal rate of approximately \(N^{-0.35}\) on the Voronoi mesh. These rates are also reflected by the convergence of the estimator, shown in Fig. 12b, resulting in good effectivities (Fig. 12c) which remain roughly constant on the Voronoi and randomised quadrilateral meshes.

We note the sudden jump in the magnitude of the estimated error after 7 iterations of adaptive refinement starting from the square mesh in Fig. 13a. Comparing with Fig. 12d, which shows the relative magnitudes of the various terms comprising the estimator on the square mesh, it is apparent that this jump is caused by a jump in the value of the data oscillation term \(\varTheta \). Noting that for \(p=1\) the coefficient approximation \({{\varvec{\kappa }}_h}\) is piecewise constant, we conclude that it is in fact only the edge data oscillation term which is non-zero and thus responsible for this effect. Further investigation indicates that this jump occurs in situations such as that illustrated in Fig. 14, and is due to the fact that although the mesh cannot exactly align with the discontinuities of \(\varvec{\kappa }\), it is possible for it to get arbitrarily close. This is a highly desirable trait from the point of view of generating a well-adapted mesh. Nonetheless, the standard (isotropic) refinement strategy used on squares produces a mesh with edges close to the diffusion discontinuity, such as the ones depicted in Fig. 14b, only if the previous iteration contains elements as in Fig. 14a with the lines of discontinuity of \(\varvec{\kappa }\) passing close to its centre. This is problematic because the roughly equal distribution of the central element in Fig. 14a and its four neighbours among the different zones of \(\varvec{\kappa }\) mean that the approximation \({{\varvec{\kappa }}_h}\) will be very similar on each of the five elements, and thus the edge term of the data oscillation indicator will be very small. However, once this parent is refined, each child is almost entirely in a single zone of \(\varvec{\kappa }\), so the approximations \({{\varvec{\kappa }}_h}\) will be very different on each of the children. This will, then, cause the reported error to dramatically increase. Moreover, since the discontinuities of \(\varvec{\kappa }\) lie along lines with irrational coordinates, it is clear that this situation could occur an arbitrary number of times in the refinement sequence, causing problems with the effectivity of the estimator. Clearly the real culprit here is the symmetry of the situation and, consequently, a way to prevent such problems occurring is to use unstructured meshes. This claim can be substantiated by the fact that the same difficulty does not occur with the randomised quadrilateral or Voronoi meshes.

7 Conclusions and extensions

We have derived and analysed a residual a posteriori error estimate for the \(C^0\)-conforming virtual element method of [21] applied to general second order elliptic problems with nonconstant coefficients. This analysis has given rise to a fully computable a posteriori error estimator which we have shown to be equivalent to the error between the true solution and the virtual element approximation, measured in the energy norm. The analysis rests crucially on a new Clément-type interpolation result. We have also presented an extensive set of numerical results to demonstrate the behaviour of this estimator when used to drive an adaptive algorithm on a variety of problems using several families of meshes, consisting of general polygonal elements.

We stress that the analysis above can also be applied to other related virtual element formulations of the same problem subject to only minor modifications. For instance, the same a posteriori analysis can be applied to the corresponding VEM obtained by discretising problem (2.4) directly, without splitting the differential operator into its symmetric and skew-symmetric parts. The resulting local discrete bilinear form would take the form

$$\begin{aligned} A_h^E(u_h, v_h)&:= (\varvec{\kappa }\varPi ^0_{p-1} \nabla u_h, \varPi ^0_{p-1} \nabla v_h)_{E} + (\varvec{\beta }\varPi ^0_{p-1} \nabla u_h, \varPi ^0_{p} v_h)_{E}\\&\quad \ + ({\gamma }\varPi ^0_{p} u_h, \varPi ^0_{p} v_h)_{E} + S^{E}(({\text {I}}- \varPi ^0_{p}) u_h, ({\text {I}}- \varPi ^0_{p}) v_h), \end{aligned}$$

(see [9] for a similar approach). The analysis would provide the same a posteriori error estimator presented in Theorem 13, but without the term \(\varPsi ^{E}_3\).