Exponential convergence of the hp virtual element method in presence of corner singularities

In the present work, we analyze the hp version of virtual element methods for the 2D Poisson problem. We prove exponential convergence of the energy error employing sequences of polygonal meshes geometrically refined, thus extending the classical choices for the decomposition in the hp finite element framework to very general decomposition of the domain. A new stabilization for the discrete bilinear form with explicit bounds in h and p\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p$$\end{document} is introduced. Numerical experiments validate the theoretical results. We also exhibit a numerical comparison between hp virtual elements and hp finite elements.

classical triangular/tetrahedral and quadrilateral/hexaedral meshes. This automatically includes nonconvex elements, hanging nodes (enabling natural handling of interface problems with nonmatching grids), easy construction of adaptive meshes and efficient approximations of geometric data features.
Among the properties of VEM, in addition to the employment of polytopal meshes, we recall the possibility of handling approximation spaces of arbitrary C k global regularity [19,25] and approximation spaces that satisfy exactly the divergence-free constraint [3,18].
The main idea of VEM consists in enriching the classical polynomial space with other functions, whose explicit knowledge is not needed for the construction of the method (this explains the name virtual) . We point out that the literature concerning methods based on polytopal meshes is not restricted to the VEM. A (very short and incomplete) list of other polytopalbased methods follows: hybrid high order methods [31], mimetic finite difference [16,24], hybrid discontinuous Galerkin methods [30], polygonal finite element method [36,43,48], polygonal discontinuous Galerkin methods [27], BEM-based finite element methods [46].
In all these works, the convergence of VEM approximations has been achieved by increasing the number of mesh elements while keeping the degree of local approximation fixed. In other words and according to the existing terminology, these methods utilize the h-version of VEM. An alternative avenue to construct convergent approximations is the p-version of VEM which is based on increasing the degree of local approximations while keeping the underlying mesh fixed. A combination of both methodologies is termed the hp-version of VEM.
The recent work [14] provides a mathematical ground for the p-version of VEM for the two-dimensional Poisson problem. In particular, it includes the a priori convergence theory for the p-and hp-version of VEM on quasiuniform polygonal meshes and for uniform distributions of local degrees of approximation. An exponential convergence has been established for analytic solutions, see [14,Theorem 5.2], and convergence at algebraic rates for solutions having finite Sobolev regularity, see [14,Theorems 4.1,5.1].
The objectives of the present paper are the following. First, we generalize the results in [14] and in particular the definition of H 1 conforming VEM to the case with varying local degree of accuracy from element to element. Such construction fits very naturally in the framework of VEM without additional complications. Furthermore, we extend the hp-VEM approach to nonquasiuniform approximations and prove their exponential convergence for nonsmooth solutions having typical corner singularities, see [5,6].
Similarly to the hp-version of FEM (see [47] and the references therein) the approximation is based on geometrically refined meshes with appropriate (linearly varying) local degree of accuracy. In order to derive the proofs, we introduce a new stabilization for the method, which turns out to be more suitable for the p and hp version of VEM. Importantly, explicit bounds of the new stabilizing term with respect to the H 1 seminorm in terms of the local degree of accuracy are shown. This proof requires a particular inverse estimate on polynomials, presented in the "Appendix". To the best of the authors knowledge, such an inverse estimate has been never published before.
We highlight that new tools for the approximation by means of functions in the VE space are presented. Such tools permits to avoid additional assumptions on the polygonal decomposition of the computational domain.
The structure of the paper is the following. After presenting the model problem and the VEM in Sects. 2 and 3 respectively, we deal with explicit bounds in terms of the degree of accuracy of a new stabilization term in Sect. 4. In Sect. 5 we show the approximation results and the main theorem of the paper, namely the exponential convergence of the energy error in terms of the dimension of the virtual space, while in Sect. 6 we validate the theoretical results with numerical tests, including a set of experiments on the comparison between hp FEM and hp VEM. Finally, in the "Appendix", we discuss a hp polynomial inverse estimates on polygons needed for proving the stability bounds of Sect. 4.

Model problem
In this section, we discuss the functional space setting and the model problem under consideration.
Firstly, we introduce the functional spaces that will be used throughout the paper. Let ∈ N and let D ⊆ R 2 be a given domain whose closure contains the origin, i.e. 0 ∈ D. We denote with L (D) the Lebesgue space of -summable functions and we denote with H (D) the Sobolev space of order on the domain D, respectively; let · ,D and | · | ,D be the Sobolev norm and seminorm, see [1]. Let where: We define the weighted Sobolev spaces where the corresponding weighted Sobolev norm reads Having this, we recall the countably normed spaces (also known as Babuška spaces), see [47] and the references therein: where c ≥ 0 and d ≥ 1 are two constants depending on u and D. We point out that space (4) is not empty since it contains functions of the form u = |x| α , for some α > 0. Definition (4) can be generalized to the case of functions with "multiple" singularities at the vertices of a polygonal domain i.e. adding in definition (1) weights of the form |x − x 0 |, x 0 being a vertex of the polygon different from 0; see [47]. In particular, defining N V (D) and the number and set of vertices of D respectively, we denote the general space with H m, ) are the vectors associated with the singularities at the vertices of ω. The associated weight function reads: Secondly, we introduce the model problem. Let Ω be a open simply connected polygonal domain. Let f ∈ L 2 (Ω) be given. We consider the two dimensional Poisson problem: and its weak formulation: where (·, ·) 0,Ω is the L 2 scalar product on Ω and a(·, ·) = (∇·, ∇·) 0,Ω . The Lax-Milgram lemma guarantees the existence of a unique weak solution u ∈ V . We recall a regularity result for the solution of problem (6).
be the number and the set of vertices of Ω respectively; let ω i be the (internal) angle associated with vertex A i , i = 1, . . . , N V . To each A i , we associate the set of the so-called singular exponents for Poisson problem with Dirichlet condition (see also [47, formula (4.2.2)]): Then, the following holds: Theorem 1 Following the notation of problem (6), let f ∈ H 0,0 β (Ω), β ∈ [0, 1); assume that the singular exponents α i defined in (7) satisfy: Then the solution of (6)belongs to H 2,2 β (Ω) and the a propri estimate: Proof See [5,6].
For the sake of simplicity, we assume in the rest of the paper that: 0 ∈ ∂Ω is the only vertex at which the solution u of problem (6) can be singular.
Finally, we point out that throughout the paper we write a ≈ b and a b meaning that there exist c 1 , c 2 and c 3 positive constants independent of the discretization parameters, such that c 1 a ≤ b ≤ c 2 a and a ≤ c 3 b respectively; we also denote by P (K ) and P (E) the spaces of polynomials of degree over a polygon K and an edge E respectively.

Virtual element spaces with non uniformly distributed degree of accuracy
In this section, we introduce a VEM for problem (6) with nonuniform local degree of accuracy. Let {T n } be a sequence of polygonal decompositions of the domain Ω. The approximation have a "geometric layer" structure; hence, in the sequel, the integer n represents the number of layers used for the corner singularity refinement as in [47]; see Sect. 5.1 for the precise definition of layers.
Let V n be the set of vertices of T n , V b n = {ν ∈ V n | ν ∈ ∂Ω} be the subset of boundary vertices, E n be the set of edges E of T n , E b n = {e ∈ E n | e ⊆ ∂Ω} be the subset of boundary edges. To each K ∈ T n , we associate h K = diam(K ), We require the two following basic assumptions on the regularity of the decomposition: (D1) ∀K ∈ T n , K is star-shaped with respect to a ball of radius greater than or equal to h K γ , where γ is a positive constant independent of the decompositions; see [23] for the definition of star-shapedness. We note that this condition can be satisfied by possibly many balls. Henceforth, we fix for each K ∈ T n a unique ball B(K ). (D2) ∀K ∈ T n , ∀E ∈ E K , |E| ≥ h K γ , where γ is a positive constant independent of the decompositions. Moreover, ∀K ∈ T n , card(E K ) is uniformly bounded.
More technical assumptions on the mesh will be introduced in Sect. 5 for the construction of proper geometric meshes.

Remark 1
Assuming that (D1) and (D2) hold true, then the following is also valid. The subtriangulation T n (K ) of K obtained by joining the vertices of K to the center of the ball B(K ) introduced in assumption (D1) is made of triangles that are star-shaped with respect to a ball of radius greater than or equal to γ 1 h T , h T being the diameter of T , ∀T ∈ T n (K ), and γ 1 being a positive constant independent of the decompositions.
Given K ∈ T n , let i K be the position of polygon K in the ordered sequence T n . Let p ∈ N card(T n ) . We associate to each K ∈ T n the local degree of accuracy p i K = (p) i K . In order to simplify the notation, we write p K := p i K .
Henceforth, we assume that T n is a conforming decomposition into polygons of Ω, i.e. for all edges E ∈ E, either E belongs to two polygons if it is an internal edge or it belongs to a single polygon if it is a boundary edge.
In the former case, it must hold that there exist K 1 , K 2 ∈ T n such that E ∈ E K 1 ∩E K 2 ; we associate to E the degree p E = max{ p K 1 , p K 2 }, that is we adopt the so-called maximum rule; see Remark 2 for further comments. In the latter case, let K ∈ T n be the unique polygon in the decomposition such that E ∈ E K ; we associate to E the degree p E = p K .
Let K ∈ T n . We firstly define the space of piecewise continuous polynomials on the boundary of K : The local virtual space on K reads: with the convention P −1 (K ) = 0 and where B(∂ K ) is defined in (9). Definition (10) and the maximum rule immediately imply that P p K (K ) ⊆ V (K ).
We associate with the local space the following set of degrees of freedom: -the values at the vertices of K ; -the values at p E − 1 internal nodes (e.g. Gauß-Lobatto nodes) for all E ∈ E K ; -the scaled internal moments of the form: where |α|=0 is a properly chosen basis of P p K −2 (K ); see Remark 5 for possible explicit choices of the polynomial basis. This is in fact a set of degrees of freedom for the local space (10); see [10]. If we set dof i the i-th degree of freedom, i = 1, . . . , dim(V (K )), then we can define the local virtual canonical basis {ϕ j , j = 1, . . . , dim(V (K ))} by: The global virtual space is obtained by matching the boundary degrees of freedom on each edge, i.e.: We note that we can split the global continuous bilinear form a(·, ·), introduced with the continuous problem (6), into a sum of local contributions as follows: We observe that we cannot compute the bilinear form a(·, ·) applied on virtual functions since it is not possible in principle to know the values of such functions at any internal points of each polygon. The same argument applies to the computation of the righthand side. For this reason, we must approximate both the stiffness matrix and the right-hand side. Thus, the structure of VEM approximation is based on the two following ingredients which are defined in what follows: -a symmetric bilinear form a n : V n × V n → R, which we decompose into a sum of local symmetric bilinear forms a K n : V (K ) × V (K ) → R as follows: -a piecewise discontinuous polynomial f n ,which is piecewise of degree p K , and the associated linear functional ( f n , ·) 0,Ω .
The discrete bilinear form a n (·, ·) and the discrete right-hand side f n are chosen in such a way that the discrete counterpart of (6) find u n ∈ V n such that a n (u n , v n ) is well-posed and it is possible to recover local hp-estimates analogous to those proved in [14]. We start by discussing the construction of the discrete bilinear form. We require that a K n in (15) satisfy the two following assumptions: (A1) polynomial consistency: ∀K ∈ T n , it must hold: (A2) local stability: ∀K ∈ T n , it must hold where 0 < α * ( p K ) ≤ α * ( p K ) < +∞ are two constants which may depend only on the local degree of accuracy p K .
On each K ∈ T n , we can introduce a local energy projector Π ∇,K When no confusion occurs, we write Π ∇ p K in lieu of Π ∇,K p K . Note that condition (19b) only fixes an additive constant of the projection and can be modified if necessary, see [2,10]. Importantly, this local energy projector can be computed by means of the degrees of freedom of space (10), see [10,12], without the need of knowing explicitly functions in the virtual space.
In [10,12], it was also shown that a computable candidate for a K i n may have the following form: where S K is any computable symmetric bilinear form on V (K ) such that: where 0 < c * ( p K ) ≤ c * ( p K ) < +∞ are two constants, which may depend on the local degree of accuracy p. In [10] it was shown that (21) implies (18) with: Now, we introduce a computable discrete loading term f n . Let S p,−1 (Ω, T n ) be the set of piecewise discontinuous polynomials over the decomposition T n of degree p K on each K ∈ T n . For ∈ N, let Π 0,K := Π 0 be the L 2 (K ) projector from the local space (10) to P (K ), the space of polynomials of degree over K ; such a projector can easily be computed whenever ≤ p K − 2 by means of the internal degrees of freedom of the space (10), see [12].
We define the discrete loading term as follows: f n ∈ S p,−1 (Ω, T n ) is such that A deeper analysis on the discrete loading term can be found in [2] and [11]. We remark that in this paper we do not consider the case of approximation with p K = 1 in order to avoid technical discussions on the right-hand side.

Remark 2
We point out that in the definition of the local virtual space (10), we fixed the degree of the edge to be the maximum of the degree of the two adjacent polygons (maximum-rule). One could also fix such an edge degree to be the minimum of the degree of the neighbouring polygons (minimum-rule). The first choice leads to P p K (K ) ⊆ V (K ); therefore, it is possible to recover local (i.e. on each polygon) classical hp-estimates, see [14]. On the other hand, in view of Sect. 5 also the choice of the minimum would yield the same convergence result.
Let F K n , K ∈ T n , be the smallest positive constants such that: and let where α * ( p K ) and α * ( p K ) are introduced in (18). We show how the energy error |u − u n | 1,K can be bounded, u and u n being respectively the solutions of (6) and (16). We carry out, in particular, an abstract error analysis which is similar to the one presented in [10]; nevertheless, we decide to show the details, since assumption (A2) is here weaker than its h counterpart in [10], where the stability constants α * ( p K ) and α * ( p K ) are assumed to be independent of the local degree of accuracy. (6) and (16) respectively. Then, for all u I ∈ V n and for all u π ∈ S p,−1 (Ω, T n ), it holds that

Lemma 1 Assume that (A1) and (A2) hold. Let u and u n be the solutions of problems
where F K n and α(K ) are defined in (24) and (25) respectively.
Proof Given any u π ∈ S p,−1 (Ω, T n ) and u I ∈ V n : (6), (16) ≤ max where the Cauchy-Schwarz inequality has been used in the last step. Applying a triangular inequality, we get: This finishes the proof.

Stability
In this section, we present an explicit choice for the stabilizing bilinear form S K introduced in (21) and we discuss the associated stability bounds (21) in terms of the local degree of accuracy. Our choice for the stabilization is the following: We note that this local stabilization term is explicitly computable by means of the local degrees of freedom, since on the boundary virtual functions are known polynomials and the L 2 projections are computable using only the internal degrees of freedom, see [12].
Following the guidelines of [47, formula (4.5.61)], that is the p-version of the Aubin-Nitsche duality argument, it holds for a convex K : Note that, in order to apply the Aubin-Nitsche argument, we use the fact that v n − Π ∇ p K v n ∈ ker(Π ∇ p K ), which guarantees that v n − Π ∇ p K v n has zero average on K ; for a detailed proof see [29,Lemma 3.2].
Assume now that K is nonconvex. Let π < ω K < 2π the largest angle of K . Then, the Aubin-Nitsche analysis in addition to interpolation theory [49,50] and regularity of solutions of elliptic problems on polygonal domains [5,6] can be refined giving: for all ε > 0. We now prove the following result.
Theorem 2 Assume that p K , the degree of accuracy of the method on the element K , coincides with the polynomial degrees p E , for all edges E ∈ E K of polygon K . Then, using definition (27), the bounds in (21) hold with: for all ε > 0, where ω K denotes the largest angle of K .
Proof We assume without loss of generality that the size of polygon K is 1. The general result follows from a scaling argument. We start by proving the estimate on c * ( p K ). Integrating by parts, we obtain for v n ∈ ker(Π ∇ p K ): We split our analysis into two parts. We firstly investigate the integral over K in (31). For this purpose, we need a technical result, namely the following hp polynomial inverse estimate in two dimensions, see Theorem 5 (which can be applied thanks to Remark 1): where we denote with · −1,K the dual norm associated with H 1 0 (K ), i.e.
Subsequently, we note that, owing to (32), we have: As a consequence: Next, we turn our attention to the integral over ∂ K in (31). Applying a Neumann trace inequality (see e.g. [47, Theorem A33]): Then, we use (33) on the second term in the first factor and a one dimensional hp inverse estimate in addition to interpolation theory on the second factor (see [49,50]), thus obtaining: Plugging (34) and (35) in (36), we deduce: Next, we estimate c * ( p K ). Let v n ∈ ker(Π ∇ p K ), then: We estimate the three terms separately. We begin with the first one. Applying the multiplicative trace inequality (see e.g. [23]), the p version of the Aubin-Nitsche duality argument (28) for convex K and (29) for nonconvex K : where we recall ω K is the largest angle in K for all ε. We now deal with the second term; using [14, Lemma 4.1]: where in the last inequality we used that v n − Π ∇ p K v n has zero average on ∂ K . Finally, we treat the third term; using Aubin-Nitsche argument (28) and its modified version for nonconvex polygon (29): Collecting the three bounds, we obtain the claim.

Remark 3
In order to keep the notation simpler, we proved Theorem 2 assuming that the polynomial degrees p E on each edge E ∈ E K coincide with the degree of accuracy In view of the forthcoming definition (59), the case of interest in the following satisfies such condition and therefore, for the proof of the main result of this work, namely Theorem 3, we do not use directly Theorem 2, but its nonuniform degree version.
As a consequence of Theorem 2, the quantity α(K ), defined in (25), can be bounded in terms of p K as follows: Remark 4 Owing to [22, formula (2.14)], we could replace the boundary term of S K , defined in (27) where c is a positive universal constant. We could replace in (27) the L 2 integral on the boundary with a piecewise Gauß-Lobatto combination, mapping each edge on the reference interval I and using (40); the advantage of such a choice is that we can automatically use the nodal degrees of freedom on the skeleton, assuming that they have a Gauß-Lobatto distribution on each edge. The boundary term of the new stabilization is now very close to the classical stabilization choice (see e.g. [10] and [14]) and its implementation is much easier than the implementation of (27), where one should reconstruct polynomials on each edge; in fact, it suffices to take instead of the Euclidean inner product of all the degrees of freedom only the boundary one with some Gauß-Lobatto weights.
For additional issues concerning the stabilization (only for the h version of VEM) see [17], while for more details concerning the implementation of the method we refer to [12].
It is worth to mention that stabilization (27) is not the only one available in the context of hp VEM, but it has the merit of having explicit bounds in terms of p in the stability constants c * ( p K ) and c * ( p K ) introduced in (21). Such dependence is algebraic in p K and this will allow us to prove the exponential convergence of the energy error in Theorem 3.
As a consequence, every stabilization satisfying stability bounds of the form: where r 1 and r 2 are two positive universal constants, works fine for proving exponential convergence of the method. Finally, we note that at the practical level, as investigated in [41], choosing different stabilizing forms has little effect on the results.

Numerical tests for the stability bounds
In Theorem 2, we proved the stability bounds (21) for a possible choice of S K . Such bounds, which also reflect on α * ( p K ) and α * ( p K ) introduced in (18), are rigorously proven but have a quite stray dependence on p. In the following, we check numerically whether the dependence on p of the above-mentioned constants is sharp.
In order to do that, we note that finding α * ( p K ) and α * ( p K ) in (18) is equivalent to find the minimum and maximum eigenvalues λ min and λ max of the generalized eigenvalue problem: Here, denotes the virtual canonical basis of V (K ), see (12). We are adopting the usual notation, by calling v n ∈ R dim(V (K )) the vector of the degrees of freedom associated with v n ∈ V (K ).
We note that we restrict our analysis on functions having zero average on K , since both A K n and A K have constant functions in their kernel; this strategy allows to avoid the problems related to solving the generalized eigenvalue problem for singular matrices. Moreover, the entries of matrix A K are not computable exactly, since virtual functions are not known explicitly; therefore, we approximate them by solving numerically the associated diffusion problem, by means of a fine and high-order finite element approximation.
In Table 1, we present the results on three different types of polygon (namely, those which we employ for the tests in the forthcoming Sect. 6): a square, a nonconvex decagon (like any of the polygons in the outer layer of Fig. 1, right), a nonconvex hexagon (like any of the polygons in the outer layer of Fig. 1, center).
The maximum generalized eigenvalue always scales like 1. On the contrary, the minimum eigenvalue behaves in all the three cases like p −1 . This means that in fact the bounds of Theorem 2 are abundant, whereas the actual behaviour of the stability bounds may be much milder. Unfortunately, currently we are not able to improve the stability bounds of Theorem 2. It is worth mentioning that this has no impact on the asymptotic exponential convergence results in the next section.
The nonmonotonicity of the eigenvalues in Table 1 is due to the fact that the matrices A K n are associated with bilinear forms which vary in n, see (20), since their definition also depend on the choice of the stabilization.  We also observe that a similar numerical investigation was also performed in [14,Section 6.4] for the hp version of VEM on quasi-uniform meshes, giving analogous results; moreover, always in [14], numerical evidence shows that the actual effect of the stabilization on the error slopes of the p version of VEM when approximating finite Sobolev regularity solutions is extremely mild.

Exponential convergence for corner singularity on geometric meshes
In this section, we want to show that exponential convergence is achieved if geometric mesh refinement and degree of accuracy distribution are chosen appropriately.
In order to achieve such a convergence we employ geometrically graded polygonal meshes, which are discussed in Sect. 5.1. Then, we show in Sect. 5.2 estimates for the first and the second terms in the error decomposition (26), in particular proving bounds for the local right-hand side approximation and for the local best approximation by means of polynomials. In Sect. 5.3, we obtain estimates for the third term in (26), in particular illustrating bounds for the best approximation by means of functions belonging to the virtual space defined in (13). Finally, in Sect. 5.4, under a proper choice for the polynomial degree vector p introduced in Sect. 3 and the sequence {T n } n of polygonal decompositions, we combine together the above error bounds; as a consequence, we guarantee exponential convergence for the error in the energy norm in terms of the number of degrees of freedom of the global virtual space V n defined in (13).

Geometric meshes
Here, we describe a class of sequences of nested geometric meshes which we employ in order to show error convergence. We recall we are assuming that the only "singular" corner is the origin 0 ∈ ∂Ω, see (8). Let σ ∈ (0, 1) be the grading parameter of the mesh.
The decomposition T n consists of n + 1 layers defined as follows. We set L 0 the 0-th layer as the set of all polygons K in decomposition T n such that 0 ∈ V K n ; next, we define by induction: We set T 0 = {Ω}. Given T n , the decomposition T n+1 is obtained by refining T n only in the layer around the singularity (i.e. L 0 ). We require that at level n the decomposition satisfies the following grading condition.
(D3) The diameter h K of K satisfies: Furthermore, the number of elements in each layer is uniformly bounded with respect to the discretization parameters. We also assume that p K ≥ 2. A more precise choice is discussed in the forthcoming definition (59).
Assumption (D3) justifies the name geometric for the sequence; more specifically, the closer a polygon is to 0 the smaller its diameter is. Moreover, it is possible to check that the ratio between the size of two neighbouring layers is proportional to 1−σ σ . As a consequence of assumption (D3), we also have, for K ∈ L j , h K ≈ σ n− j .
Example 1 A possible sequence satisfying (D1)-(D3) is the graded mesh of squares elements with hanging nodes on the L-shaped domain, that is used in [47, Definition 4.30], see Fig. 1 (left). We note that in the VEM context, this mesh contains pentagons and squares, whereas in the finite element counterpart the very same mesh is "afflicted" by the presence of squares with hanging nodes.
Example 2 As a second example, see Fig. 1 (center), we consider a mesh which is obtained by merging all the elements that correspond to one layer in the mesh from Example 1 in a single large element and by adding an oblique cut on the "central" diagonal. This mesh still cannot be used for conforming FEM approximations. Moreover, it satisfies assumptions (D1) and (D4). Fig. 1 (right). This mesh is obtained by merging all the elements that correspond to one layer in the mesh from Example 1 in a single large element. We observe that this mesh is made of n decagons and one hexagon around 0. Moreover, we want to stress the fact that this mesh, that cannot be used in the conforming FEM environment, needs many less degrees of freedom than the meshes in Examples 1 and 2. Finally, we point out that such a mesh does not satify the star-shapedness assumption (D1).

Example 3 Another choice is depicted in
We require the following additional assumption on the geometry of the decomposition, which we need in order to state the approximation results in Sects. 5.2 and 5.3.
(D4) Let T n be a geometric polygonal decomposition; write T n = T 0 n ∪ T 1 n , where T 0 n = L 0 and T 1 n = ∪ n j=1 L j . Then, there exists a collection C 1 n of squares, with edges not necessarily aligned with the coordinate axes, such that: -card(C 1 n ) = card(T 1 n ); for each K ∈ T 1 n , there exists Q = Q(K ) ∈ C 1 n such that Q ⊇ K and h K ≈ h Q ; in addition, it must hold dist(0, Q(K ))≈ h K ; -every x ∈ Ω belong at most to a fixed number of squares Q, independently on all the discretization parameters; -∀K ∈ T 0 n , K is star-shaped with respect to 0; moreover, the subtriangulation of K obtained by joining 0 with the other vertices is uniformly shape regular (γ being the shape-regularity constant).
We set Ω ext n = (∪ Q∈C 1 n Q) ∪ (∪ K ∈T 0 n K ). We point out that (D4) is a (rather) technical requirement, which has the scope to import some tools of hp FEM into hp VEM. Indeed, we will numerically prove in Sect. 6 that also meshes not satisfying (D4) may produce the expected convergence behaviour shown in Theorem 3, hinting that a potential improvement upon (D4) would immediately generalize the presented theory.
Assumption (D4) is in the spirit of the strategy of the overlapping square technique used in [14,27]. We here additionally require that squares covering polygons far from the singularity cannot cover also such a singularity (since in this case p approximation results would not hold, thus invalidating Theorem 3). We also stress that the decomposition in Example 3 does not satisfy neither (D1) nor (D4). Finally, we point out that instead of considering a decomposition of squares C n , it is possible to consider in (D4) a decomposition in sufficiently regular quadrilaterals (e.g. parallelograms), since the same analysis by means of Legendre polynomials that follows (for instance in Lemmas 2 and 4) could be performed.

Local approximation by polynomials
Here, we deal with the approximation of the first and the second term in the right-hand side (26). What we are going to prove are hp approximation properties by means of local polynomials on polygons. In hp-FEM literature, classical approximation of this type is not effectuated on general polygons but only on squares and triangles, see [7,8,37,40,47] and the references therein.
The basic tool behind this approach is the employment of orthogonal bases, namely tensor product of Legendre polynomials on the square, see [47], and Koornwinder polynomials (that is collapsed tensor product of Jacobi polynomials) on triangles, see [32,39]; with such basis, explicit computations can be performed, owing to properties of Legendre and Jacobi polynomials. On a generic polygon an explicit basis with good approximation properties is not available.
The error analysis follows the lines of [14,47] and is summarized below. Let p be the vector of the local degree of accuracy on each polygon. We recall that we denote with S p,−1 (T n , Ω) the space of piecewise discontinuous polynomials over the decomposition T n of degree p K on each polygon p K .
The first result is a polynomial approximation estimate regarding regular functions on polygons far from the singularity. This result is used for the approximation of the local second term in (26) for the elements K separated from the singularity.
Proof The result follows from classical scaling arguments and [47,Lemma 4.53].
Here, we only give the idea of the proof. Firstly one encapsulates polygon K into the corresponding square Q(K ). It is possible to bound the left hand side of inequality (44) with the same (semi)norm on the square. After that, the square is mapped into the reference square Q = [−1, 1] 2 and a p analysis by means of tensor product of Legendre polynomials is developed (see [47,Theorem 4.46]). Subsequently, the reference square is pushed forward to square Q. Using the property of the geometric mesh stated in assumption (D3) and [47,Lemma 4.50], the result follows.
Estimate on polygons around the singularity are discussed in the following lemma. We point out that for the error control in layer L 0 we can work directly on the element without the need of employing covering squares, as effectuated for the analysis on the polygons of the other layers, see Lemma 2. The proof is an extension to polygonal domains of that in Theorem [47,Lemma 4.16].
Proof We start by proving the following Hardy inequality on polygons with a vertex at 0. Let α > 0, let be given a function u such that K |x| α |D 1 u| 2 < +∞ and u ∈ C 0 (K ). Then: We consider the regular subtriangulation by joining 0 with the other vertices of K ; the existence of such a decomposition is guaranteed by assumption (D4). Thanks to [47,Lemma 4.18], the "triangular" counterpart of (46) holds: It suffices then to split the integral over K into a sum of integrals over the triangles of the subtriangulation, apply (47) and collect all the terms. Using (46) and applying the argument of [47,Lemma 4.19] to the polygon K , we observe that H 2,2 β (K ) is compactly embedded in H 1 (K ). Using such a compact embedding and proceeding as in [47,Lemma 4.16], the following inequality holds true for a polygon K star-shaped with respect to 0: where is a set of three arbitrary nonaligned vertices of K . Let Φ be the linear interpolant of u at A i , i = 1, . . . , 3. Then, plugging U = u − Φ in (48), noting that U (A i ) = 0, i = 1, 2, 3, and using the geometric assumption (D3), we get the claim.
We note that (45) does not rely on p approximation results, but only on scaling argument. This is enough in order to prove the main result of this work, that is Theorem 3, and it is in accordance with the choice of the vector of local degrees of accuracy that is effectuated in the forthcoming definition (59). We emphasize that this is in the spirit of classical hp refinement, see [47].
We now turn our attention to the approximation of the first local term in (26), i.e. to the local approximation of the loading term. Since we are approximating it with piecewise polynomials of local degree p K − 2, we set p = p − 2, i.e. ∀K ∈ T n , p K = p K − 2. We have, for all v n ∈ V n : where we recall we are assuming for the sake of simplicity p K ≥ 2 for all K ∈ T n , see Sect. 3.
As above, we develop a different analysis for polygons near and far from the singularity. We start with the "far" case.
Proof It suffices to use a Cauchy-Schwarz inequality in (49), standard bounds for the projection errors and analogous estimate to those in Lemma 2.
Assume now that K is an element in the finest level L 0 . We work here a bit differently from what we did in Lemma 3. In particular we get the following.
Proof Using a Cauchy-Schwarz inequality and Bramble Hilbert lemma (see [23]), we obtain: We point out that for the proof of Lemmas 3 and 5 we work directly on the polygon without the need of using the covering squares technique of assumption (D4), like in Lemmas 2 and 4. This justifies the fact that in assumption (D4) we did not require the existence of a collection of squares C 0 n associated with the finest layer L 0 but only the existence of collection C n 1 associated with all the other layers.

Approximation by functions in the virtual space
Here, we treat the approximation of the third term in the right-hand side of (26). We observe that this term has two main differences with respect to the other two. The first difference is that we need an approximant u I which is globally continuous; the second one is that u I is not a piecewise polynomial but a function belonging to the virtual space V n . As done in Sect. 5.2, we split the analysis into two parts. Firstly, we work on polygons not abutting the singularity, see Lemma 6; secondly, we work on elements K in the first layer L 0 , see Lemma 7.
Proof Before starting the proof, we observe that the boundary norm in the right-hand side of (50) exists, since u ∈ B 2 β (Ω) implies that u ∈ H t (K ) for all t ∈ N and polygons K / ∈ L 0 . We define u I as the weak solution of the following problem: where π u ∈ B(∂ K ), see (9), is defined in the following way. Assume for the time being that K / ∈ L 1 . Let I = [−1, 1]. Given an edge E ⊆ ∂ K , π u is defined as the push-forward of a function π u ∈ P p E ( I ) which we fix as follows. Let u be the pull-back of u| E on I . Then, π u is the Legendre expansion of u up to order p E − 1. In particular, we write: Here {L i (ξ )} ∞ i=0 is the L 2 ( I ) orthogonal basis of Legendre polynomials, with L i (−1) = (−1) i and L i (1) = 1. Next, we define π u as: It is possible to prove that π u interpolates u at the endpoints of I using the definition of π u and the fundamental theorem of calculus. Recalling [47, Theorem 3.14] and using simple algebra, the following holds true: Applying a scaling argument, interpolation theory (see [49,50]) and summing on all the edges, we get: If now K ∈ L 1 , we define π u| E as above if E does not belong to the interface between L 0 and L 1 , otherwise u I is defined as the linear interpolant of u at the two endpoints of E. We point out that (54) remains valid also if K ∈ L 1 paying an additional constant c 2s K +1 , since p K ≈ 2 whenever K ∈ L 1 . We also note that (54) implies, recalling that p E ≈ p K if E ⊆ ∂ K and following the ideas in [47,Lemma 3.39]: where we recall that j denotes the number of the layer to which K belongs.
We are now ready to prove the error estimate. For arbitrary constants c 1 and c 2 , there holds (also recalling that , see Theorem 1 and definition (4). Assume that p K = 2 if K ∈ L 0 and p K ≈ 2 if K ∈ L 1 . Then there exists u I ∈ V (K ) such that: where we recall that Π 0 p K −2 is the L 2 (K ) orthogonal projection from V (K ) into P p K −2 (K ), σ is the grading factor discussed in assumption (D3) and n + 1 is the number of layers.
Proof We consider u I defined as in (51); in particular, we fix π u, the trace of u I on ∂ K to be the piecewise affine interpolant of u at the vertices of K . From Lemma 6, we have: where c 1 is the average of u − u I on ∂ K . In order to get the claim, it suffices to bound the second term. As in Lemma 3, we consider the subtriangulation T n = T n (K ) of K obtained by connecting all the vertices of K to 0, see assumption (D4). In particular, every triangle T ∈ T n is star-shaped with respect to a ball of radius ≥ γ h T , where γ is a positive universal constant. We define u K as the piecewise linear interpolant polynomials over the triangular subtriangulation, interpolating u at the vertices of T , for every T ∈ T n . Using [47,Lemma 4.16] and applying a Poincaré inequality, yield to: We stress that the third inequality in (57) holds since u K | T is a linear polynomial and therefore D 2 u K = 0 on all T ∈ T n .
We note that in Lemmas 6 and 7 the error between f and its L 2 projection can be bounded using Lemmas 4 and 5. We also point out that the hypothesis concerning the distribution of the local degrees of accuracy, i.e. the fact that are in accordance with the forthcoming definition (59) that we introduce for the proof of Theorem 3. Finally, we point out in Lemmas 6 and 7 we introduced a function u I which is locally in V (K ) and globally continuous; thus, u I is a function in the global VE space V n introduced in (13).

Exponential convergence
We set Ω ext = ∪ n∈N Ω ext n = Ω ext 1 , where the Ω ext n are introduced in (D4). We recall that we are assuming that 0 ∈ ∂Ω ext .
We observe that our error analysis needs regularity on f and subsequently on u, the right-hand side and the solution of problem (6), respectively. In particular, we require: f can be extended to a function in B 0 β Ω ext , u can be extended to a function in B 2 β Ω ext .
With a little abuse of notation we call this two functions f and u. Assuming f ∈ B 0 β (Ω) automatically implies that u is in B 2 β (Ω); this follows from classical elliptic regularity theory, see Theorem 1. In the classical hp finite element method, this regularity leads to exponential convergence of the energy error, see [47]. In order to prove the same exponential convergence with hp VEM, we need (58) since the approximation by means of polynomials on the polygons not abutting the singularity needs regularity of the target function on a square containing the polygon, see Lemmas 2 and 4.
We recall the inflated domain Ω ext has been built in such a way that the singularity is never at the interior of Ω ext , see assumption (D4). We highlight also the fact that (58) can be easily generalized to the case of multiple singularities, see e.g. [47].
In order to obtain exponential convergence of the energy error in terms of the number of degrees of freedom, we henceforth assume that the vector p of the degrees of accuracy associated with T n is given by: where μ is a positive constant which to be determined in the proof of Theorem 3 and where · is the ceiling function. Note that choice (59) could be modified asking for p K = 1 if K ∈ L 0 ; under this requirement in fact Lemmas 3, 5 and 7 are still valid. Nonetheless, we prefer to use (59) in order to avoid technical discussions on the construction of the right-hand side of the method and keep the simple representation (23). It is clear from (59) that if K 1 and K 2 belong to the j-th and the ( j + 1)-th layers respectively, for some j = 1, . . . , n − 1, then p K 1 ≈ p K 2 , independently on all the other discretization parameters. Thus, owing to Sect. 4, we also have α(K 1 ) ≈ α(K 2 ), independently on all the other discretization parameters. Besides, p E ≈ p K whenever E ⊆ ∂ K . (6) and (16) respectively; let f be the right-hand side of problem (6). Let N = N (n) = dim(V n ). Assume that u and f satisfy (58). Then, there exists μ > 0 such that p, defined in (59), guarantees the following exponential convergence of the H 1 error in terms of the number of degrees of freedom:

Theorem 3 Let {T n } n be a sequence of polygonal decomposition satisfying (D1)-(D4). Let u and u n be the solutions of problems
with b a constant independent of the discretization parameters.
Proof It suffices to combine Lemma 1, the results of Sect. 4, Lemmas from 2 to 7 and to use the same arguments of [47,Theorem 4.51], properly choosing the parameter μ.
The basic idea behind the proof is that around the singularity, geometric mesh refinement are employed, since p approximation leads only to an algebraic decay of the error; on the other hand, on polygons far from the singularity, it suffices to increase the degree of accuracy, since on such polygons both the loading term and the exact solution of (6) are assumed to belong to the Babuska space B 2 β (Ω ext ) defined in (4) and therefore p approximation leads to exponential convergence of the local errors (see [14,Theorem 5.6]).
Following [47,Theorem 4.51] and using Lemma 1 yield: where c is a constant independent of both the discretization parameters and the number of layers. Applying (39), we obtain: where we recall n + 1 denotes the number of layers. Plugging (62) in (61), we get: We infer: Now, we prove that N (n + 1) 3 . In order to see this, we proceed as follows. In each layer there exists a fixed maximum number of elements; this follows from the geometric assumptions (D1) and (D3), applying for instance the arguments in [38,Section 4]. Using geometric assumption (D2) (which guarantees a maximum number of edges per each element), the definition of the local virtual space (10) and the distribution of the local degrees of accuracy (59), it is straightforward to note that for all K ∈ T n the dimension of each local space V (K ) is proportional to p 2 K , with Recalling again (59), we can now compute a bound for the dimension of the local space, viz. the number of the degrees of freedom: where we stress that we are using that in each layer there is a fixed maximum number of elements. The result follows from Poincaré inequality.

Extension to more general problems
The very same analysis performed in the foregoing sections can be generalized and applied to general elliptic problems. The VE approximation of such problems was firstly introduced in [13] and it bases on the existence of local L 2 projectors Π 0 p K on spaces of polynomials of degree p K and not p K − 2 as required in our framework.
A straightforward way for enabling the computation of these projectors consists in replacing the definition of local spaces V (K ) in (10) with the following: where we recall that the space B(∂ K ) is defined in (9). Doing so, the present theoretical analysis would extend easily, but the dimension of the local and global spaces would increase without implying any gain in the rate of convergence of the method. A possible way to overcome this increase of the dimension is to follow the approach in [13] and, more precisely, using the so-called enhancing technique introduced in [2] which allows to remove the additional degrees of freedom introduced in definition (63) and keeping the computability of projectors Π ∇ p K on each K ∈ T n . The resulting VE spaces go under the name of enhanced spaces.
We highlight that we performed some numerical tests employing such enhanced spaces and computing the local discrete right-hand sides as: The numerical results obtained using (64) are comparable to those presented in Sect. 6 where we considered the standard definition (23). Nonetheless, we stress that a theoretical analysis of VEM when employing enhanced spaces has not been investigated yet.

Numerical results
We show here numerical experiments validating Theorem 3. Let u, the solution of (6), given by the classical benchmark on the L-shaped domain:

Tests on different meshes
We consider sequences of the meshes depicted in Fig. 1 and we consider two different choices for the degree of accuracy distribution p. As a first selection, we pick on all the elements a constant local degree of accuracy which is equal to the number of layers, i.e. p = (n + 1, n + 1, . . . , n + 1). As a second selection, we pick p K as in (59), with μ = 1, μ being the parameter introduced for the construction of the vector of the degrees of accuracy. In Figs. 2, 3 and 4, the numerical results are shown. On the y-axis, we plot a log scale of the relative energy error between u, defined in (65), and the energy projection Π ∇ p K , defined in (19a), (19b), of the solution u n of the discrete problem (16), i.e.
where we recall that Π ∇ p K is defined in (19a) and (19b). On the other hand, in the x-axis we plot the cubic root of the number of the degrees of freedom of the relative virtual space. The reason for choice (67) is that it is not possible to compute the true energy error since virtual functions are not known explicitly.
We consider the behaviour of the error with three different σ , grading parameter, namely σ = 1 2 , 2 and we compare the three types of meshes. In particular, we denote by mesh a), mesh b) and mesh c) the meshes depicted in Fig. 1 (left), (center) and (right) respectively.
As mentioned previously, the sequence of meshes in Fig. 1 (right) does not satisfy assumptions (D1) and (D4). Nevertheless, the expected exponential convergence rate is attained in all cases and for all geometric parameters σ .

A comparison between hp FEM and hp VEM
We want now to show a comparison between the performances of hp (quadrilateral) FEM and hp VEM. We stress that an analogous of Theorem 3 holds for hp FEM, see e.g. [47]. We consider again the benchmark with known solution (65) and we consider the quadrilateral mesh in Fig. 5. In the following we denote such mesh with d) whereas we recall that we denote by (a), (b) and (c) the meshes depicted in Fig. 1 (left), (center) and (right) respectively.
In particular, we pick in both cases p K as in (59) for all K ∈ T n , with μ = 1. We discuss the case of sequences of meshes with grading parameter σ equal to 1 2 , Since we cannot compute the true energy error with the VEM (it is not computable since functions in the virtual space are not known explicitly), in order to compare the two methods, we investigate the L 2 error on the skeleton E n (it is computable in all cases a),. . . ,d), since also the virtual functions are polynomials on E n ), i.e.  The results are shown in Fig. 6, where the hp version of VEM is applied to meshes (a), (b) and (c), while the hp version of FEM is applied to mesh (d).
It is possible to see that there is not a preferential choice; for instance, hp VEM performs better than hp FEM when σ = 1 2 , they perform almost the same when σ = √ 2 − 1, performs much worse when σ = ( √ 2 − 1) 2 . In this sense, we can say that the two methods are comparable; nonetheless, the VEM leads to a huge flexibility in the choice of the domain meshing which is not available in standard H 1 conforming FEM.
We believe that, in order to really see a marked advantage of hp-VEM over hp-FEM, more complex situations need to be addressed. This may involve, for instance, complex geometries (where polyhedral meshes can do a better job), hp-adaptivity (where again there is more refinement freedom) or more involved problems (Discrete Fracture Network, crack propagation, Fluid Structure Interaction, etc.). At the present stage, on the Laplace problem on academic examples, what we can display is the flexibility in refining near corners. Note that hp-adaptivity is currently under investigation.
Next, in Fig. 7, we compare the H 1 error of VEM defined in (67) with the standard H 1 error of hp FEM employing the same meshes and discretization parameters discussed for the comparison of L 2 errors on the skeleton.
The results are comparable to those related to the L 2 error on the skeleton and more precisely the two method display similar behaviours.

Remark 5
We have not discussed yet the choice that we perform for the polynomial basis {q α } p K −2 |α|=0 which is dual to the definition of the internal degrees of freedom defined in (11). In all the numerical experiments so far we employed the monomial basis: Such a choice is typical in VEM literature and was in fact introduced in the pioneering works [10,12] since the implementation of the method under (68) turns out to be simple. Nonetheless, as firstly observed in [4], employing (68) entails a bad effect on the condition number of the VEM stiffness matrix for high values of p. In order to avoid such a ill-conditioning one may define local polynomial bases which are orthonormal in L 2 on each element, for instance obtained by employing a stable Gram-Schimdt algorithm, e.g. as that presented in [9]. A deep investigation of this aspect can be found in [41].

Lemma 8 Let T be a triangle and let b T be the associated cubic bubble function.
Let q ∈ P p (T ), p ∈ N. Then: where c is a positive constant independent on h T and p, h T = diam(T ).
Proof The idea of the proof is presented in [42,Theorem D2]. For a complete proof, see [15,Lemma A.4].
We are now ready for the inverse estimate involving the H −1 norm of polynomials.
We highlight that such result, namely Theorem 5, has been firstly proven in [35,Theorem 3.9]. It is worth to mention that the work [35] is devoted to prove hp polynomial inverse estimates on (possibly) bad-shaped domains; this keeps the topic at a very general level. On the other hand, this work provides rather technical proofs that can be in fact eased when employing regular triangles/polygons. In particular, the proof of Theorem 5 provided here turns out to be simpler than the one in [35,Theorem 3.9].
Theorem 5 Let K ⊆ R 2 be a polygon. Assume that there exists T n (K ) subtriangulation of K such that h K ≈ h T for all T ∈ T n (K ), where h ω = diam(ω), ω ⊆ R 2 . Let q ∈ P p (K ) with p ∈ N ∪ {0}. Then: where q −1,K := q (H 1 0 (K )) * . Proof Let b K be the piecewise bubble function, defined on each T ∈ T n (K ) as the local cubic bubble function b T introduced in Lemma 8. Then: Using now Lemma 8, (70) and the hypothesis that h K ≈ h T for all T in the subtriangulation of K , we obtain: Finally, we apply Theorem 4 with α = 0 and β = 1 and we get: