1 Introduction

In recent years, the mathematical and numerical analysis of initial-boundary value problems for fractional differential operators has received substantial attention. Their numerical treatment has to overcome several challenges. The first challenge arises from their nonlocal nature as integral operators. A direct Galerkin discretization leads to fully populated system matrices, and compression techniques (see, e.g., [31] and the references there) have to be brought to bear to make the discretization computationally tractable. An alternative to a direct Galerkin discretization is to realize the nonlocal operator numerically as a Dirichlet-to-Neumann operator for a local (but degenerate) elliptic problem in higher dimension. While this approach, sometimes referred to as “Caffarelli–Silvestre” extension (“CS-extension” for short) [16, 57] increases the spatial dimension by 1, it permits to use the mathematical and numerical tools that were developed for local, integer order differential operators. In the present paper, we study several hp-FE discretizations of the resulting local (but degenerate) elliptic problem. We refer to these as Extended hp-FEM.

One alternative to the extension approach is the representation of fractional powers of elliptic operators as Dunford-Taylor integrals proposed in [10, 11]. Discretizing such an integral leads to a sum of solution operators for local, second order elliptic problems, which turn out to be singularly perturbed, but are amenable to established numerical techniques. In the present paper, we also study this approach under the name sinc Balakrishnan FEM (sinc-BK FEM for short).

A second challenge arises from the fact that the solutions of problems involving fractional operators are typically not smooth, even for smooth input data (cf. the examples and discussion in [9, Sec. 8.4]). Indeed, for the spectral fractional Laplacian, the behavior near a smooth boundary \(\partial \varOmega \) is \(u \sim u_0 + O({\textrm{dist}}\,(\cdot ,\partial \varOmega )^\beta )\) for some more regular \(u_0\) and a \(\beta \not \in {{\mathbb {N}}}\) [17, Thm. 1.3]. Points of non-smoothness of \(\partial \varOmega \) introduce further singularities into the solution. The numerical resolution of both types of singularities requires suitably designed approximation spaces. For the spectral fractional Laplacian (SFL) in two-dimensional polygonal domains, we present a class of meshes in \(\varOmega \) with anisotropic, geometric refinement towards \(\partial \varOmega \) and with isotropic geometric refinement towards the corners of \(\varOmega \). We prove that spaces of piecewise polynomial functions on such meshes allow for exponential convergence.

1.1 Existing results

The use of FEM techniques for the SFL based on CS-extension was initiated in [46], based on first-order discretizations on graded meshes in the extended direction. Improvements in error vs. number of degrees of freedom were obtained [9] for such first order FEM via sparse grid approximations. Exploiting the analytic dependence of the extension on the extruded variable allows one to use, as in the present work, high order methods with exponential convergence rates for the discretization in the extended direction (independent of the regularity in \(\varOmega \)) to reach nearly optimal complexity algorithms for low order methods in \(\varOmega \) under certain regularity assumptions, [9, 36]. For domains \(\varOmega \subset {\mathbb {R}}^2\) with analytic boundary, [9] analyzed the presently considered Extended \(hp\)-FEM in \(\varOmega \) and proved, for analytic coefficients and source term, the first result on exponential convergence. The extension-based methods are Galerkin methods based on piecewise polynomials. The functional form of the CS-extension is in principle computationally accessible in terms of the eigenpairs \(\{\varphi _i,\lambda _i\}_{i \in {{\mathbb {N}}}}\) of the Dirichlet Laplacian in \(\varOmega \) and functions \(\psi _k(y) = \psi (\sqrt{\lambda _k} y)\), where \(\psi \) is explicitly known in terms of Bessel functions. This motivated [1] to approximate the extended problem from a tensor product space of a standard FEM space with \(\{\psi ({{\widetilde{\lambda }}}_k \cdot ) \}_{k \in \{1,\ldots ,M\}}\), with suitable approximations \({\widetilde{\lambda }}_k\) to the eigenvalues and truncation parameter M.

A second route to numerically solving problems involving the SFL operator is via the Dunford-Taylor calculus as mentioned above for low-order FEM in \(\varOmega \) in [10, 11] and studied in the present paper in a high-order context.

Related to the representation of the SFL via the Dunford-Taylor calculus are representations in terms of the heat semigroup. Availability of parabolic solvers then opens up a third route to the numerical treatment of the SFL, see, e.g., [18, 59, 60].

Since the SFL is defined in terms of the eigenfunctions of the Laplacian, numerical methods can be based on approximating these numerically. One representative of a fourth route to treating the SFL operator is [55], where the use of the spectral element method is proposed.

A further large class of discretization methods is based on discretizing the (local, second order) Laplacian and on the subsequent computation of fractional powers of the resulting stiffness matrix. In this fifth route, various approaches to computing approximations to these powers have been proposed in the literature. Integral representations of the univariate, model singular function \(x^\alpha \) motivate the use of quadrature-based approximations as in [25, 26]. More generally, discretization of the SFL can be based on approximating \(x^\alpha \) by rational functions resulting in techniques proposed in [27, 28]. Related to such rational approximations are the techniques in [19,20,21]. We also refer to the discussion in [30] and the survey [33].

Next we fix notation and assumptions for the class of SFD equations which are covered by our analysis.

Fig. 1
figure 1

Example of a curvilinear polygon

1.2 Geometric preliminaries

As in [8], we consider a bounded Lipschitz domain \(\varOmega \subset {{\mathbb {R}}}^2\) that is a curvilinear polygon as depicted in Fig. 1. The boundary \(\partial \varOmega \) is assumed to consist of \(J \in {{\mathbb {N}}}\) closed curves \(\varGamma ^{(i)}\). Each curve \(\varGamma ^{(i)}\) in turn is assumed to comprise \(J_i \in {{\mathbb {N}}}\) many open, disjoint, analytic arcs \(\varGamma ^{(i)}_j\), \(j=1,\ldots ,J_i\), with \( \overline{\varGamma ^{(i)}} = \bigcup _{j=1}^{J_i} \overline{\varGamma ^{(i)}_j}\;, \quad i=1,\ldots ,J\;. \) The arcs \(\varGamma ^{(i)}_j\) are assumed further to admit nondegenerate, analytic parametrizations,

$$\begin{aligned} \varGamma ^{(i)}_j = \left\{ {{\textbf{x}}}^{(i)}_j (\theta )\, | \, \theta \in (0,1) \right\} \;, \quad i=1,\ldots ,J, \quad j=1,\ldots ,J_i \;. \end{aligned}$$

The coordinate functions \(x_j^{(i)}\), \(y_j^{(i)}\) of \({\textbf{x}}^{(i)}_j(\theta ) = (x_j^{(i)}(\theta ),y_j^{(i)}(\theta ))\) are assumed to be (real) analytic functions of \(\theta \in [0,1]\) and such that \( \min _{\theta \in [0,1]} |\dot{\textbf{x}}_j^{(i)}(\theta )|^2 > 0\) for \( j=1,\ldots ,J_i\), \(i=1,\ldots ,J\). The end points of the arcs \(\varGamma ^{(i)}_j\) are denoted as \({{\varvec{A}}}^{(i)}_{j-1} = {\textbf{x}}_j^{(i)}(0)\) and \({{\varvec{A}}}^{(i)}_j = {{\textbf{x}}}_j^{(i)}(1)\). We enumerate these points counterclockwise by indexing cyclically with j modulo \(J_i\), thereby identifying in particular \({{\varvec{A}}}^{(i)}_{0} := {{\varvec{A}}}^{(i)}_{J_i}\). The interior angle at \({{\varvec{A}}}_j^{(i)}\) is denoted \(\omega _j^{(i)} \in (0,2\pi )\). For notational simplicity, we assume henceforth that \(J=1\), i.e., \(\partial \varOmega \) consists of a single component of connectedness. We write \({{\varvec{A}}}_j = {{\varvec{A}}}_j^{(1)}\) for the corners and \(\varGamma _j\) for \(\varGamma _j^{(1)}\).

1.3 Spectral fractional diffusion

When dealing with fractional operators, care must be exercised in stating the definition of the fractional powers. Here, we consider the so-called spectral fractional diffusion operators as investigated in [16]. We refer to the surveys [12, 33, 50] and the references there for a comparison of the different definitions of fractional powers of the Dirichlet Laplacian.

We consider the linear, elliptic, self-adjoint, second order differential operator \(w \mapsto \mathcal {L} w = - \text {div}( A \nabla w ) \) in a bounded, curvilinear polygon \(\varOmega \subset {\mathbb {R}}^2\) as described in Sect. 1.2. The diffusion coefficient \(A\in L^\infty (\varOmega ,{{\textsf {GL}}}({\mathbb {R}}^2))\) is assumed symmetric, uniformly positive definite. The data A and f are assumed analytic in \(\overline{\varOmega }\). We quantify analyticity of A and f by assuming that there are \(C_A\), \(C_f > 0\) such that

$$\begin{aligned} \forall p \in {{\mathbb {N}}}_0:\;\; \Vert |D^p A| \Vert _{L^\infty (\varOmega )} \le C_A^{p+1} p!\;, \quad \Vert |D^p f| \Vert _{L^\infty (\varOmega )} \le C_f^{p+1} p!\;. \end{aligned}$$

Here, the notation \(|D^p A|\) signifies \(\sum _{|\alpha | = p} | D^\alpha A |\), with the usual multi-index convention \(D^\alpha \) denoting mixed weak derivatives of order \(\alpha \in {{\mathbb {N}}}_0^2\) whose total order \(|\alpha | = \alpha _1 + \alpha _2\). Further, we employ standard notation for (fractional) Sobolev spaces \(H^t(\varOmega )\), consistent with the notation and definitions in [35].

We introduce the “energy” inner product \( a_{\varOmega }(\cdot ,\cdot )\) on \(H^1_0(\varOmega )\) associated with the differential operator \(\mathcal {L}\) by

$$\begin{aligned} a_{\varOmega }(w,v) = \int _\varOmega \left( A \nabla w \cdot \nabla v \right) \, \text{ d }x' \;. \end{aligned}$$

The operator \(\mathcal {L}: H^1_0(\varOmega ) \rightarrow H^{-1}(\varOmega )\) induced by this bilinear form is an isomorphism, due to the (assumed) positive definiteness of A. Let \(\{\lambda _k, \varphi _k \}_{k \in {\mathbb {N}}} \subset {\mathbb {R}}^+ \times H_0^1(\varOmega )\) be a sequence of eigenpairs of \(\mathcal {L}\), normalized such that \(\{\varphi _k\}_{k \in {\mathbb {N}}}\) is an orthonormal basis of \(L^2(\varOmega )\) and an orthogonal basis of \((H_0^1(\varOmega ), a_{\varOmega }(\cdot ,\cdot ))\). We introduce, for \(\sigma \ge 0\), the domains of fractional powers of \(\mathcal {L}\) as

$$\begin{aligned} {{\mathbb {H}}}^\sigma (\varOmega ) = \left\{ v = \sum _{k=1}^\infty v_k \varphi _k: \Vert v \Vert _{{{\mathbb {H}}}^\sigma (\varOmega )}^2 = \sum _{k=1}^{\infty } \lambda _k^\sigma v_k^2 < \infty \right\} . \end{aligned}$$

We denote by \({{\mathbb {H}}}^{-\sigma }(\varOmega )\) the dual space of \({{\mathbb {H}}}^{\sigma }(\varOmega )\). Denoting by \(\langle \cdot ,\cdot \rangle \) the \({{\mathbb {H}}}^{-\sigma }(\varOmega )\times {\mathbb H}^{\sigma }(\varOmega )\) duality pairing that extends the standard \(L^2(\varOmega )\) inner product, we can identify elements \(f \in {{\mathbb {H}}}^{-\sigma }(\varOmega )\) with sequences \(\{f_k\}_k\) (written formally as \(\sum _k f_k \varphi _k\)) such that \(\Vert f\Vert ^2_{{\mathbb H}^{-\sigma }(\varOmega )} = \sum _k |f_k|^2 \lambda _k^{-\sigma } <\infty \). With this identification, we can extend the definition of the norm in (1.3) to \(\sigma < 0\). Furthermore, the linear operator \(\mathcal {L}^s:{\mathbb {H}}^s(\varOmega )\rightarrow {\mathbb {H}}^{-s}(\varOmega ): v \mapsto \sum _{k=1}^\infty v_k \lambda _k^s \varphi _k\) is bounded and the Dirichlet problem for the fractional diffusion in \(\varOmega \) may be stated as: given a fractional order \(s \in (0,1]\) and \(f \in {\mathbb {H}}^{-s}(\varOmega )\), find \(u\in {\mathbb H}^{s}(\varOmega )\) such that

$$\begin{aligned} \mathcal {L}^s u = f \quad \text {in } \varOmega \;. \end{aligned}$$

The ellipticity estimate \(\langle w, \mathcal {L}^s w \rangle \ge \lambda _1^s \Vert w \Vert ^2_{{{\mathbb {H}}}^{s}(\varOmega )}\) valid for every \(w\in {{\mathbb {H}}}^{s}(\varOmega )\) implies the unique solvability of (1.4) for every \(f\in {{\mathbb {H}}}^{-s}(\varOmega )\). The hp-FEM approximations of (1.4) developed and analyzed in the present work are not based on explicit or approximated eigenfunctions but instead on the localization of the operator \(\mathcal {L}^s\) in terms of the extension discussed in Sect. 1.5 and on the Dunford-Taylor integral discussed in Sect. 1.6, the so-called Balakrishnan formula.

Remark 1.1

For \(s \in (0,1)\), the spaces \({{\mathbb {H}}}^{s}(\varOmega )\) are alternatively characterized as interpolation spaces \({{\mathbb {H}}}^s(\varOmega ) = (L^2(\varOmega ),H^1_0(\varOmega ))_{s,2}\) (using the so-called “real” method of interpolation with the K-functional) with equivalent norms, e.g., [46, (2.13)]. In particular, properties of interpolation spaces (see, e.g., [58, Thm. 1.3.3(g)]) and the Poincaré inequality give the interpolation inequality: for every \(0<s<1\) there exists \(C_s > 0\) such that

$$\begin{aligned} \forall w \in H^1_0(\varOmega ) \quad :\quad \Vert w\Vert _{{{\mathbb {H}}}^s(\varOmega )} \le C_s \Vert w\Vert ^{1-s}_{L^2(\varOmega )} \Vert \nabla w\Vert ^s_{L^2(\varOmega )}. \end{aligned}$$

\(\square \)

Remark 1.2

(compatibility conditions) As discussed in [9, Lem. 1, Rem. 1] the spectral fractional Laplacian has the mapping property \(\mathcal {L}^s: {\mathbb H}^{s+\sigma }(\varOmega ) \rightarrow {{\mathbb {H}}}^{-s+\sigma }(\varOmega )\), \(\sigma \ge 0\). For smooth coefficients A and smooth \(\partial \varOmega \), the spaces \({{\mathbb {H}}}^{s+\sigma }(\varOmega )\), \(\sigma \ge 0\), are subspaces of the Sobolev spaces \(H^{s+\sigma }(\varOmega )\). In fact, for \(-s+\sigma > 1/2\), the spaces \({{\mathbb {H}}}^{-s+\sigma }(\varOmega )\) are proper subspaces of \(H^{-s+\sigma }(\varOmega )\) as they encode some boundary conditions on \(\partial \varOmega \). E.g., for \(f \in {{\mathbb {H}}}^{-s+\sigma }(\varOmega )\) with \(-s+\sigma \ge 1/2\) one has \(f|_{\partial \varOmega } = 0\). That is, \(f \in H^{-s+\sigma }(\varOmega )\) must satisfy additionally compatibility conditions on \(\partial \varOmega \) to ensure \(u \in H^{s+\sigma }(\varOmega )\). \(\square \)

1.4 Contributions

We briefly highlight the principal contributions of this work. For the nonlocal, spectral fractional diffusion problem (1.4) in bounded, curvilinear polygonal domains \(\varOmega \) as described in Sect. 1.2 and with analytic data A and f as in (1.1), and without any boundary compatibility, we develop two hp-FEMs for (1.4) that converge exponentially in terms of the number of degrees of freedom \(N_{DOF}\) in \(\varOmega \). The setting covers in particular also boundary value problems for fractional surface diffusion on analytic surface pieces as in the setting of Sect. 8.1. Key insight in our error analysis is that either method, based on the extension of (1.4) combined with a diagonalization procedure as in [9, Sec. 6] or on a contour-integral representation of \(\mathcal {L}^s\) combined with an exponentially converging sinc quadrature, reduce the numerical solution of (1.4) to the numerical solution of local, singularly perturbed second order reaction-diffusion problems in \(\varOmega \). Drawing on analytic regularity and corresponding hp-FEM in \(\varOmega \) for these reaction-diffusion problems with robust, exponential convergence as developed in [8, 41,42,43], we establish here exponential convergence rate bounds for solutions of (1.4). As we showed in [8, 41], the singular perturbation character of the reaction-diffusion problems in \(\varOmega \) mandates both, geometric corner mesh refinement and anisotropic geometric boundary mesh refinement to resolve the algebraic corner and boundary singularities that occur in solutions to (1.4).

Before proceeding to the main part of this paper, we briefly recall the localization due to Caffarelli–Silvestre and the contour integral representation of Balakrishnan [7].

1.5 Caffarelli–Silvestre extension

In [16] the (full space) fractional Laplacian \(\mathcal {L}^s\) was localized via a singular elliptic PDE depending on one extra variable and thus represented as a Dirichlet-to-Neumann operator for an elliptic problem in a half-space. Cabré and Tan [15] and Stinga and Torrea [57] extended this to bounded domains \(\varOmega \) and more general operators, thereby obtaining an extension posed on the semi–infinite cylinder \(\mathcal {C} := \varOmega \times (0,\infty )\). Their extension is given by the local boundary value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} {\mathfrak {L}} {\mathscr {U}}= -\text {div}\left( y^\alpha {\varvec{{\mathfrak {A}}}} \nabla {\mathscr {U}}\right) = 0 &{} \text { in } \mathcal {C}=\varOmega \times (0,\infty ), \\ {\mathscr {U}}= 0 &{}\text { on } \partial _L \mathcal {C}, \\ \partial _{\nu ^\alpha } {\mathscr {U}}= d_s f &{}\text { on } \varOmega \times \{0\}, \end{array}\right. } \end{aligned}$$

where \({\varvec{{\mathfrak {A}}}} = \textrm{diag} \{A,1\} \in L^\infty (\mathcal {C},{{\textsf {GL}}}({{\mathbb {R}}}^{d+1}))\), \(\partial _L \mathcal {C}:= \partial \varOmega \times (0,\infty )\), \(d_s: = 2^{1-2s} \varGamma (1-s)/\varGamma (s)>0\), and \(\alpha = 1-2s \in (-1,1)\) [16, 57]. The so–called conormal exterior derivative of \({\mathscr {U}}\) at \(\varOmega \times \{ 0 \}\) is

$$\begin{aligned} \partial _{\nu ^\alpha } {\mathscr {U}}= -\lim _{y \rightarrow 0^+} y^\alpha \partial _y{\mathscr {U}}. \end{aligned}$$

The limit in (1.7) is in the distributional sense [15, 16, 57]. Fractional powers of \(\mathcal {L}^s\) in (1.4) and the Dirichlet-to-Neumann operator of problem (1.6) are related by [16, 17]

$$\begin{aligned} d_s {\mathcal {L}}^s u = \partial _{\nu ^\alpha } {\mathscr {U}}\quad \text {in } \varOmega \;, \end{aligned}$$

and the solution u of (1.4) is given by trace \({\textrm{tr}}_{\varOmega } {\mathscr {U}}\) of \({\mathscr {U}}\) to \(\varOmega \).

We write \(x = (x',y)\in \mathcal {C}\) with \(x' \in \varOmega \) and \(y > 0\). For \(D \subset {\mathbb {R}}^{d} \times {\mathbb {R}}^+\), we define \(L^2(y^\alpha ,D)\) as the Lebesgue space with the measure \(y^\alpha \, \text{ d }x\) and \(H^1(y^{\alpha },D)\) as the weighted Sobolev space

$$\begin{aligned} H^1(y^{\alpha },D) := \left\{ w \in L^2(y^\alpha ,D)\;:\; |\nabla w | \in L^2(y^\alpha ,D) \right\} \end{aligned}$$

equipped with the norm

$$\begin{aligned} \Vert w\Vert _{H^1(y^\alpha ,D)} = \left( \Vert w\Vert ^2_{L^2(y^\alpha ,D)}+\Vert \nabla w\Vert ^2_{L^2(y^\alpha ,D)}\right) ^{1/2}. \end{aligned}$$

To investigate (1.6) we include the homogeneous boundary condition on the lateral boundary \(\partial _L \mathcal {C}\) by setting

$$\begin{aligned} {\mathring{H}^1} (y^{\alpha },\mathcal {C}) := \left\{ w \in H^1(y^\alpha ,\mathcal {C})\;:\; w = 0 \text { on } \partial _L \mathcal {C} \right\} . \end{aligned}$$

The bilinear form \( a_{\mathcal {C}}: {\mathring{H}^1} (y^{\alpha },\mathcal {C}) \times {\mathring{H}^1} (y^{\alpha },\mathcal {C}) \rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} a_{\mathcal {C}}(v,w) = \int _{\mathcal {C}} y^\alpha ({\varvec{{\mathfrak {A}}}} \nabla v \cdot \nabla w ) \, \text{ d }x' \, \text{ d }y, \end{aligned}$$

is continuous and coercive on \({\mathring{H}^1} (y^{\alpha }, {\mathcal {C}})\). The energy norm \(\Vert \cdot \Vert _{\mathcal {C}}\) on \({\mathring{H}^1} (y^\alpha ,\mathcal {C})\) induced by the inner product \(a_{\mathcal {C}}(\cdot ,\cdot )\) is given by

$$\begin{aligned} \Vert {v}\Vert _{\mathcal {C}}^2:= a_{\mathcal {C}}(v,v) \sim \Vert \nabla v\Vert ^2_{L^2(y^\alpha ,\mathcal {C})}\;. \end{aligned}$$

For \(w \in H^1(y^\alpha ,\mathcal {C})\) we denote by \({{\,\mathrm{tr_\varOmega }\,}}w\) its trace on \(\varOmega \times \{0\}\), which connects the spaces \({\mathring{H}^1} (y^\alpha ,\mathcal {C})\) and \({\mathbb {H}}^s(\varOmega )\) (cf. [46, Prop. 2.5]) via

$$\begin{aligned} {{\,\mathrm{tr_\varOmega }\,}}{\mathring{H}^1} (y^\alpha ,\mathcal {C}) = {\mathbb {H}}^s(\varOmega ), \qquad \Vert {{\,\mathrm{tr_\varOmega }\,}}w\Vert _{{\mathbb {H}}^s(\varOmega )} \le C_{{{\,\mathrm{tr_\varOmega }\,}}} \Vert w \Vert _{{\mathring{H}^1} (y^\alpha ,\mathcal {C})}. \end{aligned}$$

With these definitions at hand, the weak formulation of (1.6) is to find

$$\begin{aligned} {\mathscr {U}}\in {\mathring{H}^1} (y^{\alpha },\mathcal {C}): \; \forall v \in {\mathring{H}^1} (y^\alpha ,\mathcal {C}) \ :\ a_{\mathcal {C}}({\mathscr {U}},v) = d_s \langle f,{{\,\mathrm{tr_\varOmega }\,}}v \rangle . \end{aligned}$$

Remark 1.3

(regularity of \({\mathscr {U}}\) for \(s = 1/2\)) In the special case \(s = 1/2\) and \(A = {\textrm{Id}}\) in (1.4) the operator \( {\mathfrak {L}}\) in the CS extension (1.6) in \(\mathcal {C}\) coincides with the Laplacian in \(\mathcal {C}\). Therefore, the solution \({\mathscr {U}}\) will, in general, exhibit algebraic singularities on \(\partial \varOmega \), even if \(\partial \varOmega \) is smooth. \(\square \)

1.6 Balakrishnan formula

The weak formulation (1.15) can be the starting point for a Galerkin method and indeed underlies the Extended \(hp\)-FEM analyzed here. The second approach to numerically solving (1.4) which we discuss here is via discretization of a spectral integral representation of fractional powers of elliptic operators. We develop the hp-error analysis based on the representation [7] (hp-discretizations of other spectral integration based approximations are discussed in Sect. 8.3). For \(0<s<1\) and \(\mathcal {L} = -\text {div}(A\nabla )\) with homogeneous Dirichlet boundary conditions the bounded linear operator \(\mathcal {L}^{-s}:{\mathbb {H}}^{-s}(\varOmega ) \rightarrow {\mathbb {H}}^s(\varOmega )\) admits the following representation with \(c_B = \pi ^{-1} \sin (\pi s)\):

$$\begin{aligned} \mathcal {L}^{-s}= & {} \displaystyle c_B \int _0^\infty \lambda ^{-s} (\lambda I + \mathcal {L})^{-1} \, \text{ d }\lambda = c_B \int _{-\infty }^\infty e^{(1-s)y}(e^yI + \mathcal {L})^{-1} \, \text{ d }y \nonumber \\= & {} \displaystyle c_B \int _{-\infty }^\infty e^{-sy} (I + e^{-y}\mathcal {L})^{-1} \, \text{ d }y \;. \end{aligned}$$

The representations (1.16) were used in [10, 11] in conjunction with an exponentially convergent, so-called sinc quadrature approximation of (1.16) (see, e.g., [56] for details) and an h-version Finite Element projection in \(\varOmega \) to obtain numerical approximations of the fractional diffusion equation (1.4) in \(\varOmega \). Here, we generalize the results in [10, 11] to the hp-FEM, establishing exponential convergence rates in polygonal domains \(\varOmega \) for the resulting sinc-BK FEM for data A and f that are analytic in \(\overline{\varOmega }\) (cf. (1.1)) without boundary compatibility of f.

1.7 Outline

The outline of the remainder of the paper is as follows. The following Sect. 2 describes the hp-FE spaces and Galerkin methods for (1.6) based on tensor products of discretizations in the x and the y variable.

Section 3 develops the diagonalization of the hp-FE semi-discretization in the extended variable. In particular, in Sect. 3.1 we prove exponential convergence of an hp-FE semidiscretization in \((0,\infty )\). The diagonalization reduces the semidiscrete approximation of the CS-extended, localized problem to a collection of decoupled, linear second order reaction-diffusion problems in \(\varOmega \).

Section 4 presents the exponential convergence results from [8] of hp-FE discretizations of linear, second order singularly perturbed reaction-diffusion equations in \(\varOmega \) and establishes robust (with respect to the perturbation parameter \(\varepsilon \)) exponential convergence results for these.

Section 5 completes the proof of exponential convergence for the Extended \(hp\)-FEM by applying the hp-FEM from Sect. 4 in \(\varOmega \) for the reaction-diffusion problems obtained from the diagonalization process in Sect. 3. Section 5 presents in fact two distinct hp-FE discretizations: a pure Galerkin method (Case B) and a method based on discretizing after diagonalization each decoupled problem separately (Case A). The latter approach features slightly better complexity estimates.

Section 6 is devoted to the analysis of the sinc-BK FEM. There once more the numerical approximation of the fractional Laplacian is reduced to the numerical solution of a sequence of local linear, second order reaction-diffusion problems in \(\varOmega \). Applying exponential convergence bounds for sinc approximation and for hp-FEM for reaction-diffusion problems in \(\varOmega \) in Sect. 4 from [8], once again exponential convergence for the resulting sinc-BK FEM for the spectral version of the fractional diffusion operator is established. As in Sect. 5, we separately discuss the possibilities of approximating the solutions of the decoupled problems from the same space (Case B) or from different spaces (Case A). Section 7 has numerical experiments verifying the theoretical convergence results. Section 8 has a summary and outlines several generalizations and directions for further research. In particular, we address in Sect. 8.1 the extension to fractional diffusion on manifolds. In Sect. 8.2, we discuss several exponential bounds on Kolmogorov n-widths of solution sets for spectral diffusion in polygons that follow from our results. Finally, Sect. 8.3 comments on the corresponding error analysis for a so-called double-exponential quadrature approximation of the BK formula, again combined with hp-FEM for the resulting (now with complex-valued perturbation parameter) elliptic singular perturbation problems.

1.8 Notation

Constants C, \(\gamma \), b may be different in each occurence, but are independent of critical parameters. We denote by \({\widehat{S}}:=(0,1)^2\) the reference square and by \({\widehat{T}}:=\{(\xi _1,\xi _2)\,|, 0< \xi _1< 1, \ 0< \xi _2 < \xi _1\}\) the reference triangle. Sets of the form \(\{x = y\}\), \(\{x = 0 \}\), \(\{x = y\}\) etc. refer to edges and diagonals of \({\widehat{S}}\) and analogously \(\{y \le x\} = \{(x,y) \in {\widehat{S}}\,|\, y \le x\}\). \({{\mathbb {P}}}_q = {\textrm{span}}\, \{x^i y^j\,|\, i,j \ge 0, i+j \le q\}\) denotes the space of polynomials of total degree q and \({{\mathbb {Q}}}_q = {\textrm{span}}\,\{x^i y^j\,|\, 0 \le i,j \le q\}\) the tensor product space of polynomial of degree q in each variable separately. Throughout, \(A\lesssim B\) signifies the inequality \(A\le C B\) with constant \(C>0\) independent of critical parameters (in particular of discretization parameters) that A and B may depend on. In the same vein, \(A\sim B\) means that there exists some \(C>0\) such that \(C^{-1} B \le A \le C B\).

2 hp-FEM discretization

In this section, we introduce hp-FEM spaces in both the x and the y-variable on which the Extended \(hp\)-FEM will be based. In particular, we introduce the geometric meshes \(G^M_{geo,\sigma }\) that are used for the discretization in the y-variable.

2.1 Notation and FE spaces

2.1.1 Meshes and FE spaces on

Given and a mesh \({{\mathcal {G}}}^M = \{ I_m \}_{m=1}^M\) on consisting of M intervals \(I_m = [y_{m-1},y_m]\), with , we associate to \({{\mathcal {G}}}^M\) a polynomial degree distribution \({{\varvec{r}}}= (r_1,r_2,\dots ,r_M) \in {{\mathbb {N}}}^M\). We introduce the hp-FE space

where \({\mathbb {P}}_r(I_m)\) denotes the space of polynomials of degree r on \(I_m\). We will primarily work with the following piecewise polynomial space of functions that vanish on :


here, the embedding is understood as implicitly extending functions from to functions on \((0,\infty )\) by zero.

For constant polynomial degree \(r_i=r \ge 1\), \(i=1,\ldots ,M\), we set . Henceforth, we abbreviate


Of particular interest will be geometric meshes \({{\mathcal {G}}}^M_{geo,\sigma } = \{I_i\,|\,i=1,\ldots ,M\}\) on with M elements and grading factor \(\sigma \in (0,1)\) with elements given by and for \(i=2,\ldots ,M\). On geometric meshes \({{\mathcal {G}}}^M_{geo,\sigma }\) on , we consider a linear polynomial degree vector \({{\varvec{r}}}= \{ r_i \}_{i=1}^M\) with slope \({\mathfrak {s}}>0\) which is defined by

$$\begin{aligned} r_i := 1 + \lceil {\mathfrak {s}}(i-1) \rceil \} \;,\quad i=1,2,\ldots ,M. \end{aligned}$$

For geometric meshes and linear degree vectors we set


with constants implied in \(\sim \) depending on \({\mathfrak {s}}>0\).

2.1.2 hp-FEM in \(\varOmega \)

In the polygon \(\varOmega \), we consider Lagrangian FEM of uniformFootnote 1 polynomial degree \(q\ge 1\) based on regular triangulations of \(\varOmega \) denoted by \({{\mathcal {T}}}\). We admit both triangular and quadrilateral elements \(K\in {{\mathcal {T}}}\), but do not assume shape regularity. In fact, as we shall explain in Sect. 4 ahead, anisotropic mesh refinement towards \(\partial \varOmega \) will be required to resolve singularities at the singular support \(\partial \varOmega \) that are generically present in solutions of fractional PDEs (cf. Remark 1.3). We introduce, for a regular (in the sense of [41, Def. 2.4.1]) triangulation \({{\mathcal {T}}}\) of \(\varOmega \) comprising curvilinear triangular or quadrilateral elements \(K\in {{\mathcal {T}}}\) with associated analytic element maps \(F_K:{\widehat{K}} \rightarrow K\) (where \({{\widehat{K}}} \in \{{{\widehat{T}}}, {\widehat{S}}\}\) is either the reference triangle or square depending on whether K is a curvilinear triangle or quadrilateral) the FE space

$$\begin{aligned} S^q_0(\varOmega ,{{\mathcal {T}}}\,\,)= \left\{ v_h \in C({\overline{\varOmega }}): v_h|_{K}\circ F_K \in V_{q}({\widehat{K}}) \quad \forall K \in {{\mathcal {T}}}, \ v_h|_{\partial \varOmega } = 0 \right\} . \end{aligned}$$

Here, for \(q \ge 1\), the local polynomial space \(V_q({{\widehat{K}}}) = {{\mathbb {P}}}_q\) if \({{\widehat{K}}} = {{\widehat{T}}}\) and \(V_q({{\widehat{K}}}) = {{\mathbb {Q}}}_q\) if \({{\widehat{K}}} = {{\widehat{S}}}\).

2.1.3 Tensor product hp-FE approximation

One hp-FE approximation of the extended problem (1.6) will be based on the finite–dimensional tensor product spaces of the form


where \({{\mathcal {T}}}\)  is a regular triangulation of \(\varOmega \). To analyze this method, we consider semidiscretizations based on the following (infinite–dimensional, closed) Hilbertian tensor product space:


Here, the argument indicates that spaces of functions supported in are considered. Galerkin projections onto the spaces \({\mathbb {V}}^{q,{{\varvec{r}}}}_{h,M}({{\mathcal {T}}}, {{\mathcal {G}}}^M)\) and with respect to the inner product \( a_{\mathcal {C}}(\cdot ,\cdot )\) are denoted by \(G^{q,{{\varvec{r}}}}_{h,M} \) and \(G_M^{{\varvec{r}}}\), respectively. For the CS-extension \({\mathscr {U}}\), i.e., the solution of (1.15), the Galerkin projections \(G^{q,{{\varvec{r}}}}_{h,M} {\mathscr {U}}\in {\mathbb {V}}^{q,{{\varvec{r}}}}_{h,M}({{\mathcal {T}}},{{\mathcal {G}}}^M)\) and are characterized by

$$\begin{aligned} \forall v \in {\mathbb {V}}^{q,{{\varvec{r}}}}_{h,M}({{\mathcal {T}}},{{\mathcal {G}}}^M) :\quad a_{\mathcal {C}}(G^{q,{{\varvec{r}}}}_{h,M} {\mathscr {U}}, v) = d_s \langle f, {{\,\mathrm{tr_\varOmega }\,}}v\rangle , \end{aligned}$$

3 Approximation by hp-semidiscretization in y

A key step in the hp-FE discretization in is, as in [9], the diagonalization of the semidiscretized, truncated extension problem with solution \(G^{{{\varvec{r}}}}_{M} {\mathscr {U}}\) given by (2.9).

3.1 Exponential convergence of hp-FEM in \((0,\infty )\)

As in [9, 36], we exploit the analytic regularity of the extended solution \({\mathscr {U}}\) with respect to the extended variable y. It results in exponential convergence of the hp-semidiscretization error \({\mathscr {U}}- G^{{{\varvec{r}}}}_M {\mathscr {U}}\) in , if geometric meshes \({{\mathcal {G}}}^M_{geo,\sigma }\) and a truncation parameter are used.

Lemma 3.1

(exponential convergence, [9, Lem. 13]) Fix \(c_1 < c_2\). Let \(f \in {{\mathbb {H}}}^{-s+\nu }(\varOmega )\) for some \(\nu \in (0,s)\). Assume that satisfies , and consider the geometric mesh \({{\mathcal {G}}}^M_{geo,\sigma }\) on and the linear degree vector \({{\varvec{r}}}\) with slope \({\mathfrak {s}}>0\). Let \({\mathscr {U}}\) be given by (1.15) and \(G^{{{\varvec{r}}}}_M {\mathscr {U}}\) be the Galerkin projection onto given by (2.9). Then there exist C, \(b > 0\) (depending solely on s, \({\mathcal {L}}\), \(c_1\), \(c_2\), \(\sigma \), \(\nu \), \({\mathfrak {s}}\)) such that

$$\begin{aligned} \Vert \nabla ({\mathscr {U}}- G^{{{\varvec{r}}}}_M {\mathscr {U}})\Vert _{L^2(y^\alpha ,\mathcal {C})} \le C e^{-b M} \Vert f\Vert _{{{\mathbb {H}}}^{-s+\nu }(\varOmega )} \;. \end{aligned}$$

Furthermore, (3.1) also holds for constant polynomial degree \({{\varvec{r}}}= (r,\ldots ,r)\) if \(c_3 M \le r \le c_4 M\) for some fixed \(c_3\), \(c_4>0\). The constant \(b> 0\) then depends additionally on \(c_3\), \(c_4\).


The statement is a slight generalization of [9, Lem. 13], which is the present lemma except that [9, Lem. 13] requires the slope \({\mathfrak {s}}\) to satisfy \({\mathfrak {s}}\ge {\mathfrak {s}}_{min}\) for some suitable \({\mathfrak {s}}_{min}>0\). The key step of the proof of [9, Lem. 13] is the use of [9, Lem. 11], which is effectively a 1D hp-approximation result on geometric meshes. Inspection of that proof shows that the condition \({\mathfrak {s}}\ge {\mathfrak {s}}_{min}\) can be removed. In fact, this is typical of high order approximation on geometric meshes and is worked out in more detail, e.g., in [3, Thm. 8, Eqn. (78), Rem. 16]. (See, alternatively, the extended preprint [3, Thm. 3.13, Eqn. (3.21), Rem. 3.14].) We remark that the constant \(b = O({\mathfrak {s}})\) as \({\mathfrak {s}}\downarrow 0\). The statement about the constant polynomial degree follows from the case of the linear degree vector since a) \(G^{{{\varvec{r}}}}_M {\mathscr {U}}\) is the Galerkin projection of \({\mathscr {U}}\), b) the minimization property of Galerkin projections, and c) the fact that the space is a subspace of provided \({{\varvec{r}}}\) is a linear degree vector with suitably chosen slope. \(\square \)

The error bound (3.1) shows that up to an exponentially small (with respect to ) error introduced by truncation of \((0,\infty )\) at , the solution \({\mathscr {U}}\) can be approximated by the solution of a local problem on the finite cylinder .

3.2 Diagonalization

Diagonalization, as introduced in [9], refers to the observation that the solution \(G^{{{\varvec{r}}}}_M {\mathscr {U}}\) of the semidiscrete problem (2.9) can be expressed in terms of \(\mathcal {M}\) solutions \(U_i\in H^1_0(\varOmega )\), of \(\mathcal {M}\) decoupled, linear local 2nd order reaction–diffusion problems in \(\varOmega \). As the eigenvalues \(\mu _i\) in the corresponding eigenvalue problem (3.2) ahead govern the length scales in the local reaction-diffusion problems in \(\varOmega \) (3.5) (which, in turn, will be crucial in the mesh-design for the hp-FEM in \(\varOmega \)), it is of interest to know their asymptotic behavior. We investigate this in Lemma 3.2 below.

Diagonalization is based on the explicit representation for the semidiscrete solution \({\mathscr {U}}_M\) obtained from the following generalized eigenvalue problem, introduced in [9, Sec. 6], and proposed earlier in [34], which reads: find such that


All eigenvalues \((\mu _i)_{i=1}^{\mathcal {M}}\) of (3.2) are positive and has an orthonormal eigenbasis \((v_i)_{i=1}^\mathcal {M}\) satisfying


We may expand the semidiscrete approximation \(G^{{{\varvec{r}}}}_M {\mathscr {U}}\) as

$$\begin{aligned} G^{{{\varvec{r}}}}_M {\mathscr {U}}(x',y)=: \sum _{i=1}^\mathcal {M} U_i(x') v_i(y). \end{aligned}$$

The coefficient functions \(U_i\in H^1_0(\varOmega )\) satisfy a system of \(\mathcal {M}\) decoupled linear reaction-diffusion equations in \(\varOmega \): for \(i=1,\ldots ,\mathcal {M}\), find \(U_i \in H^1_0(\varOmega )\) such that

$$\begin{aligned} \forall V \in H^1_0(\varOmega )\;:\quad a_{\mu _i,\varOmega }(U_i,V) = d_s v_i(0) \langle f,V\rangle \;. \end{aligned}$$

Here \(v_i\) denotes the i-th eigenfunction of the eigenvalue problem (3.2), (3.3) and

$$\begin{aligned} a_{\mu _i,\varOmega }(U,V):= \mu _i a_{\varOmega }(U,V) + \int _\varOmega U V \, \text{ d }x', \end{aligned}$$

with \( a_{\varOmega }\) as introduced in (1.2). Due to the biorthogonality (3.3) of the discrete eigenfunctions , any \(Z(x',y)= \sum _{i=1}^\mathcal {M} V_i(x') v_i(y)\) with arbitrary \(V_i\in H^1_0(\varOmega )\) satisfies the energy (“Pythagoras”) identities


The following bounds on the \(\mu _i\) were shown in [9, Lem. 14] for the special case of geometric meshes \({{\mathcal {G}}}^M_{geo,\sigma }\) and linear degree vectors:

Lemma 3.2

(properties of the eigenpairs, [9, Lem. 14]) Let \(\{{{\mathcal {G}}}^M_{geo,\sigma }\}_{M\ge 1}\) be a sequence of geometric meshes on , and let \({{\varvec{r}}}\) be a linear polynomial degree vector with slope \({\mathfrak {s}}>0\). Assume that the truncation parameter is chosen so that for some constants \(0<c_1, c_2 < \infty \) that are independent of M.

Then, there exists \(C>1\) (depending on \(c_1\), \(c_2\), s, and on \(\sigma \in (0,1)\)) such that there holds for every \(M \in {{\mathbb {N}}}\) for the eigenpairs \((\mu _i,v_i)_{i=1}^{{{\textsf{M}}}_{geo}}\) given by (3.2), (3.3)


3.3 Fully discrete approximation

The full discretization is obtained by approximating the functions \(U_i\) of (3.5) from finite-dimensional spaces. Let \({{\mathcal {T}}}_i\), \(i=1,\ldots ,{{\textsf{M}}}\), be regular triangulations in \(\varOmega \) and \(q \in {{\mathbb {N}}}\). Let \(\varPi ^q_i:H^1_0(\varOmega ) \rightarrow S^q_0(\varOmega ,{{\mathcal {T}}}_i)\) denote the Ritz projectors for the bilinear forms \( a_{\mu _i,\varOmega }\), which are characterized by

$$\begin{aligned} \forall v \in S^q_0(\varOmega ,{{\mathcal {T}}}_i):\quad a_{\mu _i,\varOmega }(u - \varPi ^q_i u,v) = 0. \end{aligned}$$

In terms of the projections \(\varPi ^q_i\) we can define the fully discrete approximation

$$\begin{aligned} {\mathscr {U}}_{h,M}(x,y):= \sum _{i=1}^{\mathcal {M}} v_i(y) \varPi ^q_i U_i(x). \end{aligned}$$

By combining (3.5) and (3.9), the functions \(\varPi ^q_i U_i \in S^q_0(\varOmega , {{\mathcal {T}}}_i)\) are explicitly and computably given as the solutions of

$$\begin{aligned} \forall V \in S^q_0(\varOmega ,{{\mathcal {T}}}_i)\ :\ a_{\mu _i,\varOmega } (\varPi ^q_i U_i, V) = d_s v_i(0) \langle f, V\rangle . \end{aligned}$$

In view of (3.7), we have the following representation of the difference between the semidiscrete approximation \(G^{{\varvec{r}}}_{M} {\mathscr {U}}\) and the fully discrete approximation \({\mathscr {U}}_{h,M}\):

Lemma 3.3

Let \({\mathscr {U}}_{h,M}\) be given by (3.10). Then:

$$\begin{aligned} a_{\mathcal {C}}(G^{{\varvec{r}}}_M {\mathscr {U}}- {\mathscr {U}}_{h,M}, G^{{\varvec{r}}}_M {\mathscr {U}}- {\mathscr {U}}_{h,M}) = \sum _{i=1}^{\mathcal {M}} \Vert U_i - \varPi ^q_i U_i\Vert ^2_{\mu _i,\varOmega }. \end{aligned}$$

Concerning the meshes \({{\mathcal {T}}}_i\), we distinguish two cases in this work:

  • Case A: The meshes \({{\mathcal {T}}}_i\), \(i=1,\ldots ,\mathcal {M}\), possibly differ from each other.

  • Case B: The meshes \({{\mathcal {T}}}_i\), \(i=1,\ldots ,\mathcal {M}\), coincide. That is, all coefficient functions \(U_i\) in the semidiscrete solution (3.4) are approximated from one common hp-FE space \(S^q_0(\varOmega ,{{\mathcal {T}}}\,\,)\).

In Case B the approximation \({\mathscr {U}}_{h,M}\) actually coincides with the Galerkin projection :

Lemma 3.4

(error representation, [9, Lem. 12]) Let \((\mu _i,v_i)_{i=1}^{\mathcal {M}}\) be the eigenpairs given by (3.2), (3.3). For \(i=1,\ldots ,\mathcal {M}\), let \(U_i \in H^1_0(\varOmega )\) be the solutions to (3.5). Consider Case B and let \(\varPi ^q_i:H^1_0(\varOmega ) \rightarrow S^q_0(\varOmega ,{{\mathcal {T}}}\,\,)\) be the Galerkin projections given as in (3.9), with one common, regular triangulation \({{\mathcal {T}}}\) of \(\varOmega \) for \(i=1,\ldots ,\mathcal {M}\). Let \(G^{{{\varvec{r}}}}_M {\mathscr {U}}\) denote the solution to the semidiscrete problem (2.9). Then the tensor product Galerkin approximation satisfies

$$\begin{aligned} G^{q,{{\varvec{r}}}}_{h,M}{\mathscr {U}}= {\mathscr {U}}_{h,M}(x',y)&= \sum _{i=1}^{\mathcal {M}} v_i(y) \varPi ^q_i U_i(x') , \end{aligned}$$
$$\begin{aligned} a_{{\mathcal {C}}}(G^{{{\varvec{r}}}}_{M}{\mathscr {U}}- G^{q,{{\varvec{r}}}}_{h,M} {\mathscr {U}}, G^{{{\varvec{r}}}}_{M} {\mathscr {U}}- G^{q,{{\varvec{r}}}}_{h,M} {\mathscr {U}})&= \sum _{i=1}^{\mathcal {M}} \Vert U_i - \varPi ^q_i U_i\Vert ^2_{\mu _i,\varOmega }. \end{aligned}$$

Lemma 3.4 shows that in Case B, the Galerkin projection of \({\mathscr {U}}\) into the tensor product space \({\mathbb {V}}^{q,{{\varvec{r}}}}_{h,M}({{\mathcal {T}}},{{\mathcal {G}}}^M)\) coincides with the approximation \({\mathscr {U}}_{h,M}\) defined in (3.10) in terms of the decoupling procedure. Hence, the decoupling procedure is not essential for numerical purposes in Case B, although it has algorithmic advantages. In contrast, Case A relies on the decoupling in an essential way. In both cases, the exponential convergence result below will make use of the error estimates of Lemmas 3.3, 3.4 obtained by the diagonalization process.

It is advisable to choose the meshes \({{\mathcal {T}}}_i\) in case Case A such that the functions \(U_i\) can be approximated well from \(S^q_0(\varOmega , {{\mathcal {T}}}_i)\) in the norm \(\Vert \cdot \Vert _{\mu _i,\varOmega }\). Correspondingly in Case B, the common mesh \({{\mathcal {T}}}\) should be chosen such that each \(U_i\) can be approximated well from \(S^q_0(\varOmega ,{{\mathcal {T}}}\,\,)\). The bounds (3.8) indicate that, for large M, most of the reaction-diffusion problems (3.5) are singularly perturbed. Hence we design in the following Sect. 4hp-FE approximation spaces in \(\varOmega \) which afford exponential convergence rates that are robust with respect to the singular perturbation parameter.

4 hp-FE approximation of singular perturbation problems

In the exponential convergence rate analysis of tensorized hp-FEM for the CS extension (Extended \(hp\)-FEM) as well as for the ensuing (see Sect. 6 ahead) sinc-BK FEM approximation, a crucial role is played by robust exponential convergence rate bounds for hp-FEM for singularly perturbed, reaction-diffusion problems in curvilinear polygonal domains \(\varOmega \). Specifically, we consider the hp-FE approximation of the local reaction-diffusion problem in \(\varOmega \),

$$\begin{aligned} -\varepsilon ^2 {\textrm{div}}\,\left( A(x') \nabla u^\varepsilon \right) + c(x') u^\varepsilon = f \quad \text{ in } \varOmega , \qquad u^\varepsilon = 0 \quad \text{ on } \partial \varOmega , \end{aligned}$$

where we assume \({\mathrm{ess\,inf}}_{x'\in \varOmega } c(x') \ge c_0 > 0\) and

$$\begin{aligned} \begin{aligned}&A, c,\text { and }f\text { are analytic on }\overline{\varOmega }\text { and}\\&A\text { is symmetric, uniformly positive definite.} \end{aligned} \end{aligned}$$

We note again that (4.1) does not imply any kind of boundary compatibility of f at \(\partial \varOmega \) (cf. Remark 1.2). We assume \(\varOmega \) to be scaled so that \(\textrm{diam}(\varOmega ) = O(1)\). Then, for small \(\varepsilon >0\), the boundary value problem (4.1) is a so-called “elliptic-elliptic” singular perturbation problem. Under the assumptions (4.2), for every \(\varepsilon >0\) problem (4.1) admits a unique solution \(u^\varepsilon \in H^1_0(\varOmega )\). In general, \(u^\varepsilon \) exhibits, for small \(\varepsilon > 0\), boundary layers near \(\partial \varOmega \) whose robust numerical resolution (i.e., with error bounds whose constants are independent of \(\varepsilon \)) requires anisotropically refined meshes aligned with \(\partial \varOmega \) (see [24, 42, 49] and the references there). In addition, the corners of \(\varOmega \) induce point singularities in the (analytic in \(\varOmega \)) solution \(u^\varepsilon \). In the context of hp-FEM under consideration here, their efficient numerical approximation mandates geometric mesh refinement near the corners.

In the present section, we consider the hp-FEM approximation of \(u^\varepsilon \) that features exponential convergence for two different types of meshes: (a) geometric boundary-refined meshes in Sect. 4.2 and (b) admissible boundary layer meshes in Sect. 4.3. In both cases the error estimates are of the form \(O(e^{-b q} + e^{-b' L})\), with the constant hidden in \(O(\cdot )\) independent of \(\varepsilon \), L, and q and where q is the polynomial degree employed and L measures the number of layers of geometric refinement towards the vertices or edges of \(\varOmega \). The difference in these two types of meshes is that “admissible boundary layer meshes” are strongly \(\varepsilon \)-dependent with geometric refinement towards the vertices and only a single layer of thin elements of width \(O(q\varepsilon )\) near \(\partial \varOmega \) to resolve the boundary layer. The number of elements is then O(L) leading to a number of degrees of freedom \(N = O(L q^2)\). In contrast, geometric boundary-refined meshes are based on geometric, anisotropic refinement towards the edges and corners of \(\varOmega \). As we show in [8], hp-FEM on such meshes afford exponential convergence for boundary layers with multiple scales. The total number of elements in geometric boundary-refined meshes with L layers is \(O(L^2)\). Combined with local FE spaces of polynomial degree q, this results in a number of degrees of freedom \(N = O(L^2 q^2)\). Whereas admissible boundary layer meshes are designed to approximate boundary layers of a single, given length scale \(\varepsilon \), geometric boundary-refined meshes afford concurrent, robust and exponentially convergent approximations of boundary layers with multiple length scales in \(\varOmega \). These arise, e.g., upon semidiscretization in the extended variable as is evident from (3.5).

4.1 Macro triangulation: geometric boundary-refined mesh

We do not consider the most general meshes with anisotropic refinement, but confine the hp-FE approximation theory to meshes generated as push-forwards of a small number of so-called mesh patches. This concept was used in the error analysis of hp-FEM for singular perturbations in [41, Sec. 3.3.3] and in [24]. Specifically, we assume given a fixed macro-triangulation \({{\mathcal {T}}}^{\,\,{\mathcal {M}}} = \{K^{{\mathcal {M}}} \,|\, K^{{\mathcal {M}}} \in {{\mathcal {T}}}^{\,\,{\mathcal {M}}}\}\) of \(\varOmega \) consisting of curvilinear quadrilaterals \(K^{{\mathcal {M}}}\) with analytic patch maps (to be distinguished from the actual element maps) \(F_{K^{{\mathcal {M}}}}:{{\widehat{S}}}=(0,1)^2 \rightarrow K^{{\mathcal {M}}}\) that satisfy the usual compatibility conditions. I.e., \({{\mathcal {T}}}^{\,\,{\mathcal {M}}}\) does not have hanging nodes and, for any two distinct elements \(K_1^{{\mathcal {M}}}, K_2^{{\mathcal {M}}} \in {{\mathcal {T}}}^{\,\,{\mathcal {M}}}\) that share an edge e, their respective element maps induce compatible parametrizations of e (cf., e.g., [41, Def. 2.4.1] for the precise conditions). Each element of the fixed macro-triangulation \({{\mathcal {T}}}^{\,\,{\mathcal {M}}}\) is further subdivided according to one of the refinement patterns in Definition 4.1 (see also [41, Sec. 3.3.3] or [24]). The actual triangulation is then obtained by transplanting refinement patterns on the square reference patch into the physical domain \(\varOmega \) by means of the element maps \(F_{K^{{\mathcal {M}}}}\) of the macro-triangulation. That is, for any element \(K\in {{\mathcal {T}}}\), the element map \(F_K\) is the concatenation of an affine map—which realizes the mapping from the reference square or triangle to the elements in the patch refinement pattern and will be denoted by \(A_K\)— and the patch map (which will be denoted by \(F_{K^{\mathcal {M}}}\)), i.e., \(F_K = F_{K^{\mathcal {M}}} \circ A_K : {{\hat{K}}} \rightarrow K\).

The following refinement patterns were introduced in [8, Def. 2.1, 2.3]. They are based on geometric refinement towards a vertex and/or an edge; the integer L controls the number of layers of refinement towards an edge whereas \( n \in {{\mathbb {N}}}\) measures the refinement towards a vertex.

Fig. 2
figure 2

Catalog \({{\mathfrak {P}}}\) of reference refinement patterns from [8]. Top row: reference edge patch \({\check{{{\mathcal {T}}}}}^{{\,\,\textsf{E}},L}_{geo,\sigma }\) with L layers of geometric refinement towards \(\{{{\widehat{y}}}=0\}\); reference corner patch \({\check{{{\mathcal {T}}}}}^{{\,\,\textsf{C}},n}_{geo,\sigma }\) with n layers of geometric refinement towards (0, 0); trivial patch. Bottom row: reference tensor patch \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{T},L,n}_{geo,\sigma }\) with n layers of refinement towards (0, 0) and L layers of refinement towards \(\{{{\widehat{x}}}= 0\}\) and \(\{{{\widehat{y}}}=0\}\); reference mixed patch \({\check{{{\mathcal {T}}}}}^{{\,\,\textsf{M}},L,n}_{geo,\sigma }\) with L layers of refinement towards \(\{{{\widehat{y}}}=0\}\) and n layers of refinement towards (0, 0). Geometric entities shown in boldface indicate parts of \(\partial {{\widehat{S}}}\) that are mapped to \(\partial \varOmega \). These patch meshes are transported into the curvilinear polygon \(\varOmega \) shown in Fig. 1 via analytic patch maps \(F_{K^{{\mathcal {M}}}}\)

Definition 4.1

(Catalog \({{\mathfrak {P}}}\) of refinement patterns, [8, Def. 2.1]) Given \(\sigma \in (0,1)\), L, \(n \in {{\mathbb {N}}}_0\) with \(n \ge L\) the catalog \({{\mathfrak {P}}}\) consists of the following patterns:

  1. 1.

    The trivial patch: The reference square \({{\widehat{S}}} = (0,1)^2\) is not further refined. The corresponding triangulation of \({{\widehat{S}}}\) consists of the single element: \(\check{{\mathcal {T}}}^{\,\, trivial} = \{{{\widehat{S}}}\}\).

  2. 2.

    The geometric edge patch \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{E},L}_{geo,\sigma }\): \({{\widehat{S}}}\) is refined anisotropically towards \(\{{{\widehat{y}}}=0\}\) into \(L+1\) elements as depicted in Fig. 2 (top left). The mesh \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{E},L}_{geo,\sigma }\) is characterized by the nodes (0, 0), \((0,\sigma ^i)\), \((1,\sigma ^i)\), \(i=0,\ldots ,L\) and the corresponding rectangular elements generated by these nodes.

  3. 3.

    The geometric corner patch \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{C},n}_{geo,\sigma }\): \({{\widehat{S}}} \) is refined isotropically towards (0, 0) as depicted in Fig. 2 (top middle). Specifically, the reference geometric corner patch mesh \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{C},n}_{geo,\sigma }\) in \({{\widehat{S}}}\) with geometric refinement towards (0, 0) and n layers is given by triangles determined by the nodes (0, 0), and \((0,\sigma ^i)\), \((\sigma ^i,0)\), \((\sigma ^i,\sigma ^i)\), \(i=0,1,\ldots ,n\).

  4. 4.

    The tensor product patch \({\check{{{\mathcal {T}}}}}^{\,\, \,\textsf{T},L,n}_{geo,\sigma }\): \({{\widehat{S}}}\) is triangulated in \({{\widehat{S}}}_1:= (0,\sigma ^L)^2\) and \({{\widehat{S}}}_2:= {{\widehat{S}}} \setminus {{\widehat{S}}}_1\) separately as depicted in Fig. 2 (bottom left). The triangulation of \({{\widehat{S}}}_1\) is a scaled version of \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{C},n-L}_{geo,\sigma }\) characterized by the nodes (0, 0), \((0,\sigma ^i)\), \((\sigma ^i,0)\), \((\sigma ^i,\sigma ^i)\), \(i=L,\ldots ,n\). The triangulation of \({{\widehat{S}}}_2\) is characterized by the nodes \((\sigma ^i,\sigma ^j)\), i, \(j=0,\ldots ,L\), and consists of rectangles and triangles, and only the triangles abutt on the diagonal \(\{{{\widehat{x}}}= {{\widehat{y}}}\}\).

  5. 5.

    The mixed patches \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{M},L,n}_{geo,\sigma }\): The triangulation consists of both anisotropic elements and isotropic elements as depicted in Fig. 2 (bottom right) and is obtained by triangulating the regions \({{\widehat{S}}}_1:= (0,\sigma ^L)^2\), \({{\widehat{S}}}_2:= \bigl ( {{\widehat{S}}} \setminus {{\widehat{S}}}_1\bigr ) \cap \{{{\widehat{y}}}\le {{\widehat{x}}}\}\), \({{\widehat{S}}}_3:= {{\widehat{S}}} \setminus \bigl ({{\widehat{S}}}_1 \cup {{\widehat{S}}}_2\bigr )\) separately. \({{\widehat{S}}}_1\) is a scaled version of \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{C},n-L}_{geo,\sigma }\) characterized by the nodes (0, 0), \((0,\sigma ^i)\), \((\sigma ^i,0)\), \((\sigma ^i, \sigma ^i)\), \(i=L,\ldots ,n\). The triangulation of \({{\widehat{S}}}_2\) is given by the nodes \((\sigma ^i,0)\), \((\sigma ^i,\sigma ^{j})\), \(0 \le i \le L\), \(i \le j \le L\) and consists of rectangles and triangles, and only the triangles abutt on the diagonal \(\{{{\widehat{x}}}={{\widehat{y}}}\}\). The triangulation of \({{\widehat{S}}}_3\) consists of triangles only given by the nodes \((0,\sigma ^i)\), \((\sigma ^i,\sigma ^i)\), \(i=0,\ldots ,L\).

Remark 4.2

We kept the list of possible patch refinement patterns in Definition 4.1 small in order to reduce the number of cases to be discussed for the hp-FE error bounds. A larger number of refinement patterns could facilitate greater flexibility in mesh generation. In particular, the reference patch meshes do not contain general quadrilaterals but only (axiparallel) rectangles; this restriction is not essential but leads to some simplifications in the hp-FE error analysis in [8].

The addition of the diagonal line in the reference corner, tensor, and mixed patches is done to be able to apply the regularity theory of [41] and is probably not necessary in actual computations. We also mention that with additional constraints on the macro triangulation \({{\mathcal {T}}}^{\,\,{\mathcal {M}}}\) the diagonal line could be dispensed with, [8]. \(\square \)

The following definition of the geometric boundary layer mesh \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) formalizes the requirement on the meshes that anisotropic refinement towards \(\partial \varOmega \) is needed as well as geometric refinement towards the corners.

Definition 4.3

(geometric boundary-refined mesh, [8, Def. 2.3]) Let \({{\mathcal {T}}}^{\,\,{\mathcal {M}}}\) be a fixed macro-triangulation consisting of quadrilaterals with analytic element maps that satisfies [41, Def. 2.4.1].

Given \(\sigma \in (0,1)\), L, \(n \in {{\mathbb {N}}}_0\) with \(n \ge L\), a mesh \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) is called a geometric boundary-refined mesh if the following conditions hold:

  1. 1.

    \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) is obtained by refining each element \(K^{{\mathcal {M}}} \in {{\mathcal {T}}}^{\,\,{\mathcal {M}}}\) according to the finite catalog \({{\mathfrak {P}}}\) of structured patch-refinement patterns specified in Definition 4.1, governed by the parameters \(\sigma \), L, and n.

  2. 2.

    \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) is a regular triangulation of \(\varOmega \), i.e., it does not have hanging nodes. Since the element maps for the refinement patterns are assumed to be affine, this requirement ensures that the resulting triangulation satisfies [41, Def. 2.4.1].

For each macro-patch \(K^{\mathcal {M}}\in {{\mathcal {T}}}^{\,\,{\mathcal {M}}}\), exactly one of the following cases is possible:

  1. 3.

    \(\overline{K^{\mathcal {M}}} \cap \partial \varOmega = \emptyset \). Then the trivial patch is selected as the reference patch.

  2. 4.

    \(\overline{K^{\mathcal {M}}} \cap \partial \varOmega \) is a single point. Then two cases can occur:

    1. (a)

      \(\overline{K^{\mathcal {M}}} \cap \partial \varOmega = \{{{\varvec{A}}}_j\}\) for a vertex \({{\varvec{A}}}_j\) of \(\varOmega \). Then the corresponding reference patch is the corner patch \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{C},n}_{geo,\sigma }\) with n layers of refinement towards the origin \({{\textbf{O}}}\). Additionally, \(F_{K^{\mathcal {M}}}({{\textbf{O}}}) = {{\varvec{A}}}_j\).

    2. (b)

      \(\overline{K^{\mathcal {M}}} \cap \partial \varOmega = \{{{\textbf{P}}}\}\), where the boundary point \({{\textbf{P}}}\) is not a vertex of \(\varOmega \). Then the refinement pattern is the corner patch \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{C},L}_{geo,\sigma }\) with L layers of geometric mesh refinement towards \({{\textbf{O}}}\). Additionally, it is assumed that \(F_{K^{\mathcal {M}}}({{\textbf{O}}}) = {{\textbf{P}}}\in \partial \varOmega \).

  3. 5.

    \(\overline{K^{\mathcal {M}}} \cap \partial \varOmega = {\overline{e}}\) for an edge e of \(K^{\mathcal {M}}\) and neither endpoint of e is a vertex of \(\varOmega \). Then the refinement pattern is the edge patch \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{E},L}_{geo,\sigma }\) and additionally \(F_{K^{\mathcal {M}}}(\{{{\widehat{y}}} = 0\}) \subset \partial \varOmega \).

  4. 6.

    \(\overline{K^{\mathcal {M}}} \cap \partial \varOmega = {\overline{e}}\) for an edge e of \(K^{\mathcal {M}}\) and exactly one endpoint of e is a vertex \({{\varvec{A}}}_j\) of \(\varOmega \). Then the refinement pattern is the mixed patch \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{M},L,n}_{geo,\sigma }\) and additionally \(F_{K^{\mathcal {M}}}(\{{{\widehat{y}}} = 0\}) \subset \partial \varOmega \) as well as \(F_{K^{\mathcal {M}}}({{\textbf{O}}}) = {{\varvec{A}}}_j\).

  5. 7.

    Exactly two edges of a macro-element \(K^{\mathcal {M}}\) are situated on \(\partial \varOmega \). Then the refinement pattern is the tensor patch \({\check{{{\mathcal {T}}}}}^{\,\,\textsf{T},L,n}_{geo,\sigma }\). Additionally, it is assumed that \(F_{K^{\mathcal {M}}}(\{{{\widehat{y}}} = 0\}) \subset \partial \varOmega \), \(F_{K^{\mathcal {M}}}(\{{{\widehat{x}}} = 0\}) \subset \partial \varOmega \), and \(F_{K^{\mathcal {M}}}({{\textbf{O}}}) = {{\varvec{A}}}_j\) for a vertex \({{\varvec{A}}}_j\) of \(\varOmega \).

Finally, the following technical condition ensures the existence of certain meshlines:

  1. 8.

    For each vertex \({{\varvec{A}}}_j\) of \(\varOmega \), introduce a set of lines

    $$\begin{aligned} \ell = \bigcup _{K^{\mathcal {M}}:{{\varvec{A}}}_j \in \overline{K^{\mathcal {M}}}} \{\, F_{K^{\mathcal {M}}}(\{{{\widehat{y}}} = 0\}), F_{K^{\mathcal {M}}}(\{{{\widehat{x}}} = 0\}), F_{K^{\mathcal {M}}}(\{{{\widehat{x}}} = {{\widehat{y}}}\})\, \}. \end{aligned}$$

    Let \(\varGamma _j\), \(\varGamma _{j+1}\) be the two boundary arcs of \(\varOmega \) that meet at \({{\varvec{A}}}_j\). Then there exists a line \(e \in \ell \) such that the interior angles \(\angle (e,\varGamma _j)\) and \(\angle (e,\varGamma _{j+1})\) are both less than \(\pi \).

Fig. 3
figure 3

Patch arrangement in \(\varOmega \) [8]. Left panel: example of L-shaped domain decomposed into 27 patches (T, E, M, C indicate Tensor, Edge, Mixed, Corner patches, empty squares stand for trivial patches). Right panel: Zoom-in near the reentrant corner \({{\varvec{A}}}_j\). Solid lines indicate patch boundaries, dashed lines mesh lines

Example 4.4

Fig. 3 shows an example of an L-shaped domain with macro triangulation and suitable refinement patterns. \(\square \)

Remark 4.5

For fixed L and increasing n, the meshes \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) are geometrically refined towards the vertices of \(\varOmega \). These meshes are classical geometric meshes for elliptic problems in corner domains as introduced in [5, 6] and discussed in [52, Sec. 4.4.1]. \(\square \)

4.2 hp-FE approximation of singularly perturbed problems on geometric boundary-refined meshes

The principal result [8, Thm. 4.1] on robust exponential convergence of hp-FEM for (4.1) reads as follows:

Proposition 4.6

([8, Thm. 4.1]) Let \(\varOmega \subset {{\mathbb {R}}}^2\) be a curvilinear polygon with J vertices as described in Sect. 1.2. Let A, \(c\ge c_0>0\), f satisfy (4.2). Denote by \(\{{\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\}_{L\ge 0, n \ge L}\) a family of geometric boundary-refined meshes in the sense of Definition 4.3. Fix \(c_1 > 0\).

Then there are constants C, \(b >0 \), \(\beta \in [0,1)\) (depending solely on the data A, c, f, \(\varOmega \), on the parameter \(c_1\), and on the analyticity properties of the patch-maps of the macro-triangulation \({{\mathcal {T}}}^{\,\,{\mathcal {M}}}\)) such that the following holds: If \(\varepsilon \in (0,1]\) and L satisfy the (boundary layer) scale resolution condition

$$\begin{aligned} {\sigma ^L} \le c_1 \varepsilon \end{aligned}$$

then, for any q, \(n \in {{\mathbb {N}}}\), the solution \(u^\varepsilon \in H^1_0(\varOmega )\) of (4.1) can be approximated from \(S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\) such that

$$\begin{aligned} \inf _{v \in S^{q}_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma })}&\!\! \! \left( \Vert u^\varepsilon - v\Vert _{L^2(\varOmega )} \! + \varepsilon \Vert \nabla ( u^\varepsilon - v)\Vert _{L^2(\varOmega )} \right) \le C q^{9} \left[ \varepsilon ^\beta \sigma ^{(1-\beta )n} + e^{-b q}\right] , \end{aligned}$$
$$\begin{aligned}&N:={\textrm{dim}}\, S^{q}_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma }) \le C \left( L^2 q^2 {\textrm{card}}\, {{\mathcal {T}}}^{\,\,{\mathcal {M}}} + n q^2 J\right) . \end{aligned}$$

Proposition 4.6 is restricted to \(\varepsilon \in (0,1]\). For \(\varepsilon \ge 1\), the solution \(u^\varepsilon \) of (4.1) does not have boundary layers but merely corner singularities. Hence, by Remark 4.5, meshes with fixed L are appropriate. In particular, the boundary layer scale resolution condition (4.3) is not required:

Proposition 4.7

Assume the hypotheses on \(\varOmega \) and the data A, c, f as in Proposition 4.6. Let \(\{{\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\}_{L\ge 0, n \ge L}\) be a sequence of geometric boundary-refined meshesFootnote 2.

There are constants C, \(b>0\), \(\beta \in [0,1)\) (depending solely on A, c, f, \(\sigma \in (0,1)\), and the analyticity properties of the macro-triangulation) such that the solution \(u^\varepsilon \) of (4.1) satisfies for all \(\varepsilon \ge 1\) and for all L, \(n \in {{\mathbb {N}}}_0\), \(q\in {{\mathbb {N}}}\)

$$\begin{aligned} \inf _{v \in S^{q}_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma })} \Vert u^\varepsilon - v\Vert _{H^1(\varOmega )} \le C \varepsilon ^{-2} q^9 \left( \sigma ^{(1-\beta )n} + e^{-b q}\right) . \end{aligned}$$

Furthermore, \({\textrm{dim}}\, S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\) satisfies (4.5).


(sketch) Rewriting problem (4.1) in the form

$$\begin{aligned} -{\textrm{div}}\,(A(x') \nabla u^\varepsilon ) + \varepsilon ^{-2} c(x') u^\varepsilon = \varepsilon ^{-2} f, \quad \text{ in } \varOmega , \qquad u^\varepsilon = 0 \quad \text{ on } \partial \varOmega , \end{aligned}$$

we recognize it to be, for \(\varepsilon \ge 1\), a regularly perturbed problem. Its approximation by hp-FEM on meshes that are refined geometrically towards the corners is well-understood, [5, 6, 52]. The factor \(\varepsilon ^{-2}\) in (4.6) results from the scaling of the right-hand side and the linearity of problem, the term \(\sigma ^{(1-\beta )n}\) reflects the approximation on the elements abutting the corner singularities and the term \(e^{-bq}\) the approximation on the remaining elements. \(\square \)

4.3 hp-FE approximation of singularly perturbed problems on admissible meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )\) in \(\varOmega \)

In Proposition 4.6, the solution \(u^\varepsilon \) is approximated on patchwise geometric meshes. These meshes are able to capture boundary layers (and corner layers) on a whole range of singular perturbation parameters \(\varepsilon \): as long as a lower bound for \(\varepsilon \) is known and provided that geometric mesh refinement resolves all scales, robust exponential convergence is assured.

On the other hand, if there is a single, explicitly known scale \(\varepsilon \) then the “minimal, admissible boundary layer meshes” \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon ):= {{\mathcal {T}}}\,\,(\min \{\kappa _0,\lambda q \varepsilon \},L)\) of [41, Def. 2.4.4] (see also [54, Fig. 11] or [44, Fig. 2]), which are designed to resolve a single, explicitly known length scale with hp-FEM may be employed. In contrast to the geometric boundary-refined meshes of Def. 4.3, these “minimal” boundary-fitted meshes are \(\varepsilon \)-dependent.

Proposition 4.8

([41, Thm. 2.4.8 in conjunction with Thm. 3.4.8]) Let \(\varOmega \subset {{\mathbb {R}}}^2\) be a curvilinear polygon with J vertices as described in Sect. 1.2. Let A, \(c\ge c_0>0\), f satisfy (4.2). Let \(u^\varepsilon \) be the solution of (4.1).

Consider, for \(\kappa _0>0\) determined by \(\varOmega \), the two-parameter family \({{\mathcal {T}}}\,\,(\kappa ,L)\), \((\kappa ,L) \in (0,\kappa _0] \times {{\mathbb {N}}}\), of admissible meshes in the sense of [41, Def. 2.4.4], [24, Def. 3.1, Figs. 1, 2].

Then there are constants b, \(\lambda _0\) independent of \(\varepsilon \in (0,1]\) such that for every \(\lambda \in (0,\lambda _0]\) there is \(C > 0\) such that for every \(q\ge 1\), \(L \ge 0\) there holds the error bounds

$$\begin{aligned}&\inf _{v \in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon ))} \Vert u^\varepsilon - v\Vert _{L^2(\varOmega )} + \varepsilon \Vert \nabla ( u^\varepsilon - v)\Vert _{L^2(\varOmega )} \le C q^6 \left[ e^{-b \lambda q} + \varepsilon e^{-b L}\right] , \end{aligned}$$
$$\begin{aligned}&\quad {\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon ): = {{\mathcal {T}}}\,\,(\min \{\kappa _0,\lambda q \varepsilon \},L), \end{aligned}$$
$$\begin{aligned}&\quad N := {\textrm{dim}}\, S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )) \le C L q^2. \end{aligned}$$

In particular, for \(L \sim q\), one has with C, \(b'\) independent of \(\varepsilon \)

$$\begin{aligned} \inf _{v \in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon ))} \Vert u^\varepsilon - v\Vert _{L^2(\varOmega )} + \varepsilon \Vert \nabla ( u^\varepsilon - v)\Vert _{L^2(\varOmega )} \le C \exp (-b' N^{1/3}). \end{aligned}$$

For \(\varepsilon \ge 1\) these admissible boundary layer meshes are the well-known geometric meshes with L layers of geometric refinement as introduced in [5, 6] and discussed in [52, Sec. 4.4.1]. These geometric, corner-refined meshes are similar to the meshes \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) with fixed \(L = 0\) discussed in Remark 4.5. In particular, the minimal boundary layer meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )\) for \(\varepsilon \ge 1\) do not really depend on \(\varepsilon \), \(\lambda \), and q. However, for consistency of notation, we keep the notation \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )\) in the following result, which covers the case \(\varepsilon \ge 1\). We need this result since the range (3.8) of eigenvalues \(\mu _i\) involves also eigenvalues \(\mu _i \ge 1\).

Proposition 4.9

Under the assumptions of Proposition 4.8, there exist constants b, \(C>0\) such that

$$\begin{aligned}&\forall \varepsilon \ge 1, \forall q,L\in {{\mathbb {N}}}: \quad \inf _{v \in S^q_0({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon ))} \Vert u^\varepsilon - v\Vert _{H^1(\varOmega )} \le C \varepsilon ^{-2} q^6 \left[ e^{-b \lambda q} + e^{-b L}\right] , \\&\quad \text{ where }\quad N := {\textrm{dim}}\, S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )) \le C L q^2. \end{aligned}$$

In particular, for \(L \sim q\), there are constants \(b'\), \(C>0\) such that

$$\begin{aligned} \forall \varepsilon \ge 1, \forall q\sim L\in {{\mathbb {N}}}: \quad \inf _{v \in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon ))} \Vert u^\varepsilon - v\Vert _{H^1(\varOmega )} \le C \varepsilon ^{-2} \exp (-b' N^{1/3}). \end{aligned}$$


For \(\varepsilon \ge 1\), the minimal boundary layer meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )\) are simply meshes that are geometrially refined towards the corners of \(\varOmega \). That is, structurally, the meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )\) are similar to the meshes \({\mathcal T}^{\,\,0,L}_{geo,\sigma }\). The approximation result is then a consequence of Proposition 4.7. \(\square \)

It is worth pointing out the following differences between the approximation on geometric boundary-refined meshes \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) and on the minimal admissible boundary layer meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }\): (a) the use of the mesh \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) requires the scale resolution condition (4.3). It requires \(L \gtrsim |\ln \varepsilon |\) so that the approximation result Proposition 4.6 depends (weakly) on \(\varepsilon \). (b) Selecting \(n \sim L \sim q\) in Proposition 4.6 yields convergence \(O(\exp (-b \root 4 \of {N}))\) whereas the choice \(L \sim q\) in Proposition 4.8 yields the better convergence behavior \(O(\exp (-b' \root 3 \of {N}))\). (c) The meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }\) are designed to approximate a single scale well whereas the meshes \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) are capable to resolve a range of scales. (d) The meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }\) rely on a suitable choice of the parameter \(\lambda \) whereas the geometric boundary layer meshes \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) do not have parameters that need to be suitably chosen.

5 Exponential convergence of Extended \(hp\)-FEM

Based on the hp semidiscretization in the extended variable combined with the diagonalization in Sect. 3, we use the hp-approximation results from Sect. 4 to prove exponential convergence of hp-FEM for the CS-extended problem (1.15).

As is revealed by the diagonalization (3.5), the y-semidiscrete solution \(G^{{{\varvec{r}}}}_{M} {\mathscr {U}}\) contains \(\mathcal {M}\) separate length scales associated with the eigenvalues \(\mu _i\), \(i=1,\ldots ,\mathcal {M}\). The solutions \(U_i \in H^1_0(\varOmega )\) of the resulting \({{\textsf{M}}}\) many independent, linear second-order reaction-diffusion problems in \(\varOmega \) exhibit both, boundary layers and corner singularities.

In Case A, which we discuss in Sect. 5.1, we employ for each i a “minimal” hp-FE space in \(\varOmega \) that resolves boundary- and corner layers appearing in the \(U_i\) due to possibly large/small values of \(\mu _i\). Mesh design principles for such “minimial” FE spaces that may resolve a single scale of a singularly perturbed problem have already been presented in, e.g., [39, 41, 42, 53, 54]; the specific choice \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )\) has been discussed in Propositions 4.84.9 and will be used in our analysis.

In Case B, which we discuss in Sect. 5.2, we discretize these decoupled, reaction-diffusion problems by one common hp-FEM in the bounded polygon \(\varOmega \), which employs both, geometric corner refinement as well as geometric boundary refinement, as in [41, 42]. Due to the need to obtain FE solutions for all \(\mu _i\) in one common FE space in \(\varOmega \), however (in order that the sum (3.13) belong to a tensor product hp-FE space), our analysis will provide one hp-FE space in \(\varOmega \) which will resolve all boundary and corner layers due to small parameters \(\mu _i\) near \(\partial \varOmega \). As we shall show, in Case B the total number of DOFs is larger than in Case A.

5.1 Exponential convergence I: diagonalization and minimal meshes

The robust exponential convergence result Proposition 4.8 allows us to establish, in conjunction with the diagonalization (3.2)–(3.4), a first exponential convergence result in Case A of Sect. 3. We consider the following numerical scheme, which relies on the “minimal boundary layer meshes” \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon )\) from [41, Sec. 2.4.2] already discussed in Proposition 4.8:

  1. (1)

    Select with and consider the space for the geometric mesh \({{\mathcal {G}}}^M_{geo,\sigma }\) on with M elements and a linear degree vector \({{\varvec{r}}}\) with slope \({\mathfrak {s}}>0\).

  2. (2)

    Solve the eigenvalue problem (3.2), (3.3).

  3. (3)

    Select \(\lambda > 0\). Define \(U^{q,L}_i \in S^{q}_0({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\sqrt{\mu _i}))\) as the solution of

    $$\begin{aligned} \forall v \in S^{q}_0({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\sqrt{\mu _i}))\ :\ a_{\mu _i,\varOmega } (U^{q,L}_i,v) = d_s v_i(0) \langle f,v\rangle . \end{aligned}$$
  4. (4)

    Define the approximation \({\mathscr {U}}^{q,L}(x,y):= \sum _{i=1}^{\mathcal {M}_{geo}} v_i(y) U^{q,L}_i(x).\)

For the approximation error \({\mathscr {U}}- {\mathscr {U}}^{q,L}\) we have:

Theorem 5.1

Let \(\varOmega \subset {{\mathbb {R}}}^2\) be a curvilinear polygon with J vertices as described in Sect. 1.2. Let A, f satisfy (1.1) and let A be uniformly symmetric positive definite on \(\varOmega \). Fix positive constants \(c_1\), \(c_2\), and the slope \({\mathfrak {s}}\).

Then there are constants C, b, \(b'\), \(b''\), \(\lambda _0>0\) (depending on \(\varOmega \), A, f s, \(\sigma \), \(c_1\), \(c_2\), \({\mathfrak {s}}\), and the parameters characterizing the mesh family \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }\)) such that for any \(\lambda \in (0,\lambda _0]\) there holds for all q, \(M\ge 1\), \(L \ge 0\)

$$\begin{aligned} \Vert u - {{\,\mathrm{tr_\varOmega }\,}}{\mathscr {U}}^{q,L} \Vert _{{\mathbb {H}}^s(\varOmega )}&\lesssim \Vert \nabla ({\mathscr {U}}- {\mathscr {U}}^{q,L}) \Vert _{L^2(y^\alpha ,\mathcal {C})} \nonumber \\&\le C M^{2-\alpha } q^6 \left[ \exp (-b \lambda q) + \exp (-b' L)\right] + \exp (-b^{\prime \prime } M). \end{aligned}$$

In particular, for \(q \sim L \sim M \sim p\), denoting \({\mathscr {U}}^p:= {\mathscr {U}}^{q,L}\) with this choice of q and L, and the total number of degrees of freedom \(N = \sum \limits _i {\textrm{dim}}\,S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\sqrt{\mu _i}))\),

$$\begin{aligned} \Vert u - {{\,\mathrm{tr_\varOmega }\,}}{\mathscr {U}}^{p} \Vert _{{\mathbb {H}}^s(\varOmega )} \lesssim \Vert \nabla ({\mathscr {U}}- {\mathscr {U}}^p) \Vert _{L^2(y^\alpha ,{\mathcal {C}})} \lesssim \exp (-bp) \sim \exp (-b''' \root 5 \of {N}) \;, \end{aligned}$$

where the constant \(b'''\) depends additionally on the implied constants in \(q \sim L \sim M\).

Remark 5.2

The approximation result (5.3) still holds if the linear degree vector \({{\varvec{r}}}\) in the definition of \({\mathscr {U}}^{q,L}\) is replaced with a constant polynomial degree \(r \sim M\). \(\square \)


Step 1 (semidiscretization error): The analyticity of f on \(\overline{\varOmega }\) implies \(f \in {{\mathbb {H}}}^{-s+\nu }(\varOmega )\) for any \(\nu \in (0,1/2+s)\). Hence, by (3.1), the semidiscretization error \({\mathscr {U}}- G^{{{\varvec{r}}}}_M {\mathscr {U}}\) satisfies for suitable \(b > 0\) independent of M

$$\begin{aligned} \Vert {\mathscr {U}}- G^{{{\varvec{r}}}}_M {\mathscr {U}}\Vert _{\mathcal {C}} \lesssim e^{-b M}. \end{aligned}$$

Step 2 (representation of \(G^{{{\varvec{r}}}}_M {\mathscr {U}}\)): The semidiscrete approximation \(G^{{\varvec{r}}}_M{\mathscr {U}}\) may be expressed in terms of the eigenbasis \(\{ v_j \}_{j=1}^M\) in (3.2), (3.3) as

$$\begin{aligned} (G^{{\varvec{r}}}_M {\mathscr {U}})(x',y) = \sum _{i=1}^{\mathcal {M}_{geo}} v_i(y) U_i(x') \;, \end{aligned}$$

where the function \(U_i\) solve by (3.5)

$$\begin{aligned} \forall V \in H^1_0(\varOmega )\; :\; a_{\mu _i,\varOmega }(U_i,V) = d_s v_i(0) \langle f,V\rangle . \end{aligned}$$

Step 3: For every \(i=1,\ldots ,\mathcal {M}_{geo}\), and for every \(q\in {{\mathbb {N}}}\), approximate \(U_i\in H^1_0(\varOmega )\) by its Galerkin approximation \(U^{q,L}_i \in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\sqrt{\mu _i}))\subset H^1_0(\varOmega )\). That is, \(U^{q,L}_i = \varPi _i^q U_i \) is the \(a_{\mu _i,\varOmega }(\cdot ,\cdot )\)-projection of \(U_i\) given by (3.9). It is the best approximation to \(U_i\) in the corresponding energy norm and satisfies

$$\begin{aligned} \Vert U_i - \varPi _i^q U_i \Vert _{\mu _i,\varOmega } = \min _{V\in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\sqrt{\mu _i}))} \Vert U_i - V \Vert _{\mu _i,\varOmega } \;. \end{aligned}$$

By linearity of \(\varPi _i^q\), the analyticity of f, Propositions 4.84.9 (depending on whether \(\mu _i\le 1\) or \(\mu _i > 1\)), and Lemma 3.2 we get

$$\begin{aligned} \Vert U_i - \varPi _i^q U_i \Vert _{\mu _i,\varOmega }&\lesssim |v_i(0)| q^6 \left( e^{-b \lambda q}+ \sqrt{\mu _i} e^{-b' L} \right) \\&\lesssim \mathcal {M}_{geo}^{(1-\alpha )/2} q^6 \left( e^{-b \lambda q}+ \sqrt{\mu _i} e^{-b' L} \right) . \end{aligned}$$

Step 4 (Proof of (5.2)): With the approximations \(U^{q,L}_i\) the approximation \({\mathscr {U}}^{q,L}\) of \({\mathscr {U}}\) is given by

$$\begin{aligned} {\mathscr {U}}^{q,L}(x',y) := \sum _{i=1}^{\mathcal {M}_{geo}} v_i(y) U^{q,L}_i(x') \in {\mathring{H}^1} (y^\alpha ,\mathcal {C}). \end{aligned}$$

From (3.7) we get for \(Z = {\mathscr {U}}_M - {\mathscr {U}}^{q,L} = \sum _{i=1}^{\mathcal {M}_{geo}} v_i(y) (U_i(x) - U^{q,L}_i(x))\)

$$\begin{aligned} \Vert Z \Vert _{\mathcal {C}}^2 = a_{\mathcal {C}}(Z,Z) = \sum _{i=1}^{\mathcal {M}_{geo}} \Vert U_i - U^q_i \Vert ^2_{\mu _i,\varOmega } \lesssim {\mathcal {M}}_{geo}^{2-\alpha } q^{12} \left[ \exp (-2b\lambda q) + \exp (-2 b' L) \right] \;. \end{aligned}$$

We note that \(\mathcal {M}_{geo} \sim M^2\). Combining this last estimate with (5.4) yields the second estimate in (5.2). The first estimate in (5.2) expresses the continuity of the trace operator at \(y = 0\).

Step 5 (complexity estimate): Using that \(\mathcal {M}_{geo} = O(M^2) = O(q^2)\) and the fact that \({\textrm{dim}} S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\sqrt{\mu _i})) \le C L q^2 = O(q^3)\) as well as the assumption \(q \sim L \sim M \sim p\), we arrive at a total problem size \(N = O(q^5)\). Absorbing algebraic factors in the exponentially decaying one in (5.2) then yields (5.3). \(\square \)

5.2 Exponential convergence II: geometric boundary-refined meshes

In this section, we show that exponential convergence of a Galerkin method for (1.15) can be achieved by a suitable choice of meshes \({{\mathcal {T}}}\) and \({{\mathcal {G}}}^M\) in the tensor product space \({\mathbb {V}}^{q,{{\varvec{r}}}}_{h,M}({{\mathcal {T}}},{{\mathcal {G}}}^M)\) of (2.6). That is, we place ourselves in Case B in Sect. 3.2. For the discretization in y, we select again the spaces with and the linear degree vector \({{\varvec{r}}}\) with slope \({\mathfrak {s}}\). The hp-FE discretization in \(\varOmega \) is based on the space \(S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\) with the geometric boundary-refined mesh \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) in Definition 4.3. Recall that \(G^{q,{{\varvec{r}}}}_{h,M}{\mathscr {U}}\) denotes the Galerkin projection of the solution \({\mathscr {U}}\) onto . In Theorem 5.4 below, we will focus on the case \(q \sim L \sim M\) and the corresponding Galerkin projection is denoted \({\mathscr {U}}^p_{TP}\).

Remark 5.3

  1. (i)

    In contrast to the procedure of Case A in the preceeding Sect. 5.1, precise knowledge of the length scales \(\sqrt{\mu _i}\) is not necessary.

  2. (ii)

    The diagonalization procedure may be carried out numerically and results in decoupled reaction-diffusion problems, affording parallel numerical solution.

  3. (iii)

    The linear degree vector \({{\varvec{r}}}\) could be replaced with a constant degree \(r \sim M\), and Theorem 5.4 will still hold. \(\square \)

The tensor-product hp-FEM in converges at exponential rate.

Theorem 5.4

Let \(\varOmega \subset {{\mathbb {R}}}^2\) be a curvilinear polygon with J vertices as described in Sect. 1.2. Let A, f satisfy (1.1) and let A be uniformly symmetric positive definite on \(\varOmega \). Fix a slope \({\mathfrak {s}}> 0\). Set


With these choices, denote by \({\mathscr {U}}^p_{TP}\) the Galerkin projection of \({\mathscr {U}}\) onto the tensor product hp-FE space .

Then \(N:= {\textrm{dim}}\, {\mathbb {V}}^{q,{{\varvec{r}}}}_{h,M}({\mathcal {T}}^{\,\,L,n}_{geo,\sigma },{{\mathcal {G}}}^M_{geo,\sigma }) = O(p^6)\), and there are constants C, b, \(b' >0\) depending only on \(\varOmega \), A, f, s, the macro triangulation \({{\mathcal {T}}}^{\,\,{\mathcal {M}}}\) underlying the geometric boundary-refined meshes \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\), the parameter \(\sigma \), the slope parameter \({\mathfrak {s}}\), and the implied constants in (5.5) such that

$$\begin{aligned} \Vert u - {{\,\mathrm{tr_\varOmega }\,}}{\mathscr {U}}^p_{TP} \Vert _{{\mathbb {H}}^s(\varOmega )} \lesssim \Vert \nabla ({\mathscr {U}}- {\mathscr {U}}^p_{TP})\Vert _{L^2(y^\alpha ,{\mathcal {C}})} \lesssim \exp (-bp) \sim \exp (-b' \root 6 \of {N}) \;. \end{aligned}$$


The proof of this result is structurally along the lines of the proof of Theorem 5.1. We omit details and merely indicate how the scale resolution condition (4.3) is now accounted for. We note that for fixed and \(M' \le M\), we have that the spaces and satisfy (if the same slope \({\mathfrak {s}}\) for the linear degree vector is chosen). Hence, the Galerkin error for the approximation from the space is smaller than that from , and we therefore focus on bounding the approximation error for . That is, we will not necessarily exploit the full approximation power of the space ; instead we select \(M'\) so that the smallest length scale induced by the discretization on the extended direction is resolved by the spatial mesh \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\). We select \(M'\) of the form \(M' = \lfloor \eta M\rfloor \) for some \(\eta \) to be chosen below. For ease of notation, we simply set \(M' = \eta M\).

By Lemma 3.2 we have that the smallest length scale of the singularly perturbed problems for the space \(S^q_0({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }) \otimes \) \(({{\mathcal {G}}}^{M'}_{geo,\sigma }) \) is \(\min _i \mu _i \gtrsim {M'}^{-2} \sigma ^{2M'}\) and that the scale resolution condition (4.3) is certainly met if we enforce

$$\begin{aligned} c_1 \min _{i} \sqrt{\mu _i} \gtrsim c_1 {M'}^{-1} \sigma ^{M'} = c_1 \frac{1}{\eta M} \sigma ^{\eta M} {\mathop {\ge }\limits ^{!}} \sigma ^L . \end{aligned}$$

Since \(L \sim M\), we see that (5.6) can be satisfied for some fixed \(c_1\) provided \(\eta \) is chosen sufficiently small (depending on the implied constant in the relation \(L \sim M\)). The approximation of \({\mathscr {U}}\) from the subspace then follows by arguments very similar to those of the proof of Theorem 5.1. \(\square \)

6 Exponential convergence of hp-sinc-BK FEM

The hp-sinc BK FEM is based on exponentially convergent, so-called “sinc” quadratures, to the Balakrishnan formula

$$\begin{aligned} \mathcal {L}^{-s} = c_B \int _{-\infty }^\infty e^{-sy} (I + e^{-y}\mathcal {L})^{-1} \, \text{ d }y \;. \end{aligned}$$

as described in Sects. 1.6 and (1.16) (see [10,11,12] and the references there). We briefly review the corresponding exponential convergence results in Sect. 6.1. The numerical realization of the sinc quadrature approximation of (6.1) leads again to the numerical solution of decoupled, local linear reaction-diffusion problems in \(\varOmega \). The resulting local boundary value problems in \(\varOmega \) are again singularly perturbed. Accordingly, we discuss two classes of hp-FE approximations for their numerical solution: In Sect. 6.2.1, we discuss Case A, which is based on the minimal boundary layer meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }\) in \(\varOmega \). In Sect. 6.2.2, we detail Case B, where geometric boundary-refined meshes in \(\varOmega \) are employed. The latter permit the use of one common hp-FEM for all values of parameters arising from the sinc quadrature approximation of (6.1).

6.1 Sinc quadrature approximation

The functional integral (6.1) can be discretized by so-called “sinc” quadratures (see, e.g., [10, 56]). To that end, we define for \(K\in {{\mathbb {N}}}\)

$$\begin{aligned} y_j := jK^{-1/2} = jk, \;\; |j| \le K, \;\; k:=1/\sqrt{K}\;. \end{aligned}$$

For \(f\in L^2(\varOmega )\subset {\mathbb {H}}^{-s}(\varOmega )\) for every \(0<s<1\), the (semidiscrete) sinc quadrature approximation \(u_M\) of \(u = \mathcal {L}^{-s}f \in {\mathbb {H}}^s(\varOmega )\) as represented in (6.1) reads with \(\varepsilon _j := e^{-y_j/2} = e^{j/(2\sqrt{K})}\), \(|j|\le K\):

$$\begin{aligned} u_K = Q^{-s}_k(\mathcal {L}) f := c_B k \sum _{|j|\le K} \varepsilon _j^{2s} \left( I + \varepsilon _j^2 \mathcal {L}\right) ^{-1} f \;. \end{aligned}$$

We note that for any k, we have that \(Q^{-s}_k(\mathcal {L}): {\mathbb {H}}^{-1}(\varOmega )\rightarrow {\mathbb {H}}^1(\varOmega )\) is a bounded linear map. By the continuous embeddings \({\mathbb {H}}^1(\varOmega ) \subseteq {\mathbb {H}}^s(\varOmega ) \subseteq {\mathbb {H}}^{-s}(\varOmega ) \subseteq {\mathbb {H}}^{-1}(\varOmega )\), also \(Q^{-s}_k(\mathcal {L}): {\mathbb {H}}^{-s}(\varOmega ) \rightarrow {\mathbb {H}}^s(\varOmega )\) is a bounded linear map for any \(0<s<1\). The semidiscretization error \(\mathcal {L}^{-s} - Q^{-s}_k(\mathcal {L})\) is bounded in [10]:

Proposition 6.1

([10, Thm. 3.2]) For \(f\in L^2(\varOmega )\) and for every \(0 \le \beta < s\) with \(0<s<1\) denoting the exponent of the fractional diffusion operator in (1.4), there exist constants b, \(C>0\) (depending on \(\beta \), s, \(\varOmega \), and \({\mathcal {L}}\)) such that for every \(k>0\) as in (6.2) holds, with \(D(\mathcal {L}^\beta ) = {\mathbb {H}}^{2 \beta }(\varOmega )\) where \({\mathbb {H}}^{\sigma }(\varOmega )\) is as in (1.3),

$$\begin{aligned} \Vert (\mathcal {L}^{-s} - Q^{-s}_k(\mathcal {L}))f \Vert _{D({\mathcal {L}}^{\beta })} \le C\exp (-b/k) \Vert f \Vert _{L^2(\varOmega )} \;. \end{aligned}$$

Remark 6.2

Sinc approximation formulas such as (6.3) have a number of parameters which can be optimized in various ways. The error bound in Proposition 6.1 is merely one particular choice (the so-called “balanced” choice of parameters), which is sufficient for the exponential sinc error bound (6.4). Other choices yield analogous (exponential) sinc error bounds, with possibly better numerical values for the constants b, \(C>0\) in (6.4). We point out that we make such a choice in our numerical examples in (7.3) and refer to [10, Rem. 3.1] for details. \(\square \)

6.2 hp-FE approximation in \(\varOmega \)

The sinc approximation error bound (6.4) implies exponential convergence of the sinc quadrature sum (6.3), which we write as

$$\begin{aligned} Q^{-s}_k(\mathcal {L}) f = c_Bk \sum _{|j| \le K} \varepsilon _j^{2s} w_j\;. \end{aligned}$$

Here, the \(w_j \in H^1_0(\varOmega )\) are solutions of the \(2K+1\) reaction-diffusion problems

$$\begin{aligned} \varepsilon _j^2 \mathcal {L} w_j + w_j = f\quad \text{ in }\quad \varOmega , \quad w_j|_{\partial \varOmega } = 0\;, \quad |j| \le K\;. \end{aligned}$$

With the bilinear form \(a_{\varepsilon _j^2,\varOmega }(\cdot ,\cdot )\) from (3.6), their variational formulations reads: find \(w_j \in H^1_0(\varOmega )\) such that

$$\begin{aligned} \forall v\in H^1_0(\varOmega )\;:\quad a_{\varepsilon _j^2,\varOmega }(w_j,v) = (f,v)\;. \end{aligned}$$

The reaction diffusion problems (6.6) are again of the type (3.5) for which exponentially convergent hp-FE approximations were presented in Sect. 5, from [41] and [8]. A fully discrete sinc-BK FEM approximation is constructed by replacing \(w_j\) in (6.5), (6.6) by one of the hp-FE approximations discussed in Sect. 5. As in the case of the Extended \(hp\)-FEM, also for the sinc-BK FEM one can distinguish between Case A, in which each problem (6.7) is discretized using a different hp-FE space, and Case B, where all problems (6.7) are discretized by the same hp-FE space in \(\varOmega \).

6.2.1 Case A

We discretize the singularly perturbed problems (6.7) with length scales \(\varepsilon _j\) using the spaces \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon _j)\). That is, denoting the resulting approximations generically by \(w_j^{hp}\), defined by: for \(|j|\le K\), find \(w_j^{hp} \in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon _j))\) such that

$$\begin{aligned} \forall v \in S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon _j))\, :\, a_{\varepsilon _j^2,\varOmega }(w_j^{hp},v) = \langle f,v\rangle \;. \end{aligned}$$

The hp-FE approximations \(w_j^{hp}\) are well-defined. Replacing in (6.5) the \(w_j\) by their hp-FE approximations, we obtain the sinc-BK FEM approximation of the (inverse of) the fractional diffusion operator \(\mathcal {L}^s\):

$$\begin{aligned} Q^{-s}_k(\mathcal {L}_{hp}) f := c_Bk \sum _{|j| \le K} \varepsilon _j^{2s} w^{hp}_j\;. \end{aligned}$$

To bound the error \(\Vert u - Q^{-s}_k(\mathcal {L}_{hp}) f \Vert _{{\mathbb {H}}^s(\varOmega )}\), we write

$$\begin{aligned} \Vert u - Q^{-s}_k(\mathcal {L}_{hp}) f \Vert _{{\mathbb {H}}^s(\varOmega )}&\le \displaystyle \Vert (\mathcal {L}^{-s} - Q^{-s}_k(\mathcal {L})) f \Vert _{{\mathbb {H}}^s(\varOmega )} + \Vert Q^{-s}_k(\mathcal {L} - \mathcal {L}_{hp}) f \Vert _{{\mathbb {H}}^s(\varOmega )} \;. \end{aligned}$$

For the first term, the sinc approximation error, we use the error bound (6.4) with \(\beta = s/2\). Using \(D(\mathcal {L}^{s/2}) = {\mathbb {H}}^s(\varOmega )\) for \(0<s<1\), we obtain from (6.4) and from \(k \sim K^{-1/2}\) (see (6.2)) the bound

$$\begin{aligned} \Vert (\mathcal {L}^{-s} - Q^{-s}_k(\mathcal {L})) f \Vert _{{\mathbb {H}}^s(\varOmega )} \le C\exp (-b\sqrt{K}) \Vert f\Vert _{L^2(\varOmega )} \;. \end{aligned}$$

To bound the second term, definition (6.5) and the triangle inequality imply

$$\begin{aligned} \Vert Q^{-s}_k(\mathcal {L} - \mathcal {L}_{hp}) f \Vert _{{\mathbb {H}}^s(\varOmega )} \lesssim k \sum _{|j| \le K } \varepsilon _j^{2s} \Vert w_{j} - w_{j}^{hp} \Vert _{{\mathbb {H}}^s(\varOmega )} \;. \end{aligned}$$

To invoke the hp-error bound (4.4) with the norm (3.7), we apply the interpolation inequality (1.5) to each term in (6.11) and, using the definition (3.7) of the norm \(\Vert w \Vert ^2_{\varepsilon ^2,\varOmega } \sim \Vert w \Vert _{\varepsilon ^2}^2 := \varepsilon ^2 \Vert \nabla w \Vert ^2_{L^2(\varOmega )} + \Vert w \Vert ^2_{L^2(\varOmega )} \sim \left( \varepsilon \Vert \nabla w \Vert _{L^2(\varOmega )} + \Vert w \Vert _{L^2(\varOmega )} \right) ^2\) for all \(\varepsilon \ge 0\), we arrive at

$$\begin{aligned} \Vert Q^{-s}_k(\mathcal {L} - \mathcal {L}_{hp}) f \Vert _{{\mathbb {H}}^s(\varOmega )}&\lesssim \displaystyle k \sum _{|j|\le K} \varepsilon _j^s \Vert w_{j} - w_{j}^{hp} \Vert ^{1-s}_{L^2(\varOmega )} \left( \varepsilon _j \Vert \nabla (w_{j} - w_{j}^{hp}) \Vert _{L^2(\varOmega )} \right) ^s \nonumber \\&\lesssim \displaystyle k \sum _{|j|\le K} \varepsilon _j^{s} \Vert w_{j} - w_{j}^{hp} \Vert _{\varepsilon _j^2} \;. \end{aligned}$$

From \(\varepsilon _j = \exp (j/(2\sqrt{K}))\), we estimate with a geometric series argument and then a Taylor expansion for large K

$$\begin{aligned} \sum _{j \ge 0} \varepsilon _j^{s-1}&= \sum _{j \ge 0} e^{j(s-1)/(2\sqrt{K})} = \left( 1 - e^{(s-1)/(2 \sqrt{K})}\right) ^{-1} \le C \sqrt{K}, \end{aligned}$$
$$\begin{aligned} \sum _{j \le 0} \varepsilon _j^{s}&= \sum _{j \le 0} e^{js/(2\sqrt{K})} = \left( 1 - e^{-s/(2 \sqrt{K})}\right) ^{-1} \le C \sqrt{K}. \end{aligned}$$

We split the summation indices in the sum in (6.12) as \(I_+ \cup I_-\) with \(I_+ := \{|j|\le K\} \cap \{j>0\}\) and \(I_- := \{|j|\le K\} \cap \{ j\le 0\}\). We note \(0 < \varepsilon _j \le 1\) for \(j\in I_-\) and \(1<\varepsilon _j \le \exp (\sqrt{K}/2)\) for \(j \in I_+\). With Proposition 4.9, we estimate the sum over \(j\in I_+\) in (6.12) according to

$$\begin{aligned} \sum _{j\in I_+} \varepsilon _j^{s} \Vert w_{j} - w_{j}^{hp} \Vert _{\varepsilon _j^2}&\lesssim \displaystyle q^6 \sum _{j\in I_+} \varepsilon _j^{s-1} \left( \exp (-bq)+\exp (-b'L)\right) \nonumber \\&{\mathop {\lesssim }\limits ^{6.13)}} \displaystyle \sqrt{K} q^6 \left( \exp (-bq)+\exp (-b'L) \right) \;. \end{aligned}$$

Next, we bound in the same manner the sum over \(j\in I_-\) (i.e., \(0 < \varepsilon _j \le 1\)) in (6.12). With Proposition 4.8 we get

$$\begin{aligned} \displaystyle \sum _{j\in I_-} \varepsilon _j^{s} \Vert w_{j} - w_{j}^{hp} \Vert _{\varepsilon _j^2}&\lesssim \displaystyle q^6 \sum _{j\in I_-} \varepsilon _j^{s} \left[ \exp (-b\lambda q ) + \varepsilon _j \exp (-b'L) \right] \nonumber \\&{\mathop {\lesssim }\limits ^{(6.14)}} \displaystyle \sqrt{K} q^6 \left[ \exp (-b\lambda q ) + \exp (-b'L) \right] \;. \end{aligned}$$

We select \(L\sim q\) (i.e., the number L of mesh-layers proportional to the polynomial degree \(q\ge 1\)) with proportionality constant independent of \(\varepsilon _j\). Furthermore, we note \(k = 1/\sqrt{K}\) and select \(K \sim q^2\) so that

$$\begin{aligned} \Vert Q^{-s}_k(\mathcal {L} - \mathcal {L}_{hp}) f \Vert _{{\mathbb {H}}^s(\varOmega )} \lesssim q^7 \exp (-b\lambda q). \end{aligned}$$

Combining the error bounds (6.10) and (6.17), and suitably adjusting the constant \(b>0\) in the exponential bounds, we arrive at

$$\begin{aligned} \Vert u - Q^{-s}_k(\mathcal {L}_{hp}) f \Vert _{{\mathbb {H}}^s(\varOmega )} \lesssim \exp (-bq) \;. \end{aligned}$$

Given that the approximation \(u_{K,hp}\) involves the solution of \(O(K) = O(q^2)\) reaction-diffusion problems, each of which requires \(O(q^3)\) DOF, the error bound (6.18) in terms of the total number of degrees of freedom \(N_{DOF}\) reads

$$\begin{aligned} \Vert u - Q^{-s}_k(\mathcal {L}_{hp}) f \Vert _{{\mathbb {H}}^s(\varOmega )} \le C \exp (-b\root 5 \of {N_{DOF}}) \end{aligned}$$

with constants b, \(C>0\) that are independent of \(N_{DOF}\). We have thus shown:

Theorem 6.3

Let \(\varOmega \) be a curvilinear polygon as defined in Sect. 1.2, let A, f satisfy (1.1), and let A be uniformly symmetric positive definite on \(\varOmega \). Let u be the solution to (1.4), and let its fully discrete approximation be given by the sinc-BK FEM approximation (6.5) in conjunction with the hp-FE approximation \(w_j^{hp}\) in (6.8) with the hp-FE spaces \(S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon _j))\) on the minimal boundary layer meshes \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon _j)\). Choose further the parameters \(q \sim L \sim \sqrt{K}\), and let \(N = \sum \limits _{|j| \le K}{\textrm{dim}}\,S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,q}_{min,\lambda }(\varepsilon _j))\) denote the total number of degrees of freedom.

Then there exists a \(\lambda _0\) (depending on \(\varOmega \), A, f, and the parameters characterizing the mesh family \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }\)) such that for any \(\lambda \in (0,\lambda _0]\) there are constants C, \(b > 0\) (depending additionally on the implied constants in \(q \sim L \sim \sqrt{K}\) and s) such that

$$\begin{aligned} \Vert {\mathcal {L}}^{-s} f - Q_k^{-s}(\mathcal {L}_{hp}) f\Vert _{{\mathbb {H}}^s(\varOmega )} \le C \exp (-b \root 5 \of {N}). \end{aligned}$$

6.2.2 Case B

Instead of approximating the problems (6.7) from individual spaces, one may approximate them from the same hp-FE space in \(\varOmega \). Specifically, we define the approximations \(w_j^{hp}\) by: Find \(w_j^{hp} \in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\) such that

$$\begin{aligned} \forall v \in S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\, :\, a_{\varepsilon _j^2,\varOmega }(w_j^{hp},v) = \langle f,v\rangle \;. \end{aligned}$$

Theorem 6.4

Let \(\varOmega \) be a curvilinear polygon as defined in Sect. 1.2. Assume that A, f satisfy (1.1) and that A is uniformly symmetric positive definite. Let u be solution to (1.4), and let its discrete approximation \(Q_k^{-s}(\mathcal {L}_{hp}) f\) be given by (6.5) in conjunction with (6.20). Let \(\sqrt{K} \sim q \sim L = n\) and set \(N = (2K+1){\textrm{dim}}\, S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,n}_{geo,\sigma }) \sim q^6 \), the total number of degrees of freedom.

Then there are constants C, \(b >0\) (depending on \(\varOmega \), A, f, \(\sigma \), s, and the analyticity properties of the macro triangulation) such that

$$\begin{aligned} \Vert {\mathcal {L}}^{-s} f - Q_k^{-s}({\mathcal {L}}_{hp} f)\Vert _{{\mathbb {H}}^s(\varOmega )} \le C \exp (-b \root 6 \of {N}). \end{aligned}$$


We start with the observation that, similar to the geometric series arguments in (6.13), (6.14), there is \(C > 0\) such that for any \(\delta > 0\) there holds

$$\begin{aligned} \sum _{j :\varepsilon _j \le \delta } \varepsilon _j^s \lesssim C \sqrt{K} \delta ^s. \end{aligned}$$

The proof of Theorem 6.4 follows the arguments of Theorem 6.3: The semidiscretization error is given by (6.10), and the error \(\Vert Q^{-s}_k (\mathcal {L} - \mathcal {L}_{hp}) f\Vert _{{\mathbb {H}}^s(\varOmega )}\) is bounded by (6.12). To estimate (6.12) the summation indices \(I = \{|j| \le K\}\) are split as \(I = I_{{\textrm{nonres}}} \cup I_{{\textrm{res}}} \cup I_+ := \{j \in I :\varepsilon _j < \sigma ^L\} \cup \{j \in I:\sigma ^L \le \varepsilon _j \le 1\} \cup \{j \in I:j > 0\}\). For the sums over \(I_{{\textrm{res}}}\) and \(I_{+}\), we proceed as in the proof Theorem 6.3 relying on Props. 4.6 and 4.7 (note that the scale resolution condition (4.3) in Prop. 4.6 is satisfied with \(c_1 = 1\)) to estimate

$$\begin{aligned} \sum _{j \in I_{{\textrm{res}}}} \varepsilon _j^s \Vert w_j - w^{hp}_j\Vert _{\varepsilon _j^2} \lesssim \sqrt{K} q^9 \left[ \sigma ^{(1-\beta ) n} + e^{-b q}\right] , \end{aligned}$$
$$\begin{aligned} \sum _{j \in I_{+}} \varepsilon _j^s \Vert w_j - w^{hp}_j\Vert _{\varepsilon _j^2} \lesssim \sqrt{K} q^9 \left[ \sigma ^{(1-\beta ) n} + e^{-b q}\right] . \end{aligned}$$

In the sum over \(I_{\textrm{nonres}}\), the scale resolution condition (4.3) is not satisfied so that the exponential convergence result of Prop. 4.6 cannot be used. Instead, the trivial estimate \(\Vert w_j - w^{hp}_j\Vert _{\varepsilon _j^2} \lesssim \Vert w_j\Vert _{\varepsilon _j^2} \lesssim \Vert f\Vert _{L^2(\varOmega )}\) provided by Céa’s Lemma leads to

$$\begin{aligned} \sum _{j \in I_{\textrm{nonres}}} \varepsilon _j^s \Vert w_j - w^{hp}_j\Vert _{\varepsilon _j^2} \lesssim \Vert f\Vert _{L^2(\varOmega )} \sum _{j \in I_{\textrm{nonres}}} \varepsilon _j^s {\mathop {\lesssim }\limits ^{(6.21); \delta = \sigma ^L}} \sigma ^{Ls} \Vert f\Vert _{L^2(\varOmega )}. \end{aligned}$$

Combining (6.22)– (6.24) and recalling \(\sqrt{K} \sim q \sim L = n\) concludes the proof. \(\square \)

7 Numerical experiments

We consider the problem (1.4) with diffusion coefficient \(A = I\), i.e., \({\mathcal {L}}^s = (-\varDelta )^s\). The domain \(\varOmega \) is chosen as either the unit square \(\varOmega _1 = (0,1)^2\), the so-called L-shaped polygonal domain \(\varOmega _2 \subset {\mathbb {R}}^2\) determined by the vertices \(\{ (0,0), (1,0), (1,1), (-1,1), (-1,-1), (0,-1)\}\), the square domain with a slit \(\varOmega _3 = (-1,1)^2 \setminus (-1,0] \times \{0\}\), or the quadrilateral \(\varOmega _4\) with corners the same as \(\varOmega _1\) but with curved sides, which are elliptic arcs. One of these elliptic arcs is characterized by the condition to pass through the two points (0, 0) and (1, 0) and to be tangential to the two lines passing through these two points and (0.5, 0.25); the remaining 3 curved sides are obtained by appropriate rotation of this arc. As we are in particular interested in smooth, but possibly non-compatible data f in all the numerical examples we take

$$\begin{aligned} f(x_1,x_2) \equiv 1 \qquad \text{ in } \varOmega . \end{aligned}$$

Notice that, in this case, f is analytic on \(\overline{\varOmega }\) but \(f\in {\mathbb {H}}^{1-s}(\varOmega )\) only for \( s > 1/2 \) due to boundary incompatibility (cf. Remark 1.2). The exact solution is not known, so that the error is estimated numerically with reference to an accurate numerical solution. The error measure is always the functional

$$\begin{aligned} e({{\tilde{u}}}) = \left| d_s\int _\varOmega f (u^{\text {fine}} - {{\tilde{u}}} )\, \text{ d }x'\right| ^{1/2}, \end{aligned}$$

where \(u^{\text {fine}}\) is the numerical solution obtained on a fine mesh. Note that for the Galerkin method on the cylinder \(\mathcal {C}\) (i.e., Extended \(hp\)-FEM in Case B) this error measure is equivalent to the energy norm if \(u^{\text {fine}}\) is replaced by the exact solution u:

$$\begin{aligned} \Vert u-{{\,\mathrm{tr_\varOmega }\,}}{{\mathscr {U}}^p}\Vert ^2_{{\mathbb {H}}^s(\varOmega )} \lesssim \Vert \nabla ({\mathscr {U}}- {\mathscr {U}}^p)\Vert ^2_{L^2(y^\alpha ,\mathcal {C})} = d_s\int _\varOmega f (u - {{\,\mathrm{tr_\varOmega }\,}}{{\mathscr {U}}^p} )\, \text{ d }x', \end{aligned}$$

where \({\mathscr {U}}^p\) denotes the discrete solution in .

In Fig. 4 we show examples of the meshes used for the four domains. These are constructed using the Netgen/NGSolve package [51]. For the square domain \(\varOmega = \varOmega _1\) the resulting mesh is the geometric boundary-refined mesh \({{\mathcal {T}}}^{\,\,L,L}_{geo, \sigma _x}\) with \(L = 4\) and \(\sigma _x = 1/4\). The same parameters are used in Netgen/NGSolve to construct the meshes for the other three domains, with the resulting meshes diverging from the strict definition of \({{\mathcal {T}}}^{\,\,L,L}_{geo, \sigma _2}\) near the re-entrant corners since these meshes are not constructed using mesh patches but instead by applying directly geometric refinement towards edges and vertices. Nevertheless we denote these meshes also by \({{\mathcal {T}}}^{\,\,L,L}_{geo, \sigma _x}\) and make use of the finite element spaces \(S^q_0(\varOmega ,{{\mathcal {T}}}^{\,\,L,L}_{geo,\sigma _x})\).

Fig. 4
figure 4

Examples of geometric boundary-refined meshes generated by Netgen/NGSolve [51] that are used for the four domains \(\varOmega = \varOmega _j\), \(j = 1,2,3,4\), ordered from left to right and top to bottom

Given a polynomial order \(p \ge 1\), in both approaches the finite element space in \(\varOmega \) is \(S^q_0(\varOmega ,{{\mathcal {T}}}^{\,\,L,L}_{geo,\sigma _x})\) with uniform polynomial degree \(q = p\), number of levels \(L = p\) and \(\sigma _x = 1/4\). Next, we describe the parameters used in the hp-FEM on and the quadrature in the Balakrishnan formula.

For the extended problem, on the geometric mesh \({{\mathcal {G}}}^M_{geo,\sigma _y}\) in as defined in Sect. 2.1.1 we use FE-spaces . Given a polynomial degree \(p \ge 1\), in the definition of these spaces we use , \(\sigma _y = 1/4\) and \(M = {\textrm{round}}(0.79\, p/s)\)Footnote 3, and a uniform degree vector \({{\varvec{r}}}= (p,\dots ,p)\).

For simplicity in the analysis of the sinc quadrature, we used a symmetric approximation (6.3). For the numerical experiments in order to obtain a more efficient scheme we have followed [10] to define the quadrature as

$$\begin{aligned} Q_k^{-s}({\mathcal {L}}) f := \frac{k \sin (\pi s)}{\pi }\sum _{\ell = -K_1}^{K_2} e^{-sy_\ell } ( I + e^{-y_\ell }{\mathcal {L}})^{-1} f, \end{aligned}$$

with \(y_\ell =\ell k\) and the number of quadrature points chosen as

$$\begin{aligned} K_1 =\left\lceil \frac{\pi ^2}{2(1-s)k^2}\right\rceil \;, \qquad K_2 = \left\lceil \frac{\pi ^2}{s k^2}\right\rceil \;. \end{aligned}$$

For the given polynomial order \(p \ge 1\), we set \(k = \frac{4}{3}p^{-1}\).

We now compare the convergence of the two schemes. We plot the error against the polynomial degree p and against \(N_{\text {ls}}^{1/2}\), where \(N_{\text {ls}}\) is the number of linear systems that need to be solved. The convergence curves for the square domain \(\varOmega _1\) are shown in Fig. 5, for the L-shaped domain \(\varOmega _2\) in Fig. 6, for the slit domain \(\varOmega _3\) in Fig. 7, and for the curved domain in Fig. 8. In the case of the curved domain, the curved sides are approximated by so-called “projection-based polynomial interpolation” in the sense of [22, 23] of order p. For all four domains we clearly see exponential convergence as the polynomial order is increased. Also, the Extended \(hp\)-FEM requires significantly fewer linear systems to be solved to achieve the same accuracy as the sinc-BK FEM . We should, however, also note that the eigenvalue problem (3.2) becomes ill-conditioned for increasing p and much higher accuracy than the one shown in the above figures cannot be obtained using our approach for the extension problem. No such accuracy limitations could be seen for the sinc approach.

Fig. 5
figure 5

Energy norm convergence for Extended \(hp\)-FEM (“hp”, solid) and the sinc-BK FEM (“BK”, dashed) for the domain \(\varOmega _1\), versus the polynomial degree p (left) and \(N_{\text {ls}}^{1/2}\), where \(N_{\text {ls}}\) denote the number of linear systems that need to be solved (right). Results for \(s = 0.2\), \(s = 0.4\) and \(s = 0.8\) are shown

Fig. 6
figure 6

Energy norm convergence of the Extended \(hp\)-FEM (“hp”, solid) and the sinc-BK FEM (“BK”, dashed) for L-shaped domain \(\varOmega _2\) versus the polynomial degree p (left) and \(N_{\text {ls}}^{1/2}\), the square root of the number of linear systems to be solved (right)

Fig. 7
figure 7

Energy norm convergence of the Extended \(hp\)-FEM (“hp”, solid) and the sinc-BK FEM (“BK”, dashed) for the slit domain \(\varOmega _3\) versus the polynomial degree p (left) and \(N_{\text {ls}}^{1/2}\), the square root of the number of linear systems to be solved (right)

Fig. 8
figure 8

Energy norm convergence of the Extended \(hp\)-FEM (“hp”, solid) and the sinc-BK FEM (“BK”, dashed) for the domain \(\varOmega _4\) with curved boundaries versus the polynomial degree p (left) and \(N_{\text {ls}}^{1/2}\), the square root of the number of linear systems to be solved (right)

8 Extensions and conclusions

We briefly summarize the principal results and indicate several generalizations which are direct consequences.

8.1 Fractional diffusion on manifolds

Diffusion problems on manifolds are an established tool in applied mathematics. Fractional diffusion on manifolds, being a generalization of classical diffusion processes, has therefore attracted attention, see, e.g., [14, 29]. The present setting of possibly non-constant coefficient A in the diffusion operator \({\mathcal {L}}\) in the setting in Sect. 1.3 allows us to extend our numerical schemes and their exponential convergence analysis to fractional diffusion on manifolds as we very briefly outline.

Let \(\varGamma \subset {\mathbb {R}}^3\) denote a compact, orientable analytic manifold (e.g. [4]). We think of bounded, analytic surfaces such as the unit sphere \({{\mathbb {S}}}^2 \subset {\mathbb {R}}^3\). Let \(\varGamma \) be covered by a finite atlas of analytic charts \(\{\chi _j\}_{j=1}^J\). In a generic analytic chart \(\chi \) of \(\varGamma \), consider the polygonal domain \({\widetilde{\varGamma }} = \chi (\varOmega )\subset \varGamma \) where the parameter domain \(\varOmega \subset {\mathbb {R}}^2\) of the chart \(\chi \) is a curvilinear polygon in the sense of Sect. 1.2. On \(\varGamma \), introduce the surface (Lebesgue)measure \(\mu \). On \({\widetilde{\varGamma }}\), for given \({\tilde{f}} \in L^2(\varGamma ,\mu ;{\mathbb {R}})\), consider the Dirichlet problem for the surface diffusion operator \({\widetilde{\mathcal {L}}}\): find \(u_\varGamma \) such that

$$\begin{aligned} {\widetilde{\mathcal {L}}} u_\varGamma := -\text {div}_\varGamma ({\tilde{A}} \nabla _\varGamma u_\varGamma ) = {\tilde{f}} \quad \text{ on }\quad {\widetilde{\varGamma }}\;, \qquad {u_\varGamma }|_{\partial {\widetilde{\varGamma }} } = 0 \;. \end{aligned}$$

Here, the “diffusion coefficient” \({\tilde{A}}\) in (8.1) is a symmetric, uniformly in \(\varGamma \) positive definite linear map acting on the tangent bundle of \(\varGamma \), and \(\nabla _\varGamma \) and \(\text {div}_\varGamma \) denote the surface gradient and divergence differential operators on \(\varGamma \), respectively (see [4]). With Sobolev spaces on \(\varGamma \) invariantly defined in the usual fashion (e.g., [4]), the surface diffusion operator \({\widetilde{\mathcal {L}}}\) in (8.1) extends to a boundedly invertible, self-adjoint operator \({\widetilde{\mathcal {L}}}: H^1_0({\widetilde{\varGamma }};\mu ) \rightarrow H^{-1}({\widetilde{\varGamma }};\mu ) = ( H^1_0({\widetilde{\varGamma }};\mu ) )^*\) (duality with respect to \(L^2({\widetilde{\varGamma }};\mu ) \sim (L^2({\widetilde{\varGamma }};\mu ))^*\)) whose inverse \({\widetilde{\mathcal {L}}}^{-1}\) is a compact, self-adjoint operator on \(L^2({\widetilde{\varGamma }};\mu )\). The spectral theorem implies that \({\widetilde{\mathcal {L}}}^{-1}\) admits a countable sequence of eigenpairs \(({\tilde{\lambda }}_k,{\tilde{\varphi }}_k)_{k\ge 1}\) whose eigenvectors \({\tilde{\varphi }}_k\) can be normalized so that they constitute an ONB of \(L^2({\widetilde{\varGamma }};\mu )\). With the ONB \(\{ {\tilde{\varphi }}_k \}_{k\ge 1}\), fractional Sobolev spaces on \({\widetilde{\varGamma }}\) can be defined as in (1.3), i.e. for \(0<s<1\),

$$\begin{aligned} {{\mathbb {H}}}^s({\tilde{\varGamma }}) := \left\{ w = \sum _{k=1}^\infty w_k {\tilde{\varphi }}_k: \Vert w \Vert _{{{\mathbb {H}}}^s({\tilde{\varGamma }})}^2 = \sum _{k=1}^{\infty } {\tilde{\lambda }}_k^s w_k^2 < \infty \right\} \;. \end{aligned}$$

The space \({{\mathbb {H}}}^s({\tilde{\varGamma }})\) can be characterized by (real) interpolation: There holds \({{\mathbb {H}}}^s({\tilde{\varGamma }}) = (H^1_0({\widetilde{\varGamma }};\mu ), L^2({\widetilde{\varGamma }};\mu ))_{s,2}\) for \(0<s<1\). As in (1.4), with the family \(({\tilde{\lambda }}_k,{\tilde{\varphi }}_k)_{k\ge 1}\) we may define the spectral fractional Laplacian \({\widetilde{\mathcal {L}}}^s = (\mathcal {L},I)_{s,2}\) by interpolation of linear operators (e.g. [32]). The arguments in [16] extend verbatim the localization (1.6) to the present setting. In particular, the spectral fractional diffusion operator on \({\widetilde{\varGamma }}\) with homogeneous Dirichlet boundary conditions on \(\partial {\widetilde{\mathcal {L}}}\) admits a localization on the cylinder \({\widetilde{\mathcal {C}}} = {\widetilde{\varGamma }} \times (0,\infty )\).

Pulling back the problem (8.1) via \(\chi \) into the (Euclidean) chart domain \(\varOmega \subset {\mathbb {R}}^2\), the Dirichlet problem for the fractional power \(s\in (0,1)\) of the surface diffusion (8.1) in \({\widetilde{\varGamma }} = \chi (\varOmega )\) reduces to (1.6) where the bilinear form (1.2) and diffusion coefficient A are given by

$$\begin{aligned} A(x') = G(x')^{\top }({\tilde{A}}\circ \chi )(x') G(x') \;,\quad x'\in \varOmega , \end{aligned}$$

with \(G(x') = D\chi (x'):\varOmega \rightarrow {\mathbb {R}}^{3\times 2}\) denoting the (assumed analytic in \(\overline{\varOmega }\)) metric of \({\mathcal {M}}\) in chart \(\chi \). The real-analyticity of compositions, sums and product of real-analytic functions implies that \(x' \mapsto A(x')\) satisfies (1.1) in \(\varOmega \), so that the preceding mathematical results also apply to (1.4) with (8.1).

8.2 N-widths of solution sets

The hp-approximation rate bounds for either the Extended \(hp\)-FEM (Theorems 5.15.4) and the sinc-BK FEM (Theorems 6.36.4) imply exponential bounds on N-widths of solution sets of (1.4) in a curvilinear polygon \(\varOmega \) as defined in Sect. 1.2, with the data A and f satisfying the conditions in Sect. 1.3. Such bounds are well-known to determine the rate of convergence of so-called reduced basis methods (see [47] and the references there).

We recall that, for a normed linear space X (with norm \(\Vert \circ \Vert _X\)) and for a compact subset \({{\mathcal {K}}}\subset X\), the N-width of \({{\mathcal {K}}}\) in X is given by

$$\begin{aligned} d_N({{\mathcal {K}}},X) = \inf _{E_N} \sup _{f\in {{\mathcal {K}}}} \inf _{g\in E_N} \Vert f - g \Vert _X. \end{aligned}$$

Here, the first infimum is taken over all subspaces \(E_N\) of X of dimension \(N\in {{\mathbb {N}}}\). Subspace sequences \(\{ E_N \}_{N\ge 1}\) that attain the rates of \(d_N({{\mathcal {K}}},X)\) in (8.3) as \(N\rightarrow \infty \) can be realized numerically by (generally non-polynomial) reduced bases ( [47] and the references there). We fix a set \(G \subset {{\mathbb {C}}}^2\) containing \(\overline{\varOmega }\) and choose \({{\mathcal {A}}}\subset L^2(\varOmega )\) as the set of functions \(f:\varOmega \rightarrow {{\mathbb {R}}}\) that admit a holomorphic extension to G with \(\Vert f \Vert _{L^\infty (G)} \le 1\). Then \({{\mathcal {K}}}:= \mathcal {L}^{-s}{{\mathcal {A}}}\subset {\mathbb {H}}^s(\varOmega )\) is a compact subset by the continuity of \( \mathcal {L}^{-s}: {{\mathbb {H}}}^{-s}(\varOmega ) \rightarrow {\mathbb {H}}^s(\varOmega )\) and the compact embedding \(L^2(\varOmega ) \subset {{\mathbb {H}}}^{-s}(\varOmega )\). We choose \(X={\mathbb {H}}^s(\varOmega )\) in (8.3).

From Theorem 6.4 and the fact that \(Q_k^{-s}(\mathcal {L}_{hp}) f \in S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\), with the choices of parameters in Theorem 6.4, \(N = \textrm{dim}S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,n}_{geo,\sigma }) = O(q^4)\), for \({{\mathcal {K}}}\) as above and \(E_N = S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\) follows the (constructive) bound

$$\begin{aligned} d_N({{\mathcal {K}}},X) \lesssim \exp (-b\root 4 \of {N}) \end{aligned}$$

for some constant \(b>0\) independent of N.

We also mention that the argument in [40] can be adapted to the setting of (1.4) in Sect. 1.2, resulting in the (sharp) nonconstructive bound

$$\begin{aligned} d_N({{\mathcal {K}}},X) \lesssim \exp (-b\sqrt{N})\;. \end{aligned}$$

We refer to [2, 13, 20, 21] for numerical approximation of (1.4) using reduced basis methods.

8.3 Contour-integral methods: \(hp-{\textrm{DE}}\) Approximation

An alternative to the sinc-BK approximation in Theorems 6.3 and 6.4 is provided by quadrature approximations of contour-integral representations of the solution operator for the spectral fractional equation \({\mathcal {L}}^s u = f\). When combined with hp-FE discretizations from Sect. 4 in \(\varOmega \), they lead to similar algorithms and convergence rate bounds. We illustrate this for the so-called double exponential quadrature (“\({\textrm{DE}}\) approximation” for short) analyzed by M. Mori and coworkers (see [45] and references there).

In the presently considered PDE context, double exponential approximations \({\textrm{DE}}_k^{-s}\) to \(\mathcal {L}^{-s}\) were studied recently in [48]; there, the focus was on bounds for the \({\textrm{DE}}\) quadrature approximation error. The results of [48] can be combined with hp-approximation error bounds from Sect. 4 to yield novel error bounds for fully discrete “\(hp-{\textrm{DE}}\)” approximations.

The \({\textrm{DE}}\) approximation in [48] is based on the Riesz-Dunford representation with a contour \(\mathcal {C}_{\sigma , \kappa ,\theta } := \{ \kappa (\cosh (\sigma w) + i \theta \sinh (w)) \,:\, w\in {{\mathbb {R}}}\} \subset {{\mathbb {C}}}_>\) that is contained in the right half-plane \({{\mathbb {C}}}_> := \{ z\in {{\mathbb {C}}}\,:\, {\text {Re}}(z) > 0\} \), i.e.,

$$\begin{aligned} \mathcal {L}^{-s} = \frac{1}{2\pi i} \int _{{{\textsf{C}}}_{\sigma , \kappa ,\theta }} z^{-s} ({\mathcal {L}} - z)^{-1} dz \;. \end{aligned}$$

Here, \(\sigma , \kappa ,\theta \) are chosen in order for the contour \(\mathcal {C}_{\sigma , \kappa ,\theta }\subset {{\mathbb {C}}}_>\) to enclose \({\textrm{spec}}\,(\mathcal {L})\). For \(\mathcal {L}^{-s}\) with \({\mathcal {L}} = -\text {div}(A\nabla )\) and homogeneous Dirichlet boundary conditions on \(\partial \varOmega \), the choice \(\sigma =1\) in the parametrization [48, Eqn. (2.1)] of the contour is possible. The semidiscrete \({\textrm{DE}}\) approximations are, for \( K_{q}\in {{\mathbb {N}}}\), then given by [48, Def. 2.1]

$$\begin{aligned} {\textrm{DE}}^{-s}_k(\mathcal {L})f := \frac{k}{2\pi i} \sum _{|j|\le K_q} ((\psi _{1,\theta }(jk))^{-s-1} \psi '_{1,\theta }(jk) (\varepsilon _j^2 \mathcal {L} - 1)^{-1} f \;, \end{aligned}$$

where \(f\in L^2(\varOmega )\), the \({{\textrm{DE}}}\)-stepsize \(k := \ln (K_q)/K_q\), and [48, Eqn. (2.1)]

$$\begin{aligned} \varepsilon _j^2 := 1 / \psi _{1,\theta }(jk), \;\; \psi _{1,\theta }(y) := \kappa \left[ \cosh \left( \frac{\pi }{2}\sinh (y)\right) + i\theta \sinh \left( \frac{\pi }{2}\sinh (y)\right) \right] \;. \end{aligned}$$

With this choice of k, and with the choices \(\varepsilon = 0\), \(\rho = 0\) and \(r=s/2\), in [48, Cor. 3.6, Rem. 3.7] the \({\textrm{DE}}\) approximation error is bounded as

$$\begin{aligned} \Vert (\mathcal {L}^{-s}-{\textrm{DE}}^{-s}_k(\mathcal {L}))f \Vert _{{\mathbb {H}}^s(\varOmega )} \le C\exp (-b [K_q/\ln (K_q)]^{1/2}) \Vert f \Vert _{L^2(\varOmega )} \;, \end{aligned}$$

for some constants b, \(C>0\) that are independent of f and of \(K_q\).

As in the sinc-BK approach, the reaction-diffusion problems \((\varepsilon _j^2 \mathcal {L} - 1)w_j = f\) in \(\varOmega \) that appear in (8.6) are singularly perturbed: for large values of |j|, \(|\varepsilon _j|\downarrow 0\) with complex singular perturbation parameters \(\varepsilon _j \in {{\mathcal {S}}}_{\delta '}:=\{z \in {\mathbb {C}}\setminus \{0\} \,|\, |{\textrm{arg}}\, z| < \pi /2-\delta '\}\) for some \(\delta '>0\) depending on \(\kappa , \theta \), i.e., \(|{\text {Im}}\varepsilon _j| \lesssim {\text {Re}}\varepsilon _j \le |\varepsilon _j| \lesssim {\text {Re}}\varepsilon _j \) for all j.

The fully discrete hp-\({{\textrm{DE}}}\) approximation of (8.6) can be written as

$$\begin{aligned} {\textrm{DE}}^{-s}_k(\mathcal {L}_{hp})f := \frac{k}{2\pi i} \sum _{|j|\le K_q} ((\psi _{1,\theta }(jk))^{-s-1} \psi '_{1,\theta }(jk) w_j^{hp}, \end{aligned}$$

where \(w_j^{hp}=(\varepsilon _j^2 \mathcal {L}_{hp} - 1)^{-1} f \in H^1_0(\varOmega ;{{\mathbb {C}}})\) denotes either one of the hp-FE approximations (6.8) or (6.20) corresponding to the minimal mesh \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }({\text {Re}}\varepsilon _j)\) (Case A) or the geometric boundary-refined meshes \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) (Case B). In both cases, the hp-FEM approach leads to quasi-optimality for the errors \(w_j - w_j^{hp}\) in the \(|\varepsilon _j|\)-weighted norm \(\Vert \cdot \Vert _{|\varepsilon _j|^2,\varOmega }\) given by (3.7); this is worked out for the case of small \(|\varepsilon _j|\) in [38, App. A] and uses in an essential way \({\text {Re}}\varepsilon _j > 0\).

As for the sinc-BK approach, the singular perturbation nature of the operators \(\varepsilon _j^2 \mathcal {L} - 1\) appearing in (8.6) leads, for analytic in \(\overline{\varOmega }\) source term f, to solutions \(w_j\) with boundary layers and corner layers. As a precise characterization of the regularity of solutions \(w_j\) in the presence of complex perturbation parameters \(\varepsilon _j\) does not seem available for polygons \(\varOmega \), we formulate the approximability of the \(w_j\) by hp-approximants \(w^{hp}_j\) as an assumption:

Assumption 8.1

Assume that (1.1) holds, and assume that, given choices \(q \sim L \le n\) and \(c_1>0\), there are constants C, \(b>0\), and \(\lambda _0 > 0\) such that the following holds:

  1. (i)

    For any \(\varepsilon \in {{\mathcal {S}}}_{\delta '}\) there holds for any \(\lambda \in (0,\lambda _0]\) for \(w = (\varepsilon ^2 \mathcal{L} -1)^{-1} f\):

    $$\begin{aligned} \inf _{v \in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,q}_{min,\lambda }({\text {Re}}\varepsilon ))} \Vert w - v\Vert _{|\varepsilon |^2,\varOmega }&\le C e^{-b \lambda q}, \end{aligned}$$
  2. (ii)

    For any \(\varepsilon \in {{\mathcal {S}}}_{\delta '}\) satisfying the scale resolution condition \(\sigma ^L \le c_1 |\varepsilon |\) there holds for \(w = (\varepsilon ^2 \mathcal{L} -1)^{-1} f\):

    $$\begin{aligned} \inf _{v \in S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma })} \Vert w - v\Vert _{|\varepsilon |^2,\varOmega }&\le C e^{-b q}, \end{aligned}$$

Remark 8.2

Assumption 8.1 has been verified in [38, App. A] for the case of \(\varOmega \subset {{\mathbb {R}}}^2\) having an analytic boundary. Then the solution \(w = (\varepsilon ^2 \mathcal{L} - 1)^{-1} f\) has only boundary layers (and no corner singularities or corner layers) and [38, App. A] ascertains that the regularity theory developed in [41] for real singular perturbation parameters extends to the case of \(\varepsilon \in {{\mathcal {S}}}_{\delta '} \subset {{\mathbb {C}}}_>\). While it should, in principle, be possible to extend the regularity theory of [41] to polygonal domains \(\varOmega \) and to complex-valued \(\varepsilon \in {\mathcal S}_{\delta '}\), a citable result does not seem available in the literature. \(\square \)

Remark 8.3

Distinct from the case of the sinc-BK FEM discussed in Sect. 6 where the solutions \(w_j\) to singularly perturbed problems are real-valued due to the parameters \(\varepsilon _j\) being real, the \({\textrm{DE}}\) approach generally leads to complex-valued \(w_j\), even for real-valued data f. \(\square \)

The \({\textrm{DE}}\) approximations (8.8) being structurally similar to sinc-BK (6.9), an error analysis analogous to Sect. 6 is possible, with error bounds essentially equal to those of Theorems 6.36.4.

Proposition 8.4

Let \(\varOmega \) be a curvilinear polygon as defined in Sect. 1.2. Let A, f satisfy (1.1), and let A be uniformly symmetric positive definite on \(\varOmega \). Let u be the solution to (1.4). Let the fully discrete approximation be given by the \(hp-{\textrm{DE}}\) approximation (8.8), where the hp-FE approximations \(w_j^{hp}\) in Case A are based on the j-dependent hp-FE spaces \(S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,q}_{min,\lambda }({\text {Re}}\varepsilon _j))\) and in Case B on the spaces \(S^q_0(\varOmega ,{\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\). Let Assumption 8.1 be valid, and choose \(q \sim L = n\) and \(K_q \sim q^2\). Let N denote the total number of degrees of freedom (i.e., \(N = \sum \nolimits _{|j| \le K_q}{\textrm{dim}}\, S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,q}_{min,\lambda }({\text {Re}}\varepsilon _j))\) in Case A and \(N = (2K_q+1){\textrm{dim}}\,S^q_0(\varOmega , {\mathcal {T}}^{\,\,L,n}_{geo,\sigma })\) in Case B).

Then the following holds for the two cases Case A and Case B:

In Case A, there exists a constant \(\lambda _0>0\) (depending on \(\varOmega \), A, f, and the parameters characterizing the mesh family \({\mathcal {T}}^{\,\,L,q}_{min,\lambda }\)) such that for any \(\lambda \in (0,\lambda _0]\) there are constants C, \(b' > 0\) (depending additionally on s and the implied constants in \(q \sim L\) and \(K_q \sim q^2\)) such that

$$\begin{aligned} \Vert \mathcal {L}^{-s} f - {\textrm{DE}}_k^{-s}(\mathcal {L}_{hp}) f\Vert _{{\mathbb {H}}^s(\varOmega )} \le C \exp (-b'{N}^{1/5}/\sqrt{\ln N}). \end{aligned}$$

In Case B, there exist constants C, \(b' > 0\) depending on \(\varOmega \), A, f, \(\sigma \), s, the analyticity properties of the macro triangulation, and the implied constants in \(q \sim L\) and \(K_q \sim q^2\) such that

$$\begin{aligned} \Vert \mathcal {L}^{-s} f - {\textrm{DE}}_k^{-s}(\mathcal {L}_{hp}) f\Vert _{{\mathbb {H}}^s(\varOmega )} \le C \exp (-b'{N}^{1/6}/\sqrt{\ln N}). \end{aligned}$$


We start by noting that

$$\begin{aligned} N \sim K_q L q^2 \sim q^5 \quad \text { for }\mathbf{Case ~A}, \qquad N \sim K_q L^2 q^2 \sim q^6 \quad \text { for }\mathbf{Case ~B}. \end{aligned}$$

The \({\textrm{DE}}\) semidiscretization error (8.7) therefore has the desired form. As in the case of sinc-BK FEM in addition to the semidiscretization error one has to estimate

$$\begin{aligned} S:= k\sum _{|j| \le K_q} |\psi _{1,\theta }(jk)|^{-s-1} |\psi _{1,\theta }(jk)| \Vert w_j - w^{hp}_j\Vert _{{\mathbb {H}}^s(\varOmega )}. \end{aligned}$$

As in the case of the sinc-BK FEM (cf. the proof of Thms. 6.36.4), we estimate with the interpolation inequality (1.5) and the norm \(\Vert \cdot \Vert _{|\varepsilon _j|^2,\varOmega } \sim \Vert \cdot \Vert _{L^2(\varOmega )} + |\varepsilon _j| \Vert \nabla \cdot \Vert _{L^2(\varOmega )}\) similar to the procedure in deriving (6.12)

$$\begin{aligned} |\varepsilon _j|^{2s} \Vert w_j - w^{hp}_j\Vert _{{\mathbb {H}}^s(\varOmega )}&\lesssim |\varepsilon _j|^{2s} \Vert w_j - w^{hp}_j\Vert ^{1-s}_{L^2(\varOmega )} \Vert \nabla (w_j - w^{hp}_j)\Vert ^{s}_{L^2(\varOmega )} \nonumber \\&\lesssim |\varepsilon _j|^{s} \Vert w_j - w^{hp}_j\Vert _{|\varepsilon _j|^2,\varOmega }. \end{aligned}$$

From the explicit form of \(\psi _{1,\theta }\) we get

$$\begin{aligned} |\psi _{1,\theta }(jk)|^{-s-1} |\psi ^\prime _{1,\theta }(jk)| \lesssim |\varepsilon _j|^{2s} e^{|j|k}. \end{aligned}$$

Inserting (8.12), (8.13) in (8.11), we obtain

$$\begin{aligned} S \lesssim k \sum _{|j| \le K_q} e^{|j|k} |\varepsilon _j|^s \Vert w_j - w^{hp}_j\Vert _{|\varepsilon _j|^2,\varOmega }. \end{aligned}$$

At this point, we differentiate between Case A and Case B, which follow the proofs of Theorems 6.3 and 6.4, respectively.

We consider Case A first. The quasi-optimality of the Galerkin method in the norm \(\Vert \cdot \Vert _{|\varepsilon _j|^2,\varOmega }\), Assumption 8.1, and the trivial bound \(|\varepsilon _j| \lesssim 1\) give

$$\begin{aligned} S&\lesssim k \sum _{|j| \le K_q} e^{|j|k} |\varepsilon _j|^s \Vert w_j - w^{hp}_j\Vert _{|\varepsilon _j|^2,\varOmega } {\mathop {\lesssim }\limits ^{\text {Ass.}\,8.1; |\varepsilon _j| \lesssim 1}} k(2K_q+1) e^{k K_q } e^{-b \lambda q}. \end{aligned}$$

With the \({\textrm{DE}}\) stepsize \(k = \ln (K_q)/K_q\), we have \(k(2K_q+1) e^{k K_q} \lesssim K_q \ln K_q\). As \(K_q \sim q^2\) the algebraic term can be absorbed in the exponentially decaying one leading to \(S \lesssim e^{-b' q}\) for suitable \(b'>0\).

We turn to Case B. The assumption \(L \sim q\) implies that the spatial mesh \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) does not necessarily resolve all scales \(|\varepsilon _j|\). We therefore split the index set \(I = \{|j| \le K_q\}\) as \(I = I_{\textrm{res}} \cup I_{\textrm{nonres}} := \{j\,:\, \sigma ^L \le |\varepsilon _j|\} \cup \{j\,:\, \sigma ^L > |\varepsilon _j|\} \) into those \(j \in I_{\textrm{res}}\) whose corresponding scale \(|\varepsilon _j|\) can be resolved by \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\) and those \(j \in I_{\textrm{nonres}}\) whose scale \(|\varepsilon _j|\) is not resolved by \({\mathcal {T}}^{\,\,L,n}_{geo,\sigma }\). We proceed as in the proof of Theorem 6.4 exploiting that for \(j \in I_{\textrm{res}}\), we will be able to use Assumption 8.1 whereas for \(j \in I_{\textrm{nonres}}\) we will have to use \(|\varepsilon _j|^s \le \sigma ^{sL} \) and the trivial estimate \(\Vert w_j - w^{hp}_j \Vert _{|\varepsilon _j|^2,\varOmega } \lesssim \Vert w_j\Vert _{|\varepsilon _j|^2,\varOmega } \lesssim \Vert f\Vert _{L^2(\varOmega )}\) that follows by Céa’s Lemma and stability:

$$\begin{aligned} S&\lesssim k \sum _{j \in I_{\textrm{res}}} e^{|j|k} |\varepsilon _j|^s \Vert w_j - w^{hp}_j\Vert _{|\varepsilon _j|^2,\varOmega } + k \sum _{j \in I_{\textrm{nonres}}} e^{|j|k} |\varepsilon _j|^s \Vert w_j - w^{hp}_j\Vert _{|\varepsilon _j|^2,\varOmega } \\&{\mathop {\lesssim }\limits ^{\text {Ass.}\,8.1; |\varepsilon _j| \lesssim 1}} k(2K_q+1) e^{k K_q } e^{-b q} + k \sum _{j \in I_{\textrm{nonres}}} e^{|j|k} |\varepsilon _j|^s \Vert w_j \Vert _{|\varepsilon _j|^2,\varOmega } \\&\lesssim k(2K_q+1) e^{k K_q } e^{-b q} +k (2 K_q+1)e^{k K_{q}} \sigma ^{Ls}. \end{aligned}$$

From \(L \sim q\), \(K_q \sim q^2\), and \(k = \ln (K_q) /K_q\) we get an exponential decay in q. \(\square \)

Remark 8.5

The bound (8.7) on the semidiscrete \({\textrm{DE}}\) approximation error assumed \(f\in L^2(\varOmega )\). In [48, Cor. 3.9], it is shown that under stronger, Gevrey-type regularity assumptions on f, again with the \({\textrm{DE}}\) stepsize \(k=\ln (K_q)/K_q\), the upper bound (8.7) can be strengthened to \(\lesssim \exp (-b/(k|\ln k |))\). Repeating the arguments in Proposition 8.4 with the choices \(q \sim L \sim K_q\) will imply error bounds of the form

$$\begin{aligned} \Vert \mathcal {L}^{-s} f - {\textrm{DE}}_k^{-s}(\mathcal {L}_{hp}) f\Vert _{{\mathbb {H}}^s(\varOmega )} \le C \exp (-b'{N}^{r}/\ln ^2 N) \end{aligned}$$

with \(r = \frac{1}{4}\) in Case A and \(r = \frac{1}{5}\) in Case B. \(\square \)

We illustrate the double exponential approximation by repeating the numerical experiment related to the fractional Laplacian on the slit domain \(\varOmega _3\). We again make use of geometric boundary-refined meshes \({{\mathcal {T}}}^{\,\,L,L}_{geo, \sigma _x}\) and the corresponding finite element spaces \(S^q_0(\varOmega ,{{\mathcal {T}}}^{\,\,L,L}_{geo,\sigma _x})\) with uniform polynomial degree \(q = p\), number of levels \(L = p\), and \(\sigma _x = 1/4\). In order to obtain an error of size \(O(e^{-c p})\) for some constant \(c > 0\), we choose the parameter \(K_q\) in the double exponential approximation asymptotically proportional to \(p^2\). Since, according to [48, Cor. 3.6], the constant b in (8.7) is proportional to \(\sqrt{s}\), after some experimentation, we set \(K_q = {\textrm{round}}((\frac{1}{2}p^2+10)/s)\) in all the experiments. Furthermore, following the advice from [48], we set \(k = 0.9 \ln (K_q) /K_q\) and \(\theta = 4\). This particular choice of k, instead of \(k = \ln (K_q) /K_q\), is made due to better stability properties and increases the error due to cut-off to \(O(e^{-K_q^{0.9}})\), which, however, does not impact the error estimates nor the numerical results visibly. As the solutions \(w_j^{hp}\) for \(j > 0\) are the complex conjugates of \(w_{-j}^{hp}\) with the negative index, it is sufficient to solve \(N_{\text {ls}} = K_q+1\) singularly perturbed problems instead of \(2K_q+1\). The results of the numerical experiments are shown in Fig. 9. The error, as a function of \(N_{\text {ls}}\), decays at the fastest rate for the DE approximation. However, it needs to be said that the use of complex arithmetic brings with it a large overhead in terms of computational time.

Fig. 9
figure 9

Energy norm convergence of the Extended \(hp\)-FEM (“hp”, solid), the sinc-BK FEM (“BK”, dashed), and the DE (“DE”, dotted) approximation for the slit domain \(\varOmega _3\) versus the polynomial degree p (left) and \(N_{\text {ls}}^{1/2}\), the square root of the number of linear systems to be solved (right)

8.4 Conclusions

For the Dirichlet problem of the spectral, fractional diffusion operator \({\mathcal {L}}^s\) with \(0<s<1\) in a bounded, polygonal domain \(\varOmega \subset {\mathbb {R}}^2\), we proposed three hp-FE discretizations: the Extended \(hp\)-FEM , the sinc-BK FEM and, somewhat briefly, the \(hp-{\textrm{DE}}\)-approximation. The first discretization, already considered in [9, 36], is based on the CS-extension upon hp-FE semi-discretization in the extended variable. Subsequent diagonalization leads to a decoupled system (3.5) of \({\mathcal {M}}\) linear and local, singularly perturbed second order reaction-diffusion problems in \(\varOmega \). Invoking analytic regularity results for these problems from [41, 43], and robust exponential convergence of hp-FEM for reaction-diffusion problems in polygons from [8, 41, 42], an exponential convergence rate bound \(C\exp (-b\root 6 \of {N_{DOF}})\) with respect to the total number of degrees of freedom, \(N_{DOF}\), which are used in the tensor-product hp-FE discretization, is established in the fractional Sobolev norm \({\mathbb {H}}^s(\varOmega )\). We add that the variational semi-discretization in Sect. 3 with respect to the extruded variable y offers the possibility for residual a posteriori error estimation.

The second discretization is based on the spectral integral representation of \(\mathcal {L}^{-s}\) due to Balakrishnan [7]. A sinc quadrature discretization [56] approximates the spectral integral by an (exponentially convergent [11, 56]) finite linear combination of solutions of decoupled elliptic reaction diffusion problems in \(\varOmega \) with analytic input data. Drawing once more on analytic regularity and robust exponential convergence of hp-FEM [8, 41,42,43], we prove exponential convergence also for this approach. A computable a posteriori bound for the semidiscretization error incurred for the sinc-BK FEM approach does not seem to be available currently.

The theoretical convergence rate bounds are verified in a series of numerical experiments. These show, in particular, for all schemes considered that exponential convergence is realized in the practical range of discretization parameters. They also indicate a number of practical issues, such as conditioning or algorithmic steering parameter selection, which are beyond the scope of the mathematical convergence analysis. We point out that the proposed algorithms and the exponential convergence results extend in several directions: besides homogeneous Dirichlet boundary conditions, also mixed, Dirichlet-Neumann boundary conditions, and operators with a nonzero first order term could be considered. In either case, the proposed algorithms extend readily. The main result is the construction of hp-FE discretizations with robust exponential convergence rates for spectral fractional diffusion in polygonal domains \(\varOmega \subset {{\mathbb {R}}}^2\). Similar results hold in bounded intervals \(\varOmega \subset {{\mathbb {R}}}^1\) (we refer to [9] for details). In polyhedral \(\varOmega \subset {{\mathbb {R}}}^3\), the present line of analysis is also applicable; however, exponential convergence and analytic regularity of hp-FEM for reaction-diffusion problems in space dimension \(d=3\) does not appear to be available to date. We considered fractional powers only for self-adjoint, second-order elliptic divergence-form differential operators \(\mathcal {L} w = - \text {div}( A \nabla w )\) in \(\varOmega \). The arguments for the sinc-BK FEM extend to non-selfadjoint operators which include first-order terms via [10], provided suitable hp-FEM for advection-reaction-diffusion problems in \(\varOmega \) are available (e.g. [37]). The proposed hp-sinc-BK approach could be replaced by the \(hp-{\textrm{DE}}\)-FEM contour-integration scheme based on [45, 48]. Distinct from the sinc-BK FEM, it requires complex arithmetic, even for real-valued data.

We also point to the numerical equivalence of the hp-FE “semidiscretization and diagonalization” approach presented in Sect. 3 with certain schemes based on rational approximation of the map \(z\mapsto z^{-s}\) on the spectrum of \(\mathcal {L}\) in \({{\mathbb {C}}}\), see, e.g., [30, Thm. 2]. By this result, the presently developed hp-error analysis in Sect. 5 directly implies an exponential convergence rate from Theorem 5.4 for a “rational approximation and hp-FEM in \(\varOmega \)” algorithm in curvilinear polygons.

The present analysis is indicative for achieving high, algebraic rate \(p+1-s\) of convergence in \({\mathbb {H}}^{s}(\varOmega )\) by h-version FEM of fixed order \(p\ge 1\) in \(\varOmega \). As in hp-FEM, this will require anisotropic mesh refinement aligned with \(\partial \varOmega \), ie., so-called “boundary-layer” meshes. Several constructions are available (see, e.g., [54] for so-called “exponential boundary layer meshes” and [49] for so-called “Shishkin meshes”). We refrain from developing details for this approach, which can be analyzed along the lines of the present paper.