1 Introduction

We investigate an interpolation problem for variational solutions obtained from finite element approximations. The specialty consists in the fact that the discrete solution should be close to the exact solution at a finite number of trial points, which are selected for topology optimization purpose. As a typical example for such topology optimization problems see [7, 18, 19, 21].

It will suffice to approximate solutions more accurately at the selected trial points (which associate mesh nodes), rather than in the whole computational domain. To reduce the discretization error at the nodes it is necessary to address the interpolate solution. Motivated by the special need of highly accurate and stable interpolation for a class of variational solutions of scattering problems, we develop a concept of Petrov–Galerkin enrichment by the FEM.

Our motivation of the FEM interpolation come from numerous applications in the engineering sciences where topology optimization problems arise. For typical formulations of topology optimization as well as related inverse and ill-posed problems we refer to [2, 10, 17, 20], and to [8, 12, 16, 23, 31] for common numerical approaches used in the field.

As the reference model we consider the Helmholtz equation. It is well known that generalized finite element methods (GFEM) are well suited in this case since they reduce the so-called pollution error of discretization. We refer to [13, 15, 26, 29, 33] for the theoretical background of GFEM, and to [34] for their review. Major developments for GFEM were carried out by Babuška with coauthors.

To reduce the discretization error it is possible to enrich either the trial or the test function spaces. The former approach enriching trial spaces is commonly used, but it has high computational costs and meets primarily the task of approximation rather than interpolation. The latter approach enriching test spaces is based on a Petrov–Galerkin setting, see e.g. [4, 11, 28]. The Petrov–Galerkin concept is also the basis for the variational multiscale method (VMS) by Hughes et al., e.g. [14].

We shall utilize necessary optimality conditions for interpolation properties which are deduced when enriching the space of test functions. Using a Petrov–Galerkin approach, we suggest low order interpolation polynomials for the trial space, and we enrich the test space with high order shape functions. The resulting Petrov–Galerkin enrichment (PGE) improves significantly the accuracy of interpolation, which is at a low price because low order finite elements are used for the discrete solution.

In the present paper a theoretical justification of PGE is provided by local wavelets with vanishing moments based on Gegenbauer polynomial approximation. We derive also practical formulas for calculation of the system matrix for the reference Helmholtz equation given over uniform meshes in 2d and 3d.

For possible extension to nonuniform meshes and isoparametric elements we refer to [25]. In the above paper, a global FE basis is enriched with polynomial bubbles based on preprocessing, where the weights multiplying these bubbles have to be found by solving local problems minimizing the dispersion. In the present context, the shape functions are to be weighted within the local optimization.

2 The concept of Petrov–Galerkin enrichment

We consider a class of variational problems formulated in the following form. Let \(X^0\) be a closed subspace of a complex Hilbert space X. For a given \(u_0\in X\) find \(u-u_0\in X^0\) such that

$$\begin{aligned} \begin{aligned}&b(u,v)=0\quad \hbox { for all}\ v\in X^0, \end{aligned} \end{aligned}$$
(2.1)

where the map \(b: X\times X\mapsto {\mathbb {C}}\) is continuous and sesquilinear over \({\mathbb {C}}\) (respectively, bilinear over \({\mathbb {R}}\)).

Assuming the existence of a variational solution to (2.1) we look for its finite dimensional approximation in a trial space \(X_h\subset X\). Let \(X^0_h\) be a closed subspace of \(X_h\) and \(u_0^h\in X_h\). For finite elements, the index \(h\in {\mathbb {R}}_+\) refers to the mesh size. Following the Petrov–Galerkin approach we set a discrete counterpart of (2.1) in the form: Find \(u^h -u_0^h\in X^0_h\) such that

$$\begin{aligned} \begin{aligned}&b_h(u^h,v^h) =0\quad \hbox { for all}\ v^h\in Y_h \end{aligned} \end{aligned}$$
(2.2)

over a test space \(Y_h\subset X\), where the continuous and sesquilinear form \(b_h: X\times X\mapsto {\mathbb {C}}\) refers to a discrete version of b. It may result from approximation of a computational domain, or may relate to the modified forms resulting from stabilization methods. In particular, \(b_h=b\) is possible option. We remark that the solution \(u^h\) depends on the choice of the test space \(Y_h\).

The main idea resides in defining the test space \(Y_h\) in (2.2). The standard Galerkin method relies on setting \(Y_h=X^0_h\) which is not the best choice. In fact, we consider the following. Let \({\mathcal {B}}_X :=\{\phi _i\in X\vert i=1,\dots ,\infty \}\) form a FE basis of X. In this basis, we denote by \(X^\perp _h\subset X\) the orthogonal complement to \(X_h\) with respect to (2.2), namely

$$\begin{aligned} \begin{aligned}&X^\perp _h :=\mathrm{span}\{\phi _i\in {\mathcal {B}}_X\vert \; b_h(w^h, \phi _i) =0\quad \hbox { for all}\ w^h\in X_h\}. \end{aligned} \end{aligned}$$
(2.3)

Noting that in general \(X \not =X_h\bigoplus X^\perp _h\) we define the space

$$\begin{aligned} Z_h :=\mathrm{span}\{\phi _i\in {\mathcal {B}}_X\vert \;\phi _i\not \in X^\perp _h\}. \end{aligned}$$

We exclude the trivial case \(Z_h =\emptyset \). Indeed, in the subsequent considerations we have the direct sum \(X =Z_h\bigoplus X^\perp _h\) in a piecewise polynomial basis \({\mathcal {B}}_X\), and \(X_h\subset Z_h\). Since \(X^\perp _h\) is contained in the kernel of the linear operator \(b_h(u^h,\,\cdot \,): X\mapsto {\mathbb {C}}\) in (2.2), this suggests to look for a test space \(Y_h\) in \(Z_h\).

The following discussion will rely on an interpolation operator \(I_{X_h}: {\widetilde{X}}\mapsto X_h\) defined on a suitable subspace \({\widetilde{X}}\subseteq X\) where exact interpolation conditions at a fixed number of points are well-defined. This property is important, for instance, in topology optimization. This requires that the solution u of (2.1) belongs to \({\widetilde{X}}\) such that \(I_{X_h}u\) is well defined. In our case \(I_{X_h}\) is the classical interpolation operator and thus it suffices if \({\widetilde{X}}\) is contained in the space of continuous functions.

For any such interpolation operator, the approximation error between the solutions u of (2.1) and \(u^h\) of (2.2) can be estimated as

$$\begin{aligned} \begin{aligned}&\Vert u-u^h\Vert _X\le \Vert u-I_{X_h}u\Vert _X +\Vert I_{X_h}u-u^h\Vert _X, \end{aligned} \end{aligned}$$
(2.4)

where \(\Vert u-I_{X_h}u\Vert _X\) is the interpolation error, and \(\Vert I_{X_h}u-u^h\Vert _X\) is the error of the FE solution with respect to exact interpolation. We note that the former does not depend on the test space \(Y_h\), whereas the latter one is dependent on \(Y_h\).

Our goal is to construct the test space in such a manner that the latter error is minimized:

$$\begin{aligned} \begin{aligned}&\mathrm{minimize}\; \Vert I_{X_h}u-u^h\Vert _X\quad \hbox {over} \ Y_h \ \hbox {in} \ Z_h. \end{aligned} \end{aligned}$$
(2.5)

If the minimum in (2.5) is zero, this implies that

$$\begin{aligned} \begin{aligned}&u^h =I_{X_h}u. \end{aligned} \end{aligned}$$
(2.6)

If the exact solution u and, hence, its interpolate \(I_{X_h}u\) are known, then the necessary optimality condition for (2.5) can be deduced directly from (2.6). For this task, we substitute (2.6) in (2.2) to determine the test space \(Y_h\) in \(Z_h\) in Algorithm 2.1 below.

We assume that the union of all finite element spaces with respect to the family of meshes under consideration is dense in X. Since the FE basis is given in X we suggest the following conceptual algorithm for construction of the interpolation by means of the discrete problem (2.2).

Algorithm 2.1

Fix the form \(b_h\) in (2.2) and set the trial space \(X_h\).

Step 1. Find the orthogonal complement \(X^\perp _h\) in (2.3).

Step 2. Determine the test space \(Y_h\) by weighting the basis functions in \(Z_h\) to satisfy the necessary optimality condition

$$\begin{aligned} \begin{aligned}&b_h(I_{X_h}u,v^h) =0\quad \hbox { for all}\ v^h\in Y_h. \end{aligned} \end{aligned}$$
(2.7)

If the test space \(Y_h\) is constructed such that (2.7) in Step 2 is satisfied, then the discrete solution \(u^h\) of (2.2) coincides with the exact interpolation solution \(I_{X_h}u\). This construction needs knowledge of the exact solution \(u\in {\widetilde{X}}\) of (2.1), or, at least, particular solutions for specific data \(u_0\). Otherwise, we suggest to relax (2.7) with the following approximate condition:

$$\begin{aligned} \begin{aligned}&b_h(I_{X_h}u,v^h) =\Phi _h(v^h)\quad \text {for all }v^h\in Y_h,\quad \text {with}\; \frac{|\Phi _h(v^h)|}{\Vert v^h\Vert _X}\le \mathrm{Er}(h), \end{aligned} \end{aligned}$$
(2.8)

where \(\Phi _h: X\mapsto {\mathbb {C}}\) is linear, and \(\mathrm{Er}:{\mathbb {R}}_+\mapsto {\mathbb {R}}_+\) expresses an error function.

Proposition 2.2

Let the test space \(Y_h\) be such that (2.8) holds, and let the form \(b_h\) in (2.2) satisfy the following inf-sup condition:

$$\begin{aligned} \begin{aligned}&\inf _{w^h\in X^0_h} \sup _{v^h\in Y_h} \frac{|b_h(w^h,v^h)|}{\Vert w^h\Vert _X \Vert v^h\Vert _X} \ge \gamma >0. \end{aligned} \end{aligned}$$
(2.9)

Then the error of the FE solution of (2.2) can be estimated by

$$\begin{aligned} \begin{aligned}&\Vert I_{X_h}u-u^h\Vert _X\le {\textstyle \frac{1}{\gamma }} \mathrm{Er}(h). \end{aligned} \end{aligned}$$
(2.10)

Proof

Due to (2.9), estimate (2.10) follows by subtracting (2.2) from (2.8). \(\square \)

We note that from (2.9) the stability estimate \(\Vert u^h -u^0\Vert _X \le {\textstyle \frac{1}{\gamma }} \Vert b_h(u^0,\,\cdot \,)\Vert _{X^{-1}}\) follows, where \(X^{-1}\) denotes the dual space of X.

In Sect. 3 we shall present our approach for the specific case of the Helmholtz equation. We confine ourselves to the case when \(X_h\) is the space of continuous piecewise linear trial functions. For this case, it is known that the approximation error is of order \(\mathrm{O}(h)\). We realize Algorithm 2.1 by substituting particular solutions in the form of plane waves into (2.8) and by dispersion analysis. In this way we determine the space \(Y_h\) consisting of weighted continuous piecewise quadratic test functions in \(Z_h\). As a result of asymptotic analysis for \(\kappa :=\frac{kh}{2}\rightarrow 0\), where k stands for the wave number, we obtain the error function \(\mathrm{Er}(h)=\mathrm{o}(\kappa ^7)\) in (2.8). Moreover, we prove that the system matrix associated to \(b_h\) is positive definite and hence the inf-sup condition (2.9) holds. Therefore Proposition 2.2 guarantees interpolation of order \(\mathrm{o}(\kappa ^7)\).

3 The Helmholtz equation

In this section we first formulate the Helmholtz problem under consideration and then show how to use Algorithm 2.1.

3.1 The Helmholtz problem formulation and discretization

Let \(\Omega \subset {\mathbb {R}}^d\), \(d\in \{2,3\}\), be a bounded domain with the Lipschitz boundary \(\partial \Omega \). We consider the following model problem: Find \(u\in H^1(\Omega ;{\mathbb {C}}) =:X\) such that

$$\begin{aligned} \begin{aligned}&u=u_0\quad \hbox { on}\ \partial \Omega ,\\&\int _{\Omega } (\nabla u\cdot \nabla {\overline{v}} -k^2 u{\overline{v}})\,dx =0\quad \text {for all }v\in H^1_0(\Omega ;{\mathbb {C}}) =:X^0, \end{aligned} \end{aligned}$$
(3.1)

where the wave number \(k\in {\mathbb {R}}\) and the Dirichlet data \(u_0\in H^1(\Omega ;{\mathbb {C}})\) are given. Except for eigenvalues, the unique solvability of the variational equation (3.1) is argued by the Fredholm’s theorem, see e.g. [6, Section 5.3].

The solution of (3.1) fulfills the homogeneous Helmholtz equation: \(-\Delta u -k^2 u =0\) in \(\Omega \). Therefore, the interior \(C^\infty \)-regularity of u follows from the general theory of linear elliptic PDEs, see [22, Chapter 3]. Moreover, we assume \(\partial \Omega \) of class \(C^1\) and \(u_0\in W^{1/p',p}(\partial \Omega ;{\mathbb {C}})\) for \(p'\) the conjugate of p and \(p>d\). Then there exists \(U_0\in W^{1,p}(\Omega )\) such that \(U_0 =u_0\) on \(\partial \Omega \), see [35, Lemma 1.49]. The difference \(w:=u-U_0\) has zero trace and solves the inhomogeneous Helmholtz equation: \(-\Delta w -k^2 w =k^2 U_0 -\Delta U_0\) in \(\Omega \). Therefore, the embedding \(u\in W^{1,p}(\Omega )\subset C({\overline{\Omega }})\) holds, see [35, Theorem 3.16]. This provides well-posedness of a pointwise evaluation of u in \({\overline{\Omega }}\) for the interpolation \(I_{X_h}u\) (see (3.2) below).

While we consider the Dirichlet problem (3.1), our subsequent considerations can be extended to the cases of Neumann, Robin, and mixed boundary conditions as well as inhomogeneous equations.

We assume that the computational domain \(\Omega _h\subset {\mathbb {R}}^d\) is polyhedral endowed with a mesh \(M_h\) and coincides with the physical domain \(\Omega \). Its boundary is denoted by \(\partial \Omega _h\). In particular, we assume that \(\Omega _h\) consists of uniform elements of size \(h>0\), quadrilaterals in \({\mathbb {R}}^2\) and polyhedra in \({\mathbb {R}}^3\). Let \(N_h\) denote the set of all mesh nodes \(x^j\in M_h\subset {\overline{\Omega }}_h\), \(j=1,\dots ,N\). We set \(X_h\subset H^1(\Omega _h;{\mathbb {C}})\) as continuous piecewise d-linear (bilinear in \({\mathbb {R}}^2\), trilinear in \({\mathbb {R}}^3\)) finite element functions with respect to the mesh \(M_h\) over \(\Omega _h\). The respective FE functions from \(X^0_h\) have zero traces at \(\partial \Omega _h\).

In the following by \(I_{X_h}\) we denote the classical, piecewise d-linear interpolation operator in \(C({\overline{\Omega }};{\mathbb {C}}) =:{\widetilde{X}}\) fulfilling

$$\begin{aligned} I_{X_h}u(x^j) =u(x^j)\quad \text {at the mesh points }x^j, j=1,\dots ,N. \end{aligned}$$
(3.2)

This operator is stable, see e.g. [5, Lemma 2.1].

The discrete counterpart of (3.1) thus reads: Find \(u^h\in X_h\) such that

$$\begin{aligned} \begin{aligned}&u^h=I_{X_h}u_0\quad \hbox { on}\ \partial \Omega _h,\\&\int _{\Omega _h} (\nabla u^h\cdot \nabla {\overline{v}}^h -k^2 u^h{\overline{v}}^h)\,dx =0,\quad \hbox { for all}\ v^h\in Y_h. \end{aligned} \end{aligned}$$
(3.3)

The unknown test space \(Y_h\subset H^1(\Omega _h;{\mathbb {C}})\) in (3.3) is to be determined according to Algorithm 2.1. For this purpose we construct in the next section a continuous piecewise polynomial basis in \(H^1(\Omega _h;{\mathbb {C}})\) based on Gegenbauer polynomials.

3.2 The continuous piecewise polynomial basis

To construct a continuous piecewise polynomial basis we suggest to use Gegenbauer polynomials \(G_n\in {\mathbb {P}}_n([-1,1])\), \(n=0,1,\dots \), which are defined by the recursion

$$\begin{aligned} \begin{aligned}&G_0(\xi ):=1,\quad G_1(\xi ):=-\xi , \quad G_2(\xi ) ={\textstyle \frac{1-\xi ^2}{2}}, \quad G_3(\xi ) ={\textstyle \frac{\xi (1-\xi ^2)}{2}},\dots \\&G_n(\xi )={\textstyle \frac{1}{n}} \bigl ( (2n-3)\xi G_{n-1}(\xi ) -(n-3) G_{n-2}(\xi )\bigr )\quad \hbox { for}\ n=2,3,\dots \end{aligned} \end{aligned}$$
(3.4)

More of their properties are given in Appendix A.

In the numerical framework, the Gegenbauer polynomials (3.4) are well suited as hierarchical shape functions for hp-FEM, e.g. see [32]. In our theoretical optimization context, they will be used for mother wavelets with vanishing moments in Appendix  A.

Based on (3.4) the hierarchical shape functions are given in the local coordinates \(\xi \in [-1,1]\) with respect to the binary parameter \(t\in \{-1,1\}\) as

$$\begin{aligned} \begin{aligned}&{\widetilde{G}}^t_1(\xi ) :={\textstyle \frac{1+t\xi }{2}},\quad {\widetilde{G}}^t_n(\xi ) :=t^nG_n(\xi )\quad \hbox { for}\ n=2,3,\dots . \end{aligned} \end{aligned}$$
(3.5)

For vector coordinates \(\xi =(\xi _1,\dots ,\xi _d)\in [-1,1]^d\) and binary multi-indices \((t_1,\dots ,t_d)\in \{-1,1\}^d\), we define the d-dimensional shape functions as

$$\begin{aligned}&S^{(t_1,\dots ,t_d)}_{n_1\dots n_d}(\xi ) :=\prod _{i=1}^d {\widetilde{G}}^{t_i}_{n_i}(\xi _i) \quad \text {with }n_i\in {\mathbb {N}}, t_i\in \{-1,1\}\text { for }i=1,\dots ,d, d\in {\mathbb {N}},\nonumber \\ \end{aligned}$$
(3.6)

which are d-dimensional polynomials \({\mathbb {P}}_n([-1,1]^d)\) of degree at most \(n={\displaystyle \max _{i=1,\dots ,d}} n_i\).

Fig. 1
figure 1

Patch \(\Pi ^j\) of four quadrilaterals containing \(x^j=(x^j_1,x^j_2)\) in a 2d computational domain

In the computational domain, we apply the following affine coordinate transformation given for every fixed mesh point \(x^j=(x^j_1,\dots ,x^j_d)\in {\mathbb {R}}^d\) and \((t_1,\dots ,t_d)\in \{-1,1\}^d\) by

$$\begin{aligned} \begin{aligned}&\{x_i =x^j_i +{\textstyle \frac{h}{2}} (\xi _i-t_i)\}_{i=1}^d: \quad \{\xi _i\}_{i=1}^d \mapsto \{x_i\}_{i=1}^d, \quad [-1,1]^d \mapsto Q^j_{(t_1,\dots ,t_d)},\quad \end{aligned} \end{aligned}$$
(3.7)

which transforms the parent domain \([-1,1]^d\) to the polyhedra

$$\begin{aligned} \begin{aligned}&Q^j_{(t_1,\dots ,t_d)} :=\{x_i\in {\mathbb {R}}:\; x^j_i -{\textstyle \frac{h}{2}} (1+t_i)\le x_i\le x^j_i +{\textstyle \frac{h}{2}} (1-t_i)\}_{i=1}^d . \end{aligned} \end{aligned}$$
(3.8)

The \(2^d\) adjacent polyhedra in (3.8) forming the patch centered at the point \(x^j\) are

$$\begin{aligned} \begin{aligned}&\Pi ^j :=\bigcup _{t_1,\dots ,t_d\in \{-1,1\}}Q^j_{(t_1,\dots ,t_d)} =\{x_i\in {\mathbb {R}}:\; x^j_i -h\le x_i\le x^j_i +h\}_{i=1}^d, \end{aligned} \end{aligned}$$

as illustrated in 2d in Fig. 1. Applying (3.7) to (3.6) we get the transformed shape functions

$$\begin{aligned} \begin{aligned}&S^{(t_1,\dots ,t_d)}_{n_1\dots n_d}(x) :=S^{(t_1,\dots ,t_d)}_{n_1\dots n_d} \bigl (t_1+{\textstyle \frac{2}{h}} (x_1-x^j_1),\dots , t_d+{\textstyle \frac{2}{h}} (x_d-x^j_d) \bigr ) \end{aligned} \end{aligned}$$
(3.9)

as polynomials \({\mathbb {P}}_n(Q^j_{(t_1,\dots ,t_d)})\) of degree at most \(n={\displaystyle \max _{i=1,\dots ,d}} n_i\) on the polyhedra (3.8) in the reference patch \(\Pi ^j\). We have the following lemma.

Lemma 3.1

The continuous piecewise polynomials of degree n on the uniform polyhedral mesh \(M_h\) over \(\Omega _h\subset {\mathbb {R}}^d\) can be spanned with the shape functions (3.9) of degree at most n on the polyhedra.

Proof

Firstly we note note that the piecewise d-linear function in (3.9), namely

$$\begin{aligned} \begin{aligned}&S^{(t_1,\dots ,t_d)}_{1\dots 1}(x) =\prod _{i=1}^d \bigl (1+{\textstyle \frac{t_i}{h}} (x_i-x^j_i) \bigr ) \quad \text {for }x\in Q^j_{(t_1,\dots ,t_d)},\quad (t_1,\dots ,t_d)\in \{-1,1\}^d, \end{aligned} \end{aligned}$$

form the usual “hat” function supported on the patch \(\Pi ^j\). Second, it holds

$$\begin{aligned} \begin{aligned}&S^{(t_1,\dots ,t_d)}_{n_1\dots n_d}(x) =0 \quad \text {for }x\in \partial Q^j_{(t_1,\dots ,t_d)}\text { if }\min _{i=1,\dots ,d} n_i =2,\quad (t_1,\dots ,t_d)\in \{-1,1\}^d, \end{aligned} \end{aligned}$$

because the Gegenbauer polynomials of degree two and more vanish at the boundary (see property (A.3) in Appendix  A). Therefore, the shape functions (3.9) of degree at most n form a basis for the continuous piecewise polynomials of degree n supported on the patch \(\Pi ^j\). We associate the center-point \(x^j\) of \(\Pi ^j\) to the nodes \(N_h=\{x^j\}_{j=1}^N\) of a uniform polyhedral mesh \(M_h\) over \(\Omega _h\). Since every continuous piecewise polynomial with respect to the mesh can be partitioned continuously over the set of overlapping patches \(\{\Pi ^j\}_{j=1}^N\) which cover \({\overline{\Omega }}_h\), this proves the assertion. \(\square \)

For the underlying Helmholtz problem (3.3), from Lemma 3.1 and Theorem A.6 which is proved in Appendix  A, the main result of this section will follow.

Theorem 3.2

Let \(X_h\) be the space of continuous piecewise d-linear polynomials over \({\mathbb {C}}\) on a uniform polyhedral mesh \(M_h\) over \(\Omega _h\subset {\mathbb {R}}^d\). The orthogonal complement with respect to (3.3) (see (A.16))

$$\begin{aligned} \begin{aligned}&X^\perp _h =\{v\in H^1(\Omega _h;{\mathbb {C}}): \int _{\Omega _h} (\nabla w^h\cdot \nabla {\overline{v}} -k^2 w^h{\overline{v}})\,dx =0\quad \hbox { for all}\ w^h\in X_h\} \end{aligned} \end{aligned}$$

can be spanned by continuous piecewise polynomials of degree \(n\ge 3\) on the mesh \(M_h\) over \(\Omega _h\). Moreover, continuous piecewise polynomials of degree at most \(n=2\) form a basis for the finite-dimensional space \(Z_h\) for fixed h.

By Theorem 3.2 we next construct the test space \(Y_h\) in \(Z_h\) according to Step 2 in Algorithm 2.1 as follows.

Let \(u\in H^1(\Omega ;{\mathbb {C}})\) be the exact solution of the variational problem (3.1). As introduced above, let \(I_{X_h}u\in X_h\) denote the interpolation over \(M_h\) satisfying the interpolation condition (3.2). Following (2.7) we look for a test space \(Y_h\) in \( Z_h\) satisfying the following equation:

$$\begin{aligned} \begin{aligned}&\int _{\Omega _h} \bigl ( \nabla (I_{X_h}u)\cdot \nabla {\overline{v}}^h -k^2 (I_{X_h}u){\overline{v}}^h \bigr )\,dx =0\quad \text {for all }v^h\in Y_h. \end{aligned} \end{aligned}$$
(3.10)

Due to Theorem 3.2, \(Y_h\) should be sought within continuous piecewise quadratic polynomials on the mesh \(M_h\) over \(\Omega _h\). In fact, one does not gain any benefit by expanding the basis of \(Y_h\) beyond the quadratic polynomials as their contribution to the system matrix of (3.10) will be zero. Since the dimension of \(X_h\) is less than \(Z_h\), substituting the span of \(X_h\) and the span of \(Z_h\) into (3.10) would result in an undetermined problem for the unknown coefficients by the respective basis functions. We reduce the number of quadratic test functions using appropriate simplifications and accounting for particular solutions u for specific data \(u_0\). As natural simplification, we suggest to rely on symmetric test functions in a finite element basis. Further we provide a dispersion analysis of (3.10) with respect to the particular solutions u specified by plane waves.

3.3 The dispersion analysis in 2d

In this section we describe in detail the dispersion analysis in 2d. The 3d results will be presented in Appendix B.

To choose an appropriate finite element basis for the test space \(Y_h\) we assemble the linear and quadratic shape functions from (3.9) in a globally continuous and symmetric way as follows. In \({\mathbb {R}}^2\), at every patch \(\Pi ^j =Q^j_{(1,-1)}\cup Q^j_{(-1,-1)}\cup Q^j_{(1,1)}\cup Q^j_{(-1,1)}\) consisting of four adjacent quadrilaterals which are defined for \(t_1,t_2\in \{-1,1\}\) by

$$\begin{aligned} \begin{aligned}&Q^j_{(t_1,t_2)} =\{x_i\in {\mathbb {R}}:\; x^j_i -{\textstyle \frac{h}{2}} (1+t_i)\le x_i\le x^j_i +{\textstyle \frac{h}{2}} (1-t_i)\}_{i=1}^2, \end{aligned} \end{aligned}$$

see Fig. 1, we define the node (linear-linear), edge (linear-quadratic), and bubble (quadratic-quadratic) modes with respect to two spatial dimensions, respectively, by

$$\begin{aligned}&{linear-linear:}\quad S^{(t_1,t_2)}_{11}(x) =\bigl (1+{\textstyle \frac{t_1}{h}} (x_1-x^j_1)\bigr ) \bigl (1+{\textstyle \frac{t_2}{h}} (x_2-x^j_2)\bigr ), \end{aligned}$$
(3.11a)
$$\begin{aligned}&{linear-quadratic:}\quad S^{(t_1,t_2)}_{21}(x) +S^{(t_1,t_2)}_{12}(x)\nonumber \\&\quad =-2\bigl ({\textstyle \frac{t_1}{h}} (x_1-x^j_1) +{\textstyle \frac{t_2}{h}} (x_2-x^j_2)\bigr ) S^{(t_1,t_2)}_{11}(x), \end{aligned}$$
(3.11b)
$$\begin{aligned}&{quadratic-quadratic:}\quad S^{(t_1,t_2)}_{22}(x) =4{\textstyle \frac{t_1}{h}} (x_1-x^j_1) {\textstyle \frac{t_2}{h}} (x_2-x^j_2) S^{(t_1,t_2)}_{11}(x).\nonumber \\ \end{aligned}$$
(3.11c)

These shape functions are illustrated in Fig. 2. In comparison, the trial space \(X_h\) consists of bilinear shape functions corresponding to the node mode (3.11a) only. Thus, the test space \(Y_h\) is enriched compared to \(X_h\).

Fig. 2
figure 2

Test space: continuous and symmetric shape functions on the patch in \({\mathbb {R}}^2\)

Next we apply the finite element ansatz function as

$$\begin{aligned} \begin{aligned}&I_{X_h}u(x) =\sum _{j=1}^N U^j \sum _{(\tau _1,\tau _2)\in \{-1,1\}^2:\, Q^j_{(\tau _1,\tau _2)}\cap \Omega _h\not =\emptyset } S^{(\tau _1,\tau _2)}_{11}(x) \end{aligned} \end{aligned}$$
(3.12)

with unknown coefficients \(\{U^j\}_{j=1}^N\in {\mathbb {C}}^N\), and

$$\begin{aligned} \begin{aligned} v^h_i(x)&=\sum _{(t_1,t_2)\in \{-1,1\}^2:\, Q^i_{(t_1,t_2)}\cap \Omega _h\not =\emptyset } \Bigl ( \alpha ^NS^{(t_1,t_2)}_{11}(x) +\alpha ^E \bigl (S^{(t_1,t_2)}_{21}(x)\\&\quad +S^{(t_1,t_2)}_{12}(x) \bigl ) +\alpha ^BS^{(t_1,t_2)}_{22}(x)\Bigr ) \quad \hbox { for}\ i=1,\dots ,N \end{aligned} \end{aligned}$$
(3.13)

with unknown weights \(\alpha ^N, \alpha ^E, \alpha ^B \in {\mathbb {R}}\).

Substituting (3.12) and (3.13) into (3.10) we obtain the system matrix, which we denote by \(A(\kappa ^2) =\{a_{ij}(\kappa ^2)\}_{i,j=1}^N \in {\mathbb {R}}^{N\times N}\). Its entries are given by

$$\begin{aligned} \begin{aligned}&a_{ij}(\kappa ^2) =\sum _{Q^i_{(t_1,t_2)}=Q^j_{(\tau _1,\tau _2)}} \int _{Q^i_{(t_1,t_2)}} \Bigl \{ \nabla S^{(\tau _1,\tau _2)}_{11} \cdot \nabla \Bigl ( \alpha ^NS^{(t_1,t_2)}_{11}(x)\\&+\alpha ^E \bigl (S^{(t_1,t_2)}_{21}(x) +S^{(t_1,t_2)}_{12}(x) \bigl ) +\alpha ^BS^{(t_1,t_2)}_{22}(x)\Bigr )-k^2 S^{(\tau _1,\tau _2)}_{11}\\&\times \Bigl ( \alpha ^NS^{(t_1,t_2)}_{11}(x) +\alpha ^E \bigl (S^{(t_1,t_2)}_{21}(x) +S^{(t_1,t_2)}_{12}(x) \bigl ) +\alpha ^BS^{(t_1,t_2)}_{22}(x)\Bigr ) \Bigr \}\,dx, \end{aligned} \end{aligned}$$
(3.14)

for \(i,j=1,\dots ,N\). Henceforth the notation \(\kappa :=\frac{kh}{2}\) is used. These coefficients can be calculated explicitly and we find

$$\begin{aligned} \begin{aligned}&a_{ij}(\kappa ^2) =a_{ji}(\kappa ^2)\\&=\left\{ \begin{array}{ll} 4A_0(\kappa ^2), &{} \text {if }\Pi ^i=\Pi ^j \text { with 4 overlapping quadrilaterals of }\Pi ^i\text { and }\Pi ^j\\ 2A_1(\kappa ^2), &{} \text {if }\Pi ^i\cap \Pi ^j\not =\emptyset \text { with 2 overlapping quadrilaterals of }\Pi ^i\text { and }\Pi ^j\\ A_2(\kappa ^2), &{} \text {if }\Pi ^i\cap \Pi ^j\not =\emptyset \text { with 1 overlapping quadrilaterals of }\Pi ^i\text { and }\Pi ^j\\ 0, &{} \text {if }\Pi ^i\cap \Pi ^j=\emptyset , \end{array}\right. \end{aligned} \end{aligned}$$

according to the nine-point interior stencil

$$\begin{aligned} \begin{aligned}&A_{\mathrm{stencil}}^{\mathrm{interior}}(\kappa ^2) =\left( \begin{array}{l@{\qquad }c@{\qquad }r} A_2(\kappa ^2) &{} 2A_1(\kappa ^2) &{} A_2(\kappa ^2)\\ 2A_1(\kappa ^2) &{} 4A_0(\kappa ^2) &{} 2A_1(\kappa ^2)\\ A_2(\kappa ^2) &{} 2A_1(\kappa ^2) &{} A_2(\kappa ^2) \end{array}\right) \end{aligned} \end{aligned}$$

with stencil coefficients

$$\begin{aligned} \begin{aligned}&A_0(\kappa ^2) =\alpha ^N \bigl ( {\textstyle \frac{2}{3} -\frac{4}{9}} \kappa ^2 \bigr ) +\alpha ^E \bigl ( {\textstyle \frac{1}{3} -\frac{4}{9}} \kappa ^2 \bigr ) -\alpha ^B {\textstyle \frac{1}{9}} \kappa ^2,\\&A_1(\kappa ^2) =\alpha ^N \bigl ( {\textstyle -\frac{1}{6} -\frac{2}{9}} \kappa ^2 \bigr ) -\alpha ^E {\textstyle \frac{1}{3}} \kappa ^2 -\alpha ^B {\textstyle \frac{1}{9}} \kappa ^2,\\&A_2(\kappa ^2) =\alpha ^N \bigl ( {\textstyle -\frac{1}{3} -\frac{1}{9}} \kappa ^2 \bigr ) +\alpha ^E \bigl ( {\textstyle -\frac{1}{3} -\frac{2}{9}} \kappa ^2 \bigr ) -\alpha ^B {\textstyle \frac{1}{9}} \kappa ^2 \end{aligned} \end{aligned}$$
(3.15)

and unknown weights \(\alpha ^N, \alpha ^E, \alpha ^B\).

To determine these unknown weights \(\alpha ^N\), \(\alpha ^E\), and \(\alpha ^B\) we employ particular solutions of the Helmholtz equation. In fact, utilizing interior regularity the solution u of (3.1) satisfies the following equation point-wise

$$\begin{aligned} \begin{aligned}&-\Delta u(x) -k^2 u(x)=0\quad \hbox { for}\ x\in \Omega . \end{aligned} \end{aligned}$$
(3.16)

The particular solutions of (3.16) are plane waves of the complex form

$$\begin{aligned} \begin{aligned}&u(x) =e^{\imath k(x_1\cos \theta +x_2\sin \theta )},\quad \imath ^2=-1, \end{aligned} \end{aligned}$$
(3.17)

with an arbitrary incident angle \(\theta \in (-\pi ,\pi ]\). The piecewise d-linear interpolation \(I_{X_h}u\in X_h\) of (3.17) on every polyhedron in the patch \(\Pi ^j\), \(j=1,\dots ,N\), reads

$$\begin{aligned} \begin{aligned}&I_{X_h}u(x) =\sum _{\tau _1,\tau _2\in \{-1,1\}} e^{\imath \kappa \bigl ((\tau _1-t_1)\cos \theta +(\tau _2-t_2)\sin \theta \bigr )} S^{(\tau _1,\tau _2)}_{11}(x)\\&\times e^{\imath k(x^j_1\cos \theta +x^j_2\sin \theta )} \quad \text {for} \ x\in Q^j_{(t_1,t_2)}, t_1,t_2\in \{-1,1\}. \end{aligned} \end{aligned}$$
(3.18)

Substituting expressions (3.18) and (3.13) into (3.10), for the interior stencil we derive the usual dispersion equation with respect to the incident angle \(\theta \):

$$\begin{aligned} \begin{aligned}&A_0(\kappa ^2) +A_1(\kappa ^2) \bigl ( \cos (2\kappa C) +\cos (2\kappa S) \bigr ) +A_2(\kappa ^2) \cos (2\kappa C) \cos (2\kappa S) =0 \end{aligned}\nonumber \\ \end{aligned}$$
(3.19)

where the notation \(C:=\cos \theta \) and \(S:=\sin \theta \) is used. Inserting (3.15) into (3.19), one linear equation \({\mathcal {D}}(\alpha ^N, \alpha ^E, \alpha ^B; \kappa ^2, \theta )=0\) for three unknowns \(\alpha ^N\), \(\alpha ^E\), and \(\alpha ^B\) in dependence of \(\kappa ^2\) and \(\theta \) is obtained. For each \(\kappa =\frac{kh}{2}\) and for each \(\theta \), equality (3.19) implies a linear condition which can be solved for the variables

$$\begin{aligned} \{\alpha ^N(\kappa ^2,\theta ), \alpha ^E(\kappa ^2,\theta ), \alpha ^B(\kappa ^2,\theta )\} \end{aligned}$$

in dependence of the parameters \((\kappa ^2,\theta )\).

Our consideration results in the following.

Proposition 3.3

If the incident direction \((\cos \theta ,\sin \theta )\) in \({\mathbb {R}}^2\) is fixed a-priori, then there exist nontrivial weights \(\alpha ^N(\kappa ^2,\theta ), \alpha ^E(\kappa ^2,\theta ), \alpha ^B(\kappa ^2,\theta )\in {\mathbb {R}}\) which solve the dispersion equation (3.19). These weights determine the finite element basis (3.13) for the test space \(Y_h\). The exact interpolation \(I_{X_h}u\) from (3.18) solves problem (3.10) stated over \(Y_h\). If the solution to (3.10) is unique, then it coincides with \(I_{X_h}u\).

We note that Proposition 3.3 justifies the optimality condition (2.7) given in Step 2 of Algorithm 2.1 for the underlying Helmholtz problem.

In realistic situations, the incident angle \(\theta \) is unknown a-priori, and the dispersion equation (3.19) cannot be solved for arbitrary angle. Therefore, in the following we rely on the asymptotic model of (3.19) when \(\kappa \rightarrow 0\). In view of the considerations of Sect. 2 this will give us an approximate optimality condition (2.8) instead of (2.7) (respectively, instead of (3.10) for the underlying Helmholtz problem).

We look for the weights in (3.13) given in asymptotic form with respect to \(\kappa ^2\) as

$$\begin{aligned} \alpha ^N(\kappa ^2) =1 +\alpha ^N_1\kappa ^2,\quad \alpha ^E(\kappa ^2) =\alpha ^E_0 +\alpha ^E_1\kappa ^2,\quad \alpha ^B(\kappa ^2) =\alpha ^B_0 +\alpha ^B_1\kappa ^2,\nonumber \\ \end{aligned}$$
(3.20)

with five unknown coefficients \(\alpha ^N_1\in {\mathbb {R}}\), and \(\alpha ^E_i, \alpha ^B_i\in {\mathbb {R}}\) for \(i=0,1\). The chosen number of five unknowns will be argued below (3.21). We substitute (3.15) and (3.20) into (3.19) and apply asymptotic relations:

$$\begin{aligned} \begin{aligned}&\cos (2\kappa C) +\cos (2\kappa S) =2 -2\kappa ^2 +\bigl (1-2(CS)^2\bigr ) {\textstyle \frac{2}{3}} \kappa ^4 -\bigl (1-3(CS)^2\bigr ) {\textstyle \frac{4}{45}} \kappa ^6 +\mathrm{o}(\kappa ^7),\\&\cos (2\kappa C) \cdot \cos (2\kappa S) =1 -2\kappa ^2 +\bigl (1+4(CS)^2\bigr ) {\textstyle \frac{2}{3}} \kappa ^4 -\bigl (1+12(CS)^2\bigr ) {\textstyle \frac{4}{45}} \kappa ^6 +\mathrm{o}(\kappa ^7). \end{aligned} \end{aligned}$$

This substitution reduces the dispersion equation to the following asymptotic equality

$$\begin{aligned} \begin{aligned}&{\mathcal {D}}(\alpha ^N(\kappa ^2), \alpha ^E(\kappa ^2), \alpha ^B(\kappa ^2); \kappa ^2, \theta ) = {\textstyle 2\kappa ^2 \bigl (-\frac{1}{3}\alpha ^E_0 -\frac{2}{9} \alpha ^B_0 \bigr )}\\&+{\textstyle \frac{2}{3}\kappa ^4 \bigl (\frac{1}{2} +\frac{4}{3}\alpha ^E_0 -2\alpha ^E_1 +\frac{2}{3} \alpha ^B_0 -\frac{4}{3}\alpha ^B_1\bigr )} +{\textstyle (CS)^2\frac{2}{3}\kappa ^4 \bigl (-1 -\frac{4}{3}\alpha ^E_0\bigr )}\\&+{\textstyle \frac{4}{45}\kappa ^6 \bigl (-2 +\frac{15}{2}\alpha ^N_1 -\frac{23}{6}\alpha ^E_0 +20\alpha ^E_1 -\frac{5}{3} \alpha ^B_0 +10\alpha ^B_1\bigr )}\\&+{\textstyle (CS)^2 \frac{4}{45}\kappa ^6 \bigl (\frac{7}{2} -15\alpha ^N_1 +\frac{7}{3}\alpha ^E_0 -20\alpha ^E_1 -\frac{5}{3} \alpha ^B_0 \bigr )} +\mathrm{Er}(\kappa ,\theta ), \end{aligned} \end{aligned}$$
(3.21)

where the residual \(\mathrm{Er}(\,\cdot \,,\theta ) =\mathrm{o}(\kappa ^7)\) for \(\theta \in (-\pi ,\pi ]\). With the following five coefficients

$$\begin{aligned} \begin{aligned}&\alpha ^N_1 =-{\textstyle \frac{9}{20}},\quad \alpha ^E_0 =-{\textstyle \frac{3}{4}},\quad \alpha ^E_1 ={\textstyle \frac{13}{40}},\quad \alpha ^B_0 ={\textstyle \frac{9}{8}},\quad \alpha ^B_1 =-{\textstyle \frac{9}{80}} \end{aligned} \end{aligned}$$
(3.22)

we eliminate the five linear combinations by the terms of order \(\mathrm{O}(\kappa ^2)\), \(\mathrm{O}(\kappa ^4)\), \(\mathrm{O}(\kappa ^6)\), \(\mathrm{O}((CS)^2\kappa ^4)\), and \(\mathrm{O}((CS)^2\kappa ^6)\) in (3.21), which provides the asymptotic equality

$$\begin{aligned} \begin{aligned}&{\mathcal {D}}\bigl ( {\textstyle 1 -\frac{9}{20}\kappa ^2, -\frac{3}{4} +\frac{13}{40}\kappa ^2, \frac{9}{8} -\frac{9}{80}\kappa ^2}; \kappa ^2, \theta \bigr ) =\mathrm{Er}(\kappa ,\theta ),\\&\mathrm{Er}(\kappa ,\theta ) =\mathrm{o}(\kappa ^7) \quad \text {for all }\theta \in (-\pi ,\pi ]. \end{aligned} \end{aligned}$$
(3.23)

Inserting (3.22) into (3.15) we get finally the following stencil coefficients

$$\begin{aligned} \begin{aligned}&A_0(\kappa ^2) ={\textstyle \frac{5}{12} -\frac{77}{180} \kappa ^2 +\frac{49}{720} \kappa ^4},\quad A_1(\kappa ^2) ={\textstyle -\frac{1}{6} -\frac{1}{45} \kappa ^2 +\frac{1}{240} \kappa ^4},\\&A_2(\kappa ^2) ={\textstyle -\frac{1}{12} -\frac{1}{36} \kappa ^2 -\frac{7}{720} \kappa ^4}. \end{aligned} \end{aligned}$$
(3.24)

The corresponding biquadratic finite element basis functions

$$\begin{aligned} \begin{aligned}&\alpha ^N(\kappa ^2) S^{(t_1,t_2)}_{11}(x) +\alpha ^E(\kappa ^2) \bigl (S^{(t_1,t_2)}_{21}(x) +S^{(t_1,t_2)}_{12}(x) \bigl ) +\alpha ^B(\kappa ^2) S^{(t_1,t_2)}_{22}(x) \end{aligned} \end{aligned}$$

for \(x\in Q^j_{(t_1,t_2)}\), \(t_1,t_2\in \{-1,1\}\), are depicted in Fig. 3 in dependence of \(\kappa \).

Fig. 3
figure 3

Test space: biquadratic finite elements on the patch in \({\mathbb {R}}^2\)

These basis functions correspond to the center-point in the patch shown in Fig. 2. They are a specific combination with the coefficients \((\alpha ^N(\kappa ^2), \alpha ^E(\kappa ^2), \alpha ^B(\kappa ^2))\) of the three shape modes therein. When varying \(\kappa \) we observe a difference between the basis functions shown in the plots (a) and (b) for \(\kappa =\sqrt{\frac{2}{7}(11-\sqrt{46})}\approx 1.0977\) (see the remark after Theorem 3.4) and \(\kappa =0.25\), respectively, while the basis functions depicted in the plots (b) and (c) for \(\kappa =0.25\) and \(\kappa =10^{-10}\) are visually indistinguishable. This fact explains that \(\kappa \) need not be chosen very small for computations, in spite of the asymptotic arguments used for the analysis \(\kappa \rightarrow 0\). We shall discuss in the next session the choice of \(\kappa \) for the computational realization.

We finish this section with few remarks. It is argued in [3] that, generally, no further reduction of degree in the residual error \(\mathrm{o}(\kappa ^7)\) can be attained in the context of the dispersion equation (3.19). From (3.23) we can determine also the discrete wave number \(k_\kappa \) such that \(|k_\kappa -k| =\mathrm{o}(\kappa ^7)\).

In the following section we show well-posedness of (3.3) for the chosen trial and test spaces. Subsequently, we estimate the respective error of discretization.

3.4 Well posedness and a-priori error analysis

Now we consider the discrete variational problem problem (3.3) with the test space \(Y_h =Y_h^\kappa \) spanned by the biquadratic finite element functions weighted according to (3.20) and (3.22). The test space is enriched in comparison with the bilinear finite elements in the trial space \(X_h\).

To guarantee well posedness of (3.3) in the test space \(Y_h^\kappa \), positive definiteness of the system matrix \(A(\kappa ^2)\in \mathrm{Sym}(N^2)\) with coefficients \(A_0(\kappa ^2), A_1(\kappa ^2), A_2(\kappa ^2)\) from (3.24) is needed. We note that the coefficients enter the system matrix \(A(\kappa ^2)\) in the following way. All nonzero elements in each row as well as in each column consist of nine elements which are \(4A_0(\kappa ^2)\) (once), \(2A_1(\kappa ^2)\) (four times), and \(A_2(\kappa ^2)\) (four times). We start our consideration with the Laplace operator which corresponds to \(\kappa =0\) in (3.24) and then extend it by continuity to \(\kappa >0\). We note that the relations \(A_0(0)>0\) and \(A_0(0)+2A_1(0)+A_2(0)=0\) hold which imply the usual consistency conditions for the Laplace operator. In fact, we can derive the following properties of the matrix \(A(0)\in \mathrm{Sym}(N^2)\):

  1. (i)

    A(0) has positive diagonal entries.

  2. (ii)

    A(0) is an L-matrix: \(a_{ij}(0)\le 0\) for \(j\not =i\), \(a_{ii}(0)>0\).

  3. (iii)

    A(0) is diagonally dominant: \(\sum _{j\not =i} |a_{ij}(0)|\le |a_{ii}(0)|\).

  4. (iv)

    A(0) is irreducible: no permutation matrix P exists such that \(PA(0)P^\top \) can be reduced.

From (i), (iii), and (iv) it follows that A(0) is irreducibly diagonal dominant, hence nonsingular, see [36]. Since the determinant of \(A(\kappa ^2)\) is a continuous function of its entries, the following existence theorem holds.

Theorem 3.4

There exists \(\kappa _0^2>0\) (which may be small) such that the discrete variational problem (3.3) stated in the trial space \(X_h\) and in the test space \(Y_h^\kappa \) is well posed for all \(\kappa ^2\le \kappa _0^2\), \(\kappa :=\frac{kh}{2}\).

We can evaluate an upper bound for the constant \(\kappa _0^2\) in Theorem 3.4 from the necessary condition (i) by requiring: \(\kappa _0<\sqrt{\frac{2}{7}(11-\sqrt{46})}\approx 1.0977\). For larger values of \(\kappa \) we obtain negative diagonal entries. In diverse tests the choice \(\kappa \le 0.25\) was successful (see Fig. 4).

Well-posedness can be assured only if \(k^2\) is bounded away from the finite-dimensional eigenvalues of the Laplace operator. Otherwise, if \(k^2\) approaches the finite-dimensional eigenvalues it is said to enter a zone of degeneracy, see the related investigation in [9, 24, 27]. In 3d, in Appendix B for a specific choice of \(\alpha ^B_1\) we will get \(A_0(\kappa ^2)>0\) for all \(\kappa ^2\), which is the improvement compared to the 2d case.

Now with the help of Theorem 3.4 we investigate the error of (3.3). We consider the plane wave solution u in (3.17) and its linear interpolate \(I_{X_h}u\) given in (3.18). The optimality condition (3.10) is not satisfied exactly with the test space \(Y_h =Y_h^\kappa \). But with (3.23) we can argue that a residual term \(\phi _h^\kappa \in X_h\) exists such that

$$\begin{aligned} \begin{aligned}&\int _{\Omega _h} \bigl (\nabla (I_{X_h}u)\cdot \nabla {\overline{v}}^h -k^2 (I_{X_h}u) {\overline{v}}^h\bigr )\,dx =\int _{\Omega _h} \phi _h^\kappa {\overline{v}}^h \,dx\quad \text {for all }v^h\in Y_h^\kappa , \end{aligned}\nonumber \\ \end{aligned}$$
(3.25a)
$$\begin{aligned} \begin{aligned}&\Bigl | \int _{\Omega _h} \phi _h^\kappa {\overline{v}}^h \,dx\Bigr | \le \mathrm{Er}(\kappa )\, \Vert v^h\Vert _X,\quad \mathrm{Er}(\kappa )>0,\quad \mathrm{Er}(\kappa ) =\mathrm{o}(\kappa ^7), \end{aligned}\nonumber \\ \end{aligned}$$
(3.25b)

holds (compare with (2.8)). In comparison with (3.25a), the discrete counterpart \(u^h\in X_h\) solves the homogeneous equation

$$\begin{aligned} \begin{aligned}&u^h=I_{X_h}u\quad \hbox { on}\ \partial \Omega _h,\\&\int _{\Omega _h} \bigl (\nabla u^h\cdot \nabla {\overline{v}}^h -k^2 u^h {\overline{v}}^h\bigr )\,dx =0\quad \text {for all }v^h\in Y_h^\kappa , v^h=0\text { on }\partial \Omega _h. \end{aligned} \end{aligned}$$
(3.26)

For \(\kappa ^2<\kappa _0^2\), from Theorem 3.4 the inf-sup condition (2.9) follows which in our case has the form of the LBB condition with the LBB constant \(\gamma \):

$$\begin{aligned} \begin{aligned}&\inf _{w^h\in X^0_h} \sup _{v^h\in Y_h} \frac{\Bigl | \int _{\Omega _h} \bigl (\nabla w^h\cdot \nabla {\overline{v}}^h -k^2 w^h {\overline{v}}^h\bigr )\,dx \Bigr |}{\Vert w^h\Vert _X \Vert v^h\Vert _X} \ge \gamma >0. \end{aligned} \end{aligned}$$
(3.27)

Therefore, analogously to Proposition 2.2 in Sect. 2, from 3.24 and (3.27) we infer the following result on the asymptotic error.

Theorem 3.5

Let \(\kappa ^2<\kappa _0^2\). For an arbitrary plane wave solution u, due to the asymptotic estimate of the dispersion (3.23), the error between the exact linear interpolation and the FE solution of (3.26) is given by

$$\begin{aligned} \begin{aligned}&\Vert I_{X_h}u -u^h\Vert _X =\mathrm{o}(\kappa ^7), \quad \kappa ={\textstyle \frac{kh}{2}}. \end{aligned} \end{aligned}$$
(3.28)

The assertion of Theorem 3.5 holds also in the 3d case, which will be described in Appendix B.

We note that the Dirichlet problem (3.26) can be generalized to an arbitrary mixed setting in the following way. We consider the Dirichlet, Neumann, and Robin boundary conditions stated at pairwise disjoint boundaries \(\Gamma _h^D\cup \Gamma _h^N\cup \Gamma _h^R =\partial \Omega _h\) of the computational domain \(\Omega _h\) endowed with the mesh \(M_h\). In the discretized form it leads to the following variational problem: Find \(u^h\in X_h\) such that \(u^h=I_{X_h}u_0\) on \(\Gamma _h^D\) and

$$\begin{aligned} \begin{aligned}&\int _{\Omega _h} \bigl (\nabla u^h\cdot \nabla {\overline{v}}^h -k^2 u^h {\overline{v}}^h\bigr )\,dx +\int _{\Gamma _h^R} \beta _h u^h{\overline{v}}^h \,dx =0\quad \hbox { for all}\ v^h\in Y_h^\kappa , \end{aligned} \end{aligned}$$
(3.29)

where the continuous data \(I_{X_h}u_0\) and \(\beta _h\) are given in \({\overline{\Omega }}_h\) and \(\Gamma _h^R\), respectively. Using bilinear finite elements for \(u^h\in X_h\), and the biquadratic finite elements given by (3.10), (3.13) with the weights from (3.20), (3.22) for \(v^h\in Y_h^\kappa \), the stiffness and mass matrices can be calculated according to the terms in (3.29). The high order of interpolation, which was validated for the Dirichlet problem (3.26), may not hold for the variational boundary conditions appearing in (3.29). Possible ways of improvement are discussed in Sect. 4.

3.5 The a-posteriori numerical analysis

We compare our approximation by the Petrov–Galerkin enrichment (PGE) with the standard Galerkin least squares (GLS) and other GFEM methods: the quasi-stabilized (QSFEM) method introduced in [3], and the variational multiscale (VMS) version from [30], which are well known in the literature. With \(u^h\) we associate discrete solutions corresponding to these FE methods which are compared with respect to the exact solution u of the Helmholtz problem (3.1) given in the form of a plane wave. We present here the numerical result of tests computed in the unit cube \(\Omega _h =\Omega =[0,1]^d\subset {\mathbb {R}}^d\) for \(d=2\). Our observations also hold for our numerical tests in 3d.

The methods mentioned above are linear in the \(H^1\)-seminorm, i.e.,

$$\begin{aligned} \begin{aligned}&c_1(k,\kappa ) :=\Vert \nabla (u-u^h)\Vert _{L^2(\Omega _h;{\mathbb {C}})} \end{aligned} \end{aligned}$$
(3.30)

where \(c_1(k,\,\cdot \,) =\mathrm{O}(\kappa )\) for each \(k\in {\mathbb {R}}\) recalling that \(\kappa =\frac{kh}{2}\). The errors \(c_1(8,\kappa )\) are depicted in Fig. 4a for the different FE methods as \(\kappa \) varies according to \(\kappa =4h\) with \(h\in (2^{-9},\dots ,2^{-4})\). We observe that the curves for GFEM methods are very close to each other and advantage over GLS. For comparison, we provide here also the quadratic approximation when both trial and test spaces are spanned by continuous piecewise biquadratic polynomials on the uniform quadrilateral mesh \(M_h\) over \(\Omega _h\). In this case, \(c_1(k,\,\cdot \,)=\mathrm{o}(\kappa )\). The computational cost of the proposed PGE method is the same as the linear FE methods, while the cost for quadratic FEM is increased.

Fig. 4
figure 4

The errors in the selected norms for various FE methods

The difference between the various tested linear approximation methods become apparent when we examine the error with respect to the discrete \(\ell ^2\)-norm

$$\begin{aligned} c_2(k,\kappa ) :=\Vert I_{X_h}u-u^h\Vert _{X_h} =\Vert u-u^h\Vert _{X_h} =\Vert \{u(x^j)-u^h(x^j)\}_{j=1}^N\Vert _{\ell ^2({\mathbb {R}}^N)}\nonumber \\ \end{aligned}$$
(3.31)

given over the mesh nodes \(\{x^j\}_{j=1}^N =N_h\). These errors are depicted in Fig. 4, firstly, in plot (b) in dependence of \(\kappa \in (2^{-7},\dots ,2^{-2})\) for fixed \(k=8\) and, second, in plot (c) with respect to varying \(k\in (2^{3},\dots ,2^{6})\) for fixed \(\kappa =0.25\). The former case in plot (b) describes the asymptotic behavior of the tested methods as \(\kappa =\frac{kh}{2}\rightarrow 0\). Plot (c) represents the pollution effect when increasing the wave number k even fixed parameter \(\kappa \).

Fig. 5
figure 5

The error of approximation of the exact interpolation solution for various FE methods

Before detailed discussion of the curves depicted in Fig. 4b, c with respect to the finite \(\ell ^2\)-norm, it will be helpful to give another interpretation of these data. To explain it we present in Fig. 5 the error with respect to the linear interpolate solution \(I_{X_h}u\in X_h\) in the \(H^1\)-seminorm given by

$$\begin{aligned} \begin{aligned}&c_3(k,\kappa ) :=\Vert \nabla (I_{X_h}u-u^h)\Vert _{L^2(\Omega _h;{\mathbb {C}})}. \end{aligned} \end{aligned}$$
(3.32)

This quantifier expresses the accuracy of the tested FE methods with respect to exact interpolate of the solution. In Fig. 5 we depict the respective curves of \(c_3(k,\kappa )\), with respect to varying \(\kappa \in (2^{-7},\dots ,2^{-2})\) in the plot (a) for fixed \(k=8\), as well as varying \(k\in (2^{3},\dots ,2^{6})\) in the plot (b) for fixed \(\kappa =0.25\).

For convenience we present in the following table some observations which can be drawn from Figs. 4 and 5.

Table 1 The numerical performance of the FE methods as \(\kappa \rightarrow 0\) for fixed k

Below we comment on the behavior of the FE methods presented in Table 1.

  • The standard GLS method has, evidently, the poorest performance.

  • The high order (here, quadratic) finite elements have a good approximation property of the exact solution while lagging behind in interpolation in the mesh nodes.

  • The computational complexity of the PGE as compared to the other linear techniques is comparable since these methods only differ with respect to weight factors of the basis functions.

  • The VMS, QSFEM, and PGE methods have the smallest pollution error when increasing k even fixed \(\kappa =\frac{kh}{2}\).

  • The QSFEM method has the best performance for large \(\kappa \sim 1\). For small \(\kappa <2^{-3}\), however, its error stops decaying and starts to grow in the tests. The reason is the following one. The corresponding stencil is represented by formula (5.7) in [3] with divided differences of the order \(\frac{\mathrm{O}(\kappa )}{\mathrm{O}(\kappa )}\), which results in numerical uncertainty of the division \(\frac{0}{0}\) as \(\kappa \rightarrow 0\). Similarly, we observe in Figs. 4 and 5 that the asymptotic decay of VMS fails for \(\kappa <2^{-5}\). For smaller \(\kappa<\!<1\) also PGE may become numerically unstable.

From the computation tests for moderately small \(\kappa \le 0.25\) we conclude that our PGE method has the interpolation order \(\mathrm{o}(\kappa ^7)\), which justifies numerically the assertion of Theorem 3.4. It approximates the linearly interpolated solution in the most stable way among the tested FE methods.

4 Concluding remarks

Here we discuss possible further developments of our approach.

In our application to inverse scattering problem, see [21], the Dirichlet data \(u_0\) in (3.1) represents boundary measurement for the reconstruction of u by means of a solution to the Helmholtz equation.

We comment on extension of the formulas in Sect. 3 to Neumann and Robin boundary conditions. The weights (3.20) and (3.22) are found from the dispersion equation (3.19) at interior mesh points. In polyhedra adjacent to mesh points on the boundary, the weights need to be determined according to the dispersion equation for the boundary stencil of the mesh points in the respective patch. The corresponding dispersion equation depends on the boundary condition.

The inhomogeneous Helmholtz equation with a right-hand side f can be transformed to a homogeneous one for \(u-u_0\) with suitable \(u_0\), even in the case when f resides in the dual space \(H^{-1}(\Omega ;{\mathbb {C}})\).

The algorithm given in Sect. 2 has the potential of application to various PDEs expressed in variational form. We note that the result of Sect. 3 includes the Laplace equation by choosing \(k=0\). For vector valued problems, the algorithm can be extended to Raviart–Thomas like finite elements. Higher order finite elements can be employed within the Hermite interpolation as well.