A sharp relative-error bound for the Helmholtz $h$-FEM at high frequency

For the $h$-finite-element method ($h$-FEM) applied to the Helmholtz equation, the question of how quickly the meshwidth $h$ must decrease with the frequency $k$ to maintain accuracy as $k$ increases has been studied since the mid 80's. Nevertheless, there still do not exist in the literature any $k$-explicit bounds on the relative error of the FEM solution (the measure of the FEM error most often used in practical applications), apart from in one dimension. The main result of this paper is the sharp result that, for the lowest fixed-order conforming FEM (with polynomial degree, $p$, equal to one), the condition"$h^2 k^3$ sufficiently small"is sufficient for the relative error of the FEM solution in 2 or 3 dimensions to be controllably small (independent of $k$) for scattering of a plane wave by a nontrapping obstacle and/or a nontrapping inhomogeneous medium. We also prove relative-error bounds on the FEM solution for arbitrary fixed-order methods applied to scattering by a nontrapping obstacle, but these bounds are not sharp for $p\geq 2$. A key ingredient in our proofs is a result describing the oscillatory behaviour of the solution of the plane-wave scattering problem, which we prove using semiclassical defect measures.


Introduction
When solving the Helmholtz equation ∆u + k 2 u = 0 with the h version of the finite-element method (where accuracy is increased by decreasing the meshwidth h while keeping the polynomial degree p constant), h must decrease faster than k −1 to maintain accuracy as k increases; this is the so-called "pollution effect" [4].
A thorough investigation of how quickly h must decrease with the frequency k to maintain accuracy as k increases was performed by Ihlenburg and Babuška in the mid 90's [70,71] on the 1-d model problem. An explicit expression for the discrete Green's function for this problem is available, and Ihlenburg and Babuška used this to prove the following two sets of results: 1. The h-FEM is quasi-optimal in the H 1 semi-norm, with quasi-optimality constant independent of k, if (hk 2 /p) is sufficiently small; i.e. there exists c, C > 0, independent of h, k, and p such that, if hk 2 /p ≤ c, then where H h is the appropriate conforming subspace of H 1 (0, 1) of piecewise polynomials of degree p on meshes of width h, and u h is the Galerkin solution; see [70,Theorem 3], [69,Theorem 4.13], [71,Theorem 3.5] (when p = 1 this result was proved earlier in [3,Theorem 3.2]). The numerical experiments in [70,Figures 8 and 9] then indicated that, when p = 1, the condition "hk 2 sufficiently small" for quasi-optimality is necessary. 2. Under an assumption on the data f (discussed below), the relative error in the h-FEM can be made arbitrarily small by, when p = 1, making hk 3/2 sufficiently small and, when p ≥ 2 and the data is sufficiently smooth (see [69,Remark 4.28]), making h 2p k 2p+1 sufficiently small. More precisely, [70,Equation 3.25], [71,Theorem 3.7], [69,Equation 4.5.15, §4.6.4, and Theorem 4.27] prove that there exists C > 0, independent of h and k (but dependent on p) such that, if hk is sufficiently small, then the Galerkin solution u h exists and where the weighted H 1 norm · H 1 k (0,1) is defined by (3.2) below. The numerical experiments in [70,Figure 11], and [69, Figure 4.13] then indicated that, when p = 1, the condition "h 2 k 3 sufficiently small" is necessary for the relative error to be bounded (in agreement with the earlier numerical experiments in [8] for small k).
A note on terminology: following [70,71,69], we call the regime in h, k, and p where the solution is quasi-optimal (with constant independent of k) the asymptotic regime, and the regime where the solution is not quasi-optimal the preasymptotic regime. For example, by the results in Points 1 and 2 above, when p = 1 the asymptotic regime is when hk 2 is sufficiently small and the preasymptotic regime is when hk 2 1. The (asymptotic) quasi-optimality results in Point 1 above have since been generalised to Helmholtz problems in 2 and 3 dimensions (and improved in the case p ≥ 2). Indeed, the fact that the h-FEM with p = 1 is quasi-optimal (with constant independent of k) in the full H 1 k norm when hk 2 is sufficiently small was proved for the homogeneous Helmholtz equation on a bounded domain with impedance boundary conditions in [79,Proposition 8.2.7] (in the case of constant coefficients) and [61,Theorem 4.5 and Remark 4.6(ii)] (in the case of variable coefficients), and for scattering problems with variable coefficients in [50,Theorem 3]. The fact that the h-FEM for p ≥ 2 is quasi-optimal when h p k p+1 is sufficiently small was proved for a variety of constant coefficient Helmholtz problems in [80,Corollary 5.6], [81,Proof of Theorem 5.8], and [51,Theorem 5.1], and for a variety of problems including variable-coefficient Helmholtz problems in [25,Theorem 2.15]; the condition "h p k p+1 sufficiently small" is indicated to be sharp for quasi-optimality by, e.g., the numerical experiments in [25, §4.4].
In contrast, the (preasymptotic) relative-error bound (1.2) in Point 2 above has not been obtained for any Helmholtz problem in 2 or 3 dimensions, even though numerical experiments indicate that the condition "h 2p k 2p+1 sufficiently small" is necessary and sufficient for the relative error to be controllably small; see, e.g., [32, Left-hand side of Figure 3]. The closest-available result is that, if h 2p k 2p+1 is sufficiently small, then (following earlier work by [104]), and for p ≥ 1 for the variable-coefficient Helmholtz equation ∇ · (A∇u) + k 2 nu = −f in [87, §2.3] (under a nontrapping condition on A and n).
We highlight that, while [32], [51], and [76] all prove results of the form (1.3), all the numerical experiments in these papers consider the relative error (either in the H 1 norm [32], [76], or the weighted H 1 norm (3.2) [51]), illustrating that relative error is indeed the quantity of interest in practice. An analogous situation is encountered in the preasymptotic error analyses of other Helmholtz FEMs in [44,103,14,35,34,101,18,33,102]: all these papers prove bounds on the error in terms of the data, as in (1.3), but all the numerical experiments in these papers concerning the error consider the relative error.

The main results of this paper and their novelty
The two main results are the following: (a) Theorem 4.1 proves the relative-error bound (1.2) when p = 1 for scattering of a plane wave by a nontrapping obstacle and/or a nontrapping inhomogeneous medium (modelled by the PDE ∇ · (A∇u) + k 2 nu = 0 with variable A and n) in 2 or 3 dimensions (see Definition 2.2 below for the precise definition of the boundary-value problems considered). As highlighted above, the numerical experiments in [8,70,69] show that "h 2 k 3 sufficiently small" is necessary for the relative error of the h-FEM with p = 1 to be controllably small (independent of k), and so the result of Theorem 4.1 is the sharp bound to which the title of the paper refers. (b) Theorem 4.2 proves for p ≥ 2 a slightly-weaker bound than (1.2), namely that for scattering of a plane wave by a nontrapping obstacle in 2 or 3 dimensions, where C in (1.4) is independent of h and k but depends on p, with this dependence given explicitly in the theorem.
As discussed above, these are the first-ever frequency-explicit relative-error bounds on the Helmholtz h-FEM in 2 or 3 dimensions. We recall the interest (highlighted at the end of the previous subsection) from [44,104,100,32,103,14,35,34,51,101,76,18,33,102] in proving such bounds. An additional novelty of Theorem 4.1 is that it applies to the variablecoefficient Helmholtz equation, and all the constants in the relative-error bound are explicit, not only in k and h, but also in the coefficients A and n. The only other coefficient-explicit, preasymptotic FEM error bound on the variable-coefficient Helmholtz equation in the literature appears in [87,Theorem 2.39], where the bound (1.3) is proved for the interior impedance problem when h 2p k 2p+1 is sufficiently small and A and n are nontrapping. The only other coefficient-explicit FEM error bounds for the Helmholtz equation with variable A and n are in [61] and [50]. Both prove quasi-optimality under the condition "hk 2 sufficient small" when p = 1, with [61,Theorems 4.2 and 4.5] proving this result for the interior impedance problem and [50,Theorem 3] proving this result for scattering by a nontrapping Dirichlet obstacle.
Our two main results, Theorems 4.1 and 4.2, are proved for a particular class of Helmholtz problems, namely those corresponding to scattering by a plane wave, and not for the equation ∆u + k 2 u = −f with general f ∈ L 2 . We highlight that, for this latter class of problems, it is unreasonable to expect a relative-error bound such as (1.2) to hold, and thus the best one can do is prove bounds for a particular class of realistic data (as we do here). For example, consider the 1-d problem (1.1) with where χ has compact support in (0, 1). The solution to (1.1) is then u(x) = exp(ik n x)χ(x), which oscillates on a scale of k −n , i.e., a smaller scale than k −1 when n > 1. The finite-element method with, say, p = 1 and hk 3/2 small (and independent of k) will therefore not resolve this solution, and hence a bound such as ( (1.5), f L 2 (0,1) ∼ k 2n and u H 1 k (0,1) ∼ k n , so that f L 2 (0, 1) u H 1 k (0,1) , and the error estimate (1.3) holds in this case because, although the absolute error on left-hand side of (1.3) is large, the right-hand side of (1.3) is larger.
1.3 Discussion of these results in the context of using semiclassical analysis in the numerical analysis of the Helmholtz equation In the last ∼10 years, there has been growing interest in using results about the k-explicit analysis of the Helmholtz equation from semiclassical analysis (a branch of microlocal analysis) to design and analyse numerical methods for the Helmholtz equation 1 . The activity has so far occurred in, broadly speaking, five different directions: 1. The use of the results in [83] (on the rigorous k → ∞ asymptotics of the solution of the Helmholtz equation in the exterior of a smooth convex obstacle with strictly positive curvature) to design and analyse k-dependent approximation spaces for integral-equation formulations [31,53,2,39,75,74,36,38], 2. The use of the results in [83], along with those in [72] on scattering from several convex obstacles, to analyse algorithms for multiple scattering problems [40,1,11,37]. 3. The use of bounds on the Helmholtz solution operator (also known as resolvent estimates) due to [86] and [99] (with the latter using the propagation of singularities results in [82]) to prove k-explicit bounds on both inverses of boundary-integral operators and the inf-sup constant of the domain-based variational formulation [22,91,7,23], and also to analyse preconditioning strategies [52]. 4. The use of identities introduced in [86] to prove coercivity of boundaryintegral operators [94] and to introduce new coercive formulations of Helmholtz problems [93,85,55,30,56]. 5. The use of bounds on the restriction of quasimodes of the Laplacian to hypersurfaces from [97,17,95,64,27,96] to prove sharp k-explicit bounds on boundary integral operators [48], [63, Appendix A], [45], [49], with these bounds then used to prove sharp k-explicit bounds on the number of iterations when GMRES is applied to boundary-integral equations [47].
The results of the present paper include a sixth direction. Namely, a key ingredient in our proofs of Theorems 4.1 and 4.2 (indeed, the ingredient that allows one to obtain a relative-error bound instead of a bound in terms of the data, such as (1.3)) is a result describing the oscillatory behaviour of the solution of the plane-wave scattering problem, which we prove using semiclassical defect measures. These measures describe where the mass in phase space of a Helmholtz solution is concentrated in the high-frequency limit (see the discussion in §9.1 below), and were introduced in [57] and [77]; see [15] for more discussion on the history of defect measures.
1 shows a schematic of Ω − and the supports of I−A and 1−n. Let the scatterer Ω sc be defined by Ω sc := Ω − ∪ supp(I − A) ∪ supp(1 − n) (i.e., the union of the shaded areas in Figure 2.1). Given R > 0 such that Ω sc ⊂ B R , where B R denotes the ball of radius R about the origin, let Ω R := Ω + ∩ B R . Let Γ R := ∂B R and let Γ := ∂Ω − . Let n denote the outward-pointing unit normal vector field on both Γ and Γ R . We denote by ∂ n the corresponding Neumann trace on Γ or Γ R and ∂ n,A the corresponding conormal-derivative trace. We denote by γu the Dirichlet trace on Γ or Γ R . Definition 2.2 (Helmholtz plane-wave scattering problem) Given k > 0 and a ∈ R d with |a| = 1, let u I (x) := e ikx·a . Given Ω − , A, and n, as in Assumption 2.1, we say u ∈ H 1 loc (Ω + ) satisfies the Helmholtz plane-wave scattering problem if ∇ · (A∇u) + k 2 nu = 0 in Ω + , either γu = 0 or ∂ n,A u = 0 on Γ,

3)
and u S := u − u I satisfies the Sommerfeld radiation condition as r := |x| → ∞, uniformly in x := x/r. where where ·, · Γ R denotes the duality pairing on Γ R that is linear in the first argument and antilinear in the second. Then u = u| Ω R , where u is the solution of the Helmholtz plane-wave scattering problem of Definition 2.2.
For a proof of Lemma 2.3, see, e.g., [60,Lemma 3.3]. From here on we denote the solution of the variational problem (2.7) by u, so that u satisfies (2.10) when Dirichlet boundary conditions are prescribed we impose the additional condition that elements of H h are zero on Γ ; in both cases we then have H h ⊂ H. The main results, Theorems 4.1 and 4.2 below require Γ to be at least C 1,1 . For such Ω R it is not possible to fit ∂Ω R exactly with simplicial elements (i.e. when each element of T h is a simplex), and fitting ∂Ω R with isoparametric elements (see, e.g, [28, Chapter VI]) or curved elements (see, e.g., [9]) is impractical. Some analysis of non-conforming error is therefore necessary, but since this is very standard (see, e.g., [12, Chapter 10]), we ignore this issue here.
The second main result, Theorem 4.2 (for p ≥ 2 and analytic Γ ), requires the triangulation T h to be quasi-uniform in the particular sense of [81, Assumption 5.1]. Triangulations satisfying this assumption can be constructed by refining a fixed triangulation that has analytic element maps; see [81,Remark 5.2].
The finite-element method for the variational problem (2.7) is the Galerkin method applied to the variational problem (2.7), i.e.
Observe that setting v = v h in (2.9) and combining this with (2.11) we obtain the Galerkin orthogonality that

Definitions of quantities involved in the statement of the main results
Throughout the paper we assume that R ≥ R 0 > 0 for some fixed R 0 > 0 and k ≥ k 0 for some fixed k 0 > 0. For simplicity we assume throughout that Given a bounded open set D, we let the weighted H 1 norm, · H 1 k be defined by We now define quantities C DtNj , j = 1, 2, C sol , C osc , C PF , C H 2 , C int , and C MS that appear in the main results (Theorems 4.1 and 4.2). All of these are dimensionless quantities, independent of k, h, and p, but dependent on one or more of A, n, Ω − (indicated below).
for all u, v ∈ H 1 (Ω R ) and for all k ≥ k 0 , and for all φ ∈ H 1/2 (Γ R ) and for all k ≥ k 0 .
C sol We assume that A, n, and Ω − are nontrapping in the sense that there exists C sol = C sol (A, n, Ω − , R, k 0 ) such that, given f ∈ L 2 (Ω R ), the solution of the boundary value problem (BVP) and v satisfies the Sommerfeld radiation condition (2.4) (with u S replaced by v), satisfies the bound observe that the factor R on the right-hand side makes C sol dimensionless. (Remark 4.5 discusses the situation where this nontrapping assumption is removed and C sol depends on k.) This assumption holds if the obstacle Ω − and the coefficients A and n are nontrapping in the sense that all billiard trajectories (or, more precisely, Melrose-Sjöstrand generalized bicharacteristics [68, Section 24.3]) starting in an exterior neighbourhood of Ω − and evolving according to the Hamiltonian flow defined by the symbol of (2.3) escape from that neighbourhood after some uniform time. For this flow to be well-defined, Γ must be C ∞ , and A and n must be globally C 1,1 and C ∞ in a neighbourhood of Γ ; note that the flow may in general be set-valued rather than unique in cases where the boundary is permitted to be infinite-order flat. Assuming the uniqueness of the flow, an explicit expression for C sol in terms of A, n, Ω − , and R is then given in [50, Theorems 1 and 2, and Equation 6.32]. However, the bound (3.5) can be established in situations with much less smoothness; indeed, [60, Theorems 2.5, 2.7, and 2.19] establishes (3.5) for a Dirichlet C 0 star-shaped obstacle and L ∞ A and n satisfying certain monotonicity assumptions. Furthermore, our arguments in the rest of the paper do not need the flow to be well-defined on Ω sc := Ω − ∪ supp(I − A) ∪ supp(1 − n), they only require that the bound (3.5) holds. We can therefore define nontrapping in this weaker sense, and work with scatterers of much lower smoothness than in standard microlocal-analysis settings.
C osc By Theorem 9.1 below, if A, n, and Ω − are nontrapping then there exists C osc = C osc (A, n, Ω − ) ('osc' standing for 'oscillation') such that for u a solution to the Helmholtz plane-wave scattering problem of Definition 2.2, The key point in (3.9) is that, although v in (3.8) depends on k via the boundary condition on Γ R , C H 2 is independent of k.
C int By, e.g., [ for all v ∈ H 2 (Ω R ), for some C int that depends only on the shape-regularity constant of the mesh. As a consequence of (3.10), the definition of · H 1 . We note here that the constants in these bounds are expressed in terms of analogous quantities to those defined above.

The main results
The first theorem holds for any p ≥ 1, but is most relevant in the case p = 1.
then the Galerkin solution u h to the variational problem (2.11) exists, is unique, and satisfies the bound where and then the Galerkin solution u h to the variational problem (2.11) exists, is unique, and satisfies the bound where The result of Theorem 4.2 might appear not to be a high-order result, since the lowest-order terms in (4.3) and (4.4) are h 2 and h, respectively. Nev- and the dominant term on the right-hand side of (4.4) is that involving k(hk) p+1 . We highlight that Theorem 4.2, along with the previous work discussed in §1.1, shows that high-order methods suffer less from the pollution effect than low-order methods.

How the main results are proved
Theorems 4.1 and 4.2 are proved using the so-called elliptic-projection argument or modified duality argument, used to prove the bound (1.3) on the solution in terms of the data. We first make some remarks about the history of this argument, and then outline our new contributions.
Recall that the classic duality argument, coming out of ideas introduced in [89], proves quasi-optimality of the Helmholtz FEM, and was used in, e.g., [3,70,79,88,80,81,24,25,51,61,50]. The elliptic-projection argument is a modification of this argument that allows one to prove results in the preasymptotic regime (as opposed to the asymptotic regime). The initial ideas were introduced in the Helmholtz context in [42,43] for interior-penalty discontinuous Galerkin methods, and then further developed for the standard FEM and continuous interior-penalty methods in [100,104]. The argument has been subsequently used by [32,6,101,24,51,76] (see, e.g., the literature review in [87, §2.3]).
We note that [43] and [100] also used an error-splitting argument (with this idea called "stability-error iterative improvement" in these papers), and that error splitting ideas were also used in [32], together with the idea of using discrete Sobolev norms in the duality argument. Although we do not use these ideas in this paper, one expects that they could be used to improve the p dependence in Theorem 4.2, but see [87,Remark 2.48] for a discussion on the challenges in doing this.
Our three new contributions to the elliptic-projection argument are (i) a rigorous proof, using semiclassical defect measures, of the bound (3.6) describing the oscillatory behaviour of the solution of the plane-wave scattering problem (see Theorem 9.1 below), (ii) the proof of H 2 regularity, with constant independent of k, of the solution of Poisson's equation with the boundary condition ∂ n v = DtN k (γv) (see (3.9) and Theorem 6.1), and (iii) determining how all the constants in the elliptic-projection argument depend on A, n, Ω − , and R.
Regarding (i): oscillatory behaviour similar to (3.6) of Helmholtz solutions has been an assumption in many analyses of finite-and boundary-element methods; see, e.g., [ . These results concern the Neumann trace of the solution of the Helmholtz plane-wave scattering problem with A = I and n = 1, and are then used in [59] and [47] to analyse boundary-element methods applied to this problem. In common with (3.6), these results are obtained using semiclassical-analysis techniques.
Regarding (ii): the analogous result (H 2 regularity with constant independent of k) for Poisson's equation with the impedance boundary condition ∂ n v = ikγv is central to the elliptic-projection argument for the Helmholtz equation with impedance boundary conditions. This result was explicitly assumed in [43,Lemma 4.3], implicitly assumed in [100,104,6,24], and recently proved in [26]. Our proof of (3.9) uses (and makes A-explicit) arguments from [26], which in turn use results from [62], adapting them to deal with the operator DtN k , instead of ik, in the boundary condition.
Regarding (iii): while the standard duality argument applied to the Helmholtz equation discussed above has recently been made explicit in A, n, and Ω − in [61,50] (as discussed in §1.2), the only places in the literature where the elliptic-projection argument is made explicit in A, n, and Ω − are the present paper and [87, §2.3], leading to the coefficient-explicit preasymptotic error bounds on the Helmholtz FEM at high-frequency in Theorem 4.1 and [87, Theorem 2.39]. One area in which we expect these results to be applied is in the analysis of uncertainty quantification (UQ) algorithms for the high-frequency Helmholtz equation with random coefficients, as discussed in the following remark. The only other analyses of uncertainty quantification (UQ) algorithms for the high-frequency Helmholtz equation with random coefficients in the literature are [41] and [54] (concerning Monte Carlo and Quasi-Monte Carlo methods, respectively). Because of the issue described in the previous paragraph, these papers use formulations of the Helmholtz equation where existence and uniqueness of the Galerkin solution is established for all k, h, p, and for a class of (deterministic) coefficients ( [41] uses the interior-penalty discontinuous-Galerkin method of [42,43] and [54] uses the coercive formulation of [56]). This then ensures that the Galerkin solution exists and is unique for all realisations of the random coefficients; see the discussion at the beginning of [41, §4].

Why does Theorem 4.2 not cover scattering by an inhomogeneous medium?
In both the elliptic-projection argument and the standard duality argument, a key role is played by the quantity η(H h ) defined by (8.3) below, which describes how well solutions of the (adjoint of the) Helmholtz equation can be approximated in H h .
In the case p = 1 we estimate η(H h ) using H 2 regularity of the solution (which holds when A and Ω − satisfy the assumptions of Theorem 4.1), leading to the bound (8.5) below. When p ≥ 1, A = I, n = 1, Ω − is a Dirichlet obstacle, and Γ is analytic, [81] proved the bound (8.6) on η(H h ), and we use this result to prove Theorem 4.2. The bound (8.6) was proved via a judicious splitting of the solution [81,Theorem 4.20] into an analytic but oscillating part, and an H 2 part that behaves "well" for large frequencies, and this splitting is only available for the exterior Dirichlet problem with A = I and n = 1.
We highlight that an alternative splitting procedure valid for Helmholtz problems with variable coefficients was recently developed in [25], leading to an alternative proof of the bound on η(H h ) (8.6) [25, Lemma 2.13]. However, this alternative procedure requires that DtN k be approximated by ik on Γ R . Indeed, in [25, Proof of Lemma 2.13] the solution is expanded in powers of k, i.e. u = ∞ j=0 k j u j , and then on Γ R one has ∂ n u j+1 = iγu j ; this relationship between u j+1 and u j on Γ R no longer holds if DtN k is not approximated by ik.

Approximating DtN k
Implementing the operator DtN k is computationally expensive, and so in practice one seeks to approximate this operator by either imposing an absorbing boundary condition on Γ R , or using a PML. In this paper we follow the precedent established in [80,81] of, when proving new results about the FEM for exterior Helmholtz problems, first assuming that DtN k is realised exactly. We remark, however, that if the two key ingredients in §4.2 (a proof of the oscillatory behaviour (3.6) and H 2 -regularity, independent of k, of a Poisson problem) can be established when DtN k is replaced by an absorbing boundary condition on Γ R , then the result of Theorem 4.1 carry over to this case. When an impedance boundary condition (i.e. the simplest absorbing boundary condition) is imposed on Γ R , the necessary Poisson H 2 -regularity result is proved in [26], but we discuss below in Remark 9.9 the difficulties in proving (3.6) in this case.

Removing the nontrapping assumption
The only place in the proofs of Theorems 4.1 and 4.2 where the nontrapping assumption (i.e. the fact that C sol in (3.5) is independent of k) is used is in the proof of the bound (3.6) (in Theorem 9.1 below). We sketch in Remark 9.10 below how (3.6) can be proved in the trapping case (i.e. when C sol is not independent of k); the rest of the proofs of Theorems 4.1 and 4.2 then go through as before. In the case of Theorem 4.1, the requirement for the relative error to be bounded independently of k would then be that h 2 k 3 C sol be sufficiently small. Under the strongest form of trapping, C sol can grow exponentially through a sequence of ks [10, §2.5], but is bounded polynomially in k if a set of frequencies of arbitrarily-small measure is excluded [73, Theorem 1.1]. However, it is not clear how sharp the requirement "h 2 k 3 C sol sufficiently small" for the relative error to be bounded is in these cases.

Outline of the proof
As highlighted in §4.2, one of the novelties of this paper is that it makes the elliptic-projection argument explicit in the coefficients A and n. However, this explicitness means that many of the expressions in the proofs are complicated (in the same way as the expressions in the results in Theorems 4.1 and 4.2 are complicated). In this section therefore, we give an outline of the proof, keeping track of the dependence on k, h, and p, but ignoring the dependence on A, n, Ω − , and R. We use the notation a b when a ≤ Cb with C independent of k, h, and p, but dependent on A, n, Ω − , and R.
As in the standard duality argument coming out of ideas introduced in [89] and then formalised in [88], our starting point is the fact that, since a(·, ·) satisfies the Gårding inequality (10.6), Galerkin orthogonality (2.12) and continuity of a(·, ·) (10.4) imply that, for any v h ∈ H h ,  In contrast, the elliptic-projection argument, which we follow, shows that on the first term on the right-hand side of (5.1), we obtain that, if hk 2 η(H h ) is sufficiently small, then, for any v h ∈ H h , ; i.e. quasi-optimality. Assuming H 2 regularity of the solution, and using (3.11), we obtain that, if hk 2 η(H h ) is sufficiently small, then In the standard elliptic-projection argument (see, e.g., [24, §5.5]) applied to the PDE ∆u + k 2 u = −f , an H 2 -regularity bound similar to (3.5) and the nontrapping bound (3.5) are combined to give |u| H 2 (Ω R ) k f L 2 (Ω R ) , and combining this with both (5.5) and the bound η(H h ) hk (see (8.5) below) proves the bound (1.3) with p = 1 on the Galerkin error in terms of the data when h 2 k 3 is sufficiently small.
In contrast, in this paper we prove, using semiclassical defect measures, that the solution to the plane-wave scattering problem satisfies (3.6), i.e. |u| H 2 (Ω R ) k u H 1 k (Ω R ) , (see Theorem 9.1 below), and using this in (5.5), along with the bounds on η(H h ) in Lemma 8.2, we obtain the relative-error bounds (4.2) and (4.4).
In summary, once one has proved the bound (3.6) (which we do via semiclassical analysis) and the Poisson H 2 -regularity bound (3.9) (which we do using results from [62] and properties of DtN k ), if one ignores the technicalities of making the argument explicit in A, n, Ω − , and R, then the proof of a preasymptotic relative-error bound follows via a straightforward modification of the elliptic-projection argument. Given the large and sustained interest (reviewed in §1.1) in preasymptotic relative-error bounds for the Helmholtz FEM, we believe this fact illustrates the advantage of approaching the numerical analysis of the Helmholtz equation from a perspective encompassing both numerical-analysis and semiclassical-analysis techniques.
As a first step to proving Theorem 6.1, we prove it in the case when Ω − = ∅. Lemma 6.4 Let A ∈ C 0,1 (B R , SPD) satisfy (2.1) (with Ω + replaced by B R ) and be such that supp( Then v ∈ H 2 (B R ) and where ∇A denotes the derivative of A.
Proof Let w ∈ H 1 (R d ) be the outgoing solution of the following transmission problem Since v ∈ H 2 (B R ) and A is Lipschitz, A∇v ∈ H 1 (B R ) and we can apply Lemma 6.2 with v := A∇v. Since A = I near Γ R , v = ∇v near Γ R and so the right-hand side of (6.1) becomes where we have used the boundary condition in (6.3). Now, DtN k and ∇ T commute on Γ R ; this can be seen either by rotation invariance, or by using the definition of DtN k and ∇ T in terms of Fourier series on Γ R . Therefore, the inequality (3.4) implies that the right-hand side of (6.1) is non-negative, hence The left-hand side of (6.4) equals where By the Cauchy-Schwarz inequality We therefore obtain Combining this with (6.2), (6.4), and (6.5), we obtain Using (5.4) on the second term on the right-hand side, we obtain the result.
We now use Lemma 6.4 to prove Theorem 6.1.
As a consequence of Lemma 7.1, we have for all v ∈ H, (7.4) and we then define the new norm on H, v := a (v, v).
Lemma 7.2 (Bounds on the solution of the variational problem associated with a (·, ·)) The solution of the variational problem Proof Since a (·, ·) is continuous and coercive in H, the first bound in (7.5) follows from the Lax-Milgram theorem and the fact that by the definition of · H 1 R (Ω R ) (7.3). The second bound in (7.5) follows from combining the first bound in (7.5) and the bound (3.9).
We now define the particular Galerkin projection known in the literature as the "elliptic projection" (see the discussion in §4.2).

Definition 7.3 (Elliptic projection
Since a (·, ·) is continuous and coercive in H by Lemma 7.1, the Lax-Milgram theorem implies that P h is well defined. The definition of P h then immediately implies the Galerkin-orthogonality property that Lemma 7.4 (Approximation properties of P h ) The elliptic projection P h satisfies for all u ∈ H.
Proof By the Cauchy-Schwarz inequality a (·, ·) is continuous in the · norm, and by definition, a (·, ·) is coercive in this norm. Therefore Céa's lemma implies that and (7.7) follows from the norm equivalence (7.4).
To prove (7.8) we use the standard duality argument. Given u ∈ H, let ξ be the solution of the variational problem Then, by Galerkin orthogonality (7.6) and continuity of a (·, ·), for all v h ∈ H h , By the norm equivalence (7.4), the consequence (3.11) of the definition of C int , the definition of ξ (7.9), and the second bound in (7.5), and the result (7.8) follows from combining this last inequality with (7.10).
8 Adjoint approximability i.e. S * f is the complex-conjugate of an outgoing Helmholtz solution.
Following [88], we define the quantity η(H h ) by observe that this definition implies that, given f ∈ L 2 (Ω R ), Lemma 8.2 Assume that A, n, and Ω − are nontrapping (and so (3.5) holds with C sol independent of k).
9 Proof of the oscillatory-behaviour bound (3.6) Theorem 9.1 If A, n, and Ω − are nontrapping (in the sense that the bound (3.5) holds), then the bound (3.6) holds, i.e., (9.1) Lemma 9.2 To prove Theorem 9.1, it is sufficient to prove that there exists k 0 > 0 and C mass = C mass (A, n, Ω − , R) > 0 such that Proof We first claim that the map k → u is continuous from (1, ∞) to H 2 (Ω R ); indeed, this follows from the well-posedness of the plane-wave scattering problem of Definition 2.2, H 2 regularity, and linearity. Therefore, the function −1 is continuous on [1, ∞), and it is sufficient to prove that the bound (9.1) (i.e., (3.6)) holds for k sufficiently large.
9.1 Overview of the ideas used in the rest of this section to prove (9.2) We have therefore reduced proving the oscillatory-behaviour bound (3.6)/(9.1) to proving the bound (9.2), which we prove using defect measures. The precise definition of a defect measure is given in Theorem 9.3 below, but the idea is that the defect measure of a Helmholtz solution describes where the mass of the solution in phase space (i.e. the set of positions x and momenta ξ) is concentrated in the high-frequency limit. Two examples of this feature are (i) the defect measure of the plane wave u I (x) := exp(ikx·a) is the product of a delta function at ξ = a and Lebesgue measure in x (see (9.8) below), reflecting the fact that, at high frequency (and in fact at any frequency), all the mass in phase space of the plane wave is travelling in the direction a, and (ii) the defect measure of an outgoing solution of the Helmholtz equation is zero on the so-called "directly incoming set" (see Lemma 9.8 below), where this set is defined in (9.20) below as points in phase space that don't hit the scatterer when propagated backwards along the flow.
A key feature of the defect measure of a Helmholtz solution is that it is invariant under the Hamiltonian flow defined by the symbol of the PDE, as long as the flow doesn't encounter the scatterer (see Theorem 9.6 below) This is analogous to results about propagation of singularities of the wave equation, where singularities travel along the trajectories of the flow (the bicharacteristics), and the projection of these trajectories in space are the rays.
The main ingredients to our proof of (9.2) are Points (i) and (ii) above, invariance under the flow (away from the scatterer), and then geometric arguments about the rays, using the fact that away from the scatterer the rays are straight lines and the flow has constant speed along the rays (see (9.12) below).
To conclude this overview, we direct the reader to [105,Chapter 5] for extensive discussion of defect measures in R d , to [16,84,50] for material on defect measures on manifolds with boundary, and to [15] for discussion on the history of defect measures.

Symbols and quantisation
Before defining defect measures, we need to define the functions on phase space (i.e. the set of positions x and momenta ξ) that the defect measure can act upon by dual pairing. These functions are called symbols, defined as functions on the cotangent bundle T * Ω + . Recall the definition of the cotangent bundle of R d : for our purposes, we can consider T * R d as {(x, ξ) : x ∈ R d , ξ ∈ R d }, i.e. the set of positions x and momenta ξ.
see, e.g., [105, §4]. The same definition holds for symbols supported away from the boundary of Ω + . We omit the analogous definition near the boundary since it is more involved; see [16, §4.2] (where it involves the so-called compressed cotangent bundle of Ω + , T * b Ω + ) and [84, §1.2]. We will not, in any event, require any specifics of the measure at the boundary in proving Theorem 9.1.

Existence of defect measures
Theorem 9.3 (Existence of defect measures [105,Theorem 5.2], [16, §4.2].) Suppose {v(k)} k0≤k<∞ is a collection of functions that is uniformly locally bounded in L 2 (Ω + ), i.e. given χ ∈ C ∞ comp (R d ) there exists C > 0, depending on χ and k 0 but independent of k, such that χv(k) L 2 (Ω+) ≤ C for all k ≥ k 0 . (9.6) Then there exists a sequence k → ∞ and a non-negative Radon measure µ on T * b Ω + (depending on k ) such that, for any symbol In the case of a plane wave u I (x) := exp(ikx · a) with |a| = 1, a direct calculation using (9.5) and the definition of the Fourier transform shows that, for all k, i.e. for any sequence k → ∞, the corresponding defect measure of u I is the product of the Lebesgue measure in x by a delta measure at ξ = a; we therefore talk about the (as opposed to a) defect measure of u I . The next lemma proves that, if u is the solution of the plane-wave scattering problem and χ is an arbitrary cut-off function, then χu is uniformly bounded in k (on compact subsets of Ω + ); existence of a defect measure of u then follows from Theorem 9.3. In the rest of this section, to emphasise the k-dependence of u, we write u = u(k).
Proof Let χ ∈ C ∞ comp (R d ) be such that χ = 1 in a neighbourhood of the scatterer Ω sc . Let v := u S + χu I , so that u = (1 − χ)u I + v. Since u I (k) L 2 (Ω R ) ≤ C 1 (R) for all k > 0, the result (9.9) will follow if we prove a uniform bound on v(k) L 2 (Ω R ) . The definition of v implies that v satisfies the Sommerfeld radiation condition, either γv = 0 or ∂ n v = 0 on Γ , and, with L A,n w := ∇ · (A∇w) + k 2 nw and [A, B] := AB − BA, since L A,n u I = 0 when 1 − χ = 0. By explicit calculation, using the fact that u I (x) = exp(ikx · a), where C 1 depends on A L ∞ (Ω R ) , ∇A L ∞ (Ω R ) , and χ, but is independent of k. The nontrapping bound (3.5) then implies that v(k) L 2 (Ω R ) ≤ C 2 with C 2 independent of k, and the result follows.

Support and invariance properties of defect measures
Recall that the semi-classical principal symbol of the Helmholtz equation (2.3) is given by (see, e.g., [105,Page 281]). In our arguments below we only consider points (x, ξ) in phase space when p = 0; this is because of the following result. As an illustration of this, the plane wave u I (x) := exp(ikx · a) with |a| = 1 is solution of the Helmholtz equation (2.3) with A = I and n = 1, and hence p = |ξ| 2 − 1 in this case. By (9.8), the defect measure of u I is the product of Lebesgue measure in x and a delta function at ξ = a, and thus is supported in |ξ| = 1, i.e., p = 0, as expected from Theorem 9.5.
The final result about defect measures that we need is their invariance under the flow (away from the scatterer). This result is Theorem 9.6 below; to state it, we first need to define the flow.
Away from Γ , and provided that A and n are both C 1,1 , the flow ϕ t is defined as follows: given ρ = (x 0 , ξ 0 ), ϕ t (ρ) := (x(t), ξ(t)) where (x(t), ξ(t)) is the solution of the Hamiltonian systeṁ with initial condition (x(0), ξ(0)) = (x 0 , ξ 0 ), where the Hamiltonian equals p defined by (9.10). Near both Γ and places where A and n are not C 1,1 , the definition of ϕ t is more involved -this is to account for reflection or refraction. However, we do not need this definition in what follows, since our arguments take place away from these regions. In fact our arguments take place away from the scatterer Ω sc . Outside Ω sc , A = I, and n = 1; thus p(x, ξ) = |ξ| 2 − 1. From (9.11), the flow satisfiesẋ i = 2ξ i andξ i = 0 and is therefore given by the straight-line motion The arguments below consider the flow with speed 2 (i.e. with |ξ 0 | = 1). This is without loss of generality, since away from Ω sc Theorem 9.5 implies that µ is only non-zero when |ξ| = 1. Both in the next result and later, we let π x denote projection in the x variables, i.e. π x ((x, ξ)) = x. Theorem 9.6 (Invariance of defect measure under the flow away from the scatterer) Suppose that u(k) satisfies (9.9), and let µ be any defect measure of u(k). If A ⊂ T * R d is such that π x (ϕ s (A)) ∩ Ω sc = ∅ for s between 0 and t, (i.e. the flow acting on A doesn't hit the scatterer from time 0 to time t), then µ(ϕ t (A)) = µ(A). (9.13) Proof In the absence of the scatterer, invariance of the measure under the flow is the statement that, for b ∈ C ∞ comp (T * R d ), 14) and this is proved in [105,Theorem 5.4], [16,Proposition 4.4]. For this result to hold in the presence of the scatterer in a time interval 0 ≤ s ≤ t, we need the spatial projection of the integrand in (9.14) to not be supported during this time interval on Ω sc , i.e., we need the condition that Under this condition, (9.14) implies that b(ρ) dµ = (b • ϕ −s )(ρ) dµ for all 0 ≤ s ≤ t. (9.16) Let 1 A denote the indicator function of a set A. By approximating 1 A by smooth symbols, (9.16) holds with b(ρ) = 1 A (ρ), provided that the condition (9.15) holds. Since ϕ −s (ρ) ∈ A iff ρ ∈ ϕ s (A), we have π x supp(1 A • ϕ −s ) = π x supp(1 ϕs(A) ) = π x ϕ s (A) , and thus (9.15) holds by the assumption in the statement of the theorem. Therefore, (9.16) implies that, for all 0 ≤ s ≤ t, which implies (9.13).
9.3 Proof of (9.2) using defect measures The following lemma reduces proving the bound (9.2) to proving a statement about defect measures.
Proof We prove the contrapositive. Suppose (9.2) fails; we aim to exhibit a defect measure associated to u for which (9.17) fails. Then, for any C 1 > 0, there exists a sequence (k n ) ∞ n=1 , with k n → ∞, such that we choose C 1 := 2C R,R0 . By Lemma 9.4, the sequence {u(k n )} ∞ n=1 is locally uniformly bounded and Theorem 9.3 implies that, by passing to a subsequence, there exists a defect measure µ of u associated to the subsequence, which we again denote k n . Let χ 0 , χ 1 ∈ C ∞ (R d ) be such that 0 ≤ χ 0 , χ 1 ≤ 1, and The bound (9.18) then implies that Passing to the limit n → ∞ and using the property of defect measure (9.7), we obtain that The definitions of χ 0 and χ 1 imply that and contradicting (9.17).
Before using Lemma 9.7 to prove (9.2), we prove a result (Lemma 9.8 below) about the structure of µ, exploiting the fact that u = u I + u S with u S is outgoing (in the sense that it satisfies the Sommerfeld radiation condition (2.4)). To make use of this outgoing property, we need to define appropriate notions of incoming and outgoing for elements of phase space. Let I denote the directly incoming set defined by (9.20) where recall that π x denotes projection in the x variables. That is, I is everything that never hits the scatterer under backward flow. Let These definitions of I and Γ + do not require the generalized bicharacteristic flow ϕ t to be defined in T * Ω sc , but when the flow is defined everywhere, Γ + is the forward generalized bicharacteristic flowout of Ω sc , that is The following lemma uses outgoingness of u S to show that, given a set E in phase space, the mass of u lying over E is either in the forward flowout Γ + or associated to the incident wave u I . Lemma 9.8 For any Borel set E ⊂ T * Ω, µ(E \ Γ + ) = µ I (E \ Γ + ), where µ is any defect measure of u, and µ I is the defect measure of u I .
Proof Let k be the sequence associated to the particular defect measure of u. By Lemma 9.4, u S (k ) is uniformly locally bounded, and so there exists a subsequence k m and a defect measure associated to u S , denoted by µ S . Then, by linearity and (9.7), µ = µ S + µ I . It is therefore sufficient to prove that µ S (E \ Γ + ) = 0. But, by the definition of Γ + , E \ Γ + ⊂ I, and µ S (I) = 0 by [16,Proposition 3.5], [50,Lemma 3.4], since u S is outgoing.
Proof of (9.25) Using Lemma 9.8 and the structure of µ I , we have we obtain By the first inclusion in (9.27), with this inequality expressing the fact that any parts of the scattered wave travelling in direction a must lie in Ω sc,R,a . Combining (9.36) with (9.37) yields Since Ω sc,R,a Ω ρ , there exists δ > 0 such that |Ω ρ | − |Ω sc,R,a | ≥ δ|Ω ρ |, and thus (9.35) and (9.38) imply that (9.25) holds; the proof is complete.
Remark 9.9 (What if impedance boundary conditions are imposed on Γ R ?) If the impedance boundary condition ∂ n u S − iku S = 0 is imposed on Γ R (as an approximation of DtN k ), then there are additional reflections on Γ R [84], [46, §2] µ S has support on the incoming set, and Lemma 9.8 no longer holds.
Remark 9.10 (Proving Theorem 9.1 in the trapping case) In the trapping case, u(k) L 2 (Ω R ) may no longer be uniformly bounded, as it is in Lemma 9.4, since (3.5) no longer holds with C sol bounded independently of k. If a subsequence of k's exists along which u(k) L 2 (Ω R ) is uniformly bounded, we may obtain a contradiction by the same argument as above by considering this subsequence. Thus, we can assume, without loss of generality, that u(k) L 2 (Ω R ) → ∞. Now instead of defining defect measures of u(k), one can instead define defect measures of u(k)/ u(k) L 2 (Ω R ) . If R is sufficiently large, then the bound in [19, Theorem 1.1] (i.e. the fact that the nontrapping cut-off resolvent estimate holds, even under trapping, if the supports of the cut-offs on both sides are sufficiently far away from the scatterer) implies that v(k) := u(k)/ u(k) L 2 (Ω R ) satisfies (9.6). Any defect measure of v(k) is then immediately non-zero, since µ(χ 2 ) ≥ 1 for any χ with suppχ ⊃ B R . Lemma 9.7 goes through as before after multiplying both sides of (9.19) by u(k) −2 L 2 (Ω R ) . The main change needed to the rest of the proof is to take into account the fact that a defect measure of u I (k)/ u(k) L 2 (Ω R ) is zero when u(k) L 2 (Ω R ) grows through the sequence k associated with that measure. In this situation, however, the bound (9.23) becomes µ(T * A) ≤ µ(T * A ∩ Γ + ); combining this with (9.24) we obtain µ(T * A) ≤ 2µ(T * Ω R ), from which the key bound (9.21) (and hence the result of the theorem) follows.
for all w h ∈ H h .
Proof Let ξ = S * (u − u h ); i.e. ξ is the solution of variational problem find ξ ∈ H such that a(v, ξ) = (v, u − u h ) L 2 (Ω R ) for all v ∈ H.
Then, by Galerkin orthogonality (7.6) and the definition of a (·, ·) (7.1), for We choose v h = P h ξ, and then use (in the following order) (i) the Galerkin orthogonality (7.6), (ii) continuity of a (·, ·), (iii) the bound (7.8), (iv) the upper bound in the norm equivalence (7.4) and the bound (7.7), and (v) the consequence (8.4) of the definition of η to obtain that, for all w h ∈ H H , the result then follows.
Remark 10.2 (Advantage of elliptic-projection over standard duality argument) Comparing (10.2) and (10.3) we see the advantage of the elliptic-projection argument over the standard duality argument: in (10.3), Galerkin orthogonality for a (·, ·) has allowed us to obtain u − w h (with w h arbitrary) as opposed to u − u h in the first argument of the sesquilinear form on the right-hand side, leading to the bound (5.3) instead of (5.2). The price for this is that we have an additional L 2 inner product on the right-hand side of (10.3), and controlling this leads to the condition (10.1).

Lemma 10.3
Assuming that the Galerkin solution u h to the variational problem (2.11) exists, if (10.1) holds, then where Proof Since DtN k satisfies the inequality (3.4), and A and n satisfy the inequalities (2.1) and (2.2), a(·, ·) (2.8) satisfies the Gårding inequality Using Galerkin orthogonality (2.12) and continuity of a(·, ·) (10.4), we find that that (5.1) holds for any v h ∈ H h . Using first the inequality (5.4) with α = u − u h H 1 k (Ω R ) , β = C cont u − v h H 1 k (Ω R ) , ε = A min , and then Lemma 10.1, we find that if (10.1) holds, then, for any v h ∈ H h , By the consequence (3.11) of the definition of C int and the bound (3.6)/(9.1), . (10.8) Choosing v h = I h u in (10.7), using (10.8), taking the square root and using the inequality √ a 2 + b 2 ≤ a + b for all a, b > 0, we find the result (10.5).
Proof (Proof of Theorem 4.1) Under the assumption that the Galerkin solution u h exists, the fact that the bound (4.2) holds under the condition (4.1) follows from combining Lemma 10.3 with the bound (8.5) on η. To prove that u h exists under the condition (4.1), recall that, since the variational problem (2.11) is equivalent to a linear system of equations in a finite-dimensional space, existence of a solution follows from uniqueness. Suppose that there exists a u h ∈ H h such that a( u h , v h ) = 0 for all v h ∈ H h ; to prove uniqueness, we need to show that u h = 0. Let u be such that a( u, v) = 0 for all v ∈ H, so that u h is the Galerkin approximation to u. Repeating the argument in the first part of the proof we see that the condition (4.1) holds then the bound (4.2) holds (with u replaced by u and u h replaced by u h ). By Lemma 2.4, u = 0, so (4.2) implies that u h = 0 and the proof is complete.
Proof (Proof of Theorem 4.2) This is very similar to the proof of Theorem 4.1, except that we use the bound (8.6) on η(H h ) instead of (8.5).