1 Introduction

The numerical simulation of high frequency (short wavelength) acoustic and electromagnetic scattering is challenging because of the need to capture the rapid oscillations in the wave field. Conventional Boundary Element Methods (BEMs), based on piecewise-polynomial basis functions, are computationally expensive because they require a fixed number of degrees of freedom (DOFs) per wavelength to achieve accurate solutions. This leads to very large (dense) BEM matrices which are costly to store and invert.

By contrast, hybrid numerical-asymptotic (HNA) BEMs use basis functions built from piecewise polynomials on coarse meshes multiplied by certain oscillatory functions, chosen based on partial knowledge of the high frequency asymptotic solution behaviour, as described by Geometrical Optics (GO) and the Geometrical Theory of Diffraction (GTD) [4, 41, 43]. The goal of the HNA approach (reviewed in [7]) is to achieve a fixed accuracy of approximation using a number of DOFs that is relatively small and frequency-independent, or only modestly (e.g. logarithmically) frequency-dependent, making it easier to store and invert the BEM matrix at high frequencies. HNA BEMs have been successfully developed for scattering by impenetrable convex [12, 13, 33], nonconvex [10, 31] and curvilinear [44] polygons, two-dimensional screens and apertures [32], smooth convex two-dimensional [2, 5, 19,20,21] and three-dimensional [23] scatterers, three-dimensional rectangular screens [30], penetrable convex polygons [27, 28] and, recently, for certain multi-obstacle scattering problems [6].

The use of oscillatory bases in HNA methods leads to an essentially frequency-independent BEM system size. However, the BEM matrix entries now involve highly oscillatory singular integrals, which need to be evaluated efficiently if one is to achieve the “holy grail” of frequency-independent computational cost. The majority of HNA methods in the literature to date (e.g. [10, 12, 13, 19,20,21, 23, 28, 30,31,32,33, 44]) are based on Galerkin discretisations. This leads to provably stable approximations and, in many cases (e.g. [10, 12, 20, 21, 31,32,33]), provides the framework for a rigorous, frequency-explicit convergence analysis. But Galerkin testing produces high-dimensional oscillatory integrals, which complicates the development of fast quadrature routines.

In order to develop HNA methods for more practically relevant problems than those tackled so far (in particular, three-dimensional problems), it would be highly advantageous to be able to use a collocation or Nyström approach, for which numerical quadrature is simpler. Existing work in this direction includes for example the piecewise-constant collocation method for convex polygons in [1], the B-spline method for two-dimensional smooth convex scatterers in [25] (which is based on the earlier technical report [26]) and the closely related Nyström method in [5]. However, these approaches are not generally supported by rigorous stability and convergence analysis, a practical consequence of which being that it is not at all obvious how to determine collocation/quadrature point distributions leading to stable approximations. This question is particularly delicate when working with non-smooth scatterers and HNA approximation spaces built with overlapping meshes to represent different high frequency solution components (as in [1]), which can exhibit a high degree of numerical “redundancy” and hence severe ill-conditioning (see e.g. the discussion in [1, §2] and §5 below).

Our aim in this paper is to demonstrate that accurate and efficient collocation-based HNA methods can indeed be developed for high frequency scattering problems involving non-smooth scatterers. The key new idea, apparently not employed previously in HNA methods, is to stabilise the method using oversampling, and solve the resulting overdetermined collocation linear system in a weighted least-squares sense.

We present our oversampled collocation approach in the context of a specific model problem, namely high frequency two-dimensional scattering by colinear screens and apertures. For an illustration of the problem see Fig. 1. Such problems have numerous applications in acoustics, electromagnetics and water waves - for details we refer to the extensive reference list in [32], noting also the recent work on iterative Wiener-Hopf approaches to this problem in [49]. Our collocation HNA BEM for the screen problem uses the same high order HNA approximation space as the Galerkin method presented in [32], which is based on an hp refinement strategy, giving exponential accuracy with increasing polynomial degree (see Theorem 3 below). To stabilise our collocation method we take more collocation points than degrees of freedom, and solve the resulting rectangular (overdetermined) system in a weighted least-squares sense, using a truncated singular value decomposition. The one-dimensional oscillatory singular integrals appearing in the BEM matrix are evaluated accurately and efficiently using the numerical method of steepest descent [16, 17, 39], which is based on complex contour deformation, combined with generalised Gaussian quadrature [37], to handle the singularities without the need for mesh grading.

By a series of numerical experiments we demonstrate that our HNA collocation BEM can achieve a fixed solution accuracy in a computation time that is bounded (in fact, sometimes even decreasing) as the frequency tends to infinity. A key empirical observation (that we are currently unable to explain theoretically) is that the oversampling threshold, which governs the number of collocation points per DOF, does not need to increase with increasing frequency in order to maintain accuracy.

This paper has as its genesis the MSc dissertation [48] of the fourth author, in which a range of HNA approximation spaces and collocation point distribution strategies were compared for square collocation BEM matrices. Numerical instabilities encountered in [48] motivated the investigation of the oversampled collocation method presented here. The oscillatory integration in [48] was carried out using Filon quadrature (see e.g. [16, 38, 40]), combined with mesh grading to handle singularities (as in [18]).

We point out that our oversampled collocation approach differs slightly from that of the CHIEF method of [50]. The CHIEF method was developed for scattering by closed surfaces, and introduces extra collocation points in the interior of the scatterer to stabilise integral equation formulations that are ill-posed at certain values of k, corresponding to interior resonances. In the current paper we are introducing extra collocation points on the scatterer boundary, to stabilise an integral equation formulation for scattering by an open surface (in particular, the scatterer has no interior), which is well-posed for all k but suffers from ill-conditioning when approximated numerically.

The structure of the paper is as follows. In Sect. §2 we define the scattering problem and its boundary integral equation reformulation. In §3 we describe the HNA approximation space of [32], and recall the corresponding best approximation error estimate. In §4 we present our regularised collocation approach, and in §5 we discuss the philosophy behind the truncated SVD approach used to solve the least-squares problem. In §6 we outline our numerical quadrature scheme for evaluating the oscillatory and singular integrals appearing in the BEM system. In §7 we present a series of numerical experiments demonstrating the performance of our method, including an application to scattering by the middle-third Cantor set.

Fig. 1
figure 1

Real part of the total field for scattering by \(N_\varGamma =4\) colinear sound-soft screens constituting the order \(j=2\) prefractal of the middle-third Cantor set (for more details see §7.3). The incident wave propagates from top left to bottom right

2 Problem statement

We consider 2D time-harmonic acoustic scattering by a sound-soft (Dirichlet) screen

$$\begin{aligned} \varGamma =\{ (x_1,0)\in \mathbb {R}^2:x_1\in \widetilde{\varGamma }\}, \qquad \widetilde{\varGamma }= \bigcup _{j=1}^{N_{\varGamma }} (s_{2j-1},s_{2j}), \end{aligned}$$
(1)

a bounded and relatively open subset of \(\varGamma _\infty :=\{ {\mathbf{x}}=(x_1,x_2)\in \mathbb {R}^2:x_2=0\}\) comprising \(N_{\varGamma }\ge 1\) colinear disjoint open intervals, with \(0=s_1<s_2<\ldots <s_{2N_{\varGamma }}={{\,\mathrm{diam}\,}}{\varGamma }\). We assume that lengths have been nondimensionalised so that \(s_{2N_{\varGamma }}={{\,\mathrm{diam}\,}}{\varGamma }=1\). For each \(j=1,\ldots ,N_{\varGamma }\) we set \(\varGamma _j:=(s_{2j-1},s_{2j})\times \{0\}\subset \mathbb {R}^2\). The propagation domain is \(D:=\mathbb {R}^2\setminus \overline{\varGamma }\), and on \(\varGamma\) we define the normal vector \({\mathbf{n}}=(0,1)\).

As the incident wave we take \(u^i({\mathbf{x}}) :=\mathrm{e}^{\mathrm{i}k {\mathbf{x}}\cdot {\mathbf{d}}}\), where \(k>0\) is the nondimensional wavenumber and \({\mathbf{d}}=(d_1,d_2)\in \mathbb {R}^2\) is a unit direction vector. The boundary value problem (BVP) to be solved is: Find \(u\in C^2\left( D\right) \cap C(\mathbb {R}^2)\cap W^1_{\mathrm {loc}}(D)\) s.t.

$$\begin{aligned} \varDelta u+k^2u&= 0, \qquad \text{ in } D, \end{aligned}$$
(2)
$$\begin{aligned} u&= 0, \qquad \text{ on } \varGamma , \end{aligned}$$
(3)

with the scattered field \(u^s:=u-u^i\) satisfying the Sommerfeld radiation condition (see, e.g., [7, Equation (2.9)]). Here \(W^1_\mathrm{loc}(D)\) is the usual space of locally square-integrable functions whose weak gradient is also locally square-integrable. Well-posedness of the BVP (2)–(3) is proved, for example, in [52]. For an example solution see Fig. 1.

Remark 1

By Babinet’s principle, the Dirichlet screen problem (2)–(3) is equivalent to its complementary aperture problem, in which \(u^i\) impinges on an aperture \(\varGamma\) in a sound-hard (Neumann) screen occupying \(\varGamma _\infty \setminus \overline{\varGamma }\). (For a precise definition of the aperture BVP see [32, Definition 1.2].) Explicitly, with \(U^\pm = \{{\mathbf{x}}\in \mathbb {R}^2:\pm x_2>0\}\), \(u^r({\mathbf{x}}) :=\mathrm{e}^{\mathrm{i}k {\mathbf{x}}\cdot {\mathbf{d}}'}\), \(d_2<0\), and \({\mathbf{d}}':=(d_1,-d_2)\), the screen and aperture solutions u and \(u'\) are connected by the formula [32, Equation (3.8)]

$$\begin{aligned} u'({\mathbf{x}})= {\left\{ \begin{array}{ll} u^r({\mathbf{x}}) + u({\mathbf{x}}), &{} {\mathbf{x}}\in U^+,\\ u^i({\mathbf{x}}) - u({\mathbf{x}}), &{} {\mathbf{x}}\in U^-. \end{array}\right. } \end{aligned}$$
(4)

Theorem 1 below reformulates the BVP (2)–(3) as an an integral equation for the normal derivative jump \([{\partial u}/{\partial {\mathbf{n}}}]:=({\partial u}/{\partial {\mathbf{n}}})^{+} - ({\partial u}/{\partial {\mathbf{n}}})^{-}\), where \(^\pm\) respectively denote the values on the top (\(+\)) and bottom (−) of \(\varGamma\). We first clarify some notation (for more detail see [32]). For \(s\in \mathbb {R}\), we denote by \(H^{s}(\mathbb {R})\) the usual Sobolev space on \(\mathbb {R}\), which, following [8, 32], we equip with the wavenumber-dependent norm

$$\begin{aligned} \Vert u\Vert _{H_k^{s}(\mathbb {R})}^{2} :=\int _{-\infty }^\infty (k^{2}+\xi ^{2})^{s}\,|\hat{u}(\xi )|^{2}\,\mathrm{d}\xi , \end{aligned}$$
(5)

where \(\hat{u}\) denotes the Fourier transform of u. We set \(\widetilde{H}^{s}(\widetilde{\varGamma }):=\overline{C^\infty _0(\widetilde{\varGamma })}^{H^{s}(\mathbb {R})}\), equipped with the norm \(\Vert u\Vert _{H_k^{s}(\mathbb {R})}\) inherited from \(H^s(\mathbb {R})\), and \(H^s(\widetilde{\varGamma }):=\{u=U|_{\widetilde{\varGamma }}:U\in H^s(\mathbb {R})\}\), equipped with the norm \(\Vert u\Vert _{H^s_k(\widetilde{\varGamma })} = \inf \Vert U\Vert _{H^s_k(\mathbb {R})}\), where the infimum is taken over all \(U\in H^s(\mathbb {R})\) such that \(u=U|_{\widetilde{\varGamma }}\). We identify \(\varGamma \subset \varGamma _\infty\) with \(\widetilde{\varGamma }\subset \mathbb {R}\) in the natural way and define \(H^s(\varGamma _\infty ):=H^s(\mathbb {R})\), \(\widetilde{H}^s(\varGamma ):=\widetilde{H}^s(\widetilde{\varGamma })\), \(H^s(\varGamma ):=H^s(\widetilde{\varGamma })\) etc. We denote by \(\mathcal {S}:\widetilde{H}^{-1/2}(\varGamma )\rightarrow C^2(D)\cap W^1_{loc}(\mathbb {R}^2)\) and \(S:\widetilde{H}^{-1/2}(\varGamma )\rightarrow H^{1/2}(\varGamma )\) the standard single-layer potential and single-layer boundary integral operator, which for \(\phi \in L^2(\varGamma )\) have the integral representations

$$\begin{aligned} \mathcal {S}\phi ({\mathbf{x}})&= \int _\varGamma \varPhi ({\mathbf{x}},{\mathbf{y}}) \phi ({\mathbf{y}})\, \mathrm{d}s({\mathbf{y}}), \qquad {\mathbf{x}}\in \mathbb {R}^2,\\ S\phi ({\mathbf{x}})&= \int _\varGamma \varPhi ({\mathbf{x}},{\mathbf{y}}) \phi ({\mathbf{y}})\, \mathrm{d}s({\mathbf{y}}), \qquad {\mathbf{x}}\in \varGamma , \end{aligned}$$

where \(\varPhi ({\mathbf{x}},{\mathbf{y}}) := (\mathrm{i}/4)H_0^{(1)}(k|{\mathbf{x}}-{\mathbf{y}}|)\) is the fundamental solution of (2).

Theorem 1

[52, Theorem 1.7], [32, Theorem 3.1] Suppose that u is a solution of the BVP (2)–(3). Then the representation formula

$$\begin{aligned} u({\mathbf{x}})=u^i({\mathbf{x}}) -\mathcal {S}v({\mathbf{x}}), \qquad {\mathbf{x}}\in D, \end{aligned}$$
(6)

holds, with \(v=[{\partial u}/{\partial {\mathbf{n}}}]\in \widetilde{H}^{-1/2}(\varGamma )\). Furthermore, v satisfies the integral equation

$$\begin{aligned} S v=f, \end{aligned}$$
(7)

with \(f:=u^i|_\varGamma \in H^{1/2}(\varGamma )\). Conversely, suppose that \(v\in \widetilde{H}^{-1/2}(\varGamma )\) satisfies (7). Then u defined by (6) satisfies the BVP (2)–(3), and \([{\partial u}/{\partial {\mathbf{n}}}]=v\).

We note that (7) is well-posed for all \(k>0\), with no spurious frequencies. Indeed, the operator S is coercive (strongly elliptic) on \(\widetilde{H}^{-1/2}(\varGamma )\) [8].

3 Hybrid numerical-asymptotic approximation space

Our hp HNA BEM approximation space for solving the BIE (7) is identical to that considered in [32]. Its design is motivated by the following regularity result.

Theorem 2

[32, Theorem 4.1, Lemma 4.5] The solution \(v=[{\partial u}/{\partial {\mathbf{n}}}]\) of the BIE (7) can be decomposed on each screen component \(\varGamma _j\), \(j=1,\ldots ,N_{\varGamma }\), as

$$\begin{aligned} {v({\mathbf{x}}(s))} = \varPsi ({\mathbf{x}}(s))+ v_j^+(s-s_{2j-1})\mathrm{e}^{\mathrm{i}ks} +v_j^-(s_{2j}-s) \mathrm{e}^{-\mathrm{i}ks},&\quad s\in (s_{2j-1},s_{2j}), \end{aligned}$$
(8)

where \(\varPsi :=-2{{\,\mathrm{sgn}\,}}(d_2){\partial u^i}/{\partial {\mathbf{n}}}\), and the functions \(v_j^\pm (s)\) are analytic in \(\mathfrak {R}{s}>0\), and, for each \(k_0>0\), there exists a constant \(C>0\), depending only on \(k_0\), such that

$$\begin{aligned} |v_j^\pm (s)|\le C (1+k)k|ks|^{-1/2}, \qquad {\mathfrak {R}{s}}>0, \quad k\ge k_0. \end{aligned}$$
(9)

The known term \(\varPsi\) constitutes the “physical optics” approximation, describing the contribution to \(v=[{\partial u}/{\partial {\mathbf{n}}}]\) of the incident and specularly reflected waves. The bounds (9) imply that the unknown functions \(v_j^\pm\), which correspond to the amplitudes of the diffracted waves, are non-oscillatory as a function of s (see [32, Remark 4.2]), which means they can be approximated much more efficiently than the full (oscillatory) solution v. Hence, rather than approximating v itself using piecewise polynomials on a fine (k-dependent) mesh (as in conventional BEMs), our HNA approximation space uses the decomposition (8), with \(\varPsi\) evaluated analytically and \(v_j^+\) and \(v_j^-\) replaced by piecewise polynomials on coarse (k-independent) meshes, graded to account for the singularities of \(v_j^+\) at \(s_{2j-1}\), and of \(v_j^-\) at \(s_{2j}\).

Fig. 2
figure 2

Illustration of the overlapping graded meshes used to approximate the amplitudes \(v_j^\pm\) in (8), in the case where \(\varGamma\) comprises two components, \(\varGamma _1\) and \(\varGamma _2\)

To describe the meshes we use, we first define a geometrically graded mesh \(\mathcal {M}\) on the interval [0, 1] with l elements and grading parameter \(\sigma \in (0,1/2)\), and an associated space of piecewise polynomials \(\mathcal {P}_{{\mathbf{p}}}(\mathcal {M})\), by

$$\begin{aligned}&{\mathcal {M}}&:=\{x_i\}_{i=0}^l, \qquad x_0=0,\,\,x_i=\sigma ^{l-i}, \,i=1,\ldots ,l,\\&{\mathcal {P}}_{{\mathbf{p}}}({\mathcal {M}})&:=\left\{ \rho :[0,1]\rightarrow {\mathbb {C}}:{\rho }|_{\left( x_{i-1}, x_i\right) } \text { is a polynomial of degree }\le ({{\mathbf{p}}})_i,\, i=1,\ldots ,l\right\} , \end{aligned}$$

where the degree vector \({\mathbf{p}}\in \mathbb {N}_0^{l}\) is defined for a given maximum degree \(p>0\) by

$$\begin{aligned} (\mathbf{{p}})_i:= {\left\{ \begin{array}{ll} p- \left\lfloor \frac{l+1-i}{l}p\right\rfloor ,&{} 1\le i\le l-1,\\ p, &{} i=l.\\ \end{array}\right. } \end{aligned}$$
(10)

Then for each screen segment \(\varGamma _j\) of length \(L_j:=s_{2j}-s_{2j-1}\), we define the meshes

$$\begin{aligned} \mathcal {M}^-_{j} := s_{2j-1} + L_j\mathcal {M},\qquad \mathcal {M}^+_{j} := s_{2j} - L_j\mathcal {M}, \end{aligned}$$

and denote by \({\mathcal {P}}_{{\mathbf{p}}}(\mathcal {M}_j^\pm )\) the associated spaces of piecewise polynomials, defined analogously to \({\mathcal {P}}_{{\mathbf{p}}}(\mathcal {M})\) above. For an illustration of the resulting meshes see Fig. 2. Finally, our HNA approximation space for approximating the difference \(v-\varPsi\) is

$$\begin{aligned} V_N:= \bigoplus _{j=1}^{N_{\varGamma }}\left[ \mathcal {P}_{{\mathbf{p}}}(\mathcal {M}_j^+)\mathrm{e}^{\mathrm{i}k s} + \mathcal {P}_{{\mathbf{p}}}(\mathcal {M}_j^-)\mathrm{e}^{-\mathrm{i}k s}\right] , \end{aligned}$$
(11)

with dimension

$$\begin{aligned} N = 2N_{\varGamma }\sum _{i=1}^{l}[({\mathbf{p}})_i+1]. \end{aligned}$$

We then have the following best approximation error estimate.

Theorem 3

[32, Theorem 5.1] Let \(k_0>0\) and \(l\ge cp\) for some constant \(c>0\). Then for every \(\delta >0\) there exists a constant \(C>0\), depending only on \(\delta\), \(\sigma\), \(N_{\varGamma }\)and c, and a constant \(\tau >0\), depending only on \(\delta\), \(\sigma\)and c, such that

$$\begin{aligned} \inf _{{\eta }\in V_N}\Vert (v-\varPsi )-{\eta }\Vert _{\widetilde{H}_k^{-1/2}(\varGamma )} \le C (1+k) k^\delta \,\mathrm{e}^{-p\tau }, \qquad k\ge k_0. \end{aligned}$$
(12)

Theorem 3 states that \(V_N\) can approximate \(v - \varPsi\) with an error which decreases exponentially as the maximum polynomial degree \(p\) tends to infinity, with a k-independent exponent. While the constant premultiplying the exponential factor grows with increasing k, it does so only modestly (like \(O(k^{1+\delta }\))), meaning that we can achieve any specified accuracy of approximation with \(p\) growing only logarithmically in k as \(k\rightarrow \infty\). This corresponds to the number of degrees of freedom N growing like \(\log ^2{k}\) as \(k\rightarrow \infty\). We shall see later that this logarithmic growth appears unnecessary in practice.

4 Collocation method

The difference

$$\begin{aligned} \phi := v - \varPsi \end{aligned}$$

satisfies the BIE

$$\begin{aligned} S\phi = f - S\varPsi . \end{aligned}$$
(13)

Let \(\{\varphi _n\}_{n=1}^N\) denote a basis for \(V_N\). In our implementation, each basis function \(\varphi _n\) consists of an exponential \(\mathrm{e}^{\mathrm{i}ks}\) or \(\mathrm{e}^{- \mathrm{i}ks}\) multiplied by an appropriately scaled and translated Legendre polynomial supported on a single element of one of the meshes \(\mathcal {M}_j^+\) or \(\mathcal {M}_j^-\) for some \(j\in \{1,\ldots ,N_\varGamma \}\), normalised so that \(\Vert \varphi _n\Vert _{L^2(\varGamma )}=1\).

To select an element \(\sum _{n=1}^N a_n\varphi _n \in V_N\) approximating \(\phi \in \tilde{H}^{-1/2}(\varGamma )\) we ask that the BIE (13), with \(\phi\) replaced by \(\sum _{n=1}^N a_n\varphi _n \in V_N\), should hold at a set of M collocation points

$$\begin{aligned} \mathcal {C}_M:=\{c_1,\ldots ,c_M\}\subset \widetilde{\varGamma }=\bigcup _{j=1}^{N_{\varGamma }}(s_{2j-1},s_{2j}). \end{aligned}$$

As we shall see, for reasons of stability it is useful to oversample, taking \(M> N\). The resulting system of linear equations for the coefficients \(a_n\), \(n=1,\ldots ,N\), is then overdetermined, so we seek a weighted least-squares solution. Explicitly, given a set of weights \(\mathcal {W}:=\{w_m\}_{m=1}^M\) corresponding to the collocation points \(\mathcal {C}_M\) (our choice of weights will be detailed below), we define \(A\in \mathbb {C}^{M\times N}\) and \({\mathbf{b}}\in \mathbb {C}^M\) by

$$\begin{aligned} A_{m,n} := \sqrt{w_m}S\varphi _n(c_m),\quad b_m := \sqrt{w_m}\left( f(c_m)-S\varPsi (c_m)\right) , \quad&m=1,\ldots ,M, \nonumber \\ {}&n=1,\ldots , N, \end{aligned}$$
(14)

and then seek \({\mathbf{a}}=(a_1,\ldots ,a_n)^T\in \mathbb {C}^N\) minimising the residual

$$\begin{aligned} \mathcal {R}:=\Vert A{\mathbf{a}}- {\mathbf{b}}\Vert _{2}. \end{aligned}$$
(15)

This problem is highly ill-conditioned, but can be regularised using a truncated singular value decomposition (SVD), as in e.g. [22, 29, 46]. The full SVD of A takes the form

$$\begin{aligned} A= U\varSigma V^*, \end{aligned}$$

where U and V are unitary matrices, \(V^*\) denotes the conjugate transpose of V, and \(\varSigma\) is an \(M\times N\) diagonal matrix. Denote by \(\sigma _n\) the nth entry of the diagonal of \(\varSigma\). To regularise, we introduce a small threshold \(\epsilon >0\), and define a modified diagonal matrix \(\varSigma ^{\epsilon }\), which has nth entry equal to \(\sigma _n\) if \(\sigma _n/ \max _{n'=1,\ldots ,N}\sigma _{n'}>\epsilon\), and zero otherwise. This provides a regularisation

$$\begin{aligned} A^{\epsilon } = U\varSigma ^{\epsilon } V^*, \end{aligned}$$
(16)

from which we can define \({\mathbf{a}}\) via a pseudo-inverse as

$$\begin{aligned} {\mathbf{a}}= (A^\epsilon )^\dag {\mathbf{b}}= V(\varSigma ^\epsilon )^\dag U^*{\mathbf{b}}, \end{aligned}$$
(17)

where \((\varSigma ^\epsilon )^\dag\) is the (diagonal) pseudo-inverse of \(\varSigma ^{\epsilon }\) with entries \(\sigma ^\dag _n\), defined such that \(\sigma ^\dag _n=1/\sigma _n\) if \(\sigma _n/\max _{n'=1,\ldots ,N}\sigma _{n'}>\epsilon\), and \(\sigma ^\dag _n=0\) otherwise.

Regarding the placement of collocation points, while we have no theoretical results to guide us, as a result of extensive numerical experiments (including those carried out for square systems (\(M=N\)) in [48]), we arrived at the following prescription.

We first select an oversampling threshold \(C_\mathrm {OS}>1\). For each \(j\in \{1,\ldots ,N_\varGamma \}\) let \(s_{2j-1}=x^\pm _{j,0}<x^\pm _{j,1}<\cdots < x^\pm _{j,l} = s_{2j}\) denote the points of the overlapping meshes \(\mathcal {M}_j^\pm\) on \(\varGamma _j\) (cf. Fig. 2). On each element \([x^\pm _{j,i-1},x^\pm _{j,i}]\) we allocate \(M^\pm _{j,i}:={\lceil {C_\mathrm {OS}(p^\pm _{j,i} + 1)}\rceil }\) collocation points, where \(p^\pm _{j,i}\) is the largest polynomial degree included on \(V_N\) on that element. For the elements \([x^+_{j,0},x^+_{j,1}]\), ...\([x^+_{j,l-2},x^+_{j,l-1}]\) and \([x^-_{j,1},x^-_{j,2}]\), ...\([x^-_{j,l-1},x^-_{j,l}]\) we place the collocation points at the first kind Chebyshev nodes within that element. If we were to follow this same prescription for the largest elements, \([x^+_{j,l-1},x^+_{j,l}]\) and \([x^-_{j,0},x^-_{j,1}]\), the overlapping nature of the meshes would mean that collocation points could coincide with, or lie very close to other collocation points already selected on the smaller elements. To avoid this, for these largest elements we allocate \(M^+_{j,1}+M^-_{j,l}=2{\lceil {C_\mathrm {OS}(p + 1)}\rceil }\) collocation points at the first kind Chebyshev nodes in the interval \([x^+_{j,l-1},x^-_{j,1}]\) (the intersection of the two largest elements; recall that \(0<\sigma <1/2\) so this intersection is non-trivial). The total number of collocation points is then

$$\begin{aligned} M = \sum _{j=1}^{N_\varGamma }\sum _{\pm }\sum _{i=1}^{l} M^\pm _{j,i} =\sum _{j=1}^{N_\varGamma }\sum _{\pm }\sum _{i=1}^{l} \lceil {C_\mathrm {OS}(p^\pm _{j,i} + 1)}\rceil \approx C_\mathrm {OS} N, \end{aligned}$$
(18)

and our prescription ensures that

$$\begin{aligned} M \ge C_\mathrm {OS}N, \quad \text {with equality if } C_\mathrm {OS} \text { is an integer.} \end{aligned}$$
(19)

In (14) we choose weights \(\{w_m\}_{m=1}^M\) that are inversely proportional to the approximate local density of collocation points. Explicitly, suppose that [ab] is a mesh element on which we have allocated \(M_{[a,b]}\) collocation points at the first-kind Chebyshev nodes

$$\begin{aligned} c_\ell = (a+b)/2 + (b-a)\cos (\pi (\ell -1/2)/M_{[a,b]})/2,\quad \ell =1,\ldots ,M_{[a,b]}. \end{aligned}$$

Then to the points \(\{c_\ell \}\) we assign the weights \(\{w_\ell \}\) defined by

$$\begin{aligned}&w_\ell = (c_{\ell +1}-c_{\ell })/2 + (c_{\ell }-c_{\ell -1})/2,\quad \text {for }\ell =2,\ldots M_{[a,b]}-1,\nonumber \\&w_1 = (c_{2}-c_{1})/2 + (c_{1}-a),\nonumber \\&w_{M_{[a,b]}} = (b-c_{M_{[a,b]}})+(c_{M_{[a,b]}}-c_{M_{[a,b]}-1})/2. \end{aligned}$$
(20)

We motivate this choice of weights in §5, just after (22).

5 Discussion about the SVD solver

In this section we elaborate on the motivation for considering oversampling in combination with a truncated SVD in the solution method (17). It is known from the earlier analysis in [32] – recall Theorem 3 in this paper – that the best approximation in the HNA space \(V_N\) converges to \(v-\varPsi\) at an exponential rate. Briefly, oversampling and regularization ensure that a near-best approximation can also be computed numerically, in spite of potential ill-conditioning of the linear system (14).

One obvious source of ill-conditioning of the discretization is that the basis functions \(\varphi _N\) of the approximation space (11) may be close to being linearly dependent. The underlying reason is that, loosely speaking, in our discretization on each segment \(\varGamma _j\) we are combining two bases together. The impact of this potential redundancy depends on the values of the wavenumber k and the degree vector \({\mathbf{p}}\). Consider, for example, the case of small k: then the functions in the space \(\mathcal {P}_{{\mathbf{p}}}(\mathcal {M}_j^+)\mathrm{e}^{\mathrm{i}k s}\) are smooth and non-oscillatory on \(\varGamma _j\), and they may approximate the functions in the space \(\mathcal {P}_{{\mathbf{p}}}(\mathcal {M}_j^-)\mathrm{e}^{-\mathrm{i}k s}\). In the case of fixed k and a degree vector \({\mathbf{p}}\) associated with a large maximum degree \(p\) in (10), we can make a similar observation: the large degree polynomials in \(\mathcal {P}_{{\mathbf{p}}}(\mathcal {M}_j^\pm )\) may resolve the oscillations of \(\mathrm{e}^{\pm \mathrm{i}k s}\).

For linear systems with a numerical null-space, the truncated SVD solution satisfies a specific algebraic minimization property:

Lemma 1

[15, Lemma 3.1] Let \({\mathbf{a}}\) be computed by the regularized pseudo-inverse (17) with relative threshold \(\epsilon > 0\). Then

$$\begin{aligned} \Vert {\mathbf{b}}- A {\mathbf{a}}\Vert _2 \le \inf _{{\mathbf{v}}\in \mathbb {C}^N} \left\{ \Vert {\mathbf{b}}- A {\mathbf{v}}\Vert _2 + \epsilon \,\Vert A\Vert _2\,\Vert {\mathbf{v}}\Vert _2 \right\} . \end{aligned}$$
(21)

Lemma 1 shows that the regularized pseudo-inverse (17) yields a solution with small residual, provided that a coefficient vector \({\mathbf{v}}\) exists with small residual and sufficiently small norm \(\Vert {\mathbf{v}}\Vert _2\). Note that the size of the coefficients is affected by the normalization of the basis functions, and recall from §3 that for our problem we have normalized the basis functions in \(L^2(\varGamma )\). For us, the existence of suitable vectors that result in a small right-hand side in (21) is suggested by Theorem 2 and Theorem 3. Thus, ill-conditioning of A due to redundancy in the discretization does not preclude the computation of highly accurate solutions.

However, the statement of Lemma 1 is purely algebraic, and a small discrete Euclidean residual of (14) does not imply a small continuous residual of the integral equation (13) in \(H^{1/2}(\varGamma )\), from which we could infer (by the bounded invertibility of \(S:\tilde{H}^{-1/2}(\varGamma ) \rightarrow H^{1/2}(\varGamma )\)) a small approximation error of the continuous solution of (13) in \(\tilde{H}^{-1/2}(\varGamma )\).

In the simpler \(L^2\) setting one can appeal to the results on function approximations using redundant sets and a regularizing SVD solver with threshold \(\epsilon\) presented in [35, 36]. No guarantees can be given about the accuracy of interpolation, with errors possibly as large as \(1/\epsilon\) [35, Proposition 4.6]. However, oversampling by a ‘sufficient’ amount renders the approximation problem well-conditioned [36, § 6]. In particular, if functions in a space G are sampled using a family of functionals \(\{ \ell _{M,m} \}_{m=1}^M\), with \(\ell _{M,m} : G \rightarrow \mathbb {C}\), then ideally the samples should be sufficiently ‘rich’ to recover any function in G:

$$\begin{aligned} A' \Vert g \Vert ^2 \le \liminf _{M \rightarrow \infty } \sum ^{M}_{m=1} | \ell _{m,M}(g) |^2 \le \limsup _{M \rightarrow \infty } \sum ^{M}_{m=1} | \ell _{m,M}(g) |^2 \le B' \Vert g \Vert ^2,\quad \forall g \in G. \end{aligned}$$
(22)

The choice to weight matrix entries with \(\sqrt{w}_m\) in (14), and the specific choice (20) of the weights, yields \(A'=B'=1\) for \(G \subset L^2(\varGamma )\), because the discrete sums are a Riemann sum for the \(L^2\) norm.

Generalising these results to the \(\tilde{H}^{-1/2}-H^{1/2}\) setting required for the rigorous analysis of our collocation BEM remains an open problem. Therefore, aside from the motivating observations presented in this section, our choice of collocation points, and of a linear oversampling rate \(M \approx C_{\mathrm {OS}} N\) with a proportionality constant \(C_{\mathrm {OS}}\) that is independent of the wavenumber, is based largely on numerical experiments, results of which we report in §7 (and see also [48], where a number of different collocation point allocations were compared for square systems without oversampling).

6 Oscillatory quadrature

Our HNA approximation space uses oscillatory basis functions supported on large (k-independent) mesh elements. This means that assembling the discrete system (14) involves calculating highly oscillatory integrals, which may also be singular.

By applying linear changes of variable and possibly splitting the integration range, the integrals arising in (14) can all be written in terms of integrals of the general form

$$\begin{aligned} I[F;k,T,\alpha ] = \int _0^T F(t; k)\mathrm{e}^{\mathrm {i}k \alpha t}\mathrm{d}t, \end{aligned}$$
(23)

where \(k>0\) is the wavenumber, \(T>0\), \(\alpha \in [0,2]\), and \(F(\cdot ;k)\) is smooth and non-oscillatory on (0, T) but possibly logarithmically singular at, or near, the left endpoint \(t=0\). More explicitly, we shall assume that, for some \(t_0\in (-\infty ,0]\), \(F(\cdot ;k)\) is analytic in the cut-plane \(\mathbb {C}\setminus (-\infty ,t_0]\), with at most polynomial growth as \(|t|\rightarrow \infty\), and that there exists \(\tilde{c}>0\) and two functions \(F_0(t;k)\) and \(F_1(t;k)\), analytic in \(k|t-t_0|< \tilde{c}\), such that

$$\begin{aligned} F(t;k)= F_0(t;k) + F_1(t;k)\log (k|t-t_0|), \quad \text {for } k|t-t_0|< \tilde{c}. \end{aligned}$$
(24)

We explain the transformation of the integrals in (14) to the form (23) in §6.1. Then in §6.2 we outline our method for efficiently calculating integrals of the form (23).

6.1 Transformation to the general form (23)

The left-hand side of the system (14) consists of integrals of the form

$$\begin{aligned} S\varphi _n(c_m)=&{\frac{\mathrm {i}}{4}\int _{a}^{b}H^{(1)}_0(k|c_m-s|)}{\mathcal {L}_{q,[a,b]}(s)\mathrm{e}^{\pm \mathrm {i}k s }}\mathrm{d}{s}, \nonumber \\ =&\frac{\mathrm {i}}{4}\int _{a}^{b}{\frac{ H^{(1)}_0(k|c_m-s|)}{\mathrm{e}^{\mathrm {i}k|c_m-s|}}}{\mathcal {L}_{q,[a,b]}(s)}\mathrm{e}^{\mathrm {i}k (|c_m-s|\pm s)}\mathrm{d}{s}, \end{aligned}$$
(25)

where \(\varphi _n=\mathcal {L}_{q,[a,b]}(s)\mathrm{e}^{\pm \mathrm {i}k s }\) is the nth basis function, \(\mathcal {L}_{q,[a,b]}\) is the \(L^2\)-normalised Legendre polynomial of some degree q on \([a,b]={{\,\mathrm{supp}\,}}{\varphi _n}\), and \(c_m\) is the mth collocation point. To obtain a non-oscillatory prefactor we have extracted the phase of the Hankel function in (25) using [47, (10.2.5)]. To express (25) in the form (23) we then proceed by cases. In each case below, the fact that the resulting prefactor satisfies (24) follows from the small argument behaviour of the Hankel function (see e.g. [47, (10.4.3),(10.2.2),(10.8.2)]).

  • Case A: If \(c_m\le a\) then \(|c_m-s|=s-c_m\) on [ab] and a translation \(s=a+t\) puts (25) in the form (23) with \(T=b-a\), \(\alpha =1\pm 1\) (i.e. 0 for \(+\) and 2 for −) and

    $$\begin{aligned} F(t;k) = \frac{\mathrm {i}}{4}\mathrm{e}^{\mathrm {i}k(a\pm a-c_m)}\frac{ H^{(1)}_0(k(a+t-c_m))}{\mathrm{e}^{\mathrm {i}k(a+t-c_m)}}{\mathcal {L}_{q,[a,b]}(a+t)}, \end{aligned}$$

    which has a logarithmic singularity at \(t_0 = c_m-a\le 0\).

  • Case B: If \(c_m\ge b\) then the reflection \(t\mapsto -t\) puts us in Case A.

  • Case C: If \(a<c_m<b\) then splitting the integral as \(\int _a^b = \int _a^{c_m} + \int _{c_m}^b\) produces two integrals satisfying the conditions of Cases B and A respectively.

The right-hand side of (15) involves the evaluation of

$$\begin{aligned} S {\varPsi }(c_m)=\frac{{|d_2|}k}{2}\sum _{j=1}^{N_{\varGamma }}\int _{s_{2j-1}}^{s_{2j}}\frac{ H^{(1)}_0(k|c_m-s|)}{\mathrm{e}^{\mathrm {i}k|c_m-s|}}\mathrm{e}^{\mathrm {i}k (|c_m-s|+d_1s)}\mathrm{d}{s}. \end{aligned}$$
(26)

Each integral in the sum (26) can be expressed in the form (23) by essentially the same procedure described above. For example, when \(c_m\le s_{2j-1}\) we can adapt the approach of Case A to write \(\int _{s_{2j-1}}^{s_{2j}}\) in the form (23) with \(T=L_j=s_{2j}-s_{2j-1}\), \(\alpha =1+d_1\), and \(F(t;k) = \frac{{|d_2|}k}{2}\mathrm{e}^{\mathrm {i}k(s_{2j-1}-c_m)}\frac{ H^{(1)}_0(k(s_{2j-1}+t-c_m))}{\mathrm{e}^{\mathrm {i}k(s_{2j-1}+t-c_m)}}\), which has a logarithmic singularity at \(t_0 = c_m-s_{2j-1}\le 0\).

6.2 Fast evaluation of (23) using numerical steepest descent with generalised Gaussian quadrature

Depending on the values of the parameters \(k,T,\alpha\) and \(t_0\), the integrand (23) may be highly oscillatory and/or numerically singular. In this section we describe an algorithm which can compute (23) efficiently in all cases. Our approach combines the method of numerical steepest descent (see e.g. [16, 17, 39]), which uses complex contour deformation to convert oscillatory integrals into rapidly decaying ones, with generalised Gaussian quadrature (see e.g. [34, 37]), which handles the singularities. Our method is an extension of the scheme presented in [34, §4.3], which was shown to accurately calculate singular oscillatory integrals, to the case of near-singularities, where the integrand has a singularity outside of, but close to the integration range.

Our algorithm involves two parameters \(c_\mathrm{osc}>0\) and \(0<c_\mathrm{sing}\le 1\), which are used to classify the integral (23) into one of four cases (described below). The parameter \(c_\mathrm{osc}\) represents the minimum number of oscillations in the exponential factor \(\mathrm{e}^{\mathrm{i}k \alpha t}\) there need to be over the interval [0, T] for us to consider the integral oscillatory (and apply numerical steepest descent). That is, we consider the integral oscillatory when \(k\alpha T/(2\pi )>c_\mathrm{osc}\) and non-oscillatory otherwise. The parameter \(c_\mathrm{sing}\) controls how close to the integration range the singularity at \(t_0\) needs to be for us to consider the integral singular (and apply generalised Gaussian quadrature). An important factor affecting the accuracy of numerical quadrature based on polynomial approximation for non-entire functions is the distance from the integration interval to the nearest singularity, measured relative to the length of the integration interval (e.g. [53, Theorem 19.3]). So, in the non-oscillatory case we consider the integral singular when \(|t_0|/T<c_\mathrm{sing}\). But when the integral is oscillatory this is an unnecessarily stringent condition, because as \(k\alpha \rightarrow \infty\) with T fixed the main contribution to the integral comes from small neighbourhoods of the endpoints \(t=0\) and \(t=T\) of size approximately \(O(1/(k\alpha ))\) (e.g. [16, (2.1)]). Hence, in the oscillatory case we consider the integral singular when \(|t_0|k\alpha /(2\pi c_\mathrm{osc}) <c_\mathrm{sing}\). The inclusion of \(c_\mathrm{osc}\) in the denominator here ensures our classifications are compatible, in the sense that in the borderline case \(k\alpha T/(2\pi )=c_\mathrm{osc}\), the transition between singular and non-singular cases occurs at \(|t_0|/T = {c}_\mathrm{sing}\) for both the oscillatory and non-oscillatory cases.

Our algorithm also depends on one further parameter, \(N_{\mathrm {Q}}\in \mathbb {N}\), which controls the number of quadrature points used. Explicitly, we use \(N_{\mathrm {Q}}\) points for each standard Gaussian quadrature rule and \(2N_{\mathrm {Q}}\) points for each generalised Gaussian quadrature rule (since generalised Gaussian quadrature requires twice as many points as standard Gaussian quadrature to achieve the same degree of exactness, see e.g. [37]).

We now present our algorithm, along with diagrams illustrating the contour deformations involved. Here the arrows indicate the direction along which each contour should be traversed, and the dashed line represents the original integration contour [0, T].

  • Case 1: \(k\alpha T/(2\pi ) \le c_\mathrm{osc}\) and \(|t_0|/T\ge c_\mathrm{sing}\) (non-oscillatory, non-singular). We evaluate (23) using standard Gauss–Legendre quadrature with \(N_{\mathrm {Q}}\) points.

  • Case 2: \(k\alpha T/(2\pi ) \le c_\mathrm{osc}\) and \(|t_0|/T< c_\mathrm{sing}\) (non-oscillatory, singular). We write (23) as the difference of two singular integrals

    figure a

    to which we apply generalised Gauss–Legendre quadrature with \(2N_{\mathrm {Q}}\) points for each integral. (When \(t_0=0\) the first integral is not present.)

  • Case 3: \(k\alpha T/(2\pi ) > c_\mathrm{osc}\) and \(|t_0|k\alpha /(2\pi c_\mathrm{osc}) \ge c_\mathrm{sing}\) (oscillatory, non-singular). We deform the integration contour into the complex plane, writing

    figure b

    justified by our assumption that \(k\alpha \ge 0\) and that \(F(\cdot ;k)\) grows only polynomially at infinity, meaning that the integrand of (23) decays exponentially as \(\mathrm{Im}\,t\rightarrow \infty\). Parametrizing the integrals by \(t=\mathrm{i}k\alpha \xi\) (respectively \(t=T+\mathrm{i}k\alpha \xi\)) gives

    $$\begin{aligned} \int _0^T = \frac{\mathrm{i}}{k \alpha }\int _0^\infty F\left( \frac{\mathrm {i}\xi }{k\alpha }; k\right) \mathrm{e}^{- \xi }\,\mathrm{d}\xi - \frac{\mathrm{i}\mathrm{e}^{\mathrm {i}k\alpha T}}{{k\alpha }}\int _0^\infty F\left( T + \frac{\mathrm {i}\xi }{\alpha k}; k\right) \mathrm{e}^{-\xi }\,\mathrm{d}\xi ; \end{aligned}$$
    (27)

    we evaluate both semi-infinite integrals using standard Gauss–Laguerre quadrature with \(N_{\mathrm {Q}}\) points for each integral.

  • Case 4: \(k\alpha T/(2\pi ) > c_\mathrm{osc}\) and \(|t_0|k\alpha /(2\pi c_\mathrm{osc})< c_\mathrm{sing}\) (oscillatory, singular). We write the integral as

    figure c

    evaluating the first integral using generalised Gauss–Legendre quadrature with \(2N_{\mathrm {Q}}\) points, the second using generalised Gauss–Laguerre quadratureFootnote 1 with \(2N_{\mathrm {Q}}\) points, and the third using standard Gauss–Laguerre quadrature with \(N_{\mathrm {Q}}\) points. (Again, when \(t_0=0\) the first integral is not present.) Here our assumption that \(c_\mathrm{sing}\le 1\) ensures that our treatment of the third integral \(\int _{T}^{T+\mathrm{i}\infty }\) as non-singular (being evaluated by standard, rather than generalised Gauss–Laguerre quadrature) is consistent with our classification of the two integrals in Case 3 as non-singular, in the sense that if \(k\alpha T/(2\pi ) > c_\mathrm{osc}\) then

    $$\begin{aligned} k\alpha (T-t_0)/(2\pi c_\mathrm{osc}) \ge k\alpha T/(2\pi c_\mathrm{osc}) > 1 \ge c_\mathrm{sing}. \end{aligned}$$

The total number of quadrature points (a proxy for computational cost) is \(N_{\mathrm {Q}}\) in Case 1, \(4N_{\mathrm {Q}}\) in Case 2, \(2N_{\mathrm {Q}}\) in Case 3, and \(5N_{\mathrm {Q}}\) in Case 4 (excluding special cases where \(t_0=0\)). Therefore, to minimise computational cost, we should take \(c_{\mathrm {osc}}\) as large as possible, and \(c_{\mathrm {sing}}\) as small as possible, so as to be in the cheapest Case 1 (non-oscillatory and non-singular) as often as possible. But obviously one cannot take \(c_{\mathrm {osc}}\) too large, or \(c_{\mathrm {sing}}\) too small, else oscillations (respectively, singularities) will not be adequately resolved. We investigate the choice of these parameters in more detail in the next section.

6.3 Choice of parameters

To validate the quadrature scheme described in §6.2 and tune the parameters \(c_{\mathrm {osc}}\), \(c_{\mathrm {sing}}\) and \(N_{\mathrm {Q}}\) we carried out a detailed set of experiments for the model integral

$$\begin{aligned} \int _0^1 H_0^{(1)}(k(t-t_0)) \,\mathrm{d}t, \end{aligned}$$
(28)

which is of the form (23) with \(\alpha =T=1\) and \(F(t;k)=H_0^{(1)}(k(t-t_0))\mathrm{e}^{-\mathrm {i}k t}\). For all possible combinations of \(c_{\mathrm {osc}}\in \{1,\ldots ,5\}\), \(c_{\mathrm {sing}}\in \{0,0.1,0.2,\ldots ,1\}\) and \(N_{\mathrm {Q}}\in \{5,10,15,20\}\), we computed the integral (28) using our algorithm at a large number of values of the parameters k and \(t_0\) in the range \(k\in (0,100]\), \(t_0\in [-1,0]\). For each set of parameters we measured the relative error compared to a reference value for (28), computed using a composite Gauss rule with a large number of quadrature points and mesh grading to handle (near) singularities. Based on numerical experiments we believe this reference solution to be accurate to at least 14 digits.

Based on our experiments, the choices

$$\begin{aligned} c_{\mathrm {osc}}=2, \quad c_{\mathrm {sing}}=0.5, \quad N_{\mathrm {Q}}=20 \end{aligned}$$
(29)

were found to produce a scheme which agrees with the reference solution to at least 12 digits, uniformly over the range of k and \(t_0\) tested, and these are the values we use in all our numerical results in §7. As evidence of this claim we show in Fig. 3a a plot of the corresponding relative error over the studied range of k and \(t_0\). The \((k,t_0)\) plane is divided into the four regions in which Cases 1,2,3 and 4 apply. The maximum relative error over the whole plot is \(1.6\times 10^{-13}\).

For comparison we show in Fig. 3b the corresponding plot for the choices \(c_{\mathrm {osc}}=4\), \(c_{\mathrm {sing}}=0.1\) and \(N_{\mathrm {Q}}=15\). The maximum relative error over the whole plot is now \(7.5\times 10^{-8}\), and one can clearly see the accuracy degrading as one moves to the right in region 1 (increasing oscillation) and downwards in regions 1 and 3 (approaching singularity).

Fig. 3
figure 3

Plots of \(\log _{10}\) of the relative error in computing (28) using the quadrature scheme of §6.2 for two different sets of quadrature parameters \(\{c_{\mathrm {osc}}=2, c_{\mathrm {sing}}=0.5, N_{\mathrm {Q}}=20\}\) and \(\{c_{\mathrm {osc}}=4, c_{\mathrm {sing}}=0.1, N_{\mathrm {Q}}=15\}\). In both plots the labels 1-4 indicate the regions of the \((k,t_0)\) plane in which Cases 1, 2, 3 and 4 apply

As we shall show in §7, the quadrature scheme in §6.2 with the parameter choices (29) is sufficiently efficient to achieve our goal of frequency independent computational cost for the HNA collocation BEM. However, we do not claim that the quadrature scheme is completely optimal. In particular, we note that the error in numerical steepest descent (without singularities) behaves like \(O((k\alpha )^{-2N_{\mathrm {Q}}-1})\) as \(k\alpha \rightarrow \infty\) (see e.g. [16, p90]), so in Cases 3 and 4 savings could be made by reducing \(N_{\mathrm {Q}}\) as \(k\alpha\) grows. But we reserve such further optimisation for future work.

7 Numerical results

In this section we demonstrate the computational efficiency of our HNA collocation BEM via a series of numerical examples. We also include the results of experiments exploring the influence of the parameters \(C_\mathrm{OS}\) and \(\epsilon\) in the truncated SVD solver. Finally we present an application of our method to the computation of high frequency scattering by high order prefractals of the middle-third Cantor set. The code used to produce the results in this section forms part of a larger Matlab-based HNA BEM software repository, which is downloadable from github.com/AndrewGibbs/HNABEMLAB.

Throughout this section we denote by \(v_p:=\phi _p+\varPsi\) our HNA BEM approximation to the solution v of the original BIE (7), where \(\phi _p\) denotes our approximation of the solution \(\phi\) of the BIE (13), the subscript p indicating the maximum polynomial degree used in our HNA approximation space \(V_N\), and \(\varPsi\) denotes the geometrical optics contribution. We denote by \(v^\pm _{j,p}\) the corresponding numerical approximations to the amplitudes \(v^\pm _j\) of the oscillatory functions \(\mathrm{e}^{\pm \mathrm {i}k s}\) in the decomposition (8). From the boundary solution we obtain an approximation

$$\begin{aligned} u^s_p:=-\mathcal {S}v_p \end{aligned}$$

to the scattered field \(u^s\) in D (cf. (6)), and an approximation

$$\begin{aligned} u^\infty _p(\theta ) :=-\int _{\tilde{\varGamma }}\mathrm{e}^{-\mathrm {i}ks\cos \theta } v_p(s)\mathrm{d}s \end{aligned}$$
(30)

to the far-field pattern

$$\begin{aligned} u^\infty (\theta ) :=-\int _{\tilde{\varGamma }}\mathrm{e}^{-\mathrm {i}ks\cos \theta } v(s)\mathrm{d}s, \end{aligned}$$
(31)

which describes the far-field behaviour of \(u^s\) via

$$\begin{aligned} u^s({\mathbf{x}})\sim u^\infty (\theta )\frac{\mathrm{e}^{\mathrm {i}(kr +\pi /4)}}{2\sqrt{2\pi k r}},\quad \text {for }{\mathbf{x}}= r(\cos \theta ,\sin \theta ),\quad \text {as }\Vert {\mathbf{x}}\Vert _2=:r\rightarrow \infty . \end{aligned}$$

Unless otherwise stated, in our numerical results we use the following parameters:

HNA space parameters3):

We use the mesh grading parameter \(\sigma =0.15\) and the number of layers \(\ell =2(p+1)\) for each graded mesh, as in related studies (e.g. [6, 32]).

Collocation parameters4and §5):

We use the oversampling parameter \(C_{\mathrm {OS}}=1.25\) and the SVD truncation parameter \(\epsilon =10^{-8}\). These values were chosen based on the experiments summarised in §7.2.

Quadrature parameters6):

We take \(c_{\mathrm {osc}}=2\), \(c_{\mathrm {sing}}=0.5\) and \(N_{\mathrm {Q}}=20\), as discussed in §6.3 (see (29)).

7.1 High frequency performance

To illustrate the high frequency performance of our HNA method we consider first the case where the screen consists of a single unit interval \(\varGamma =(0,1)\times \{0\}\). Examples with multiple screens will be considered in §7.3. As the incident wave direction we take \({\mathbf{d}}=(1,-1)/\sqrt{2}\). Figure 4a shows the resulting total field for wavenumber \(k=128\), and Fig. 4b, c plot the boundary solution \(v_8\), along with the magnitudes \(|v^\pm _{1,8}|\) of the non-oscillatory amplitudes of the oscillatory factors \(\mathrm{e}^{\pm \mathrm {i}k s}\), for both \(k=128\) and \(k=512\).

Fig. 4
figure 4

HNA collocation BEM solution for \(\varGamma =(0,1)\times \{0\}\) and incident direction \((1,-1)/\sqrt{2}\), with \(p=8\). a Real part of the total field \(u^i + u^s_8\) for \(k=128\). b, c Real part of the boundary solution \(v_8\), along with the amplitudes \(|v^\pm _{1,8}|\), for \(k=128\) and 512 respectively

In Fig. 5a we show the relative \(L^1(\varGamma )\) error

$$\begin{aligned} \text {Rel. }L^1\text { err. on }\varGamma :=\frac{\Vert v_p-v_{12}\Vert _{L^1(\varGamma )}}{\Vert v_{12}\Vert _{L^1(\varGamma )}} \end{aligned}$$
(32)

in the BEM solution for a range of values of p and k, using \(p=12\) as reference solution. For easier comparison with the best approximation theory (Theorem 3) it would be preferable to compute the \(\tilde{H}^{-1/2}(\varGamma )\) norm (which is possible via an appropriate single layer boundary integral representation, see e.g. [11, §7]), but here we choose the \(L^1(\varGamma )\) norm because it is easier to evaluate, and also because convergence of the boundary solution in \(L^1(\varGamma )\) implies, via a straightforward integral estimation, the convergence in supremum norm of the domain solution (on compact subsets of D) and the far-field pattern (cf. (35) below), which we study in Fig. 5c, d. We compute our \(L^1(\varGamma )\) norms by composite Gaussian quadrature with 20 Gauss points per element on a graded mesh, built by intersecting the overlapping meshes used to approximate \(v_{12}\), with additional subdivision of the larger elements to ensure that no element is longer than one wavelength \(2\pi /k\).

Figure 5a demonstrates that our approximation of the boundary solution converges exponentially (in \(L^1(\varGamma )\)) as we increase the polynomial degree p. Moreover, for fixed p the \(L^1\) error is not only bounded but actually decreases with increasing k.

Fig. 5
figure 5

Convergence with respect to increasing maximum polynomial degree p, for various wavenumbers k. a \(L^1\) error on \(\varGamma\) against p, b \(L^1\) error on \(\varGamma\) against CPU time, c, d \(L^\infty\) error in the near and far field against p

In Fig. 5b we present the same results, plotted against the fourth root of the total CPU time for the BEM assembly and solve. The rationale behind this comparison is as follows. Our approximation space \(V_N\) for this problem has dimension \(N=O(p^2)\) (maximal polynomial degree p over two overlapping meshes of \(2(p+1)\) elements each). And with a fixed oversampling parameter \(C_{\mathrm {OS}}=1.25\), the number of collocation points \(M\approx C_{\mathrm {OS}}N\) also scales like \(O(p^2)\). Hence the number of elements in the BEM matrix scales like \(MN=O(p^4)\) with increasing p. The linear systems we use are moderate in size (typically \(N\le 1000\)) and are rapidly solved using our truncated SVD approach. As a result, the majority of the CPU time is spent assembling the discrete system, which means that total CPU time scales like \(O(p^4)\). Therefore, to scrutinize the performance of our numerical quadrature scheme for evaluating the BEM matrix entries by a direct comparison with Fig. 5a, we should plot the boundary error against the fourth root of the CPU time. The results in Fig. 5b demonstrate that our quadrature scheme (based on numerical steepest descent) is both accurate and efficient, and results in a fully discrete BEM for which the error at a fixed computational cost is bounded (in fact, decreases) with increasing k. The timings in Fig. 5b are for a desktop PC with an Intel i7-4790 3.60GHz 4-core CPU, with matrix assembly done in parallel using a Matlab parfor loop. They show that highly accurate solutions can be obtained for essentially arbitrarily high frequencies in just a few seconds on a standard machine.

In Fig. 5c, d we plot the corresponding near-field errors

$$\begin{aligned} \text {Rel. }L^\infty \text { err. in near-field} :=\frac{\Vert u^s_p-u^s_{12}\Vert _{L^\infty (\mathcal {O})}}{\Vert u^s_{12}\Vert _{L^\infty (\mathcal {O})}}, \end{aligned}$$
(33)

where \(\mathcal {O}\) is the circle of radius one centred at (0.5, 0), and far-field errors

$$\begin{aligned} \text {Rel. }L^\infty \text { err. in far-field} :=\frac{\Vert u^\infty _p-u^\infty _{12}\Vert _{L^\infty (0,2\pi )}}{\Vert u^\infty _{12}\Vert _{L^\infty (0,2\pi )}}. \end{aligned}$$
(34)

As for the boundary solution, we observe exponential convergence as p increases with k fixed, and bounded errors as k increases for p fixed.

Fig. 6
figure 6

Analogue of Figs. 4a, 5c, d for the case of grazing incidence

In Fig. 6 we show analogous results for the case of grazing incidence, with \({\mathbf{d}}=(0,-1)\). A plot of the corresponding total field is shown in Fig. 6a and plots of the near- and far-field errors are shown in Fig. 6b, c. As before we see exponential convergence with increasing p, and no significant increase in error for increasing k. Relative errors are an order of magnitude larger than in the non-grazing example, because in this case there is no geometrical optics component (\(\varPsi =0\)), so the scattered fields are purely diffractive and hence smaller in maximum magnitude.

Comparing Figs. 5d and 6c, we observe that in the far field the accuracy noticeably improves as k increases in the case of non-grazing incidence, but not in the case of grazing incidence. This is due to the fact that the far-field pattern \(u^\infty\) has peaks of magnitude proportional to \(|d_2|k\) at the values of \(\theta\) corresponding to the reflected and shadow directions (cf. Fig. 8a), associated with the contribution of the geometrical optics term \(\varPsi\).Footnote 2 Hence the denominator in the relative \(L^\infty\) error will be O(k) for non-grazing incidence, while the numerator is not expected to be comparably large because the contribution from \(\varPsi\) is computed exactly (up to quadrature error), and hence does not contribute to the discretization error.

7.2 Collocation parameters

Our choices of oversampling parameter \(C_{\mathrm {OS}}=1.25\) and SVD truncation parameter \(\epsilon =10^{-8}\) in the previous section were based on the results of extensive numerical testing. In this section we report a small subset of these results, to illustrate the effect of changing these parameters. All results in this section relate to the case \(\varGamma =(0,1)\times \{0\}\) with \({\mathbf{d}}=(1,-1)/\sqrt{2}\), as in Figs. 4 and 5.

Fig. 7
figure 7

Dependence of solution accuracy on the oversampling parameter \(C_\mathrm {OS}\) and the truncation parameter \(\epsilon\) for moderate (\(k=256\), left-hand plots) and large (\(k=65536\), right-hand plots) wavenumbers and varying polynomial degree p. The values of \(C_\mathrm {OS}\) used are \(\{1,1.05,1.1,\ldots ,2\}\), but the resulting ratio M/N between the number of collocation points and the number of unknowns is in general slightly larger than \(C_\mathrm {OS}\) (recall (19)). The data points corresponding to \(C_{\mathrm {OS}}=1.25\) are shown with large markers

In Fig. 7 we present plots showing the dependence of the \(L^1(\varGamma )\) solution error, for \(p\in \{2,\ldots ,8\}\), on the oversampling ratio M/N, which is a function of the oversampling parameter \(C_{\mathrm {OS}}\). The left-hand plots correspond to wavenumber \(k=256\) and the right-hand plots to \(k=65536\), while the top, middle and bottom plots correspond to the SVD truncation thresholds \(\epsilon =10^{-4}\), \(\epsilon =10^{-8}\) and \(\epsilon =10^{-12}\), respectively. In each case tests were run for \(C_{\mathrm {OS}}\in \{1,1.05,1.1,\ldots ,2\}\); note that for the non-integer values of \(C_{\mathrm {OS}}\) the resulting values of M/N are in general larger than \(C_{\mathrm {OS}}\) (recall (19)). As the reference solution we use the solution with \(p=12\) and \(C_{\mathrm {OS}}=2\).

In interpreting these plots we focus first on the results for \(C_{\mathrm {OS}}=1\) (\(M/N=1\)), which corresponds to a square system (no oversampling). The need for oversampling is clear from the fact that in all six plots the errors for \(C_{\mathrm {OS}}=M/N=1\) are not monotonically decreasing with increasing p. That is, increasing the dimension of the approximation space \(V_N\) does not necessarily lead to a more accurate solution. (Further results for this square case can be found in [48].)

By contrast, at least in plots (c)–(f) of Fig. 7, for all fixed values of \(C_{\mathrm {OS}}>1\) considered, the error is monotonically decreasing with increasing p. Since computational cost is proportional to M, it is desirable to take \(C_{\mathrm {OS}}\) as small as possible while maintaining this monotonicity. But for \(C_{\mathrm {OS}}\) close to 1 there are still visible instabilities. On the basis of our experiments it seems that \(C_{\mathrm {OS}}\ge 1.25\) is sufficient to ensure stability for all the problems we considered. (The data points corresponding to the choice \(C_{\mathrm {OS}}=1.25\) are shown with larger markers to make this transition more clear.) Significantly, and perhaps surprisingly, the amount of oversampling required to stabilise the method appears to be independent of the wavenumber k, as one sees, for example, by comparing Figs. 7c (moderate k) and 7d (larger k), and recalling the convergence plots in Fig. 5a, b, which reach wavenumber \(k=262,144\) with no discernable loss of stability.

To interpret the dependence of the results on the SVD truncation parameter \(\epsilon\), it is instructive to recall Lemma 1, which, while not directly providing information about the error in our boundary solution (as explained in §5), still provides some intuition. On the one hand, since the argument of the infimum on the right-hand side of (21) is a linearly increasing function of \(\epsilon\), we expect that if \(\epsilon\) is taken to be too large, the discrete residual for the SVD solution may be large, leading to a loss in accuracy of the boundary solution. This effect can be clearly observed by comparing Fig. 7c, d (\(\epsilon =10^{-8}\)), where we see clear exponential convergence as p increases, with Fig. 7a, b (\(\epsilon =10^{-4}\)), where errors appear to stagnate around \(p=5\). On the other hand, looking again at (21), we expect that if \(\epsilon\) is taken to be too small, the SVD solver may select a solution with a large coefficient norm, in which case our numerical solution may suffer from spurious oscillations between the collocation points, or other numerical instabilities that might increase the \(L^1(\varGamma )\) error. This degradation in accuracy for small \(\epsilon\) can be observed by comparing Fig. 7c, d (\(\epsilon =10^{-8}\)), with Fig. 7e, f (\(\epsilon =10^{-12}\)). Based on extensive testing we found that the choice \(\epsilon =10^{-8}\) gave satisfactory performance for all the examples we considered.

7.3 Scattering by a Cantor set

In this section we present an application of our HNA BEM to high frequency scattering by the middle-third Cantor set. The mathematical analysis of acoustic scattering by fractal screens, of which the Cantor set is an example, was developed recently in [9] and [11], with the latter providing a rigorous convergence theory for an approximation strategy based on conventional Galerkin BEM using piecewise constant basis functions, along with numerical results for low frequency scattering in both two and three dimensions by a range of fractal screens. For two-dimensional scattering by the Cantor set, the HNA BEM presented in the current paper allows us to investigate the high frequency regime, and to our knowledge this is the first time that high frequency BEM simulations of scattering by fractal screens have been presented in the literature. Fractals, which one might define loosely as “sets possessing geometric detail on every lengthscale”, arise in numerous scattering applications, including the human lung and other dendritic structures in medical science [42], snowflakes, ice crystals and other atmospheric particles in climate science [51], fractal antennas for electromagnetic wave transmission/reception [24], fractal piezoelectric ultrasound transducers [45], and fractal aperture problems in laser physics [14]; for further references see [11]. The case of high frequency scattering by fractals is particularly intriguing, because even though the diameter of the scatterer may be large compared to the wavelength, the scatterer always possesses sub-wavelength structure, so standard ray-based high frequency approximations do not apply. In his 1979 paper on random phase screens [3], Berry describes this case as “a new regime in wave physics” in which the scattered waves adopt “unfamiliar forms that should be studied in their own right”.

In practical applications and numerical simulations one generally works with “prefractal” approximations in which the fractal structure is truncated at some level. The standard prefractals for the middle-third Cantor set are defined as follows. Letting \(\varGamma ^{(0)}=(0,1)\times \{0\}\), for \(j\in \mathbb {N}\) we construct the jth prefractal \(\varGamma ^{(j)}\) by removing the (closed) middle third from each disjoint interval making up the previous prefractal \(\varGamma ^{(j-1)}\), so that e.g. \(\varGamma ^{(1)}=((0,1/3)\cup (2/3,1))\times \{0\}\). That the BIE solutions for sound-soft scattering on the prefractals converge to a well-defined non-zero limiting BIE solution on the underlying fractal was proved in [9, Example 8.2] (and see also [11, Prop. 6.1]).

A plot of the total field for scattering by \(\varGamma ^{(2)}\) with incident direction \({\mathbf{d}}=(1,-1)/\sqrt{2}\), computed using our collocation HNA BEM, was presented in Fig. 1. In Fig. 8 we present radial plots of

$$\begin{aligned} \max \{\log _{10}|u^\infty _8|,-1\}, \end{aligned}$$

for the first six prefractals \(\varGamma ^{(0)}, \ldots \varGamma ^{(5)}\), with \(k=1024\) and \({\mathbf{d}}=(1,-1)/\sqrt{2}\). The log scales are truncated at \(-1\) to leave space in the centre of the plots for the corresponding prefractals to be plotted. Corresponding \(L^1(\varGamma )\) convergence plots are presented in Fig. 9, where, for each \(j=0, \ldots , 5\) we take as reference solution the corresponding solution with \(p=10\). The fact that the absolute \(L^1(\varGamma )\) error in our boundary solution with \(p=8\) is smaller than \(10^{-1}\) for all \(j=0, \ldots , 5\) implies that our far-field plots in Fig. 8 are also accurate to \(10^{-1}\) absolute error, since

$$\begin{aligned} |u^\infty _p|\le \Vert v_p\Vert _{L^1(\varGamma )}. \end{aligned}$$
(35)

We note that at \(k=1024\) the lowest order prefractal \(\varGamma ^{(0)}\) is approximately 160 wavelengths long, and each component of the highest order prefractal \(\varGamma ^{(5)}\) is approximately one wavelength long. While the far-field patterns for \(\varGamma ^{(4)}\) and \(\varGamma ^{(5)}\) are similar, they are clearly not identical, which is to be expected since in refining \(\varGamma ^{(4)}\) to \(\varGamma ^{(5)}\) we are making wavelength-scale changes to the scatterer geometry. Hence for \(\varGamma ^{(5)}\) we are still in the “pre-asymptotic” regime with regard to convergence to the solution for scattering by the limiting fractal screen. (The asymptotic regime is considered in [11] using a conventional BEM approach.) We emphasize that our HNA method can handle much larger wavenumbers than \(k=1024\), but at very large wavenumbers the far-field plots are so oscillatory they are difficult to interpret, so we do not present them here.

Fig. 8
figure 8

Radial plots of \(\log _{10}|u^\infty _8(\theta )|\) for \(k=1024\) and prefractal levels \(j=0,\ldots ,5\), for scattering by the middle-third Cantor set

Fig. 9
figure 9

Convergence of the boundary solution with increasing p for \(k=1024\) and prefractal levels \(j=0,\ldots ,5\), for scattering by the middle-third Cantor set