Fast hybrid numerical-asymptotic boundary element methods for high frequency screen and aperture problems based on least-squares collocation

We present a hybrid numerical-asymptotic (HNA) boundary element method (BEM) for high frequency scattering by two-dimensional screens and apertures, whose computational cost to achieve any prescribed accuracy remains bounded with increasing frequency. Our method is a collocation implementation of the high order hp HNA approximation space of Hewett et al. (IMA J Numer Anal 35:1698–1728, 2015), where a Galerkin implementation was studied. An advantage of the current collocation scheme is that the one-dimensional highly oscillatory singular integrals appearing in the BEM matrix entries are significantly easier to evaluate than the two-dimensional integrals appearing in the Galerkin case, which leads to much faster computation times. Here we compute the required integrals at frequency-independent cost using the numerical method of steepest descent, which involves complex contour deformation. The change from Galerkin to collocation is nontrivial because naive collocation implementations based on square linear systems suffer from severe numerical instabilities associated with the numerical redundancy of the HNA basis, which produces highly ill-conditioned BEM matrices. In this paper we show how these instabilities can be removed by oversampling, and solving the resulting overdetermined collocation system in a weighted least-squares sense using a truncated singular value decomposition. On the basis of our numerical experiments, the amount of oversampling required to stabilise the method is modest (around 25% typically suffices), and independent of frequency. As an application of our method we present numerical results for high frequency scattering by prefractal approximations to the middle-third Cantor set.


Introduction
The numerical simulation of high frequency (short wavelength) acoustic and electromagnetic scattering is challenging because of the need to capture the rapid oscillations in the wave field.Conventional Boundary Element Methods (BEMs), based on piecewisepolynomial basis functions, are computationally expensive because they require a fixed number of degrees of freedom (DOFs) per wavelength to achieve accurate solutions.This leads to very large (dense) BEM matrices which are costly to store and invert.By contrast, hybrid numerical-asymptotic (HNA) BEMs use basis functions built from piecewise polynomials on coarse meshes multiplied by certain oscillatory functions, chosen based on partial knowledge of the high frequency asymptotic solution behaviour, as described by Geometrical Optics (GO) and the Geometrical Theory of Diffraction (GTD) [4,41,43].The goal of the HNA approach (reviewed in [7]) is to achieve a fixed accuracy of approximation using a number of DOFs that is relatively small and frequency-independent, or only modestly (e.g.logarithmically) frequency-dependent, making it easier to store and invert the BEM matrix at high frequencies.HNA BEMs have been successfully developed for scattering by impenetrable convex [12,13,33], nonconvex [10,31] and curvilinear [44] polygons, two-dimensional screens and apertures [32], smooth convex two-dimensional [2,5,[19][20][21] and three-dimensional [23] scatterers, three-dimensional rectangular screens [30], penetrable convex polygons [27,28] and, recently, for certain multi-obstacle scattering problems [6].
The use of oscillatory bases in HNA methods leads to an essentially frequency-independent BEM system size.However, the BEM matrix entries now involve highly oscillatory singular integrals, which need to be evaluated efficiently if one is to achieve the ''holy grail'' of frequency-independent computational cost.The majority of HNA methods in the literature to date (e.g.[10, 12, 13, 19-21, 23, 28, 30-33, 44]) are based on Galerkin discretisations.This leads to provably stable approximations and, in many cases (e.g.[10,12,20,21,[31][32][33]), provides the framework for a rigorous, frequency-explicit convergence analysis.But Galerkin testing produces high-dimensional oscillatory integrals, which complicates the development of fast quadrature routines.
In order to develop HNA methods for more practically relevant problems than those tackled so far (in particular, three-dimensional problems), it would be highly advantageous to be able to use a collocation or Nystro ¨m approach, for which numerical quadrature is simpler.Existing work in this direction includes for example the piecewise-constant collocation method for convex polygons in [1], the B-spline method for two-dimensional smooth convex scatterers in [25] (which is based on the earlier technical report [26]) and the closely related Nystro ¨m method in [5].However, these approaches are not generally supported by rigorous stability and convergence analysis, a practical consequence of which being that it is not at all obvious how to determine collocation/quadrature point distributions leading to stable approximations.This question is particularly delicate when working with non-smooth scatterers and HNA approximation spaces built with overlapping meshes to represent different high frequency solution components (as in [1]), which can exhibit a high degree of numerical ''redundancy'' and hence severe ill-conditioning (see e.g. the discussion in [1, §2] and §5 below).
Our aim in this paper is to demonstrate that accurate and efficient collocation-based HNA methods can indeed be developed for high frequency scattering problems involving non-smooth scatterers.The key new idea, apparently not employed previously in HNA methods, is to stabilise the method using oversampling, and solve the resulting overdetermined collocation linear system in a weighted least-squares sense.
We present our oversampled collocation approach in the context of a specific model problem, namely high frequency two-dimensional scattering by colinear screens and apertures.For an illustration of the problem see Fig. 1.Such problems have numerous applications in acoustics, electromagnetics and water waves -for details we refer to the extensive reference list in [32], noting also the recent work on iterative Wiener-Hopf approaches to this problem in [49].Our collocation HNA BEM for the screen problem uses the same high order HNA approximation space as the Galerkin method presented in [32], which is based on an hp refinement strategy, giving exponential accuracy with increasing polynomial degree (see Theorem 3 below).To stabilise our collocation method we take more collocation points than degrees of freedom, and solve the resulting rectangular (overdetermined) system in a weighted least-squares sense, using a truncated singular value decomposition.The one-dimensional oscillatory singular integrals appearing in the BEM matrix are evaluated accurately and efficiently using the numerical method of steepest descent [16,17,39], which is based on complex contour deformation, combined with generalised Gaussian quadrature [37], to handle the singularities without the need for mesh grading.By a series of numerical experiments we demonstrate that our HNA collocation BEM can achieve a fixed solution accuracy in a computation time that is bounded (in fact, sometimes even decreasing) as the frequency tends to infinity.A key empirical observation (that we are currently unable to explain theoretically) is that the oversampling threshold, which governs the number of collocation points per DOF, does not need to increase with increasing frequency in order to maintain accuracy.
This paper has as its genesis the MSc dissertation [48] of the fourth author, in which a range of HNA approximation spaces and collocation point distribution strategies were compared for square collocation BEM matrices.Numerical instabilities encountered in [48] motivated the investigation of the oversampled collocation method presented here.The oscillatory integration in [48] was carried out using Filon quadrature (see e.g.[16,38,40]), combined with mesh grading to handle singularities (as in [18]).
We point out that our oversampled collocation approach differs slightly from that of the CHIEF method of [50].The CHIEF method was developed for scattering by closed surfaces, and introduces extra collocation points in the interior of the scatterer to stabilise integral equation formulations that are ill-posed at certain values of k, corresponding to interior resonances.In the current paper we are introducing extra collocation points on the scatterer boundary, to stabilise an integral equation formulation for scattering by an open surface (in particular, the scatterer has no interior), which is well-posed for all k but suffers from ill-conditioning when approximated numerically.
The structure of the paper is as follows.In Sect.§2 we define the scattering problem and its boundary integral equation reformulation.In §3 we describe the HNA approximation space of [32], and recall the corresponding best approximation error estimate.In §4 we present our regularised collocation approach, and in §5 we discuss the philosophy behind the truncated SVD approach used to solve the least-squares problem.In §6 we outline our numerical quadrature scheme for evaluating the oscillatory and singular integrals appearing in the BEM system.In §7 we present a series of numerical experiments demonstrating the performance of our method, including an application to scattering by the middle-third Cantor set.

Problem statement
We consider 2D time-harmonic acoustic scattering by a sound-soft (Dirichlet) screen We assume that lengths have been nondimensionalised so that and on C we define the normal vector n ¼ ð0; 1Þ.
As the incident wave we take u i ðxÞ :¼ e ikxÁd , where k [ 0 is the nondimensional wavenumber and d ¼ ðd 1 ; d 2 Þ 2 R 2 is a unit direction vector.The boundary value problem (BVP) to be solved is: with the scattered field u s :¼ u À u i satisfying the Sommerfeld radiation condition (see, e.g., [7, Equation (2.9)]).Here W 1 loc ðDÞ is the usual space of locally square-integrable functions whose weak gradient is also locally square-integrable.Well-posedness of the BVP (2)-( 3) is proved, for example, in [52].For an example solution see Fig. 1.
We note that (7) is well-posed for all k [ 0, with no spurious frequencies.Indeed, the operator S is coercive (strongly elliptic) on e H À1=2 ðCÞ [8].
3 Hybrid numerical-asymptotic approximation space Our hp HNA BEM approximation space for solving the BIE ( 7) is identical to that considered in [32].Its design is motivated by the following regularity result.
Theorem 2 [32, Theorem 4.1, Lemma 4.5] The solution v ¼ ½ou=on of the BIE (7) can be decomposed on each screen component C j , j ¼ 1; . ..; N C , as where W :¼ À2 sgn ðd 2 Þou i =on, and the functions v AE j ðsÞ are analytic in Rs [ , and, for each k 0 [ 0, there exists a constant C [ 0, depending only on k 0 , such that The known term W constitutes the ''physical optics'' approximation, describing the contribution to v ¼ ½ou=on of the incident and specularly reflected waves.The bounds (9) imply that the unknown functions v AE j , which correspond to the amplitudes of the diffracted waves, are non-oscillatory as a function of s (see [32,Remark 4.2]), which means they can be approximated much more efficiently than the full (oscillatory) solution v. Hence, rather than approximating v itself using piecewise polynomials on a fine (k-dependent) mesh (as in conventional BEMs), our HNA approximation space uses the decomposition (8), with W evaluated analytically and v þ j and v À j replaced by piecewise polynomials on coarse (kindependent) meshes, graded to account for the singularities of v þ j at s 2jÀ1 , and of v À j at s 2j .To describe the meshes we use, we first define a geometrically graded mesh M on the interval [0, 1] with l elements and grading parameter r 2 ð0; 1=2Þ, and an associated space of piecewise polynomials P p ðMÞ, by where the degree vector p 2 N l 0 is defined for a given maximum degree p [ 0 by Then for each screen segment C j of length L j :¼ s 2j À s 2jÀ1 , we define the meshes and denote by P p ðM AE j Þ the associated spaces of piecewise polynomials, defined analogously to P p ðMÞ above.For an illustration of the resulting meshes see Fig. 2. Finally, our HNA approximation space for approximating the difference v À W is We then have the following best approximation error estimate.
Theorem 3 states that V N can approximate v À W with an error which decreases exponentially as the maximum polynomial degree p tends to infinity, with a k-independent exponent.While the constant premultiplying the exponential factor grows with increasing k, it does so only modestly (like Oðk 1þd )), meaning that we can achieve any specified accuracy of approximation with p growing only logarithmically in k as k ! 1.This corresponds to the number of degrees of freedom N growing like log 2 k as k ! 1.We shall see later that this logarithmic growth appears unnecessary in practice.

Collocation method
The difference Let fu n g N n¼1 denote a basis for V N .In our implementation, each basis function u n consists of an exponential e iks or e Àiks multiplied by an appropriately scaled and translated Legendre polynomial supported on a single element of one of the meshes M þ j or M À j for some j 2 f1; . ..; N C g, normalised so that ku n k L 2 ðCÞ ¼ 1.
To select an element P N n¼1 a n u n 2 V N approximating / 2 HÀ1=2 ðCÞ we ask that the BIE (13), with / replaced by P N n¼1 a n u n 2 V N , should hold at a set of M collocation points As we shall see, for reasons of stability it is useful to oversample, taking M [ N. The resulting system of linear equations for the coefficients a n , n ¼ 1; . ..; N, is then overdetermined, so we seek a weighted least-squares solution.Explicitly, given a set of weights W :¼ fw m g M m¼1 corresponding to the collocation points C M (our choice of weights will be detailed below), we define Fig. 2 Illustration of the overlapping graded meshes used to approximate the amplitudes v AE j in (8), in the case where C comprises two components, C 1 and and then seek a ¼ ða 1 ; . ..; This problem is highly ill-conditioned, but can be regularised using a truncated singular value decomposition (SVD), as in e.g.[22,29,46].The full SVD of A takes the form where U and V are unitary matrices, V Ã denotes the conjugate transpose of V, and R is an M Â N diagonal matrix.Denote by r n the nth entry of the diagonal of R. To regularise, we introduce a small threshold [ 0, and define a modified diagonal matrix R , which has nth entry equal to r n if r n = max n 0 ¼1;...;N r n 0 [ , and zero otherwise.This provides a regularisation from which we can define a via a pseudo-inverse as where ðR Þ y is the (diagonal) pseudo-inverse of R with entries r y n , defined such that r y n ¼ 1=r n if r n = max n 0 ¼1;...;N r n 0 [ , and r y n ¼ 0 otherwise.Regarding the placement of collocation points, while we have no theoretical results to guide us, as a result of extensive numerical experiments (including those carried out for square systems (M ¼ N) in [48]), we arrived at the following prescription.
We first select an oversampling threshold C OS [ 1.For each j 2 f1; . ..; N C g let s 2jÀ1 ¼ x AE j;0 \x AE j;1 \ Á Á Á \x AE j;l ¼ s 2j denote the points of the overlapping meshes M AE j on C j (cf.Fig. 2).On each element ½x AE j;iÀ1 ; x AE j;i we allocate M AE j;i :¼ dC OS ðp AE j;i þ 1Þe collocation points, where p AE j;i is the largest polynomial degree included on V N on that element.For the elements ½x þ j;0 ; x þ j;1 , ...½x þ j;lÀ2 ; x þ j;lÀ1 and ½x À j;1 ; x À j;2 , ...½x À j;lÀ1 ; x À j;l we place the collocation points at the first kind Chebyshev nodes within that element.If we were to follow this same prescription for the largest elements, ½x þ j;lÀ1 ; x þ j;l and ½x À j;0 ; x À j;1 , the overlapping nature of the meshes would mean that collocation points could coincide with, or lie very close to other collocation points already selected on the smaller elements.To avoid this, for these largest elements we allocate 1Þe collocation points at the first kind Chebyshev nodes in the interval ½x þ j;lÀ1 ; x À j;1 (the intersection of the two largest elements; recall that 0\r\1=2 so this intersection is non-trivial).The total number of collocation points is then and our prescription ensures that In (14) we choose weights fw m g M m¼1 that are inversely proportional to the approximate local density of collocation points.Explicitly, suppose that [a, b] is a mesh element on which we have allocated M ½a;b collocation points at the first-kind Chebyshev nodes Then to the points fc ' g we assign the weights fw ' g defined by We motivate this choice of weights in §5, just after (22).

Discussion about the SVD solver
In this section we elaborate on the motivation for considering oversampling in combination with a truncated SVD in the solution method (17).It is known from the earlier analysis in [32] -recall Theorem 3 in this paper -that the best approximation in the HNA space V N converges to v À W at an exponential rate.Briefly, oversampling and regularization ensure that a near-best approximation can also be computed numerically, in spite of potential illconditioning of the linear system (14).
One obvious source of ill-conditioning of the discretization is that the basis functions u N of the approximation space (11) may be close to being linearly dependent.The underlying reason is that, loosely speaking, in our discretization on each segment C j we are combining two bases together.The impact of this potential redundancy depends on the values of the wavenumber k and the degree vector p.Consider, for example, the case of small k: then the functions in the space P p ðM þ j Þe iks are smooth and non-oscillatory on C j , and they may approximate the functions in the space P p ðM À j Þe Àiks .In the case of fixed k and a degree vector p associated with a large maximum degree p in (10), we can make a similar observation: the large degree polynomials in P p ðM AE j Þ may resolve the oscillations of e AEiks .
For linear systems with a numerical null-space, the truncated SVD solution satisfies a specific algebraic minimization property: Lemma 1 [15, Lemma 3.1] Let a be computed by the regularized pseudo-inverse (17) with relative threshold [ 0. Then Lemma 1 shows that the regularized pseudo-inverse (17) yields a solution with small residual, provided that a coefficient vector v exists with small residual and sufficiently small norm kvk 2 .Note that the size of the coefficients is affected by the normalization of the basis functions, and recall from §3 that for our problem we have normalized the basis functions in L 2 ðCÞ.For us, the existence of suitable vectors that result in a small right-hand side in ( 21) is suggested by Theorem 2 and Theorem 3. Thus, ill-conditioning of A due to redundancy in the discretization does not preclude the computation of highly accurate solutions.However, the statement of Lemma 1 is purely algebraic, and a small discrete Euclidean residual of ( 14) does not imply a small continuous residual of the integral equation ( 13) in H 1=2 ðCÞ, from which we could infer (by the bounded invertibility of S : HÀ1=2 ðCÞ !H 1=2 ðCÞ) a small approximation error of the continuous solution of (13) in HÀ1=2 ðCÞ.
In the simpler L 2 setting one can appeal to the results on function approximations using redundant sets and a regularizing SVD solver with threshold presented in [35,36].No guarantees can be given about the accuracy of interpolation, with errors possibly as large as 1= [35,Proposition 4.6].However, oversampling by a 'sufficient' amount renders the approximation problem well-conditioned [36, § 6].In particular, if functions in a space G are sampled using a family of functionals f' M;m g M m¼1 , with ' M;m : G ! C, then ideally the samples should be sufficiently 'rich' to recover any function in G: The choice to weight matrix entries with ffiffiffi ffi w p m in ( 14), and the specific choice (20) of the weights, yields A 0 ¼ B 0 ¼ 1 for G & L 2 ðCÞ, because the discrete sums are a Riemann sum for the L 2 norm.
Generalising these results to the HÀ1=2 À H 1=2 setting required for the rigorous analysis of our collocation BEM remains an open problem.Therefore, aside from the motivating observations presented in this section, our choice of collocation points, and of a linear oversampling rate M % C OS N with a proportionality constant C OS that is independent of the wavenumber, is based largely on numerical experiments, results of which we report in §7 (and see also [48], where a number of different collocation point allocations were compared for square systems without oversampling).

Oscillatory quadrature
Our HNA approximation space uses oscillatory basis functions supported on large (kindependent) mesh elements.This means that assembling the discrete system (14) involves calculating highly oscillatory integrals, which may also be singular.By applying linear changes of variable and possibly splitting the integration range, the integrals arising in ( 14) can all be written in terms of integrals of the general form where k [ 0 is the wavenumber, T [ 0, a 2 ½0; 2, and FðÁ; kÞ is smooth and non-oscillatory on (0, T) but possibly logarithmically singular at, or near, the left endpoint t ¼ 0.More explicitly, we shall assume that, for some t 0 2 ðÀ1; 0, FðÁ; kÞ is analytic in the cutplane C n ðÀ1; t 0 , with at most polynomial growth as jtj ! 1, and that there exists c [ 0 and two functions F 0 ðt; kÞ and F 1 ðt; kÞ, analytic in kjt À t 0 j\ c, such that Fðt; kÞ ¼ F 0 ðt; kÞ þ F 1 ðt; kÞ logðkjt À t 0 jÞ; for kjt À t 0 j\ c: We explain the transformation of the integrals in (14) to the form (23) in §6.1.Then in §6.2 we outline our method for efficiently calculating integrals of the form (23).

Transformation to the general form (23)
The left-hand side of the system ( 14) consists of integrals of the form 0 ðkjc m À sjÞL q;½a;b ðsÞe AEiks ds; where u n ¼ L q;½a;b ðsÞe AEiks is the nth basis function, L q;½a;b is the L 2 -normalised Legendre polynomial of some degree q on ½a; b ¼ supp u n , and c m is the mth collocation point.To obtain a non-oscillatory prefactor we have extracted the phase of the Hankel function in (25) using [47, (10.2.5)].To express (25) in the form (23) we then proceed by cases.In each case below, the fact that the resulting prefactor satisfies (24) follows from the small argument behaviour of the Hankel function (see e.g.[47, (10.4.3),(10.2.2),(10.8.2)]). - which has a logarithmic singularity at t 0 ¼ c m À a 0. -Case B: If c m !b then the reflection t7 !À t puts us in Case A.
-Case C: If a\c m \b then splitting the integral as R b cm produces two integrals satisfying the conditions of Cases B and A respectively.
The right-hand side of (15) involves the evaluation of Each integral in the sum ( 26) can be expressed in the form ( 23) by essentially the same procedure described above.For example, when c m s 2jÀ1 we can adapt the approach of Case A to write R s 2j s2jÀ1 in the form (23) with , which has a logarithmic singularity at t 0 ¼ c m À s 2jÀ1 0.

Fast evaluation of (23) using numerical steepest descent with generalised Gaussian quadrature
Depending on the values of the parameters k; T; a and t 0 , the integrand (23) may be highly oscillatory and/or numerically singular.In this section we describe an algorithm which can compute (23) efficiently in all cases.Our approach combines the method of numerical steepest descent (see e.g.[16,17,39]), which uses complex contour deformation to convert oscillatory integrals into rapidly decaying ones, with generalised Gaussian quadrature (see e.g.[34,37]), which handles the singularities.Our method is an extension of the scheme presented in [34, §4.3], which was shown to accurately calculate singular oscillatory integrals, to the case of near-singularities, where the integrand has a singularity outside of, but close to the integration range.
Our algorithm involves two parameters c osc [ 0 and 0\c sing 1, which are used to classify the integral (23) into one of four cases (described below).The parameter c osc represents the minimum number of oscillations in the exponential factor e ikat there need to be over the interval [0, T] for us to consider the integral oscillatory (and apply numerical steepest descent).That is, we consider the integral oscillatory when kaT=ð2pÞ [ c osc and non-oscillatory otherwise.The parameter c sing controls how close to the integration range the singularity at t 0 needs to be for us to consider the integral singular (and apply generalised Gaussian quadrature).An important factor affecting the accuracy of numerical quadrature based on polynomial approximation for non-entire functions is the distance from the integration interval to the nearest singularity, measured relative to the length of the integration interval (e.g.[53,Theorem 19.3]).So, in the non-oscillatory case we consider the integral singular when jt 0 j=T\c sing .But when the integral is oscillatory this is an unnecessarily stringent condition, because as ka ! 1 with T fixed the main contribution to the integral comes from small neighbourhoods of the endpoints t ¼ 0 and t ¼ T of size approximately Oð1=ðkaÞÞ (e.g.[16, (2.1)]).Hence, in the oscillatory case we consider the integral singular when jt 0 jka=ð2pc osc Þ\c sing .The inclusion of c osc in the denominator here ensures our classifications are compatible, in the sense that in the borderline case kaT=ð2pÞ ¼ c osc , the transition between singular and non-singular cases occurs at jt 0 j=T ¼ c sing for both the oscillatory and non-oscillatory cases.
Our algorithm also depends on one further parameter, N Q 2 N, which controls the number of quadrature points used.Explicitly, we use N Q points for each standard Gaussian quadrature rule and 2N Q points for each generalised Gaussian quadrature rule (since generalised Gaussian quadrature requires twice as many points as standard Gaussian quadrature to achieve the same degree of exactness, see e.g.[37]).
We now present our algorithm, along with diagrams illustrating the contour deformations involved.Here the arrows indicate the direction along which each contour should be traversed, and the dashed line represents the original integration contour [0, T].
-Case 1: kaT=ð2pÞ c osc and jt 0 j=T !c sing (non-oscillatory, non-singular).We evaluate (23) using standard Gauss-Legendre quadrature with N Q points.-Case 2: kaT=ð2pÞ c osc and jt 0 j=T\c sing (non-oscillatory, singular).We write (23) as the difference of two singular integrals to which we apply generalised Gauss-Legendre quadrature with 2N Q points for each integral.(When t 0 ¼ 0 the first integral is not present.) -Case 3: kaT=ð2pÞ [ c osc and jt 0 jka=ð2pc osc Þ !c sing (oscillatory, non-singular).We deform the integration contour into the complex plane, writing T + i∞ i∞ justified by our assumption that ka !0 and that FðÁ; kÞ grows only polynomially at infinity, meaning that the integrand of ( 23) decays exponentially as Im t ! 1. Parametrizing the integrals by t ¼ ikan (respectively we evaluate both semi-infinite integrals using standard Gauss-Laguerre quadrature with N Q points for each integral.-Case 4: kaT=ð2pÞ [ c osc and jt 0 jka=ð2pc osc Þ\c sing (oscillatory, singular).We write the integral as evaluating the first integral using generalised Gauss-Legendre quadrature with 2N Q points, the second using generalised Gauss-Laguerre quadrature 1 with 2N Q points, and the third using standard Gauss-Laguerre quadrature with N Q points.(Again, when t 0 ¼ 0 the first integral is not present.)Here our assumption that c sing 1 ensures that our treatment of the third integral R Tþi1 T as non-singular (being evaluated by standard, rather than generalised Gauss-Laguerre quadrature) is consistent with our classification of the two integrals in Case 3 as non-singular, in the sense that if kaT=ð2pÞ [ c osc then The total number of quadrature points (a proxy for computational cost) is N Q in Case 1, 4N Q in Case 2, 2N Q in Case 3, and 5N Q in Case 4 (excluding special cases where t 0 ¼ 0).Therefore, to minimise computational cost, we should take c osc as large as possible, and c sing as small as possible, so as to be in the cheapest Case 1 (non-oscillatory and nonsingular) as often as possible.But obviously one cannot take c osc too large, or c sing too 1 In related literature it is common for generalised Gauss-Laguerre to refer to a quadrature rule for integrals of the form R 1 0 x Àm f ðxÞe Àx dx, for m [ À 1 where f is well-approximated by a polynomial, but we do not use this definition here.In this paper, we use generalised Gauss-Laguerre to mean a quadrature rule which can efficiently evaluate R 1 0 ½logðxÞf ðxÞ þ gðxÞe Àx dx, where f and g are well-approximated by polynomials.
small, else oscillations (respectively, singularities) will not be adequately resolved.We investigate the choice of these parameters in more detail in the next section.

Choice of parameters
To validate the quadrature scheme described in §6.2 and tune the parameters c osc , c sing and N Q we carried out a detailed set of experiments for the model integral which is of the form ( 23) with a ¼ T ¼ 1 and Fðt; kÞ ¼ H ð1Þ 0 ðkðt À t 0 ÞÞe Àikt .For all possible combinations of c osc 2 f1; . ..; 5g, c sing 2 f0; 0:1; 0:2; . ..; 1g and N Q 2 f5; 10; 15; 20g, we computed the integral (28) using our algorithm at a large number of values of the parameters k and t 0 in the range k 2 ð0; 100, t 0 2 ½À1; 0. For each set of parameters we measured the relative error compared to a reference value for (28), computed using a composite Gauss rule with a large number of quadrature points and mesh grading to handle (near) singularities.Based on numerical experiments we believe this reference solution to be accurate to at least 14 digits.
Based on our experiments, the choices were found to produce a scheme which agrees with the reference solution to at least 12 digits, uniformly over the range of k and t 0 tested, and these are the values we use in all our numerical results in §7.As evidence of this claim we show in Fig. 3a a plot of the corresponding relative error over the studied range of k and t 0 .The ðk; t 0 Þ plane is divided into the four regions in which Cases 1,2,3 and 4 apply.The maximum relative error over the whole plot is 1:6 Â 10 À13 .For comparison we show in Fig. 3b the corresponding plot for the choices c osc ¼ 4, c sing ¼ 0:1 and N Q ¼ 15.The maximum relative error over the whole plot is now 7:5 Â 10 À8 , and one can clearly see the accuracy degrading as one moves to the right in region 1 (increasing oscillation) and downwards in regions 1 and 3 (approaching singularity).
As we shall show in §7, the quadrature scheme in §6.2 with the parameter choices ( 29) is sufficiently efficient to achieve our goal of frequency independent computational cost for the HNA collocation BEM.However, we do not claim that the quadrature scheme is completely optimal.In particular, we note that the error in numerical steepest descent (without singularities) behaves like OððkaÞ À2NQÀ1 Þ as ka ! 1 (see e.g.[16, p90]), so in Cases 3 and 4 savings could be made by reducing N Q as ka grows.But we reserve such further optimisation for future work.

Numerical results
In this section we demonstrate the computational efficiency of our HNA collocation BEM via a series of numerical examples.We also include the results of experiments exploring the influence of the parameters C OS and in the truncated SVD solver.Finally we present an application of our method to the computation of high frequency scattering by high order prefractals of the middle-third Cantor set.The code used to produce the results in this section forms part of a larger Matlab-based HNA BEM software repository, which is downloadable github.com/AndrewGibbs/HNABEMLAB.
Throughout this section we denote by v p :¼ / p þ W our HNA BEM approximation to the solution v of the original BIE (7), where / p denotes our approximation of the solution / of the BIE (13), the subscript p indicating the maximum polynomial degree used in our HNA approximation space V N , and W denotes the geometrical optics contribution.We denote by v AE j;p the corresponding numerical approximations to the amplitudes v AE j of the oscillatory functions e AEiks in the decomposition (8).From the boundary solution we obtain an approximation u s p :¼ ÀSv p to the scattered field u s in D (cf. ( 6)), and an approximation to the far-field pattern which describes the far-field behaviour of u s via u s ðxÞ $ u 1 ðhÞ e iðkrþp=4Þ 2 ffiffiffiffiffiffiffiffiffi ffi 2pkr p ; for x ¼ rðcos h; sin hÞ; as kxk 2 ¼: r !1: Unless otherwise stated, in our numerical results we use the following parameters: HNA space parameters ( §3): We use the mesh grading parameter r ¼ 0:15 and the number of layers ' ¼ 2ðp þ 1Þ for each graded mesh, as in related studies (e.g.[6,32]).Quadrature parameters ( §6): We take c osc ¼ 2, c sing ¼ 0:5 and N Q ¼ 20, as discussed in §6.3 (see ( 29)).

High frequency performance
To illustrate the high frequency performance of our HNA method we consider first the case where the screen consists of a single unit interval C ¼ ð0; 1Þ Â f0g.Examples with multiple screens will be considered in §7.3.As the incident wave direction we take d ¼ ð1; À1Þ= ffiffi ffi 2 p . Figure 4a shows the resulting total field for wavenumber k ¼ 128, and Fig. 4b, c plot the boundary solution v 8 , along with the magnitudes jv AE 1;8 j of the nonoscillatory amplitudes of the oscillatory factors e AEiks , for both k ¼ 128 and k ¼ 512.
In Fig. 5a we show the relative L 1 ðCÞ error Rel.L 1 err.on C :¼ in the BEM solution for a range of values of p and k, using p ¼ 12 as reference solution.For easier comparison with the best approximation (Theorem 3) it would be preferable to compute HÀ1=2 ðCÞ norm (which is possible via an appropriate single layer boundary integral representation, see e.g.[11, §7]), but here we choose the L 1 ðCÞ norm because it is easier to evaluate, and also because convergence of the boundary solution in L 1 ðCÞ implies, via a straightforward integral estimation, the convergence in supremum norm of the domain solution (on compact subsets of D) and the far-field pattern (cf.(35) below), which we study in Fig. 5c, d.We compute our L 1 ðCÞ norms by composite Gaussian quadrature with 20 Gauss points per element on a graded mesh, built by intersecting the overlapping meshes used to approximate v 12 , with additional subdivision of the larger elements to ensure that no element is longer than one wavelength 2p=k.Figure 5a demonstrates that our approximation of the boundary solution converges exponentially (in L 1 ðCÞ) as we increase the polynomial degree p.Moreover, for fixed p the L 1 error is not only bounded but actually decreases with increasing k.
In Fig. 5b we present the same results, plotted against the fourth root of the total CPU time for the BEM assembly and solve.The rationale behind this comparison is as follows.Our approximation space V N for this problem has dimension N ¼ Oðp 2 Þ (maximal polynomial degree p over two overlapping meshes of 2ðp þ elements each).And with a fixed parameter C OS ¼ 1:25, the number of collocation points M % C OS N also scales like Oðp 2 Þ.Hence the number of elements in the BEM matrix scales like MN ¼ Oðp 4 Þ with increasing p.The linear systems we use are moderate in size (typically N 1000) and are rapidly solved using our truncated SVD approach.As a result, the majority of the CPU time is spent assembling the discrete system, which means that total CPU time scales like Oðp 4 Þ.Therefore, to scrutinize the performance of our numerical quadrature scheme for evaluating the BEM matrix entries by a direct comparison with Fig. 5a, we should plot the boundary error against the fourth root of the CPU time.The results in Fig. 5b demonstrate that our quadrature scheme (based on numerical steepest descent) is both accurate and efficient, and results in a fully discrete BEM for which the error at a fixed computational cost is bounded (in fact, decreases) with increasing k.The timings in Fig. 5b are for a desktop PC with an Intel i7-4790 3.60GHz 4-core CPU, with matrix assembly done in parallel using a Matlab parfor loop.They show that highly accurate solutions can be obtained for essentially arbitrarily high frequencies in just a few seconds on a standard machine.
In Fig. 5c, d where O is the circle of radius one centred at (0.5, 0), and far-field errors Rel.L 1 err. in far-field :¼ ku 1 p À u 1 12 k L 1 ð0;2pÞ ku 1  12 k L 1 ð0;2pÞ : As for the boundary solution, we observe exponential convergence as p increases with k fixed, and bounded errors as k increases for p fixed.In Fig. 6 we show analogous results for the case of grazing incidence, with d ¼ ð0; À1Þ.A plot of the corresponding total field is shown in Fig. 6a and plots of the near-and farfield errors are shown in Fig. 6b, c.As before we see exponential convergence with increasing p, and no significant increase in error for increasing k.Relative errors are an order of magnitude larger than in the non-grazing example, because in this case there is no geometrical optics component (W ¼ 0), so the scattered fields are purely diffractive and hence smaller in maximum magnitude.
Comparing Figs.5d and 6c, we observe that in the far field the accuracy noticeably improves as k increases in the case of non-grazing incidence, but not in the case of grazing incidence.This is due to the fact that the far-field pattern u 1 has peaks of magnitude proportional to jd 2 jk at the values of h corresponding to the reflected and shadow directions (cf.Fig. 8a), associated with the contribution of the geometrical optics term W. 2 Hence the denominator in the relative L 1 error will be O(k) for non-grazing incidence, while the numerator is not expected to be comparably large because the contribution from W is 2 This contribution is found by substituting WðxðsÞÞ ¼ À2ikjd 2 je ikd1s for v(s) in (31).For example, for e C ¼ ð0; 1Þ we compute 2ikjd 2 j Z 1 0 e iksðd1Àcos hÞ ds ¼ 2ikjd 2 j e ikðd1Àcos hÞ À 1 ikðd 1 À cos hÞ ; which takes the value 2ikjd 2 j when cos h ¼ d 1 .
computed exactly (up to quadrature error), and hence does not contribute to the discretization error.

Collocation parameters
Our choices oversampling parameter C OS ¼ 1:25 and SVD parameter ¼ 10 À8 in the previous section were based on the results of extensive numerical testing.In this section we report a small subset of these results, to illustrate the effect of changing these parameters.All results in this section relate to the case C ¼ ð0; 1Þ Â f0g with d ¼ ð1; À1Þ= ffiffi ffi 2 p , as in Figs. 4 and 5.In Fig. 7 we present plots showing the dependence of the L 1 ðCÞ solution error, for p 2 f2; . ..; 8g, on the oversampling ratio M/N, which is a function of the oversampling parameter C OS .The left-hand plots correspond to wavenumber k ¼ 256 and the right-hand plots to k ¼ 65536, the top, middle and bottom plots correspond to the SVD truncation thresholds ¼ 10 À4 , ¼ 10 À8 and ¼ 10 À12 , respectively.In each case tests were run for C OS 2 f1; 1:05; 1:1; . ..; 2g; note that for the non-integer values of C OS the resulting values of M/N are in general larger than C OS (recall (19)).As the reference solution we use the solution with p ¼ 12 and C OS ¼ 2.
In interpreting these plots we focus first on the results for C OS ¼ 1 (M=N ¼ 1), which corresponds to a square system (no oversampling).The need for oversampling is clear from the fact that in all six plots the errors for C OS ¼ M=N ¼ 1 are not monotonically decreasing with increasing p.That is, increasing the dimension of the approximation space  By contrast, at least in plots (c)-(f) of Fig. 7, for all fixed values of C OS [ 1 considered, the error is monotonically decreasing with increasing p.Since computational cost portional to M, it is desirable to take C OS as small as possible while maintaining this monotonicity.But for C OS close to 1 there are still visible instabilities.On the basis of our experiments it seems that C OS !1:25 is sufficient to ensure stability for all the problems we considered.(The data points corresponding to the choice C OS ¼ 1:25 are shown with larger markers to make this transition more clear.)Significantly, and perhaps surprisingly, the amount of oversampling required to stabilise the method appears to be independent of the wavenumber k, as one sees, for example, by comparing Figs.7c (moderate k) and 7d (larger k), and recalling the convergence plots in Fig. 5a, b, which reach wavenumber k ¼ 262; 144 with no discernable loss of stability.
To interpret the dependence of the results on the SVD truncation parameter , it is instructive to recall Lemma 1, which, while not directly providing information about the error in our boundary solution (as explained in §5), still provides some intuition.On the one hand, since the argument of the infimum on the right-hand side of ( 21) is a linearly increasing function of , we expect that if is taken to be too large, the discrete residual for the SVD solution may be large, leading to a loss in accuracy of the boundary solution.This effect can be clearly observed by comparing Fig. 7c, d ( ¼ 10 À8 ), where we see clear exponential convergence as p increases, with Fig. 7a, b ( ¼ 10 À4 ), where errors appear to stagnate around p ¼ 5. On the other hand, looking again at (21), we expect that if is taken to be too small, the SVD solver may select a solution with a large coefficient norm, in which case our numerical solution may suffer from spurious oscillations between the collocation points, or other numerical instabilities that might increase the L 1 ðCÞ error.This degradation in accuracy for small can be observed by comparing Fig. 7c, d ( ¼ 10 À8 ), with Fig. 7e, f ( ¼ 10 À12 ).Based on extensive testing we found that the choice ¼ 10 À8 gave satisfactory performance for all the examples we considered.

Scattering by a Cantor set
In this section we present an application of our HNA BEM to high frequency scattering by the middle-third Cantor set.The mathematical analysis of acoustic scattering by fractal screens, of which the Cantor set is an example, was developed recently in [9] and [11], with the latter providing a rigorous convergence theory for an approximation strategy based on conventional Galerkin BEM using piecewise constant basis functions, along with numerical results for low frequency scattering in both two and three dimensions by a range of fractal screens.For two-dimensional scattering by the Cantor set, the HNA BEM presented in the current paper allows us to investigate the high frequency regime, and to our knowledge this is the first time that high frequency BEM simulations of scattering by fractal screens have been presented in the literature.Fractals, which one might define loosely as ''sets possessing geometric detail on every lengthscale'', arise in numerous scattering applications, including the human lung and other dendritic structures in medical science [42], snowflakes, ice crystals and other atmospheric particles in climate science [51], fractal antennas for electromagnetic wave transmission/reception [24], fractal piezoelectric ultrasound transducers [45], and fractal aperture problems in laser physics [14]; for further references see [11].The case of high frequency scattering by fractals is particularly intriguing, because even though the diameter of the scatterer may be large compared to the wavelength, the scatterer always possesses sub-wavelength structure, so standard ray-based high frequency approximations do not apply.In his 1979 paper on In practical applications and numerical simulations one generally with ''prefractal'' approximations in which fractal structure is truncated at some level.The standard prefractals for the middle-third Cantor set are defined as follows.Letting C ð0Þ ¼ ð0; 1Þ Â f0g, for j 2 N we construct the jth prefractal C ðjÞ by removing the (closed) middle third from each disjoint interval making up the previous prefractal C ðjÀ1Þ , so that e.g.C ð1Þ ¼ ðð0; 1=3Þ [ ð2=3; 1ÞÞ Â f0g.That the BIE solutions for sound-soft scattering on the prefractals converge to a well-defined non-zero limiting BIE solution on the underlying fractal was proved in [9, Example 8.2] (and see also [11,Prop. 6

.1]).
A plot of the total field for scattering by C ð2Þ with incident direction d ¼ ð1; À1Þ= ffiffi ffi 2 p , computed using our collocation HNA BEM, was presented in Fig. 1.In Fig. 8 we present radial plots of maxflog 10 ju 1 8 j; À1g; for the first six prefractals C ð0Þ ; . ..C ð5Þ , with k ¼ 1024 and d ¼ ð1; À1Þ= ffiffi ffi 2 p .The log scales are truncated at À1 to leave space in the centre of the plots for the corresponding prefractals to be plotted.Corresponding L 1 ðCÞ convergence plots are presented in Fig. 9, where, for each j ¼ 0; . ..; 5 we take as reference solution the corresponding solution with p ¼ 10.The fact that the absolute L 1 ðCÞ error in our boundary solution with p ¼ 8 is smaller than 10 À1 for all j ¼ 0; . ..; 5 implies that our far-field plots in Fig. 8 are also accurate to 10 À1 absolute error, since ju 1 p j kv p k L 1 ðCÞ : We note that at k ¼ 1024 the lowest order prefractal C ð0Þ is approximately 160 wavelengths long, and each component of the highest order prefractal C ð5Þ is approximately one wavelength long.While the far-field patterns for C ð4Þ and C ð5Þ are similar, they are clearly not identical, which is to be expected since in refining C ð4Þ to C ð5Þ we are making wavelength-scale changes to the scatterer geometry.Hence for C ð5Þ we are still in the ''preasymptotic'' regime with regard to convergence to the solution for scattering by the limiting fractal screen.(The asymptotic regime is considered in [11] using a conventional BEM approach.)We emphasize that our HNA method can handle much larger wavenumbers than k ¼ 1024, but at very large wavenumbers the far-field plots are so oscillatory are difficult to interpret, so we do not them here.

Fig. 1
Fig. 1 Real part of the total field for scattering by N C ¼ 4 colinear sound-soft screens constituting the order j ¼ 2 prefractal of the middle-third Cantor set (for more details see §7.3).The incident wave propagates from top left to bottom right SN Partial Differential Equations and Applications

Theorem 3 [ 32 , Theorem 5 . 1 ]
Let k 0 [ 0 and l !cp for some constant c [ 0. Then for every d [ 0 there exists a constant C [ 0, depending only on d, r, N C and c, and a constant s [ 0, depending only on d, r and c, such that

Fig. 3
Fig. 3 Plots of log 10 of the relative error in computing (28) using the quadrature scheme of §6.2 for two different sets of quadrature parameters fc osc ¼ 2; c sing ¼ 0:5; N Q ¼ 20g and fc osc ¼ 4; c sing ¼ 0:1; N Q ¼ 15g.In both plots the labels 1-4 indicate the regions of the ðk; t 0 Þ plane in which Cases 1, 2, 3 and 4 apply

Fig. 4
Fig. 4 HNA collocation BEM solution for C ¼ ð0; 1Þ Â f0g and incident direction ð1; À1Þ= ffiffi ffi 2 p , with p ¼ 8. a Real part of the total field u i þ u s 8 for k ¼ 128.b, c Real part of the boundary solution v 8 , along with the amplitudes jv AE 1;8 j, for k ¼ 128 and 512 respectively

Fig. 5
Fig. 5 Convergence with respect to increasing maximum polynomial degree p, for various wavenumbers k. a L 1 error on C against p, b L 1 error on C against CPU time, c, d L 1 error in the near and far field against p

Fig. 6
Fig. 6 Analogue of Figs.4a, 5c, d for the case of grazing incidence

Fig. 7
Fig. 7 Dependence of solution accuracy the oversampling parameter C OS and truncation parameter for moderate (k ¼ 256, left-hand plots) and large (k ¼ 65536, right-hand plots) wavenumbers and varying polynomial degree p.The values of C OS used are f1; 1:05; 1:1; . ..; 2g, but the resulting ratio M/N between the number of collocation points and the number of unknowns is in general slightly larger than C OS (recall (19)).The data points corresponding to C OS ¼ 1:25 are shown with large markers

Fig. 9
Fig. 9 Convergence of the boundary solution with increasing p for k ¼ 1024 and prefractal levels j ¼ 0; . ..; 5, for scattering by the middle-third Cantor set we plot the corresponding near-field errors