1 Introduction

1.1 Boundary integral equations for Laplace’s equation

If an explicit expression for the fundamental solution of a linear PDE is known, then boundary value problems (BVPs) for that PDE can be converted to integral equations on the boundary of the domain. The main advantage of this procedure is that the dimension of the problem is reduced; indeed, the problem is converted from one on a d-dimensional domain to one on a \((d-1)\)-dimensional domain. Futhermore, if the original domain is the exterior of a bounded obstacle, then the problem is reduced from one on a d-dimensional infinite domain, to one on a \((d-1)\)-dimensional finite domain.

This reduction to the boundary has both theoretical and practical benefits: on the theoretical side, C. Neumann famously used boundary integral equations (BIEs) to prove existence of the solution of the Dirichlet problem for Laplace’s equation in convex domains in [80] (see, e.g., the account in [65, Chapter 1]), and BIEs have a long history of use in the harmonic analysis literature to prove wellposedness of BVPs on rough domains (see, e.g., [13, 22, 101, 49, §2.1], [72, 68, Chapter 15], [97, Chapter 4], [70]). On the more practical side, numerical methods based on Galerkin, collocation, or numerical quadrature discretisation of BIEs, coupled with fast matrix–vector multiply and compression algorithms, and iterative solvers such as GMRES, provide spectacularly effective computational tools for solving a range of linear boundary value problems, for example in potential theory, elasticity, and acoustic and electromagnetic wave scattering (see, e.g., [4, 8, 11, 15, 21, 23, 39, 55, 85, 87, 106]).

Let \(\Phi (\textbf{x},\textbf{y})\) be the fundamental solution for Laplace’s equation:

$$\begin{aligned} \Phi (\textbf{x},\textbf{y}):= \displaystyle {\frac{1}{2\pi } \log \left( \frac{a}{|\textbf{x}-\textbf{y}|}\right) ,} \quad d= 2, \quad := \dfrac{1}{(d-2)C_d|\textbf{x}-\textbf{y}|^{d-2}}, \quad d\ge 3,\nonumber \\ \end{aligned}$$
(1.1)

where \(C_d\) is the surface area of the unit sphere \(S^{d-1}\subset \mathbb {R}^d\) and \(a >0\). With \(\Gamma \) the boundary of a bounded Lipschitz domain, the boundary integral operators (BIOs) S, D, \(D'\), and H, the single-layer, double-layer, adjoint double-layer, and hypersingular operators, respectively, are defined for \(\phi \in {L^2(\Gamma )}\), \(\psi \in H^1(\Gamma )\), and \(\textbf{x}\in \Gamma \) by

$$\begin{aligned} S_k \phi (\textbf{x}) = \int _\Gamma \Phi _k(\textbf{x},\textbf{y}) \phi (\textbf{y})\, \textrm{d}s(\textbf{y}), \quad D \phi (\textbf{x}) = \int _\Gamma \frac{\partial \Phi (\textbf{x},\textbf{y})}{\partial n(\textbf{y})} \phi (\textbf{y})\, \textrm{d}s(\textbf{y}), \quad \end{aligned}$$
(1.2)

and

$$\begin{aligned} D' \phi (\textbf{x}) = \int _\Gamma \frac{\partial \Phi (\textbf{x},\textbf{y})}{\partial n(\textbf{x})} \phi (\textbf{y})\, \textrm{d}s(\textbf{y}), \quad H \psi (\textbf{x}) = \frac{\partial }{\partial n(\textbf{x})}\int _\Gamma \frac{\partial \Phi _k(\textbf{x},\textbf{y})}{\partial n(\textbf{y})} \psi (\textbf{y})\, \textrm{d}s(\textbf{y}).\nonumber \\ \end{aligned}$$
(1.3)

When \(\Gamma \) is Lipschitz, the integrals in D and \(D'\) are defined as Cauchy principal values, in general only for almost all \(\textbf{x}\in \Gamma \) with respect to the surface measure \(\textrm{d}s\). The definition of H on spaces larger than \(H^1(\Gamma )\) is complicated (it must be understood either as a finite-part integral, or as the non-tangential limit of a potential; see [65, Chapter 7], [17, Page 113] respectively), but these details are not essential to the present paper. The standard mapping properties of \(S, D, D'\), and H on Sobolev spaces on \(\Gamma \) are recalled in Appendix A (see (A.3)).

The BIE operators involved in the standard first- and second-kind BIEs for the Dirichlet and Neumann problems for Laplace’s equation are shown in Table 1; although we do not explicitly consider the Neumann problem in this paper, we use the information in this table in what follows. For the details of the right-hand sides and unknowns for the integral equations corresponding to the operators in Table 1, see, e.g., [87, §3.4], [65, Chapter 7], [93, Chapter 7], [17, §2.5]. Recall that the adjective “direct" in the table refers to equations where the unknown is either the Dirichlet or Neumann trace of the solution to the corresponding BVP, and the adjective “indirect" refers to equations where the unknown does not have immediate physical relevance.

Table 1 The integral operators involved in the standard boundary-integral-equation formulations of the interior and exterior Dirichlet and Neumann problems for Laplace’s equation

Following [87, Pages 9 and 10], we call BIEs first kind where the unknown function only appears under the integral, and second kind where the unknown function appears outside the integrand as well as inside; by this definition, the BIEs in the first and third row of Table 1 are first kind, and in the second and fourth row second kind. An alternative definition of second kind BIEs is that, in addition to the unknown function appearing outside the integrand as well as inside, the BIO is Fredholm of index zero (i.e., the Fredholm alternative applies to the BIE); see, e.g., [4, §1.1.4]. Every BIE that we describe in the paper as second-kind is second-kind in both senses above.

1.2 The Galerkin method

We focus on solving Laplace BIEs with the Galerkin method. The Galerkin method applied to the equation \(A\phi = f\), where \(\phi , f \in {{\mathcal {H}}}\), \(A:{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}\) is a continuous (i.e. bounded) linear operator, and \({{\mathcal {H}}}\) is a complexFootnote 1 Hilbert space, is: given a sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \) of finite-dimensional subspaces of \({{\mathcal {H}}}\) with \(\dim ({{\mathcal {H}}}_N)\rightarrow \infty \) as \(N\rightarrow \infty \),

$$\begin{aligned} \text { find }\phi _N \in {{\mathcal {H}}}_N \text { such that }\big (A\phi _N,\psi _N)_{{\mathcal {H}}}=\big (f,\psi _N\big )_{{\mathcal {H}}}\quad \text { for all }\psi _N\in {{\mathcal {H}}}_N. \end{aligned}$$
(1.4)

We say that the Galerkin method converges for the sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \) if, for every \(f\in {{\mathcal {H}}}\), the Galerkin equations (1.4) have a unique solution for all sufficiently large N and \(\phi _N\rightarrow A^{-1}f\) as \(N\rightarrow \infty \). We say that \(({{\mathcal {H}}}_N)_{N=1}^\infty \) is asymptotically dense in \({{\mathcal {H}}}\) if, for every \(\phi \in {{\mathcal {H}}}\),

$$\begin{aligned} \min _{\psi _N\in {{\mathcal {H}}}_N}\left\| \phi -\psi _N\right\| _{{{\mathcal {H}}}} \rightarrow 0 \quad \text{ as } \quad N\rightarrow \infty . \end{aligned}$$
(1.5)

A necessary condition for the convergence of the Galerkin method is that \(({{\mathcal {H}}}_N)_{N=1}^\infty \) is asymptotically dense in \({{\mathcal {H}}}\). Indeed, a standard necessary and sufficient condition (e.g., [37, Chapter II, Theorem 2.1]) for convergence of the Galerkin method is that \(({{\mathcal {H}}}_N)_{N=1}^\infty \) is asymptotically dense and that, for some \(N_0\in \mathbb {N}\) and \(C_{\textrm{dis}}>0\),

$$\begin{aligned} \frac{\Vert {{\mathcal {P}}}_N A\psi _N\Vert _{{{\mathcal {H}}}}}{\Vert \psi _N\Vert _{{\mathcal {H}}}} \ge C_{\textrm{dis}} \quad \text{ for } \text{ all } \text{ non-zero } \psi _N\in {{\mathcal {H}}}_N \text{ and } N\ge N_0, \end{aligned}$$
(1.6)

where \({{\mathcal {P}}}_N\) is orthogonal projection of \({{\mathcal {H}}}\) onto \({{\mathcal {H}}}_N\). Importantly, if (1.6) holds, then ([37, Chapter II, Equation (2.5)] or see [87, Theorem 4.2.1 and Remark 4.2.5])

$$\begin{aligned} \Vert \phi - \phi _N\Vert _{{{\mathcal {H}}}} \le \left( 1+ \frac{\Vert A\Vert _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{dis}}}\right) \min _{\psi _N\in {{\mathcal {H}}}_N}\Vert \phi -\psi _N\Vert _{{\mathcal {H}}}, \quad \text{ for } N\ge N_0, \end{aligned}$$
(1.7)

where \(\phi =A^{-1}f\) and \(\phi _N\) is the unique solution of the Galerkin equations (1.4). We note that (1.7) is known as a quasioptimal error estimate.

We now recap the main abstract theorem on convergence of the Galerkin method; this theorem uses the definition that an operator \(A:{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}\) is coercive,Footnote 2 if there exists \(C_{\textrm{coer}}>0\) such that

$$\begin{aligned} \big |(A\psi ,\psi )_{{{\mathcal {H}}}}\big |\ge C_{\textrm{coer}}\left\| \psi \right\| ^2_{{{\mathcal {H}}}} \quad \text { for all }\psi \in {{\mathcal {H}}}. \end{aligned}$$
(1.8)

Theorem 1.1

(The main abstract theorem on convergence of the Galerkin method.)

  1. (a)

    If A is invertible then there exists a sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \) for which the Galerkin method converges.

  2. (b)

    If A is invertible then the following are equivalent:

    1. (i)

      The Galerkin method converges for every asymptotically-dense sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \) in \({{\mathcal {H}}}\).

    2. (ii)

      \(A=A_0+K\) where \(A_0\) is coercive and K is compact.

  3. (c)

    If A is coercive (i.e. (1.8) holds) then, for every sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \) and every \(N\in \mathbb {N}\), the Galerkin equations (1.4) have a unique solution \(\phi _N\) and, where \(\phi = A^{-1}f\),

    $$\begin{aligned} \big \Vert \phi -\phi _N\big \Vert _{{\mathcal {H}}}\le \frac{\Vert A\Vert _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}} \, \min _{\psi \in {{\mathcal {H}}}_N}\big \Vert \phi -\psi \big \Vert _{{\mathcal {H}}}, \end{aligned}$$
    (1.9)

    (so that \(\phi _N\rightarrow \phi \) as \(N\rightarrow \infty \) if \(({{\mathcal {H}}}_N)_{N=1}^\infty \) is asymptotically dense in \({{\mathcal {H}}}\)).

References for the proof

Part (a) was first proved in [64, Theorem 1]; see also [37, Chapter II, Theorem 4.1]. Part (b) was first proved in [64, Theorem 2], with this result building on results in [99]; see also [37, Chapter II, Lemma 5.1 and Theorem 5.1]. Part (c) is Céa’s Lemma, first proved in [14]. \(\square \)

1.3 The rationale for using second-kind BIEs posed in \({L^2(\Gamma )}\)

The BIOs in Table 1 are coercive in the trace spaces \(H^{\pm 1/2}(\Gamma )\) (or certain subspaces of these) for Lipschitz \(\Gamma \), thus insuring convergence of the associated Galerkin methods by Part (c) of Theorem 1.1; this coercivity theory was established for first-kind equations by Nédélec and Planchard [79], Le Roux [56, 57], and Hsiao and Wendland [46], and for second-kind equations by Steinbach and Wendland [95]. These arguments involve transferring boundedness/coercivity properties of the PDE solution operator to the associated boundary integral operators via the trace map and layer potentials; the generality of these arguments is why coercivity holds with \(\Gamma \) only assumed to be Lipschitz, and Costabel [25] highlighted how these ideas can be traced back to the work of Gauss and Poincaré.

Despite convergence of the associated Galerkin methods, using the first-kind formulations in the trace spaces has the disadvantage that the condition numbers of the Galerkin matrices grow as the discretisation is refined; e.g., for the h-version of the Galerkin method (where convergence is obtained by decreasing the mesh-width h and keeping the polynomial degree fixed), the condition numbers grow like \(h^{-1}\); see, e.g. [87, §4.5]. The design of appropriate preconditioning strategies for these Galerkin matrices has therefore been a classic topic of study in the BIE community for over 20 years, with proposed solutions including (i) preconditioning with an opposite-order operator [94] (see also the survey [45]), (ii) using wavelets, either as an approximation space (e.g., [42, 43, 103]) or in preconditioning (e.g., [88, 98]); using domain decomposition methods; see, e.g., [44] and the recent book [96]. Furthermore, using the second-kind formulations in the trace spaces has the disadvantage that the inner products on \(H^{\pm 1/2}(\Gamma )\) are non-local and non-trivial to evaluate; even if the basis functions \(\phi _N\) and \(\psi _N\) in (1.4) have supports only on a subset of \(\Gamma \), \((A\phi _N,\psi _N)_{{{\mathcal {H}}}}\) is an integral over all of \(\Gamma \), and the calculation of the Galerkin matrix in this case is impractical.

For the second-kind BIEs, an attractive alternative to working in the trace spaces is to work in \({L^2(\Gamma )}\). When \(\Gamma \) is \(C^1\), D and \(D'\) are compact in \({L^2(\Gamma )}\) by the results of Fabes, Jodeit, and Rivière [35, Theorems 1.2 and 1.9] and thus each of the second-kind BIOs \(\frac{1}{2}I \pm D\) and \(\frac{1}{2}I \pm D'\) is the sum of a coercive operator and a compact operator, and convergence of the associated Galerkin methods in \({L^2(\Gamma )}\) is ensured by Part (b) of Theorem 1.1. Since the \({L^2(\Gamma )}\) norm is local, \((A\phi _N,\psi _N)_{{{\mathcal {H}}}}\) is an integral over the support of \(\psi _N\), and the Galerkin matrix is much more easily computable. Furthermore, when D and \(D'\) are compact, the condition numbers of the Galerkin matrices of \(\frac{1}{2}I \pm D\) and \(\frac{1}{2}I \pm D'\) are independent of the discretisation (without preconditioning); see [4, §3.6.3], [41, §4.5.5].

1.4 Convergence of the Galerkin method in \(L^2(\Gamma )\) for the standard second-kind integral equations on polyhedral and Lipschitz domains

The disadvantage of using second-kind BIEs in \({L^2(\Gamma )}\) is that convergence of the Galerkin method is harder to establish when \(\Gamma \) is only Lipschitz, or Lipschitz polyhedral. Indeed, in these cases D and \(D'\) are not compact; e.g., when \(\Gamma \) has a corner or edge their spectra are not discrete; see, e.g. [4, §8.1.3]. When \(\Gamma \) is only Lipschitz, D and \(D'\) are bounded on \(L^2(\Gamma )\) by the results on boundedness of the Cauchy integral on Lipschitz \(\Gamma \) of Coifman, McIntosh, and Meyer [22] (following earlier work by Calderón [12] on boundedness for \(\Gamma \) with small Lipschitz character). Verchota [101] showed that the operators \(\frac{1}{2}I \pm D\) and \(\frac{1}{2}I \pm D'\) are Fredholm of index zero on \(L^2(\Gamma )\); when \(\Gamma \) is connected, \(\frac{1}{2}I - D\) and \(\frac{1}{2}I- D'\) are invertible on \({L^2(\Gamma )}\) and \(\frac{1}{2}I + D\) and \(\frac{1}{2}I+ D'\) invertible on \(L^2_0(\Gamma )\), the set of \(\phi \in {L^2(\Gamma )}\) with mean value zero; see [101, Theorems 3.1 and 3.3(i)].Footnote 3

A long-standing open question has been

Can \(\frac{1}{2}I \pm D\) and \(\frac{1}{2}I \pm D'\) be written as the sum of a coercive operator and a compact operator in the space \(L^2(\Gamma )\) when \(\Gamma \) is only assumed to be Lipschitz?

By Part (b) of Theorem 1.1, this question is equivalent to the question: does the Galerkin method applied to \(\frac{1}{2}I \pm D\) and \(\frac{1}{2}I \pm D'\) in \({L^2(\Gamma )}\) converge for every asymptotically-dense sequence of subspaces when \(\Gamma \) is only assumed to be Lipschitz?

Until recently, this question was answered only in the following two cases, both in the affirmative: (i) \(\Gamma \) is a 2d curvilinear polygon with each side \(C^{1,\alpha }\) for some \(0<\alpha <1\) and with each corner angle in the range \((0,2\pi )\). (ii) \(\Gamma \) is Lipschitz, with sufficiently small Lipschitz character. Regarding (i): this result was announced by Shelepov in [89], with details of the proof given in [90], and with the analogous result for polygons following from the result of Chandler [16, §3]; see, e.g. [9, Lemma 1.5]. Regarding (ii): Wendland [105, §4.2] recognised that the results of I. Mitrea [71, Lemma 1, Page 392] about the essential spectral radius could be adapted to prove this result, with this result proved in full in [19, Corollary 3.5]; for more discussion on both (i) and (ii), see [19, §1].

The recent paper [19] finally settled the question above negatively by giving examples of 2-d Lipschitz domains and 3-d star-shaped Lipschitz polyhedra for which \(\frac{1}{2}I \pm D\) and \(\frac{1}{2}I \pm D'\) cannot be written as the sum of a coercive operator and a compact operator in the space \(L^2(\Gamma )\). The 3-d star-shaped Lipschitz polyhedra are defined in [19, Definition 5.7], and called the open-book polyhedra; see Fig. 1 for an example, where we use the notation that \(\Omega _{\theta ,n}\) is the open-book polyhedron with n pages and opening angle \(\theta \). Given \(\epsilon >0\) there exists \(\theta _0\in (0,\pi ]\) such that the essential numerical range of D in \({L^2(\Gamma )}\) contains the interval \([-\sqrt{n}/2 + \epsilon , \sqrt{n}/2-\epsilon ]\) [19, Theorem 1.3] if \(0 < \theta \le \theta _0\). By the definition of the essential numerical range (see, e.g., [19, Equation 2.3]), this result implies that if \(\theta \) is sufficiently small and \(n\ge 2\), then \(\frac{1}{2}I \pm D\) and \(\frac{1}{2}I \pm D'\) cannot be written as the sum of a coercive operator and a compact operator in the space \(L^2(\Gamma )\) when \(\Gamma = \partial \Omega _{\theta ,n}\).

Fig. 1
figure 1

Views from above and below of the open-book polyhedron \(\Omega _{\theta ,n}\) of [19, Definition 5.7], with \(n=4\) pages and opening angle \(\theta = \pi /2\)

Nevertheless, Part (b) of Theorem 1.1 only shows that the Galerkin method applied to these domains does not converge for every asymptotically dense sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \subset {L^2(\Gamma )}\), leaving opening the possibility that all Galerkin methods used in practice (based on boundary element method discretisation [87, 93]) are in fact convergent. However, the following result from [19] clarifies that this is not the case.

Theorem 1.2

([19, Theorem 1.4]) Suppose that A is invertible but A cannot be written in the form \(A=A_0+K\), where \(A_0\) is coercive and K is compact, and that \(({{\mathcal {H}}}^*_N)_{N=1}^\infty \) is a sequence of finite-dimensional subspaces of \({{\mathcal {H}}}\), with \({{\mathcal {H}}}^*_1\subset {{\mathcal {H}}}^*_2 \subset ...\), for which the Galerkin method converges. Then there exists a sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \) of finite-dimensional subspaces of \({{\mathcal {H}}}\), with \({{\mathcal {H}}}_1\subset {{\mathcal {H}}}_2 \subset ...\), such that:

  1. (a)

    the Galerkin method does not converge for the sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \); and

  2. (b)

    for each \(N\in \mathbb {N}\),

    $$\begin{aligned} {{\mathcal {H}}}^*_N \subset {{\mathcal {H}}}_N \subset {{\mathcal {H}}}^*_{M_N},\quad \text { for some } M_N\in \mathbb {N}. \end{aligned}$$
    (1.10)

We can apply this result when \(({{\mathcal {H}}}^*_N)_{N=1}^\infty \) is a sequence of boundary element subspaces that is asymptotically dense in \({L^2(\Gamma )}\), in which case \(({{\mathcal {H}}}_N)_{N=1}^\infty \), satisfying (1.10), is also a sequence of boundary element subspaces (since \({{\mathcal {H}}}_N\subset {{\mathcal {H}}}_{M_N}^*\)) and is also asymptotically dense in \({L^2(\Gamma )}\) (since \({{\mathcal {H}}}^*_N\subset {{\mathcal {H}}}_N\)).

In summary, the results of [19] show that there exist Lipschitz and polyhedral boundaries \(\Gamma \) for which there are Galerkin methods for solving BIEs involving \(\frac{1}{2}I \pm D\) and \(\frac{1}{2}I \pm D'\) that do not converge, with these methods based on asymptotically-dense sequences \(({{\mathcal {H}}}_N)_{N=1}^\infty \subset {L^2(\Gamma )}\) of boundary element subspaces.

1.5 Motivation for the present paper and summary of the main results

Given the negative results of [19] about convergence of the Galerkin method for the standard second-kind formulations, a natural question is therefore

Do there exist second-kind BIE formulations in \(L^2(\Gamma )\) of Laplace’s equation where, with \(\Gamma \) only assumed to be Lipschitz, the operators are continuous, invertible, and can be written as the sum of a coercive operator and a compact operator?

In this paper we answer this question in the affirmative for the Laplace interior and exterior Dirichlet problems. We present new BIE formulations that are continuous and in fact coercive (i.e., not just the sum of a coercive and a compact operator) in the space \(L^2(\Gamma )\), with \(\Gamma \) only assumed to be Lipschitz; thus convergence of the Galerkin method in \({L^2(\Gamma )}\) for every asymptotically-dense sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \), plus the explicit error estimate (1.9), is ensured by Part (c) of Theorem 1.1. Furthermore, the strong property of coercivity allows us to prove that, if the Galerkin matrices are preconditioned by a specified diagonal matrix, then the number of GMRES iterations required to solve the associated linear systems to a prescribed accuracy does not increase as the discretisation is refined and N increases.

In summary, the new BIEs introduced in this paper are such that, when solving the Laplace interior and exterior Dirichlet problems on a general Lipschitz domain:

  1. 1.

    Given any asymptotically-dense sequence of subspaces, the associated Galerkin method is provably convergent; and

  2. 2.

    For a wide variety of subspaces, including piecewise polynomials (of arbitrary degree) on anisotropic meshes, the Galerkin matrices are provably well conditioned—with the number of GMRES iterations independent of the subspace dimension—after preconditioning by only a diagonal matrix.

Indeed, Sect. 1.4 recalled that the standard second-kind BIEs in \({L^2(\Gamma )}\) do not satisfy Point 1. Furthermore, the proposed remedies to the growth of the condition number of the first-kind BIEs in the trace spaces recapped in Sect. 1.3, although tremendously successful in many contexts, do not satisfy Point 2. Indeed, to our knowledge, there is no theory on either operator preconditioning or wavelet preconditioning of piecewise-polynomial discretisations using anisotropic meshes on general Lipschitz polyhedra. Furthermore, whilst there exists theory for domain-decomposition methods on anisotropic meshes (e.g., [44]) the preconditioners are more complicated, and expensive, than multiplication by a diagonal matrix.

Outline of the paper. Section 1.6 defines more precisely the Laplace BVPs we consider. Section 1.7 recaps results about a non-standard layer potential introduced in [13] and its non-tangential limits. Section 2 states the main results. Section 3 discusses the ideas behind the main results, and the links to other work in the literature. Section 4 proves the main results, except the parts of the proofs that are related to the wellposedness and regularity of the Laplace oblique Robin problem, with these given in Sect. 5.

Section 6 in the extended version of the present paper [18] presents results for the Helmholtz exterior Dirichlet problem (with these results corollaries of the Laplace results in Sect. 2).

1.6 Notation and statement of the BVPs

Let \({\Omega ^-}\subset \mathbb {R}^d\), \(d\ge 2\), be a bounded (not necessarily connected) Lipschitz open set, and let \({\Omega ^+}:= \mathbb {R}^d{\setminus } \overline{{\Omega ^-}}\) and \(\Gamma :=\partial {\Omega ^-}\). Let \(\textbf{n}\) be the outward-pointing unit normal vector to \({\Omega ^-}\) (so \(\textbf{n}\) points out of \({\Omega ^-}\) and into \({\Omega ^+}\)). For \(v \in H^1({\Omega ^-})\) let \(\gamma ^- v\) denote its Dirichlet trace. For \(v \in H^1({\Omega ^-},\Delta ):= \{ w: w \in H^1({\Omega ^-}), \Delta w \in L^2({\Omega ^-})\}\) let \(\partial _n^- v\) denote its Neumann trace; recall that, if \(v \in H^2({\Omega ^-})\) then \(\partial _n^- v= \textbf{n}\cdot \gamma ^- \nabla v\). Similarly, for \(v\in H^1_{\textrm{loc}}({\Omega ^+}):=\{ w: {\Omega ^+}\rightarrow \mathbb {R}: \chi w \in H^1({\Omega ^+}) \text { for all }\chi \in C^\infty _\textrm{comp}(\mathbb {R}^d)\}\), let \(\gamma ^+ v\) denote its Dirichlet trace. For \(v\in H^1_{\textrm{loc}}({\Omega ^+}, \Delta ):=\{ w: {\Omega ^+}\rightarrow \mathbb {R}: \chi w \in H^1({\Omega ^+}), \chi \Delta w \in L^2({\Omega ^+}) \text { for all }\chi \in C^\infty _\textrm{comp}(\mathbb {R}^d)\}\), let \(\partial _n^+ v\) denote its Neumann trace.

Definition 1.3

(Laplace interior Dirichlet problem (IDP)) Given \(g_D\in H^{1/2}(\Gamma )\), we say that \(u\in H^1({\Omega ^-})\) satisfies the interior Dirichlet problem (IDP) if \(\Delta u =0\) in \({\Omega ^-}\) and \(\gamma ^- u= g_D\) on \(\Gamma \).

Definition 1.4

(Laplace exterior Dirichlet problem (EDP)) With \({\Omega ^-}\) and \({\Omega ^+}\) as above, assume further that \({\Omega ^+}\) is connected. Given \(g_D\in H^{1/2}(\Gamma )\), we say that \(u\in H^1_\mathrm{{loc}}({\Omega ^+})\) satisfies the exterior Dirichlet problem (EDP) if \(\Delta u =0\) in \({\Omega ^+}\), \(\gamma ^+ u= g_D\) on \(\Gamma \), and \(u(\textbf{x})= O(1)\) when \(d=2\) and \(u(\textbf{x}) = o(|\textbf{x}|^{3-d})\) when \(d\ge 3\) as \(|\textbf{x}| \rightarrow \infty \) (uniformly in all directions \(\textbf{x}/|\textbf{x}|\)).

We make three remarks.

  1. (i)

    Recall that, by elliptic regularity (see, e.g., [65, Theorem 4.16]), the solution of the IDP and EDP are \(C^\infty \) in \({\Omega ^-}\) and \({\Omega ^+}\) respectively. Therefore, the pointwise conditions at infinity imposed in the EDP make sense.

  2. (ii)

    For the IDP and EDP, uniqueness of the solution is shown in, e.g., [65, Corollary 8.3] and [65, Theorems 8.9 and 8.10] respectivelyFootnote 4. Existence then follows from Fredholm theory and, e.g., [65, Theorems 7.5, 7.6, and 7.15].

  3. (iii)

    The Neumann traces of the solutions of both the IDP and EDP are in \({H^{-1/2}(\Gamma )}\); see, e.g., [65, Lemma 4.3]. Later, we consider both these BVPs when the Dirichlet data is in \(H^1(\Gamma )\). The regularity result of Nečas [78, §5.1.2] (restated as Theorem B.1 below) then implies that \(\partial _n^- u\) and \(\partial _n^+ u\) (in Definitions 1.3 and 1.4 respectively) are both in \({L^2(\Gamma )}\), as opposed to just in \({H^{-1/2}(\Gamma )}\).

The IDP and EDP can equivalently be formulated in terms of non-tangential limits, with these alternative formulations standard in the harmonic-analysis literature (see, e.g., [101, Corollary 3.2], [13, §3], [68, Theorem 2], [97, Proposition 5.1]). We state these alternative formulations, and recall their equivalence, so that we can easily use results from the harmonic-analysis literature (summarised in Appendix B below).

Given \(\textbf{x}\in \Gamma \), let \(\Theta ^{\pm }(\textbf{x})\) be the non-tangential approach set to \(\textbf{x}\) from \(\Omega ^\pm \) defined by

$$\begin{aligned} \Theta ^{\pm }(\textbf{x}):= \Big \{ \textbf{y}\in \Omega ^{\pm } \,: \, |\textbf{x}-\textbf{y}|\le \min \big \{c, C \textrm{dist}(\textbf{y},\Gamma )\big \}\Big \}, \end{aligned}$$
(1.11)

for some \(c>0\) and some \(C>1\) sufficiently large depending on the Lipschitz character of \(\Omega ^\pm \).Footnote 5 If \(u\in C(\Omega ^{\pm })\), its non-tangential maximal function \(u^*: \Gamma \rightarrow [0,\infty ]\) is defined by

$$\begin{aligned} u^*(\textbf{x}):= \sup _{\textbf{y}\in \Theta ^{\pm }(\textbf{x})}|u(\textbf{y})|, \quad \textbf{x}\in \Gamma . \end{aligned}$$
(1.12)

Define the non-tangential limit

$$\begin{aligned} \widetilde{\gamma }^\pm u(\textbf{x}):= \lim _{\textbf{y}\rightarrow \textbf{x},\, \textbf{y}\in \Theta ^\pm (\textbf{x})}u(\textbf{y}). \end{aligned}$$
(1.13)

If \(u\in C^2(\Omega ^\pm )\), \(\Delta u=0\), and \(u^* \in {L^2(\Gamma )}\), then \(\widetilde{\gamma }^\pm u(x)\) is well-defined for almost all \(x\in \Gamma \) and \(\widetilde{\gamma }^\pm u \in {L^2(\Gamma )}\) by [48, Corollary 5.5] (restated as Part (ii) of Theorem B.2 below). Furthermore, if \(u\in H^s_\textrm{loc}(\Omega ^\pm )\) with \(s>1/2\), then \(\widetilde{\gamma }^\pm u =\gamma ^\pm u\) by [17, Lemma A.9] (restated as Lemma B.3 below).

Definition 1.5

(Laplace IDP formulated via non-tangential limits) Given \(g_D\in L^2(\Gamma )\), we say that \(u\in C^2({\Omega ^-})\) with \(u^* \in L^2(\Gamma )\) satisfies the IDP if \(\Delta u =0\) in \({\Omega ^-}\) and \(\widetilde{\gamma }^- u = g_D\) on \(\Gamma \).

Definition 1.6

(Laplace EDP formulated via non-tangential limits) With \({\Omega ^-}\) and \({\Omega ^+}\) as above, assume further that \({\Omega ^+}\) is connected. Given \(g_D\in L^2(\Gamma )\), we say that \(u\in C^2({\Omega ^+})\) with \(u^* \in L^2(\Gamma )\) satisfies the EDP if \(\Delta u =0\) in \({\Omega ^+}\), \(\widetilde{\gamma }^+ u = g_D\) on \(\Gamma \), and \(u(\textbf{x})= O(1)\) when \(d=2\) and \(u(\textbf{x}) = o(|\textbf{x}|^{3-d})\) when \(d\ge 3\) as \(|\textbf{x}| \rightarrow \infty \) (uniformly in all directions \(\textbf{x}/|\textbf{x}|\)).

Existence and uniqueness of the solutions of these formulations of the IDP and EDP go back to the work of Dahlberg [27], and are given explicitly in, e.g., [101, Corollary 3.2 and Lemma 3.7], [13, §3]. The following equivalence result is proved in Appendix C.

Theorem 1.7

(Equivalence of the different formulations of the IDP and EDP) If \(g_D\in H^{1/2}(\Gamma )\), then the solution of the IDP in the sense of Definition 1.3 is the solution of the IDP in the sense of Definition 1.5, and vice versa.

Similarly, if \(g_D\in H^{1/2}(\Gamma )\), then the solution of the EDP in the sense of Definition 1.4 is the solution of the EDP in the sense of Definition 1.6, and vice versa.

1.7 Recap of results about layer potentials and their non-tangential limits

Recall that the surface gradient operator on \(\Gamma \) is the unique operator such that, when \(v \in C^1(\overline{{\Omega ^-}})\), \(\nabla v= \textbf{n}(\textbf{n}\cdot \nabla v) + \nabla _{\Gamma }(\gamma ^- v)\) on \(\Gamma \) (and similarly for \(v \in C^1(\overline{{\Omega ^+}})\)); for an explicit expression for \(\nabla _{\Gamma }\) in terms of a parametrisation of \(\Gamma \), see, e.g., [17, Equation A.14].

The following results all rely on the harmonic-analysis results in [22] and [101] (see also the accounts in [68, Chapter 15], [97, Chapter 4], [49, Chapter 2, Section 2]). Define

$$\begin{aligned} \nabla _{\Gamma }S \phi (\textbf{x}):= \int _\Gamma \nabla _{\Gamma ,\textbf{x}}\Phi (\textbf{x},\textbf{y})\phi (\textbf{y})\textrm{d}s(\textbf{y}) \quad \text { for a.e. } \textbf{x}\in \Gamma , \end{aligned}$$

where the integral is understood in the principal-value sense. By [101, Theorem 1.6], \(\nabla _{\Gamma }S:{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\), with this mapping continuous, and \((\nabla _{\Gamma }S)\phi = \nabla _{\Gamma }(S\phi )\). The following potential was introduced in [13, §2]; given \(\textbf{Z}\in (L^\infty (\Gamma ))^d\) that is real-valued (which we assume throughout), let

$$\begin{aligned} \mathcal {K}_{\textbf{Z}}\phi (\textbf{x}):= \int _\Gamma \textbf{Z}(\textbf{y}) \cdot \nabla _\textbf{y}\Phi (\textbf{x},\textbf{y}) \phi (\textbf{y})\, \textrm{d}s(\textbf{y}) \quad \quad \text { for }\textbf{x}\in \mathbb {R}^d\setminus \Gamma , \end{aligned}$$
(1.14)

and let

$$\begin{aligned} K_{\textbf{Z}}\phi (\textbf{x}):= \int _\Gamma \textbf{Z}(\textbf{y})\cdot \nabla _\textbf{y}\Phi (\textbf{x},\textbf{y})\phi (\textbf{y}) \, \textrm{d}s(\textbf{y}) \quad \quad \text { for a.e. }\textbf{x}\in \Gamma , \end{aligned}$$
(1.15)

where the integral in (1.15) is understood in the principal-value sense. The results of [22] and [101] imply that \(K_{\textbf{Z}}:{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\) and, for \(\phi \in {L^2(\Gamma )}\), \(\mathcal {K}_{\textbf{Z}}\phi \in C^2(\Omega ^\pm )\), \(\mathcal {K}_{\textbf{Z}}\phi \) satisfies Laplace’s equation, and \((\mathcal {K}_{\textbf{Z}}\phi )^*\in {L^2(\Gamma )}\) with

$$\begin{aligned} \widetilde{\gamma }^{\pm } \mathcal {K}_{\textbf{Z}}\phi (\textbf{x}) = \pm \frac{1}{2}\big ( \textbf{Z}(\textbf{x})\cdot \textbf{n}(\textbf{x})\big ) \phi (\textbf{x}) + K_{\textbf{Z}}\phi (\textbf{x}) \quad \quad \text { for a.e. } \textbf{x}\in \Gamma . \end{aligned}$$
(1.16)

Observe that (i) when \(\textbf{Z}= \textbf{n}\), \(\mathcal {K}_{\textbf{Z}}= \mathcal {D}\), \(K_{\textbf{Z}}= D\), and (1.16) is the usual jump relation for the double-layer potential, and (ii) we can rewrite \(K_{\textbf{Z}}\) as

$$\begin{aligned} K_{\textbf{Z}}\phi (\textbf{x}) = \int _\Gamma \left( \textbf{Z}(\textbf{y})\cdot \textbf{n}(\textbf{y}) \frac{\partial \Phi (\textbf{x},\textbf{y})}{\partial n(\textbf{y})} \phi (\textbf{y}) + \textbf{Z}(\textbf{y}) \cdot \nabla _{\Gamma }\Phi (\textbf{x},\textbf{y}) \phi (\textbf{y})\right) \textrm{d}s(\textbf{y}).\nonumber \\ \end{aligned}$$
(1.17)

In a similar way to how the \(L^2\) adjoint of D is \(D'\) (see, e.g., [68, Chapter 15, text around Equation 4.10]), the \(L^2\) adjoint of \(K_{\textbf{Z}}\) is

$$\begin{aligned} K_{\textbf{Z}}' \phi (\textbf{x}):= \big (\textbf{Z}(\textbf{x}) \cdot \textbf{n}(\textbf{x}) \big )D' \phi (\textbf{x}) + \textbf{Z}(\textbf{x}) \cdot \nabla _{\Gamma }S \phi (\textbf{x}). \end{aligned}$$
(1.18)

The significance of the operator \(K_{\textbf{Z}}'\) is that it appears in the inner product of \(\textbf{Z}\) with the non-tangential limit of \(\nabla \mathcal {S}\), where \({{\mathcal {S}}}\) is the single-layer potential defined for \(\phi \in {L^2(\Gamma )}\) by

$$\begin{aligned} {{\mathcal {S}}}\phi (\textbf{x}):= \int _\Gamma \Phi (\textbf{x},\textbf{y})\phi (\textbf{y})\, \textrm{d}s(\textbf{y}) \quad \quad \text { for }\textbf{x}\in \mathbb {R}^d\setminus \Gamma . \end{aligned}$$
(1.19)

Indeed, by [101, Theorems 1.6 and 1.11] (see also [68, Theorem 5], [17, Equation 2.30]), for almost every \(\textbf{x}\in \Gamma \),

$$\begin{aligned} \widetilde{\gamma }^\pm \nabla {{\mathcal {S}}}\phi (\textbf{x}) = \textbf{n}(\textbf{x}) \left( \mp \frac{1}{2}I + D'\right) \phi (\textbf{x}) + \nabla _{\Gamma }(S\phi )(\textbf{x}), \end{aligned}$$
(1.20)

so that

$$\begin{aligned} \textbf{Z}(\textbf{x})\cdot \widetilde{\gamma }^\pm \nabla {{\mathcal {S}}}\phi (\textbf{x}) = \left( \mp \frac{1}{2}(\textbf{Z}(\textbf{x})\cdot \textbf{n}(\textbf{x}))I + K_{\textbf{Z}}'\right) \phi (\textbf{x}). \end{aligned}$$
(1.21)

2 Statement of the main results

2.1 New boundary integral equations for the Laplace interior and exterior Dirichlet problems on general Lipschitz domains for \(d\ge 3\)

We focus on the case \(d\ge 3\), since the question of whether or not there exist BIE formulations of the Laplace IDP and EDP that are coercive, or coercive up to a compact perturbation, on Lipschitz domains is more pressing when \(d=3\) than \(d=2\) (because of the existing convergence theory for \(\pm \frac{1}{2}I + D\) and \(\pm \frac{1}{2}I+ D'\) on curvilinear polygons [16, 89, 90] but negative results for these operators for certain 3-d star-shaped polyhedra [19] recapped in Sect. 1.4). Results for \(d=2\) are given in Sect. 2.3.

2.1.1 The interior Dirichlet problem

Given \(\textbf{Z}\in (L^\infty (\Gamma ))^d\) and \(\alpha \in \mathbb {R}\), define the integral operators \(A'_{I,\textbf{Z},\alpha }\), \(A_{I,\textbf{Z},\alpha }\), and \(B_{I,\textbf{Z},\alpha }\) by

$$\begin{aligned}&A'_{I,\textbf{Z},\alpha }:= \frac{1}{2}(\textbf{Z}\cdot \textbf{n})I - K_{\textbf{Z}}' + \alpha S, \qquad A_{I,\textbf{Z},\alpha }:= \frac{1}{2}(\textbf{Z}\cdot \textbf{n})I - K_{\textbf{Z}}+ \alpha S, \end{aligned}$$
(2.1)
$$\begin{aligned}&B_{I,\textbf{Z},\alpha }:=-(\textbf{Z}\cdot \textbf{n}) H - \textbf{Z}\cdot \nabla _{\Gamma }\left( \frac{1}{2}I + D\right) + \alpha \left( \frac{1}{2}I + D\right) , \end{aligned}$$
(2.2)

with the subscript I standing for “interior", and the \('\) superscript indicating that \(A'_{I,\textbf{Z},\alpha }\) is the \(L^2\) adjoint of \(A_{I,\textbf{Z},\alpha }\).

Theorem 2.1

(New integral equations for Laplace IDP with \(d\ge 3\))

  1. (i)

    Direct formulation. Let u be the solution of the Laplace IDP of Definition 1.3 with \(d\ge 3\) and additionally \(g_D\in H^1(\Gamma )\). Then \(\partial _n^- u\) satisfies

    $$\begin{aligned} A'_{I,\textbf{Z},\alpha }\partial _n^- u=B_{I,\textbf{Z},\alpha }g_D. \end{aligned}$$
    (2.3)
  2. (ii)

    Indirect formulation. Given \(g_D\in {L^2(\Gamma )}\), if \(\phi \in {L^2(\Gamma )}\) satisfies

    $$\begin{aligned} A_{I,\textbf{Z},\alpha }\phi =- g_D, \end{aligned}$$
    (2.4)

    then

    $$\begin{aligned} u:= (\mathcal {K}_{\textbf{Z}}- \alpha {{\mathcal {S}}})\phi \end{aligned}$$
    (2.5)

    is the solution of the Laplace IDP of Definition 1.5.

  3. (iii)

    Continuity. \(A'_{I,\textbf{Z},\alpha }: {L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\), \(A_{I,\textbf{Z},\alpha }: {L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\), and \(B_{I,\textbf{Z},\alpha }:H^1(\Gamma )\rightarrow {L^2(\Gamma )}\), and these mappings are continuous.

  4. (iv)

    Coercivity up to compact perturbation. If \(\textbf{Z}\in (C(\Gamma ))^d\) and there exists \(c>0\) such that

    $$\begin{aligned} \textbf{Z}(\textbf{x}) \cdot \textbf{n}(\textbf{x}) \ge c \quad \text { for almost every } \textbf{x}\in \Gamma , \end{aligned}$$
    (2.6)

    then both \(A'_{I,\textbf{Z},\alpha }\) and \(A_{I,\textbf{Z},\alpha }\) are the sum of a coercive operator and a compact operator on \({L^2(\Gamma )}\).

  5. (v)

    Invertibility for all \(\alpha >0\). If \(\alpha >0\), \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) for some \(0<\beta <1\), and there exists \(c>0\) such that (2.6) holds, then both \(A'_{I,\textbf{Z},\alpha }:{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\) and \(A_{I,\textbf{Z},\alpha }:{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\) are invertible.

  6. (vi)

    Coercivity for sufficiently large \(\alpha \). If \(\textbf{Z}\) satisfies (2.6), \(\textbf{Z}\in (C^{0,1}(\Gamma ))^d\) with Lipschitz constant \(L_{\textbf{Z}}\), and

    $$\begin{aligned} 2\alpha \ge 3d L_{\textbf{Z}}, \end{aligned}$$
    (2.7)

    then both \(A'_{I,\textbf{Z},\alpha }\) and \(A_{I,\textbf{Z},\alpha }\) are coercive on \({L^2(\Gamma )}\) with coercivity constant c/2 (with c defined by (2.6)); indeed,

    $$\begin{aligned} \big (A'_{I,\textbf{Z},\alpha }\psi ,\psi \big )_{{L^2(\Gamma )}} \ge \frac{c}{2} \left\| \psi \right\| ^2_{{L^2(\Gamma )}} \quad \text { for all real-valued } \psi \in {L^2(\Gamma )}, \end{aligned}$$
    (2.8)

    and similarly for \(A_{I,\textbf{Z},\alpha }\).

Recall that if A is real and \((A\psi ,\psi )\ge C_{\textrm{coer}}\Vert \psi \Vert ^2_{{L^2(\Gamma )}}\) for all real-valued \(\psi \in {L^2(\Gamma )}\), then \(\Re (A\phi ,\phi )\ge C_{\textrm{coer}}\Vert \phi \Vert ^2_{{L^2(\Gamma )}}\) for all complex-valued \(\phi \in {L^2(\Gamma )}\); thus (2.8) implies that \(A'_{I,\textbf{Z},\alpha }\) and \(A_{I,\textbf{Z},\alpha }\) are coercive on complex-valued \({L^2(\Gamma )}\).

For any bounded Lipschitz open set \({\Omega ^-}\) there exists \(\textbf{Z}\in (C^{0,1}(\Gamma ))^d\) such that (2.6) holds; see, e.g., [40, Lemma 1.5.1.9], [78, Proof of Lemma 1.3], [18, Appendix D]. The combination of this result and Parts (iii) and (vi) of Theorem 2.1 imply that, for any bounded Lipschitz open set, there exists a BIE formulation of the Laplace IDP that is continuous and coercive in \({L^2(\Gamma )}\).

The vector field \(\textbf{Z}\) can be thought of as a “regularised normal vector"; the choice \(\textbf{Z}= \textbf{n}\) satisfies (2.6) but does not have the regularity required for Parts (iv), (v), and (vi) of Theorem 2.1 unless \({\Omega ^-}\) is, respectively, \(C^1\), \(C^{1,\beta }\), or \(C^{1,1}\). Indeed, from Parts (iv)-(vi) of the theorem we see that the stronger the property one wishes to obtain for \(A'_{I,\textbf{Z},\alpha }\) and \(A_{I,\textbf{Z},\alpha }\), the more regularity of \(\textbf{Z}\) is required. E.g., coercivity up to a compact perturbation is proved for continuous \({\textbf{Z}}\) satisfying (2.6) but coercivity is proved only for Lipschitz \({\textbf{Z}}\) satisfying (2.6).

2.1.2 The exterior Dirichlet problem

Given \(\textbf{Z}\in (L^\infty (\Gamma ))^d\) and \(\alpha \in \mathbb {R}\), define the integral operators \(A'_{E,\textbf{Z},\alpha }\), \(A_{E,\textbf{Z},\alpha }\), and \(B_{E,\textbf{Z},\alpha }\) by

$$\begin{aligned}&A'_{E,\textbf{Z},\alpha }:= \frac{1}{2}(\textbf{Z}\cdot \textbf{n})I + K_{\textbf{Z}}' + \alpha S, \qquad A_{E,\textbf{Z},\alpha }:= \frac{1}{2}(\textbf{Z}\cdot \textbf{n})I + K_{\textbf{Z}}+ \alpha S, \end{aligned}$$
(2.9)
$$\begin{aligned}&B_{E,\textbf{Z},\alpha }:=(\textbf{Z}\cdot \textbf{n}) H + \textbf{Z}\cdot \nabla _{\Gamma }\left( -\frac{1}{2}I + D\right) + \alpha \left( -\frac{1}{2}I + D\right) , \end{aligned}$$
(2.10)

with the subscript E standing for “exterior".

Theorem 2.2

(New integral equations for Laplace EDP with \(d\ge 3\))

  1. (i)

    Direct formulation. Let u be the solution of the Laplace EDP of Definition 1.4 with \(d=3\) and additionally \(g_D\in H^1(\Gamma )\). Then \(\partial _n^+ u\) satisfies

    $$\begin{aligned} A'_{E,\textbf{Z},\alpha }\partial _n^+ u=B_{E,\textbf{Z},\alpha }g_D. \end{aligned}$$
    (2.11)
  2. (ii)

    Indirect formulation. Given \(g_D\in {L^2(\Gamma )}\), if \(\phi \in {L^2(\Gamma )}\) satisfies

    $$\begin{aligned} A_{E,\textbf{Z},\alpha }\phi = g_D, \end{aligned}$$
    (2.12)

    then

    $$\begin{aligned} u:= (\mathcal {K}_{\textbf{Z}}+ \alpha {{\mathcal {S}}})\phi \end{aligned}$$
    (2.13)

    is the solution of the Laplace EDP of Definition 1.6.

  3. (iii)

    Continuity. \(A'_{E,\textbf{Z},\alpha }: {L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\), \(A_{E,\textbf{Z},\alpha }: {L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\), and \(B_{E,\textbf{Z},\alpha }:H^1(\Gamma )\rightarrow {L^2(\Gamma )}\), and these mappings are continuous.

  4. (iv)

    Coercivity up to compact perturbation. If \(\textbf{Z}\in (C(\Gamma ))^d\) and there exists \(c>0\) such that (2.6) holds, then both \(A'_{E,\textbf{Z},\alpha }\) and \(A_{E,\textbf{Z},\alpha }\) are the sum of a coercive operator and a compact operator on \({L^2(\Gamma )}\).

  5. (v)

    Invertibility for all \(\alpha >0\). If \(\alpha >0\), \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) for some \(0<\beta <1\), and there exists \(c>0\) such that (2.6) holds, then both \(A'_{E,\textbf{Z},\alpha }:{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\) and \(A_{E,\textbf{Z},\alpha }:{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\) are invertible.

  6. (vi)

    Coercivity for sufficiently large \(\alpha \). If \(\textbf{Z}\in (C^{0,1}(\Gamma ))^d\) with Lipschitz constant \(L_{\textbf{Z}}\) and (2.7) holds, then both \(A'_{E,\textbf{Z},\alpha }\) and \(A_{E,\textbf{Z},\alpha }\) are coercive on \({L^2(\Gamma )}\) with coercivity constant c/2 (with c defined by (2.6)), in that (2.8) holds with \(A'_{I,\textbf{Z},\alpha }\) replaced by either \(A'_{E,\textbf{Z},\alpha }\) or \(A_{E,\textbf{Z},\alpha }\).

Similar to the case of the IDP, the existence, for any bounded Lipschitz open set \({\Omega ^-}\), of a vector field \(\textbf{Z}\in (C^{0,1}(\Gamma ))^d\) such that (2.6) holds combined with Parts (iii) and (vi) of Theorem 2.2 imply that, for any bounded Lipschitz open set \({\Omega ^-}\) such that \({\Omega ^+}\) is connected, there exists a BIE formulation of the Laplace EDP that is continuous and coercive in \({L^2(\Gamma )}\).

2.1.3 The new formulations of the IDP and EDP for \(d\ge 3\) on domains that are star-shaped with respect to a ball

When \({\Omega ^-}\) is star-shaped with respect to a ball, the coercivity results in Theorems 2.1 and 2.2 take a particularly simple form.

Definition 2.3

  1. (i)

    \(D\) is star-shaped with respect to the point \(\textbf{x}_0\) if, whenever \(\textbf{x}\in D\), the segment \([\textbf{x}_0,\textbf{x}]\subset D\).

  2. (ii)

    \(D\) is star-shaped with respect to the ball \(B_{\kappa }(\textbf{x}_0)\) if it is star-shaped with respect to every point in \(B_{\kappa }(\textbf{x}_0)\).

Lemma 2.4

([73, Lemma 5.4.1]) If \(D\) is Lipschitz with outward unit normal vector \(\varvec{\nu }\), then \(D\) is star-shaped with respect to \(B_{\kappa }(\textbf{x}_0)\), for some \(\kappa >0\), if and only if \((\textbf{x}-\textbf{x}_0) \cdot \varvec{\nu }(\textbf{x}) \ge {\kappa }\) for all \(\textbf{x}\in \partial D\) for which \(\varvec{\nu }(\textbf{x})\) is defined.

From now on, if a domain \(D\) is star-shaped with respect to \(\textbf{x}_0\), we assume (without loss of generality) that \(\textbf{x}_0=\textbf{0}\).

Theorem 2.5

(Coercivity for star-shaped domains) Let \({\Omega ^-}\subset \mathbb {R}^d\), \(d\ge 3\), be a bounded Lipschitz domain that is star-shaped with respect to a ball of radius \(\kappa \), i.e.

$$\begin{aligned} \kappa := \mathop {{\textrm{ess}} \inf }_{\textbf{x}\in \Gamma }( \textbf{x}\cdot \textbf{n}(\textbf{x})). \end{aligned}$$
(2.14)

Then

$$\begin{aligned} A'_{I,\textbf{x},\alpha }\,\,\text { and }\,\,A_{I,\textbf{x},\alpha }, \,\,\text { with } \alpha \ge -(d-2)/2, \end{aligned}$$

and

$$\begin{aligned} \quad A'_{E,\textbf{x},\alpha }\,\, \text { and }\,\, A_{E,\textbf{x},\alpha },\,\,\text { with } \alpha \ge (d-2)/2, \end{aligned}$$

are all coercive on \({L^2(\Gamma )}\) with coercivity constant \(\kappa /2\), in that (2.8) holds with c replaced by \(\kappa \), and \(A'_{I,\textbf{Z},\alpha }\) replaced by any one of \(A'_{I,\textbf{x},\alpha }, A_{I,\textbf{x},\alpha }, A'_{E,\textbf{x},\alpha }\), or \(A_{E,\textbf{x},\alpha }\).

2.2 Convergence and conditioning of the associated Galerkin methods

We now show how Theorems 2.1 and 2.2 imply that (i) the associated Galerkin methods converge (see Sect. 2.2.1), and (ii) the associated Galerkin matrices are provably well-conditioned as the discretisation is refined, without the need for operator preconditioning (see Sect. 2.2.2). We focus on the case \(d\ge 3\) and the new BIE formulations for the IDP and EDP (appearing in Theorems 2.1 and 2.2), but analogous results hold for the BIEs for star-shaped domains in Sect. 2.1.3 and also for the new BIEs for \(d=2\) in Sect. 2.3 below.

2.2.1 Convergence of the Galerkin method for the new formulations

Corollary 2.6

(Convergence of the Galerkin method) Let \(({{\mathcal {H}}}_N)_{N=1}^\infty \) denote any sequence of finite-dimensional subsets of \({{\mathcal {H}}}:= L^2(\Gamma )\) that is asymptotically dense in \(L^2(\Gamma )\) in the sense defined in Sect. 1.2.

  1. (a)

    If \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) for some \(0<\beta <1\), and there exists \(c>0\) such that (2.6) holds then, for all \(\alpha >0\), the Galerkin method (1.4) applied to any one of the BIEs (2.3), (2.4), (2.11) or (2.12) converges (in the sense defined in Sect. 1).

  2. (b)

    If, additionally, \(\textbf{Z}\) is Lipschitz and \(\alpha \) satisfies (2.7), then, additionally, the Galerkin solution exists for every finite-dimensional subspace \({{\mathcal {H}}}_N\subset {L^2(\Gamma )}\) and satisfies the quasioptimal error estimate (1.9), with constant \(2\Vert A^\dag _{\textbf{Z},\alpha }\Vert _{{{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}}}/c\), where \(A^\dag _{\textbf{Z},\alpha }:=A_{I,\textbf{Z},\alpha }\) for the BIEs (2.3) and (2.4), \(A^\dag _{\textbf{Z},\alpha }:=A_{E,\textbf{Z},\alpha }\) for the BIEs (2.11) and (2.12).

Since the proof is so short, we include it here.

Proof of Corollary 2.6

(a) This follows from Parts (iv) and (v) of Theorem 2.1/Theorem 2.2 and Part (b) of Theorem 1.1. (b) This follows from Part (vi) of Theorem 2.1/Theorem 2.2 and Part (c) of Theorem 1.1. \(\square \)

We highlight that Corollary 2.6 is the first time convergence of the Galerkin method for a BIE posed in \({L^2(\Gamma )}\) used to solve a boundary-value problem for Laplace’s equation has been proved with the only assumption on \(\Gamma \) that it is Lipschitz; the same is true if \(\Gamma \) is assumed to be Lipschitz polyhedral in 3-d.

Remark 2.7

(Bounding the best approximation and Galerkin errors) For 3-d Lipschitz polyhedra the smoothness of the solution, in particular its singularities at corners and edges, is well understood (see, e.g., [104]) for the direct formulations (2.3) and (2.11), where the solution of the integral equation is \(\phi =\partial _n^\pm u\). Moreover, it is well understood how to design effective h- and hp-boundary element approximation spaces \({{\mathcal {H}}}_N\) based on graded, anisotropic meshes so as to obtain optimal best approximation error estimates (see, e.g., [30, 31, 61, 62, 104]), indeed exponential convergence of \(\min _{\psi \in {{\mathcal {H}}}_N}\big \Vert \phi -\psi \big \Vert _{{L^2(\Gamma )}}\) as a function of \(M_N:=\dim ({{\mathcal {H}}}_N)\) if the Dirichlet data \(g_D\) is the restriction to \(\Gamma \) of an analytic function (see [62, Theorem 3.1]). Further, by Part (a) of Corollary 2.6 and the quasioptimality (1.7), the same rates of convergence follow for the Galerkin error \(\Vert \phi -\phi _N\Vert _{L^2(\Gamma )}\) as long as \(\alpha >0\).

2.2.2 Solution of the Galerkin linear systems of the new formulations

Let \({{\mathcal {H}}}_N = \textrm{span} \{\psi ^{N}_1,\ldots , \psi ^{N}_{M_N}\}\), with \(M_N=\dim ({{\mathcal {H}}}_N)\) and \(\{\psi ^{N}_1,\ldots , \psi ^{N}_{M_N}\}\) a basis for \({{\mathcal {H}}}_N\). The Galerkin method (1.4) applied to (2.4) is then equivalent to the linear system

$$\begin{aligned} {\textsf{A}}\textbf{x}= \textbf{b}\end{aligned}$$
(2.15)

where

$$\begin{aligned} ({\textsf{A}})_{ij}:= \big (A^\dag _{\textbf{Z},\alpha }\psi ^{N}_j,\psi ^{N}_i\big )_{L^2(\Gamma )} \quad \text { and }\quad b_i:= -(g_D,\psi ^{N}_i)_{L^2(\Gamma )}, \quad i,j=1,\ldots , M_N,\nonumber \\ \end{aligned}$$
(2.16)

with \(A^\dag _{\textbf{Z},\alpha }:= A_{I,\textbf{Z},\alpha }\), and with the Galerkin solution \(\phi _N\) given by \(\phi _N= \sum _{j=1}^{M_N} x^{N}_j \psi ^{N}_j\), where \(\textbf{x}= (x_1^N,\ldots ,x_{M_N}^N)^T\). The Galerkin method applied to (2.3), (2.12), or (2.11), respectively, is also equivalent to (2.15), with \(A^\dag _{\textbf{Z},\alpha }:=A'_{I,\textbf{Z},\alpha }\), \(A_{E,\textbf{Z},\alpha }\), or \(A'_{E,\textbf{Z},\alpha }\) in (2.16) and with correspondingly different definitions of the right-hand side components \(b_i\).

In each case, whether \(A^\dag _{\textbf{Z},\alpha }=A_{I,\textbf{Z},\alpha }\), \(A'_{I,\textbf{Z},\alpha }\), \(A_{E,\textbf{Z},\alpha }\), or \(A'_{E,\textbf{Z},\alpha }\), the matrix \({\textsf{A}}\) defined in (2.16) is non-symmetric, and a popular method for solving such non-symmetric linear systems is the generalised minimum residual method (GMRES) [86], which we now briefly recall. Consider the abstract linear system \({\textsf{C}}\textbf{x}= \textbf{d}\) in \(\mathbb {C}^{M_N}\), where \({\textsf{C}}\in \mathbb {C}^{M_N\times M_N}\) is invertible. Let \(\textbf{x}_{0}\) be an initial guess for \(\textbf{x}\), and define the corresponding initial residual \(\textbf{r}^0:={\textsf{C}}\textbf{x}^0-\textbf{d}\) and the corresponding standard Krylov spaces by

$$\begin{aligned} {{\mathcal {K}}}^m({\textsf{C}}, \textbf{r}_0):= \textrm{span}\big \{{\textsf{C}}^j \textbf{r}_0: j = 0, \ldots , m-1\big \}. \end{aligned}$$

For \(m \ge 1\), define the mth GMRES iterate \(\textbf{x}_m\) to be the unique element of \({{\mathcal {K}}}^m\) \(({\textsf{C}}, \textbf{r}_0)\) such that its residual \(\textbf{r}_m:= {\textsf{C}}\textbf{x}_m-\textbf{d}\) satisfies the minimal residual property

$$\begin{aligned} \Vert \textbf{r}_m \Vert _2 = \min _{\textbf{y}\in {{\mathcal {K}}}_m({\textsf{C}}, \textbf{r}^0)} \Vert {{\textsf{C}}} {\textbf{y}} -{\textbf{d}}\Vert _2. \end{aligned}$$

The main result of this subsection (Theorem 2.11 below) is a result about the convergence of GMRES applied to (2.15) preconditioned by diagonal matrices. This result is proved under the following assumption in which (and subsequently) for every \(v_N\in {{\mathcal {H}}}_N\) we denote by \(\textbf{v}\in \mathbb {C}^{M_N}\) the unique vector \(\textbf{v}=(v^N_1,\ldots , v^N_{M_N})^T\) such that \(v_N = \sum _{j=1}^{M_N} v^N_j \psi ^{N}_j\).

Assumption 2.8

\(({{\mathcal {H}}}_N)_{N=1}^\infty \), and the associated bases \((\{\psi ^N_1,\ldots ,\psi ^N_{M_N}\})_{N=1}^\infty \), are such that there exists a sequence of diagonal matrices \(({\textsf{D}}_N)_{N=1}^\infty \) and \(C_1, C_2>0\), independent of N, such that

$$\begin{aligned} C_1 \big \Vert {\textsf{D}}_N^{1/2} \textbf{w}\big \Vert _2 \le \left\| w_N\right\| _{{L^2(\Gamma )}} \le C_2 \big \Vert {\textsf{D}}_N^{1/2} \textbf{w}\big \Vert _2 \quad \text { for all }w_N \in {{\mathcal {H}}}_N. \end{aligned}$$
(2.17)

Remark 2.9

(Relation of \(C_1\) and \(C_2\) in (2.17) to the mass matrix) Let \({\textsf{M}}_N\) be the mass matrix defined by

$$\begin{aligned} ({\textsf{M}}_N)_{ij}:= \big (\psi _j^{N},\psi ^{N}_i\big )_{L^2(\Gamma )}. \end{aligned}$$
(2.18)

Since \(({\textsf{M}}_N\textbf{w},\textbf{w})_2 = \Vert w_N\Vert ^2_{L^2(\Gamma )}\), and thus \(\Vert {\textsf{M}}_N^{1/2}\textbf{w}\Vert _2 = \Vert w_N\Vert _{L^2(\Gamma )}\), (2.17) implies that

$$\begin{aligned} C_1 \Vert \textbf{v}\Vert _2 \le \big \Vert {\textsf{M}}^{1/2}_N {\textsf{D}}^{-1/2}_N \textbf{v}\big \Vert _{{L^2(\Gamma )}} \le C_2 \Vert \textbf{v}\Vert _2 \quad \text { for all }v_N \in {{\mathcal {H}}}_N; \end{aligned}$$

i.e., \({\textsf{D}}_N^{1/2}\) can be considered as a right-preconditioner for \({\textsf{M}}_N^{1/2}\), removing the N-dependence of the norms of \({\textsf{M}}_N^{1/2}\) and \({\textsf{M}}_N^{-1/2}\).

Remark 2.10

(When does Assumption 2.8hold?) If \((\psi ^{N}_j)_{j=1}^{M_N}\) is an orthogonal basis of \({{\mathcal {H}}}_N\), then Assumption 2.8 is satisfied with \({\textsf{D}}_N= {\textsf{M}}_N\) and \(C_1=C_2=1\); therefore, if \((\psi ^{N}_j)_{j=1}^{M_N}\) is an orthonormal basis of \({{\mathcal {H}}}_N\), then Assumption 2.8 is satisfied with \({\textsf{D}}_N\) equal the identity matrix, \(I_{N}\).

Lemma 4.15 below shows that Assumption 2.8 is satisfied (and specifies the matrices \({\textsf{D}}_N\)) when \(({{\mathcal {H}}}_N)_{N=1}^\infty \) are piecewise-polynomial subspaces allowing discontinuities across elements, under very mild constraints on the sequence of meshes; in particular Lemma 4.15 covers nodal basis functions on highly anisotropic meshes, such as the meshes highlighted in Remark 2.7. We highlight that the assumption that discontinuities are allowed is made so that we can assume in the proof that each basis function is supported on only one element, but we expect the result to hold more generally. In particular, if \(d=3\) and the sequence of meshes is regular, shape-regular, and quasi-uniform (in the sense of [87, Definitions 4.1.4, 4.1.12, and 4.1.13], respectively) on a polyhedral or piecewise curved domain (in the sense of [87, Assumptions 4.3.17 and 4.3.18], respectively), then Assumption 2.8 holds for a general nodal basis with \({\textsf{D}}_N = h^{d-1}{\textsf{I}}_N\) by [87, Theorem 4.4.7].

Let \(\textbf{y}_m\) be the mth iterate when the linear system

$$\begin{aligned} ({\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2})\textbf{y}= {\textsf{D}}_N^{-1/2}\textbf{b}\end{aligned}$$
(2.19)

is solved using GMRES with zero initial guess. (Since \({\textsf{D}}_N\) is diagonal, the cost of calculating the action of \({\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\) is dominated by the cost of calculating the action of \({\textsf{A}}\).) Let

$$\begin{aligned} \phi _N^m:=\sum _{j=1}^{M_N} ({\textsf{D}}_N^{-1/2}\textbf{y}_m)_j \psi ^{N}_j, \end{aligned}$$
(2.20)

and observe that, by (2.19), (2.15), and (2.16), the Galerkin solution \(\phi _N\) is given by

$$\begin{aligned} \phi _N=\sum _{j=1}^{M_N} ({\textsf{D}}_N^{-1/2}\textbf{y})_j \psi ^{N}_j. \end{aligned}$$
(2.21)

Theorem 2.11

(Convergence of GMRES applied to the linear system (2.19)) Assume that \(\textbf{Z}\) is Lipschitz, there exists \(c>0\) such that (2.6) holds, \(\alpha \) satisfies (2.7), and Assumption 2.8 holds. With \(C_1\) and \(C_2\) as in (2.17), and where \(\Vert A^\dag _{\textbf{Z},\alpha }\Vert \) denotes \(\Vert A^\dag _{\textbf{Z},\alpha }\Vert _{L^2(\Gamma )\rightarrow L^2(\Gamma )}\), let \(\beta \in [0,\pi /2)\) be defined such that

$$\begin{aligned} \cos \beta = \frac{c}{2\Vert A^\dag _{\textbf{Z},\alpha }\Vert }\left( \frac{C_1}{C_2}\right) ^{2}\quad \text { and let } \quad \gamma _\beta := 2 \sin \left( \frac{\beta }{4-2\beta /\pi }\right) \end{aligned}$$
(2.22)

(observe that \(\cos \beta \) is indeed \(\le 1\) since, by definition, \(C_1\le C_2\) and \(c/2 \le \Vert A^\dag _{\textbf{Z},\alpha }\Vert \)). Given \(\varepsilon >0\), if

$$\begin{aligned} m\ge \left( \log \left( \frac{1}{\gamma _\beta }\right) \right) ^{-1} \left[ \log \left( \frac{24 \Vert A^\dag _{\textbf{Z},\alpha }\Vert }{c}\left( \frac{C_2}{C_1}\right) ^3\right) + \log \left( \frac{1}{\varepsilon }\right) \right] , \end{aligned}$$
(2.23)

then

$$\begin{aligned} \frac{ \left\| \phi -\phi _N^m\right\| _{{L^2(\Gamma )}}}{\left\| \phi \right\| _{{L^2(\Gamma )}}} \le (1+\varepsilon )\frac{2\Vert A^\dag _{\textbf{Z},\alpha }\Vert }{c}\left( \min _{\psi \in {{\mathcal {H}}}_N}\frac{\left\| \phi -\psi \right\| _{{L^2(\Gamma )}}}{\left\| \phi \right\| _{{L^2(\Gamma )}}}\right) + \varepsilon \end{aligned}$$
(2.24)

(compare to (1.9)).

The key point about Theorem 2.11 is that both the bound on the number of iterations (2.23) and the terms on the right-hand side of (2.24) other than the best-approximation error are independent of the dimension \(M_N\). Therefore, the number of iterations required to solve systems involving \({\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\) to a prescribed accuracy does not increase as the discretisation is refined and \(M_N\) increases. The same property holds when the conjugate-gradient method is applied to sequences of \(M_N\times M_N\) symmetric, positive-definite matrices whose condition number is bounded independently of \(M_N\).

Remark 2.12

(Bounds on the condition number) Recall that, in general, a bound on the condition number for a nonnormal matrix cannot be used to rigorously prove results about the convergence of GMRES applied to that matrix; see, e.g., [60, Page 165], [32, Page 3]. We have no reason to expect that \({\textsf{A}}\) is normal, so to prove Theorem 2.11 we crucially use the coercivity of \(A_{I,\textbf{Z},\alpha }\).

Nevertheless, since there is a long history of studying the condition numbers of second-kind integral equations posed on \({L^2(\Gamma )}\), we record that in the course of proving Theorem 2.11 we prove that, where \(C:= \Vert A^\dag _{\textbf{Z},\alpha }\Vert _{L^2(\Gamma )\rightarrow L^2(\Gamma )}\),

$$\begin{aligned} {{\,\textrm{cond}\,}}\big ( {\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\big ) \le \frac{2 C}{c}\left( \frac{C_2}{C_1}\right) ^2, \end{aligned}$$

where \({{\,\textrm{cond}\,}}({\textsf{B}}):= \Vert {\textsf{B}}\Vert _2 \Vert {\textsf{B}}^{-1}\Vert _2\) (see (4.41) and (4.42) below); i.e., \({{\,\textrm{cond}\,}}( {\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2})\) is bounded independently of the dimension \(M_N\). Furthermore, by the arguments in [6, §III], [63, Equation B.8], and [4, Equation 3.6.166],

$$\begin{aligned} {{\,\textrm{cond}\,}}({\textsf{A}}) \le \frac{2C}{c}\, {{\,\textrm{cond}\,}}({\textsf{M}}_N); \end{aligned}$$
(2.25)

recall that for a piecewise polynomial boundary element approximation space (in dimensions \(d=2\) or 3) using nodal basis functions on a quasiuniform mesh (with these terms defined in Sect. 4.6.2), \({{\,\textrm{cond}\,}}({\textsf{M}}_N)\) is independent of the dimension \(M_N\) (see, e.g., the proof of Part (i) of Lemma 4.15 or [87, Remark 4.5.3]).

Remark 2.13

(Calculating the entries of the Galerkin matrices for the new BIEs) Calculating the entries of the Galerkin matrices for the new BIEs requires evaluating integrals involving only the operators S, D, and \(D'\). Indeed, for the direct BIE (2.3), the expression (1.18) shows that constructing the Galerkin matrix requires evaluating integrals involving the operators above, and evaluating integrals of the form

$$\begin{aligned} \int _\Gamma \textbf{Z}(\textbf{x})\cdot \nabla _{\Gamma }\big (S \psi _j(\textbf{x})\big ) \psi _i (\textbf{x})\, \textrm{d}s(\textbf{x}) \end{aligned}$$
(2.26)

where \(\psi _i, \psi _j \in {{\mathcal {H}}}_N\). It is shown in [91, §4.3] that, using integration by parts, the integral (2.26) can be evaluated in terms of integrals involving derivatives of \(\psi _i\) and values (but not derivatives) of \(S\psi _j\). Constructing the Galerkin matrix of the indirect BIE (2.4) requires evaluating the integral

$$\begin{aligned} \int _\Gamma \big (K_{\textbf{Z}}\psi _j(\textbf{x})\big ) \psi _i (\textbf{x})\, \textrm{d}s(\textbf{x}), \end{aligned}$$
(2.27)

where \(K_{\textbf{Z}}\) is defined by (1.15). Using the expansion \(\textbf{Z}= \sum _{i=1}^d Z_i \textbf{e}_i\) in the definition of \(K_{\textbf{Z}}\), we have

$$\begin{aligned} K_{\textbf{Z}}\phi (\textbf{x})= \sum _{i=1}^d \textbf{e}_i \cdot \int _\Gamma \nabla _\textbf{y}\Phi (\textbf{x},\textbf{y})Z_i(\textbf{y})\phi (\textbf{y}) \, \textrm{d}s(\textbf{y}). \end{aligned}$$

Given \(\textbf{x}\in \Gamma \), \(\textbf{e}_i = (\textbf{e}_i\cdot \textbf{n}(\textbf{x})) \textbf{n}(\textbf{x}) + \textbf{e}_T(\textbf{x})\), where \(\textbf{e}_T(\textbf{x})\) is tangent to \(\Gamma \) at \(\textbf{x}\). Thus,

$$\begin{aligned} K_{\textbf{Z}}\phi (\textbf{x}) = -\sum _{i=1}^d\Big ( (\textbf{e}_i\cdot \textbf{n}(\textbf{x})) D'\big (Z_i \phi \big )(\textbf{x}) + \textbf{e}_i\cdot \nabla _\Gamma S \big (Z_i \phi \big )(\textbf{x})\Big ); \end{aligned}$$

the integral (2.27) can therefore be evaluated in terms of integrals only involving \(D'\) and (by the discussion above regarding (2.26)) S.

2.3 New boundary integral equations for the Laplace interior and exterior Dirichlet problems on general 2-d Lipschitz domains

The biggest difference in going from \(d\ge 3\) to \(d=2\) is that the single-layer potential is no longer o(1) at infinity, and is only O(1) for a restricted class of densities; see (4.23), (4.24) below. In this section, we first outline what parts of the \(d\ge 3\) results in Sect. 2.1 immediately carry over to \(d=2\). We then present modifications of the integral equations in Theorem 2.1 and Theorem 2.2 that are coercive for general 2-d Lipschitz domains when \(\alpha \) is sufficiently large.

Inspecting the proof of Theorem 2.1 in Sect. 4, we see that Parts (i), (ii), (iii), and (iv) hold when \(d=2\) (i.e., everything apart from invertibility (v) and coercivity for sufficiently large \(\alpha \) (vi)).

Similarly, inspecting the proof of Theorem 2.2 in Sect. 4, we see that Parts (i), (iii), (iv), and (v) hold when \(d=2\) (i.e., everything apart from the indirect formulation (ii) and coercivity for sufficiently large \(\alpha \) (vi)), although, firstly, \(\alpha u_\infty \) must be added to the right-hand side of the BIE (2.11), where \(u_\infty \) is the limit of u at infinityFootnote 6 and, secondly, Part (v) holds when \(d=2\) provided the constant a in the fundamental solution (1.1) is not equal to the capacity of \(\Gamma \), \(\textrm{Cap}_\Gamma \) (defined in, e.g., [65, Page 263]), which holds, in particular, if \(a>\mathop {\textrm{diam}}(\Gamma )\).

Let

$$\begin{aligned} P_\Gamma \phi (\textbf{x}):= \frac{1}{|\Gamma |} \int _\Gamma \phi (\textbf{y})\, \textrm{d}s(\textbf{y})= \frac{1}{|\Gamma |} \big (\phi ,1\big )_{L^2(\Gamma )}\quad \text { for }\textbf{x}\in \Gamma ; \end{aligned}$$
(2.28)

i.e., \(P_\Gamma \phi \) is the mean value of \(\phi \). Observe that \(P_\Gamma ^2=P_\Gamma \) and \(P_\Gamma '=P_\Gamma \). Let \(Q_\Gamma :=I-P_\Gamma \).

We give two theorems: the first for general 2-d Lipschitz domains, the second for 2-d star-shaped Lipschitz domains. Recall that we are assuming throughout that \(\textbf{Z}\) is real-valued.

Theorem 2.14

(New integral equations for Laplace IDP and EDP in 2-d) Suppose that \(\textbf{Z}\in (L^\infty (\Gamma ))^2\) and \(\alpha ,\beta \in \mathbb {R}\).

  1. (i)

    IDP direct formulation. Let u be the solution of the Laplace IDP of Definition 1.3 with \(d=2\) and \(g_D\in H^1(\Gamma )\). Let

    $$\begin{aligned} T'_{I, \textbf{Z},\alpha ,\beta }:= Q_\Gamma A'_{I,\textbf{Z},\alpha }Q_\Gamma + \beta P_\Gamma . \end{aligned}$$
    (2.29)

    Then \(\partial _n^- u\) satisfies

    $$\begin{aligned} T'_{I,\textbf{Z},\alpha ,\beta }(\partial _n^- u) =Q_\Gamma B_{I,\textbf{Z},\alpha }\,g_D. \end{aligned}$$
    (2.30)
  2. (ii)

    IDP indirect formulation. Let

    $$\begin{aligned} T_{I, \textbf{Z},\alpha ,\beta }:= Q_\Gamma A_{I,\textbf{Z},\alpha }Q_\Gamma + \beta P_\Gamma . \end{aligned}$$

    Given \(g_D\in {L^2(\Gamma )}\), if \(\phi \) satisfies

    $$\begin{aligned} T_{I,\textbf{Z},\alpha ,\beta }\phi =- g_D, \end{aligned}$$
    (2.31)

    then, if \(d=2\),

    $$\begin{aligned} u:= (\mathcal {K}_{\textbf{Z}}-\alpha {{\mathcal {S}}}) Q_\Gamma \phi + P_\Gamma A_{I,\textbf{Z},\alpha }Q_\Gamma \phi - \beta P_\Gamma \phi \end{aligned}$$
    (2.32)

    is the solution of the Laplace IDP of Definition 1.5.

  3. (iii)

    EDP direct formulation. Let u be the solution of the Laplace EDP of Definition 1.4 with \(d=2\) and \(g_D\in H^1(\Gamma )\). Let

    $$\begin{aligned} T'_{E, \textbf{Z},\alpha ,\beta }:= Q_\Gamma A'_{E,\textbf{Z},\alpha }Q_\Gamma + \beta P_\Gamma . \end{aligned}$$
    (2.33)

    Then \(\partial _n^+ u\) satisfies

    $$\begin{aligned} T'_{E,\textbf{Z},\alpha ,\beta }(\partial _n^+ u) =Q_\Gamma B_{E,\textbf{Z},\alpha }\,g_D. \end{aligned}$$
    (2.34)
  4. (iv)

    EDP indirect formulation. Let

    $$\begin{aligned} T_{E, \textbf{Z},\alpha ,\beta }:= Q_\Gamma A_{E,\textbf{Z},\alpha }Q_\Gamma + \beta P_\Gamma . \end{aligned}$$

    Given \(g_D\in {L^2(\Gamma )}\), if \(\phi \) satisfies

    $$\begin{aligned} T_{E,\textbf{Z},\alpha ,\beta }\phi =g_D, \end{aligned}$$
    (2.35)

    then, if \(d=2\),

    $$\begin{aligned} u:= (\mathcal {K}_{\textbf{Z}}+\alpha {{\mathcal {S}}}) Q_\Gamma \phi - P_\Gamma A_{I,\textbf{Z},\alpha }Q_\Gamma \phi + \beta P_\Gamma \phi \end{aligned}$$
    (2.36)

    is the solution of the Laplace EDP of Definition 1.6.

  5. (v)

    Coercivity. If \(\textbf{Z}\in (C^{0,1}(\Gamma ))^2\) satisfies (2.6), \(\alpha \) satisfies (2.7), and \(\beta =c/2\), then \(T'_{I,\textbf{Z},\alpha ,\beta }\), \(T_{I,\textbf{Z},\alpha ,\beta }\), \(T'_{E,\textbf{Z},\alpha ,\beta }\), and \(T_{E,\textbf{Z},\alpha ,\beta }\) are all coercive on \({L^2(\Gamma )}\) with coercivity constant c/2.

Theorem 2.15

(New integral equations for 2-d star-shaped domains)

  1. (i)

    IDP direct formulation. Let u be the solution of the Laplace IDP of Definition 1.3 with \(d=2\) and \(g_D\in H^1(\Gamma )\). Then \(\partial _n^- u\) satisfies

    $$\begin{aligned} \left( A'_{I,\textbf{x},0} - \frac{|\Gamma |}{4\pi }P_\Gamma \right) \partial _n^- u=B_{I,\textbf{x},0} \,g_D. \end{aligned}$$
    (2.37)
  2. (ii)

    IDP indirect formulation. Given \(g_D\in {L^2(\Gamma )}\), if \(\phi \) satisfies

    $$\begin{aligned} \left( A_{I,\textbf{x},0} - \frac{|\Gamma |}{4\pi }P_\Gamma \right) \phi =- g_D, \end{aligned}$$
    (2.38)

    then, if \(d=2\),

    $$\begin{aligned} u:= \mathcal {K}_{\textbf{Z}}\phi + \frac{|\Gamma |}{4\pi }P_\Gamma \phi \end{aligned}$$

    is the solution of the Laplace IDP of Definition 1.5.

  3. (iii)

    EDP direct formulation. Let u be the solution of the Laplace EDP of Definition 1.3 with \(d=2\) and \(g_D\in H^1(\Gamma )\). Then \(\partial _n^+ u\) satisfies

    $$\begin{aligned} \left( A'_{E,\textbf{x},0} + \frac{|\Gamma |}{4\pi }P_\Gamma \right) \partial _n^+ u=B_{E,\textbf{x},0} \,g_D. \end{aligned}$$
    (2.39)
  4. (iv)

    EDP indirect formulation. Given \(g_D\in {L^2(\Gamma )}\), if \(\phi \) satisfies

    $$\begin{aligned} \left( A_{E,\textbf{x},0} + \frac{|\Gamma |}{4\pi }P_\Gamma \right) \phi = g_D, \end{aligned}$$
    (2.40)

    then, if \(d=2\),

    $$\begin{aligned} u:= \mathcal {K}_{\textbf{Z}}\phi + \frac{|\Gamma |}{4\pi }P_\Gamma \phi \end{aligned}$$

    is the solution of the Laplace EDP of Definition 1.6.

  5. (v)

    Coercivity. If \({\Omega ^-}\) is star-shaped with respect to a ball of radius \(\kappa \) (i.e. (2.14) holds), then each of the integral operators on the left-hand sides of (2.37), (2.38), (2.39), and (2.40) is coercive on \({L^2(\Gamma )}\) with coercivity constant \(\kappa /2\).

3 Discussion of the ideas behind the new BIEs and links to previous work

3.1 How the BIEs arise

The indirect BIE (2.4) for the IDP arises from imposing the boundary condition on the ansatz \(u= (\mathcal {K}_{\textbf{Z}}-\alpha {{\mathcal {S}}})\phi \) via taking the nontangential limit. Similarly, the indirect BIE (2.12) for the EDP arises from the ansatz \(u= (\mathcal {K}_{\textbf{Z}}+\alpha {{\mathcal {S}}})\phi \). For the indirect BIEs for \(d=2\) in Theorem 2.14, the idea is the same, except now a) the density in the ansatz is not a general \(L^2(\Gamma )\) function (so that \({{\mathcal {S}}}\phi \) has the correct behaviour at infinity), and b) extra terms are added to the ansatz to ensure that the resulting BIE is still coercive on \({L^2(\Gamma )}\).

For the direct BIE (2.3) for the IDP, recall that \(u={{\mathcal {S}}}\partial _n^- u- {{\mathcal {D}}}\gamma ^- u\) by Green’s integral representation. The direct BIE (2.3) then arises from considering

$$\begin{aligned} -\textbf{Z}\cdot \widetilde{\gamma }^- (\nabla u ) + \alpha \gamma ^- u. \end{aligned}$$

Similarly, the direct BIE (2.11) for the EDP arises from considering

$$\begin{aligned} \textbf{Z}\cdot \widetilde{\gamma }^+ (\nabla u ) + \alpha \gamma ^+ u, \end{aligned}$$

with \(u = -{{\mathcal {S}}}\partial _n^+ u+ {{\mathcal {D}}}\gamma ^+ u\). Alternatively, since (informally) \(\textbf{Z}\cdot \nabla = (\textbf{Z}\cdot \textbf{n}) \partial _n + \textbf{Z}\cdot \nabla _{\Gamma }\), the direct BIE (2.3) can be obtained by adding (i) \((\textbf{Z}\cdot \textbf{n})\) multiplied by the standard direct second-kind BIE

$$\begin{aligned} \left( \frac{1}{2}I - D'\right) \partial _n^- u= -H\gamma ^- u, \end{aligned}$$
(3.1)

(ii) \(-\textbf{Z}\cdot \nabla _{\Gamma }\) applied to the standard direct first-kind BIE

$$\begin{aligned} S\partial _n^- u= \left( \frac{1}{2}I + D\right) \gamma ^- u, \end{aligned}$$
(3.2)

and (iii) \(\alpha \) multiplied by (3.2). Similar considerations hold for the direct BIE (2.11), and the 2-d direct BIEs of Theorems 2.14 and 2.15, where, additionally, one uses that \(P_\Gamma (\partial ^{\pm }_n u)=0\) (see Lemma 4.10).

3.2 The other BVPs solved by the new BIEs

The BIEs introduced in Sect. 2 to solve the Dirichlet problem can be used to solve other Laplace BVPs. Although the focus of this paper is on solving the Dirichlet problem, we highlight this fact here since these other BVPs affect the properties of the new BIEs.

For example, the BIO \(A'_{E,\textbf{Z},\alpha }\) used to solve the EDP in Theorem 2.2 can also be used to solve the Laplace interior oblique Robin problem, i.e., the problem of finding u in \({\Omega ^-}\) satisfying \(\Delta u=0\) and

$$\begin{aligned} (\textbf{Z}\cdot \textbf{n}) \partial _n^- u+ \textbf{Z}\cdot \nabla _\Gamma ( \gamma ^- u) + \alpha \, \gamma ^- u= g \quad \text { on } \Gamma ; \end{aligned}$$
(3.3)

see Definition 5.1 and Theorem 5.5 below. Similarly, the BIO \(A'_{I,\textbf{Z},\alpha }\) used to solve the IDP in Theorem 2.1 can also be used to solve the Laplace exterior oblique Robin problem; see Definition 5.2 and Theorem 5.6 below. This relationship means that the injectivity results implicit in Part (v) of Theorems 2.1 and 2.2 are obtained by proving uniqueness of these oblique Robin problems; see Sect. 5.3.

3.3 The use of similar BIEs by Calderón [13] and Medková [67]

Calderón [13] used indirect versions of the BIEs in Theorems 2.1 and 2.2 with \(\alpha =0\) to prove wellposedness results about the Dirichlet problem and the oblique derivative problem (i.e., (3.3) with \(\alpha =0\)) with data in \(L^p(\Gamma )\). Indeed, [13] posed the ansatz \(u= \mathcal {K}_{\textbf{Z}}\phi \) for the IDP, which gives the BIE \(A_{I, \textbf{Z}, 0} \phi =-g_D\) [13, Page 39], and posed the ansatz \(u={{\mathcal {S}}}\phi \) for the oblique derivative problem, which gives the BIE \(A'_{E,\textbf{Z},0} \phi = g\) [13, Page 45]. Furthermore, Medková [67, §5.23] posed the ansatz \(u = {{\mathcal {S}}}\phi \) for the interior oblique Robin problem, resulting in \(A'_{I,\textbf{Z},\alpha }\phi = -g\).

In both [13] and [67], the BIOs are proved to be Fredholm of index zero on \(L^2(\Gamma )\); see [13, Page 39] (where the result is proved to hold on a slightly wider range of \(L^p(\Gamma )\) spaces) and [67, Proposition 5.23.2].

3.4 The main new properties of the BIEs of this paper: coercivity for appropriate \(\alpha \)

Building on the work of Calderón and Medková, we show that the BIOs are not only Fredholm of index zero on \(L^2(\Gamma )\), but invertible for general Lipschitz domains as soon as \(\alpha >0\), and, crucially, coercive if \(\alpha \) is chosen appropriately (so also coercive plus compact for all \(\alpha >0\)). For star-shaped domains this coercivity can be proved using a simple modification of Calderón’s proof that the BIOs are Fredholm of index zero (see Lemmas 4.1 and 4.2 below). For general domains this coercivity (for appropriate \(\alpha \)) is proved using Rellich-type identities (with this method also giving an alternative proof of coercivity for star-shaped domains). Recall that identities arising from multiplying \(\Delta u\) by a derivative of u are associated with the name Rellich, due to Rellich’s introduction of the multiplier \(\textbf{x}\cdot \nabla u\) for the Helmholtz equation in [84]; these identities have been well-used in the study of the Laplace, Helmholtz, and other elliptic equations, see, e.g., the overviews in [49, Pages 111 and 112], [17, §5.3], [74, §1.4]. Verchota [101] famously used Rellich identities to prove invertibility of \(\frac{1}{2}I - D\) and \(\frac{1}{2}I - D'\) on \({L^2(\Gamma )}\) (see Remark 4.9 below) and Medková [67, §5.23] also used Rellich identities to prove that \(A'_{I, \textbf{Z}, \alpha }\) is invertible for sufficiently large \(\alpha \) [67, Lemma 5.23.1, Prop. 5.23.2, Theorem 5.23.4].

Our coercivity results are proved using the identity arising from multiplying \(\Delta u\) by \(\textbf{Z}\cdot \nabla u + \alpha u\) (see Lemma 4.6 below); our use of a multiplier that is a linear combination of u and a derivative of u is inspired by the use of such multipliers by Morawetz [75,76,77], and the particular identity we use also appears as [54, Equation 2.28]. As recalled in Sect. 1.3, the idea of proving coercivity of Laplace BIOs in the trace spaces goes back to Nédélec and Planchard [79], Le Roux [56], Hsiao and Wendland [46], and Steinbach and Wendland [95], with this method based on using Green’s identity (i.e. multiplying \(\Delta u\) by u). The idea of proving coercivity of second-kind BIOs in \({L^2(\Gamma )}\) using Rellich-type identities was introduced in [91] for a particular Helmholtz BIE on star-shaped domains and then further developed in [92] for the standard second-kind Helmholtz BIE on smooth convex domains. The main differences between [91, 92] and the present paper are that (i) [91, 92] only consider direct BIEs for the exterior Helmholtz Dirichlet problem whereas the present paper considers direct and indirect BIEs for the interior and exterior Laplace Dirichlet problems and (ii) [91, 92] only prove coercivity under geometric restrictions on \(\Gamma \) (which is somewhat expected for the high-frequency Helmholtz equation; see [7, 20, §6.3.2]), namely star-shapedness with respect to a ball for [91] and strict convexity and a piecewise analytic \(C^3\) boundary for [92], whereas the present paper proves coercivity of Laplace BIOs for general Lipschitz domains.

3.5 Combined-potential ansatz for solutions of Laplace’s equation

A key difference between the indirect BIEs in the present paper and those in [13] is that ours arise from the ansatz \(u= (\mathcal {K}_{\textbf{Z}}-\alpha {{\mathcal {S}}})\phi \) for the solution of the Laplace IDP, whereas [13] poses the ansatz \(u = \mathcal {K}_{\textbf{Z}}\phi \). We saw in the discussion above that the presence of the parameter \(\alpha \)—i.e., the fact that we use a combined-potential ansatz—is crucial for proving coercivity of our BIOs.

The combined-potential ansatz is also crucial to proving uniqueness for cases where coercivity does not hold. Indeed, using a linear combination of double- and single-layer potentials to find solutions of the Helmholtz equation is standard, and goes back to [10, 58, 81], with the motivation to ensure uniqueness at all wavenumbers. Using such a combination for Laplace’s equation is less common, but this was done by D. Mitrea in [69, Theorem 5.1] and subsequently by Medková in [66]. The rationale for this combined ansatz is similar, namely that the standard indirect second-kind equations (based on a double-layer-potential ansatz) have non-trivial null spaces for multiply connected domains (with these characterised in [53, 69]) but the BIOs resulting from a combined double- and single-layer potential ansatz are invertible no matter the topology of \({\Omega ^-}\); see [67, Theorem 5.15.2] (for \(d\ge 3\)) and [67, Theorem 5.15.3] (for \(d=2\)). The BIOs in Sect. 2 are also invertible (and even, for appropriate \(\alpha \), coercive) no matter the topology of \({\Omega ^-}\).

4 Proofs of the main results

In this section we prove all of the results in Sect. 2apart from the invertibility results in Part (v) of Theorem 2.1/2.2. As discussed in Sect. 3, these invertibility results are equivalent to uniqueness of the Laplace interior and exterior oblique Robin problems, and these uniqueness results are proved in Sect. 5. Indeed, Part (v) of Theorem 2.1 follows from Corollary 5.11, and Part (v) of Theorem 2.2 follows from Corollary 5.9.

4.1 Proofs of Parts (i), (ii), and (iii) of Theorems 2.1 and 2.2

For Part (i) of Theorem 2.1, first recall that the standard direct BIEs for the IDP (corresponding to the top left of Table 1) are (3.2) and (3.1). If \(g_D\in H^1(\Gamma )\), then \(\partial _n^- u\in {L^2(\Gamma )}\) (by Theorem B.1), and then the mapping properties (A.3a) of S and D imply that both sides of (3.2) are in \(H^1(\Gamma )\). Taking the surface gradient, \(\nabla _{\Gamma }\), of (3.2) yields the (vector) integral equation in \(({L^2(\Gamma )})^d\)

$$\begin{aligned} \nabla _{\Gamma } S(\partial _n^- u)= \nabla _{\Gamma }\left( \frac{1}{2}I + D\right) \gamma ^- u. \end{aligned}$$
(4.1)

Taking \((\textbf{Z}\cdot \textbf{n})\) times the scalar equation (3.1), minus \(\textbf{Z}\) dot the vector equation (4.1), plus \(\alpha \) times (3.2) yields (2.3). The proof of Part (i) of Theorem 2.2 (i.e., that (2.11) holds) is very similar.

For Part (ii) of Theorem 2.1, first recall that \(\mathcal {K}_{\textbf{Z}}\phi \) and \({{\mathcal {S}}}\phi \) are both in \(C^2({\Omega ^-})\) and satisfy Laplace’s equation (for \(\mathcal {K}_{\textbf{Z}}\) this was recalled in Sect. 1.7). When \(\phi \in {L^2(\Gamma )}\), \({{\mathcal {S}}}\phi \in H^{3/2}({\Omega ^-})\) by (A.2) and then \(({{\mathcal {S}}}\phi )^*\in {L^2(\Gamma )}\) by Part (iii) of Theorem B.2. As recalled in Sect. 1.7, \((\mathcal {K}_{\textbf{Z}}\phi )^* \in {L^2(\Gamma )}\) by [101], and thus u defined by (2.5) satisfies \(u^*\in {L^2(\Gamma )}\). To show that \(\phi \) satisfies the BIE (2.4), we take the non-tangential limit of (2.5), using (1.16) and that, by Lemma B.3, \(\widetilde{\gamma }^- ({{\mathcal {S}}}\phi ) = \gamma ^-({{\mathcal {S}}}\phi )\), where \(\gamma ^-({{\mathcal {S}}}\phi )\) is given by the first jump relation in

$$\begin{aligned} \gamma ^{\pm }{{\mathcal {S}}}= S, \qquad \partial _n^{\pm }{{\mathcal {S}}}= \mp \frac{1}{2}I + D'. \end{aligned}$$
(4.2)

(see, e.g., [65, Page 219] or [17, Equation 2.41]).

Part (ii) of Theorem 2.2 follows in an analogous way, except that we now need to show that u defined by (2.13) satisfies \(u(\textbf{x})= o(|\textbf{x}|^{3-d})\) when \(d=3\) as \(|\textbf{x}|\rightarrow \infty \); these asymptotics follow from the first bound in (4.23) and the bound

$$\begin{aligned} |\mathcal {K}_{\textbf{Z}}\phi (\textbf{x})| = O(|\textbf{x}|^{1-d}) \quad \text { as }|\textbf{x}|\rightarrow \infty , \end{aligned}$$
(4.3)

which is proved in a similar way to the bound on the double-layer potential in [87, Equation 3.23].

Part (iii) of both Theorems 2.1 and 2.2 follows from combining: (a) the definitions of \(A'_{I,\textbf{Z},\alpha }\) (2.1) and \(A'_{E,\textbf{Z},\alpha }\) (2.9) in terms of \(K_{\textbf{Z}}'\) and S; (b) the definitions of \(A_{I,\textbf{Z},\alpha }\) (2.1) and \(A_{E,\textbf{Z},\alpha }\) (2.9) in terms of \(K_{\textbf{Z}}\) and S; (c) the definitions of \(B_{I,\textbf{Z},\alpha }\) (2.2) and \(B_{E,\textbf{Z},\alpha }\) (2.2) in terms of DH,  and \(\nabla _{\Gamma }\); (d) the continuity of \(K_{\textbf{Z}}:{{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}}\) (and hence also of \(K_{\textbf{Z}}':{{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}}\)) recalled in Sect. 1.7; (e) the continuity of \(S: L^2(\Gamma )\rightarrow L^2(\Gamma )\), \(H: H^1(\Gamma )\rightarrow L^2(\Gamma )\), and \(D: H^1(\Gamma )\rightarrow H^1(\Gamma )\) (and hence also of \(\nabla _{\Gamma }D: H^1(\Gamma )\rightarrow L^2(\Gamma )\)), recalled in (A.3).

4.2 Proofs of Part (iv) of Theorems 2.1 and 2.2 (coercivity up to a compact perturbation)

Lemma 4.1

If \(d\ge 2\), \(\Gamma \) is Lipschitz and \(\textbf{Z}\in (C(\Gamma ))^d\) then \(K_{\textbf{Z}}+ K_{\textbf{Z}}'\) is compact in \({L^2(\Gamma )}\). Thus there exists a compact operator \( C:{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}\) such that

$$\begin{aligned} \big ( K_\textbf{Z}\phi ,\phi \big )_{{L^2(\Gamma )}}= \big (C \phi ,\phi \big )_{{L^2(\Gamma )}} \quad \text{ for } \text{ all } \text{ real-valued } \phi \in {L^2(\Gamma )}. \end{aligned}$$

Part (iv) of both Theorems 2.1 and 2.2 follow by combining Lemma 4.1 with the assumption (2.6) and the fact that S is compact on \({L^2(\Gamma )}\) (via the mapping property in (A.3a) with \(s=1/2\)).

Proof of Lemma 4.1

Since \(\Phi (\textbf{x},\textbf{y})\) is a function of \(|\textbf{x}-\textbf{y}|\), \(\nabla _\textbf{x}\Phi (\textbf{x},\textbf{y})= - \nabla _\textbf{y}\Phi (\textbf{x},\textbf{y})\); the definitions of \(K_\textbf{Z}\) (1.15) and \(K'_\textbf{Z}\) (1.18) then imply that, for all \(\phi \in {L^2(\Gamma )}\),

$$\begin{aligned} \big ( K_\textbf{Z}+ K_{\textbf{Z}}'\big )\phi (\textbf{x})= \int _\Gamma \big (\textbf{Z}(\textbf{y})-\textbf{Z}(\textbf{x})\big )\cdot \nabla _\textbf{y}\Phi (\textbf{x},\textbf{y}) \phi (\textbf{y})\, \textrm{d}s(\textbf{y}). \end{aligned}$$
(4.4)

If \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) for \(\beta >0\), then the kernel of the integral on the right-hand side of (4.4) is weakly singular, and thus the operator is compact on \({L^2(\Gamma )}\) by, e.g., the combination of [83, Part 3 of the theorem on Page 49] and the Riesz-Thorin interpolation theorem (see, e.g., [36, Theorem 6.27]), where the latter is used to verify the hypothesis of the former. Therefore, the result of this lemma follows if we can show that if, for all \(\beta >0\), \(K_{\textbf{Z}} + K'_{\textbf{Z}}\) is compact for all \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\), then \(K_{\textbf{Z}} + K'_{\textbf{Z}}\) is compact for all \(\textbf{Z}\in (C(\Gamma ))^d\).

Given \(\textbf{Z}\in (C(\Gamma ))^d\), there exist \(\beta >0\) and \(\textbf{Z}_\ell \in (C^{0,\beta }(\Gamma ))^d\) for all \(\ell \in \mathbb {N}\) such that \(\Vert \textbf{Z}_\ell -\textbf{Z}\Vert _{L^\infty }\rightarrow 0\) as \(\ell \rightarrow \infty \). By (1.18), the operator \(K_{\textbf{Z}}'\) can be written \(K_\textbf{Z}' = \textbf{Z}\cdot \textbf{T}\), where \(\textbf{T}:L^2(\Gamma )\rightarrow (L^2(\Gamma ))^d\) is bounded by the results of [22] and [101] (as discussed in Sect. 1.7). Let \(K_{\textbf{Z}_\ell }' = \textbf{Z}_\ell \cdot \textbf{T}\); then

$$\begin{aligned}{} & {} \left\| K_{\textbf{Z}_\ell }' - K_{\textbf{Z}}'\right\| _{L^2(\Gamma )\rightarrow {L^2(\Gamma )}}= \left\| \textbf{Z}_\ell \cdot \textbf{T}- \textbf{Z}\cdot \textbf{T}\right\| _{{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}} \\{} & {} \quad \qquad \le \left\| \textbf{Z}_\ell -\textbf{Z}\right\| _{(L^\infty (\Gamma ))^d} \left\| \textbf{T}\right\| _{{L^2(\Gamma )}\rightarrow {L^2(\Gamma )}}\rightarrow 0 \end{aligned}$$

as \(\ell \rightarrow \infty \). Therefore also \(K_{\textbf{Z}_\ell } \rightarrow K_{\textbf{Z}}\), so that \(K_{\textbf{Z}_\ell } + K_{\textbf{Z}_\ell }' \rightarrow K_{\textbf{Z}}+ K_{\textbf{Z}}'\). Since the space of compact operators is closed, \(K_{\textbf{Z}}+ K_{\textbf{Z}}'\) is compact. \(\square \)

4.3 Proof of Theorem 2.5 (coercivity for \({\Omega ^-}\) that are star-shaped with respect to a ball)

Theorem 2.5 is an immediate consequence of combining (i) the following special case of Lemma 4.1, (ii) the definitions of \(A'_{I,\textbf{Z},\alpha }\) and \(A_{I,\textbf{Z},\alpha }\) in (2.1) and \(A'_{E,\textbf{Z},\alpha }\) and \(A_{E,\textbf{Z},\alpha }\) in (2.9), and (iii) the inequality \((S\phi ,\phi )_{{L^2(\Gamma )}}\ge 0\) for all \(\phi \in {L^2(\Gamma )}\). The inequality in (iii) is well-known, following from Green’s identity, and is a special case of Lemma 4.4 below with \({\widetilde{\textbf{Z}}}=\textbf{0}\).

Lemma 4.2

(Key lemma for coercivity for star-shaped \({\Omega ^-}\)) Let \(\Gamma \) be Lipschitz. If \(d\ge 3\) then

$$\begin{aligned}{} & {} K_\textbf{x}+ K_\textbf{x}'+ (d-2)S=0 \,\,\text {and thus}\,\, \big (K_\textbf{x}\phi ,\phi \big )_{{L^2(\Gamma )}} + \frac{d-2}{2}\big (S\phi ,\phi \big )_{L^2(\Gamma )}\nonumber \\{} & {} \quad =0\,\,\text{ for } \text{ all } \text{ real-valued } \phi \in {L^2(\Gamma )}. \end{aligned}$$
(4.5)

If \(d=2\) then

$$\begin{aligned}{} & {} K_\textbf{x}+ K_\textbf{x}'+ \frac{|\Gamma |}{2\pi }P_\Gamma =0 \quad \text {and thus} \,\,\left( \left( K_\textbf{x}+ \frac{|\Gamma |}{4\pi }P_\Gamma \right) \phi ,\phi \right) _{{L^2(\Gamma )}}\nonumber \\{} & {} \quad = 0 \,\, \text{ for } \text{ all } \text{ real-valued } \phi \in {L^2(\Gamma )}, \end{aligned}$$
(4.6)

where \(P_\Gamma \) is defined by (2.28).

Proof of Lemma 4.2

By (1.1), when \(d\ge 3\), \((\textbf{y}-\textbf{x})\cdot \nabla _\textbf{y}\Phi (\textbf{x},\textbf{y}) = -(d-2)\Phi (\textbf{x},\textbf{y})\), and when \(d=2\), \((\textbf{y}-\textbf{x})\cdot \nabla _\textbf{y}\Phi (\textbf{x},\textbf{y}) =-1/2\pi \). The results then follow from (4.4) with \(\textbf{Z}(\textbf{x})=\textbf{x}\). \(\square \)

Remark 4.3

(Link with the work of Fabes, Sand, and Seo [34]) The analogue of (4.5) when \(\Gamma \) is the graph of a function (i.e., the boundary of a hypograph) appears in the first sentence after the first displayed equation on [34, Page 133]. Indeed, the analogue of the operator \(K_{\textbf{Z}}'\) for the hypograph with \(\textbf{Z}= \textbf{e}_d\) (i.e., the unit vector pointing in the \(x_d\) direction) arises in [34] when they apply the Rellich identity (4.9) below with \(u= {{\mathcal {S}}}\phi \), as part of their proof that \(\lambda I - D'\) is invertible on \(L^2(\Gamma )\) for \(\lambda \in \mathbb {R}\) with \(|\lambda |\ge 1/2\).

4.4 Proof of Part (vi) of Theorems 2.1 and 2.2 (coercivity for general \({\Omega ^-}\))

Lemma 4.4

(Key lemma for coercivity for general \({\Omega ^-}\)) Suppose that \({\Omega ^-}\subset \mathbb {R}^d\) is Lipschitz, \({\widetilde{\textbf{Z}}}\in W^{1,\infty }(\mathbb {R}^d)\) with compact support, and \(\alpha \in \mathbb {R}\) satisfies the lower bound

$$\begin{aligned} 2\alpha \ge 2\left( \sup _{\textbf{x}\in \mathbb {R}^d} \big \Vert D{\widetilde{\textbf{Z}}}(\textbf{x}) \big \Vert _2 \right) + \big \Vert \nabla \cdot {\widetilde{\textbf{Z}}}\big \Vert _{L^\infty (\mathbb {R}^d)} \end{aligned}$$
(4.7)

(where \(D{\widetilde{\textbf{Z}}}\) is the matrix with (ij)th element \(\partial _i \widetilde{Z}_j\) and \(\Vert \cdot \Vert _2\) denotes the operator norm on \(\mathbb {R}^d\times \mathbb {R}^d\) induced by the Euclidean norm on \(\mathbb {R}^d\)). If \(d\ge 3\) then

$$\begin{aligned} \pm \big ( K_{{\widetilde{\textbf{Z}}}} \phi ,\phi )_{{L^2(\Gamma )}} + \alpha (S \phi ,\phi )_{{L^2(\Gamma )}} \ge 0 \end{aligned}$$
(4.8)

for all real-valued \(\phi \in {L^2(\Gamma )}\). If \(d=2\), then (4.8) holds for all real-valued \(\phi \in {L^2(\Gamma )}\) with \(P_\Gamma \phi =0\), where \(P_\Gamma \) is defined by (2.28).

We first show how the coercivity results of Theorems 2.1 and 2.2 are a consequence of Lemma 4.4 combined with the following lemma.

Lemma 4.5

Given \(\textbf{Z}\in (C^{0,1}(\Gamma ))^d\) with non-zero Lipschitz constant, there exists a compactly supported \({\widetilde{\textbf{Z}}}_\textrm{ext}\in (C^{0,1}(\mathbb {R}^d))^d\) with the same Lipschitz constant as \(\textbf{Z}\) and such that \({\widetilde{\textbf{Z}}}_\textrm{ext}|_\Gamma = \textbf{Z}\).

The proof of Lemma 4.5 is given in [18, Appendix D] (i.e., the extended version of the present paper). Note that, by the Kirszbraun theorem [51, 100], \(\textbf{Z}\in (C^{0,1}(\Gamma ))^d\) can be extended to a function \(\textbf{Z}_\textrm{ext}\in (C^{0,1}(\mathbb {R}^d))^d\) with the same (non-zero) Lipschitz constant, so to prove Lemma 4.5 we only need to show that there exists an extension with compact support.

Proof of Part (vi) of Theorems 2.1 and 2.2 assuming Lemmas 4.4 and 4.5

Given \(\textbf{Z}\), by Lemma 4.5 there exists a compactly-supported Lipschitz extension of \(\textbf{Z}\) to \(\mathbb {R}^d\) with the same Lipschitz constant; call this \({\widetilde{\textbf{Z}}}\). This \({\widetilde{\textbf{Z}}}\) then satisfies the assumptions of Lemma 4.4, and the inequality (2.7) then ensures that (4.7) holds (where we have used the inequality \(\Vert A\Vert _2^2 \le \sum _i \sum _j |(A)_{ij}|^2\) to show that \(\sup _\textbf{x}\Vert D{\widetilde{\textbf{Z}}}(\textbf{x})\Vert _2\le d L_{\textbf{Z}}\)). Thus (4.8) holds (with \(K_{{\widetilde{\textbf{Z}}}}\) replaced by \(K_{\textbf{Z}}\)) and the coercivity results follow from the definitions of \(A'_{I,\textbf{Z},\alpha }\) and \(A_{I,\textbf{Z},\alpha }\) (2.1) and \(A'_{E,\textbf{Z},\alpha }\) and \(A_{E,\textbf{Z},\alpha }\) (2.9) and the inequality (2.6) on \(\textbf{Z}\cdot \textbf{n}\). \(\square \)

The proof of Lemma 4.4 is based on the following identity. The relationship of this identity to other similar identities in the literature is discussed in Sect. 3.4, and we note, in particular, that this identity appears as [54, Equation 2.28]; for completeness we include the short proof.

Lemma 4.6

(Rellich-type identity) Let v be a real-valued \(C^2\) function on some open set \(D\subset \mathbb {R}^d\), \(d\ge 2\). Let \({\widetilde{\textbf{Z}}}\in (C^1(D))^d\) and \(\alpha \in C^1(D)\) and let both be real-valued. Then, with the summation convention,

$$\begin{aligned} 2 \,{{\mathcal {Z}}}v \Delta v&= \nabla \cdot \Big [ 2\, {{\mathcal {Z}}}v \nabla v - |\nabla v|^2{\widetilde{\textbf{Z}}}\Big ] - \big (2 \alpha -\nabla \cdot {\widetilde{\textbf{Z}}}\big )|\nabla v|^2- 2 \partial _i \widetilde{Z}_j \partial _i v \partial _j v \nonumber \\&\quad - 2 v\nabla \alpha \cdot \nabla v, \end{aligned}$$
(4.9)

where

$$\begin{aligned} {{\mathcal {Z}}}v:= \big ({\widetilde{\textbf{Z}}}\cdot \nabla v + \alpha v\big ). \end{aligned}$$
(4.10)

Proof

Splitting \({{\mathcal {Z}}}v\) into its component parts, we see that the identity (4.9) is the sum of the identities

$$\begin{aligned} 2 \,{\widetilde{\textbf{Z}}}\cdot \nabla v \Delta v = \nabla \cdot \left[ 2\, ({\widetilde{\textbf{Z}}}\cdot \nabla v)\,\nabla v - |\nabla v|^2{\widetilde{\textbf{Z}}}\right] + \big (\nabla \cdot {\widetilde{\textbf{Z}}}\big )|\nabla v|^2- 2 \partial _i \widetilde{Z}_j \partial _i v \partial _j v \end{aligned}$$
(4.11)

and

$$\begin{aligned} 2\alpha v \Delta v = \nabla \cdot \left[ 2 \alpha v\nabla v \right] - 2\alpha |\nabla v|^2- 2v\nabla \alpha \cdot \nabla v. \end{aligned}$$
(4.12)

To prove (4.12), expand the divergence on the right-hand side. The identity (4.11) is obtained by combining the identities

$$\begin{aligned} {\widetilde{\textbf{Z}}}\cdot \nabla v \Delta v = \nabla \cdot \left[ ({\widetilde{\textbf{Z}}}\cdot \nabla v) \nabla v\right] - \partial _i \widetilde{Z}_j \partial _i v \partial _j v - \nabla v \cdot ({\widetilde{\textbf{Z}}}\cdot \nabla ) \nabla v \end{aligned}$$
(4.13)

and

$$\begin{aligned} 2\nabla v \cdot \Big ( {\widetilde{\textbf{Z}}}\cdot \nabla \Big ) \nabla v= \nabla \cdot \left( |\nabla v|^2{\widetilde{\textbf{Z}}}\right) - (\nabla \cdot {\widetilde{\textbf{Z}}}) |\nabla v|^2, \end{aligned}$$
(4.14)

which can both be proved by expanding the divergences on the right-hand sides. \(\square \)

For the proof of Lemma 4.4, we need the identity (4.9) integrated over a Lipschitz domain when v is the single-layer potential. As a step towards this, we prove the following lemma.

Lemma 4.7

(Integrated version of the identity) Let D be a Lipschitz domain with outward-pointing unit normal vector \(\varvec{\nu }\). Define

$$\begin{aligned} V(D):=\Big \{v: \,v\in H^1(D), \; \Delta v \in L^2(D),\; \gamma v \in H^1(\partial D),\; \partial _{\nu } v\in L^2(\partial D) \Big \}.\nonumber \\ \end{aligned}$$
(4.15)

If \(v\in V(D)\), \({\widetilde{\textbf{Z}}}\in (W^{1,\infty }(D))^d\) (i.e. \(\widetilde{Z}_i\) and \(\partial _i \widetilde{Z}_j \in L^\infty (D)\) for \(i,j=1,\ldots ,n \)), \(\alpha \in W^{1,\infty }(D)\), and v, \({\widetilde{\textbf{Z}}}\), and \(\alpha \) are all real-valued, then

$$\begin{aligned}&\int _{\partial D} \left[ ({\widetilde{\textbf{Z}}}\cdot \varvec{\nu }) \left( \left( \partial _{\nu } v\right) ^2 - |\nabla _{\Gamma }v|^2 \right) + 2 \Big ({\widetilde{\textbf{Z}}}\cdot \nabla _{\Gamma }(\gamma v) + \alpha (\gamma v)\Big ) \partial _{\nu } v\right] \textrm{d}s\nonumber \\&\qquad = \int _D \bigg (2 \,{{\mathcal {Z}}}v \Delta v +2 \, \partial _i \widetilde{Z}_j \partial _i v \partial _j v + 2 \, v \nabla \alpha \cdot \nabla v +\left( 2 \alpha -\nabla \cdot {\widetilde{\textbf{Z}}}\right) |\nabla v|^2\bigg ) \textrm{d}\textbf{x}. \end{aligned}$$
(4.16)

Recall that, when D is Lipschitz, we can identify \(W^{1,\infty }(D)\) with \(C^{0,1}({\overline{D}})\) (see, e.g., [33, §4.2.3, Theorem 5]), and understand \({\widetilde{\textbf{Z}}}\) and \(\alpha \) on \(\partial D\) in (4.16) by restriction without needing a trace operator.

Proof of Lemma 4.7

We first assume that \({\widetilde{\textbf{Z}}}\) and \(\alpha \) are as in the statement of the theorem, but \(v\in {{\mathcal {D}}}(\overline{D}):=\{ U|_D: U \in C^\infty (\mathbb {R}^d)\}\). Recall that the divergence theorem \(\int _D\nabla \cdot {\textbf {F}}\,\textrm{d}\textbf{x}= \int _{{\partial D}} {\textbf {F}} \cdot \varvec{\nu }\,\textrm{d}s\) is valid when \({\textbf {F}} \in (C^1(\overline{D}))^d\) [65, Theorem 3.34], and thus for \(\textbf{F}\in (H^1(D))^d\) by the density of \(C^1(\overline{D})\) in \(H^1(D)\) [65, Theorem 3.29] and the continuity of the trace operator from \(H^1(D)\) to \(H^{1/2}({\partial D})\) [65, Theorem 3.37]. Recall also that the product of an \(H^1(D)\) function and a \(W^{1,\infty }(D)\) function is in \(H^1(D)\), and the usual product rule for differentiation holds for such functions. Thus \(\textbf{F}= 2 {{\mathcal {Z}}}v \nabla v - |\nabla v|^2{\widetilde{\textbf{Z}}}\) is in \((H^1(D))^d\) and then (4.9) implies that \(\nabla \cdot \textbf{F}\) is given by the integrand on the left-hand side of (4.16). Furthermore,

$$\begin{aligned} \gamma \textbf{F}\cdot \varvec{\nu }=({\widetilde{\textbf{Z}}}\cdot \varvec{\nu }) \left( \left( \frac{\partial v}{\partial \nu }\right) ^2 - |\nabla _{\Gamma }v|^2 \right) + 2\, \big ({\widetilde{\textbf{Z}}}\cdot \nabla _{\Gamma }v + \alpha v\big ) \frac{\partial v}{\partial \nu } \end{aligned}$$

on \(\partial D\), where we have used the fact that \(\nabla v= \varvec{\nu }(\partial v/\partial \nu ) + \nabla _{\Gamma }v\) on \(\partial D\) for \(v\in {{\mathcal {D}}}(\overline{D})\); the identity (4.16) then follows from the divergence theorem.

The result for \(v\in V(D)\) then follows from (i) the density of \({{\mathcal {D}}}(\overline{\Omega })\) in \(V(D)\) [26, Lemmas 2 and 3] and (ii) the fact that (4.16) is continuous in v with respect to the topology of \(V(D)\) \(\square \)

Proof of Lemma 4.4

As discussed in Sect. 3, our strategy is to mimic the classic method of “transferring" coercivity properties of the PDE formulation to the BIOs in the trace spaces, but with Green’s identity

$$\begin{aligned} -\int _D u \Delta u \, \textrm{d}\textbf{x}= \int _D |\nabla u|^2\,\textrm{d}\textbf{x}- \int _{\partial D} \gamma u\, \partial _n u\, \textrm{d}s, \end{aligned}$$
(4.17)

replaced by the integrated version of the Rellich-type identity (4.9). That is, we apply the integrated version of (4.9), namely (4.16), with v replaced by \(u={{\mathcal {S}}}\phi \) (with \(\phi \in {L^2(\Gamma )}\)), and D first equal to \({\Omega ^-}\), and then equal to \({\Omega ^+}\cap B_R\), where \(R> \sup _{\textbf{x}\in {\Omega ^-}}|\textbf{x}|\). At this stage we let \({\widetilde{\textbf{Z}}}\) be a general real-valued \(W^{1,\infty }(\mathbb {R}^3)\) vector field with compact support, and let \(\alpha \) be an arbitrary real constant. That (4.16) holds with v replaced by \(u={{\mathcal {S}}}\phi \), with \(\phi \) real-valued, can be justified by using the results of [48] and [17, Appendix A] recapped in Appendix Sect. B. Indeed, when \(\phi \in {L^2(\Gamma )}\), \(u= {{\mathcal {S}}}\phi \in H^{3/2}(D)\) when \(D= {\Omega ^-}\) or \({\Omega ^+}\cap B_R\) by the first mapping property in (A.2); then \(u\in V(D)\) by Corollary B.5, and (4.16) holds by Lemma 4.7,Footnote 7

We have therefore established that (4.16) holds when \(D= {\Omega ^-}\) or \({\Omega ^+}\cap B_R\) and \(u= {{\mathcal {S}}}\phi \) for \(\phi \in {L^2(\Gamma )}\) that is real-valued. That is, with the identity (4.9) written as \(\nabla \cdot \textbf{Q}=P\),

$$\begin{aligned} \int _\Gamma \textbf{Q}_-\cdot \textbf{n}\,\textrm{d}s = \int _{{\Omega ^-}} P \,\textrm{d}\textbf{x}\end{aligned}$$
(4.18)

and

$$\begin{aligned} -\int _\Gamma \textbf{Q}_+\cdot \textbf{n}\,\textrm{d}s + \int _{{\Gamma _R}} Q_R \,\textrm{d}s = \int _{{\Omega ^+}\cap B_R} P \,\textrm{d}\textbf{x}, \end{aligned}$$
(4.19)

where (remembering that \(\Delta u =0\) and \(\alpha \) is a constant)

$$\begin{aligned} P&= 2 \, \partial _i \widetilde{Z}_j \partial _i u \partial _j u + \big (2 \alpha - \nabla \cdot {\widetilde{\textbf{Z}}}\big )|\nabla u|^2, \end{aligned}$$
(4.20)
$$\begin{aligned} \textbf{Q}_{\pm }\cdot \textbf{n}&= ({\widetilde{\textbf{Z}}}\cdot \textbf{n}) \left( \big (\partial _n^{\pm }u\big )^2 - |\nabla _{\Gamma }(\gamma ^\pm u)|^2\right) + 2\, \Big ({\widetilde{\textbf{Z}}}\cdot \nabla _{\Gamma }(\gamma ^\pm u) +\alpha \gamma ^\pm u\Big )\partial _n^{\pm }u. \end{aligned}$$
(4.21)

If R is chosen large enough so that \(\textrm{supp}\,{\widetilde{\textbf{Z}}}\subset B_R\), then

$$\begin{aligned} Q_R = \textbf{Q}\cdot \widehat{\textbf{x}}= 2\alpha \,u\frac{\partial u}{\partial r}\quad \text { for }\textbf{x}\in {\Gamma _R}, \end{aligned}$$
(4.22)

where we have used the fact that u is \(C^\infty \) in a neighbourhood of \({\Gamma _R}\) (either by elliptic regularity or directly by the definition of the single-layer potential (1.19)) to justify writing \(\partial u/\partial r\) in place of some appropriate trace.

Adding (4.18) and (4.19) yields

$$\begin{aligned} \int _\Gamma (\textbf{Q}_- - \textbf{Q}_+) \cdot \textbf{n}\,\textrm{d}s + \int _{{\Gamma _R}} Q_R \,\textrm{d}s = \int _{{\Omega ^-}} P \, \textrm{d}\textbf{x}+ \int _{{\Omega ^+}\cap B_R} P \, \textrm{d}\textbf{x}. \end{aligned}$$

Now if \(d\ge 3\) and \(\phi \in {L^2(\Gamma )}\), then

$$\begin{aligned} |{{\mathcal {S}}}\phi (\textbf{x})|= O(|\textbf{x}|^{2-d})\quad \text { and }\quad |\nabla {{\mathcal {S}}}\phi (\textbf{x})| = O(|\textbf{x}|^{1-d}) \end{aligned}$$
(4.23)

as \(|\textbf{x}|\rightarrow \infty \), uniformly in all directions \(\textbf{x}/|\textbf{x}|\). If \(d=2\) then

$$\begin{aligned} {{\mathcal {S}}}\phi (\textbf{x})= & {} \frac{1}{2\pi }\log \left( \frac{a}{|\textbf{x}|}\right) \left( \phi ,1\right) _{{L^2(\Gamma )}} + O(|\textbf{x}|^{-1})\quad \text { and }\quad \nonumber \\ \nabla {{\mathcal {S}}}\phi (\textbf{x})= & {} -\frac{1}{2\pi |\textbf{x}|} \left( \phi ,1\right) _{{L^2(\Gamma )}} + O(|\textbf{x}|^{-2}) \end{aligned}$$
(4.24)

as \(|\textbf{x}|\rightarrow \infty \), uniformly in all directions \(\textbf{x}/|\textbf{x}|\); these asymptotics are proved for \(d=2,3\) in, e.g., [93, Lemma 6.21] (see also [87, Equations 3.22 and 3.23] for \(d=3\)); the proof of (4.23) for \(d\ge 4\) is analogous. Recalling the definition of \(P_\Gamma \) (2.28) and the assumption that \(P_\Gamma \phi =0\) when \(d=2\), we see that, by (4.22), \(\int _{{\Gamma _R}}Q_R \,\textrm{d}s=O(R^{2-d})\) for \(d\ge 3\) and \(\int _{{\Gamma _R}}Q_R \,\textrm{d}s=O(R^{-2})\) for \(d=2\) as \(R\rightarrow \infty \). Thus, in this limit,

$$\begin{aligned} \int _\Gamma (\textbf{Q}_- - \textbf{Q}_+) \cdot \textbf{n}\,\textrm{d}s = \int _{{\Omega ^-}\cup {\Omega ^+}} P \, \textrm{d}\textbf{x}. \end{aligned}$$
(4.25)

The expressions for \(\textbf{Q}_\pm \cdot \textbf{n}\) (4.21) and the single-layer potential jump relations (4.2) then imply that

$$\begin{aligned} \int _\Gamma (\textbf{Q}_- - \textbf{Q}_+) \cdot \textbf{n}\,\textrm{d}s = 2\Big ( ({\widetilde{\textbf{Z}}}\cdot \textbf{n})D' + {\widetilde{\textbf{Z}}}\cdot \nabla _{\Gamma } S+ \alpha S) \phi , \phi \Big )_{{L^2(\Gamma )}}. \end{aligned}$$
(4.26)

A key identity to help one see this is

$$\begin{aligned} \left( \partial _n^- u(\textbf{x})\right) ^2-\left( \partial _n^+ u(\textbf{x})\right) ^2 = 2\,\phi (\textbf{x})\,\big (D^\prime \phi (\textbf{x})\big ) \quad \text { for a.e. }\textbf{x}\in \Gamma , \end{aligned}$$

which can be established using \(a^2 -b^2 = (a-b)(a+b)\) and the jump relations (4.2) for \(\partial ^{\pm }_n u\).

Combining (4.25), (4.26), and (2.28), we therefore have that

$$\begin{aligned}{} & {} 2 \Big ( ({\widetilde{\textbf{Z}}}\cdot \textbf{n})D' + {\widetilde{\textbf{Z}}}\cdot \nabla _{\Gamma } S+ \alpha S) \phi , \phi \Big )_{{L^2(\Gamma )}} \nonumber \\{} & {} \quad = \int _{{\Omega ^-}\cup {\Omega ^+}}\Big ( 2 \partial _i \widetilde{Z}_j \partial _i u \partial _j u + \big (2 \alpha - \nabla \cdot {\widetilde{\textbf{Z}}}\big )|\nabla u|^2\Big ) \textrm{d}\textbf{x}. \end{aligned}$$
(4.27)

Using the Cauchy-Schwarz inequality and the definition of the matrix 2-norm for the term involving \(2 \, \partial _i \widetilde{Z}_j \partial _i u \partial _j u= 2 \,\nabla u\cdot (D{\widetilde{\textbf{Z}}}\, \nabla u)\), and then standard results about integrals for both this term and the term involving \(\nabla \cdot {\widetilde{\textbf{Z}}}\), we find that the right-hand side of (4.27) is

$$\begin{aligned} \ge \left( 2\alpha - \left( 2\sup _{\textbf{x}\in \mathbb {R}^d} \big \Vert D{\widetilde{\textbf{Z}}}\big \Vert _2 + \big \Vert \nabla \cdot {\widetilde{\textbf{Z}}}\big \Vert _{L^\infty (\mathbb {R}^d)}\right) \right) \int _{{\Omega ^-}\cup {\Omega ^+}} |\nabla u|^2 \, \textrm{d}\textbf{x}. \end{aligned}$$

Therefore, choosing \(\alpha \) to satisfy the lower bound (4.7) establishes the lemma with the \(+\) sign in (4.8). Multiplying (4.27) by \(-1\) and letting \(\alpha \mapsto -\alpha \) we see again that if \(\alpha \) satisfies (4.7) then this modified right-hand side is \(\ge 0\), which establishes the lemma with the − sign in (4.8). \(\square \)

Remark 4.8

(Recovering the results of Lemma 4.2for \(d\ge 3\) ) If \(d\ge 3\) and \({\widetilde{\textbf{Z}}}=\textbf{x}\), (4.27) becomes

$$\begin{aligned} 2 \Big ( (\textbf{x}\cdot \textbf{n})D' + \textbf{x}\cdot \nabla _{\Gamma } S+ \alpha S) \phi , \phi \Big )_{{L^2(\Gamma )}} = (2+ 2\alpha -d)\int _{{\Omega ^-}\cup {\Omega ^+}}|\nabla u|^2\, \textrm{d}\textbf{x}.\nonumber \\ \end{aligned}$$
(4.28)

This is because, despite the additional terms in the analogue of (4.22) coming from \({\widetilde{\textbf{Z}}}\) no longer having compact support, it turns out that \(\int _{{\Gamma _R}}Q_R \,\textrm{d}s =O(R^{2-d})\) as \(R\rightarrow \infty \) as before. Letting \(\alpha =(d-2)/2\) in (4.28) and recalling the definition (1.18) of \(K_{\textbf{Z}}'\), we obtain the second equality in (4.5).

Remark 4.9

(Link with Verchota’s proof of invertibility of \(\frac{1}{2}I- D'\) on \({L^2(\Gamma )}\) ) Verchota’s proof that \(\frac{1}{2}I-D'\) is invertible on \({L^2(\Gamma )}\) when \(\Gamma \) is Lipschitz in [101, Theorem 3.1] relies on the inequalities

$$\begin{aligned} \left\| \left( \frac{1}{2}I - D'\right) \phi \right\| _{L^2(\Gamma )}\lesssim \left\| \left( \frac{1}{2}I + D'\right) \phi \right\| _{L^2(\Gamma )}\lesssim \left\| \left( \frac{1}{2}I - D'\right) \phi \right\| _{L^2(\Gamma )},\nonumber \\ \end{aligned}$$
(4.29)

which hold for all \(\phi \in {L^2(\Gamma )}\) for \(d\ge 3\) and for all \(\phi \in {L^2(\Gamma )}\) with \(P_\Gamma \phi =0\) for \(d=2\), and where the omitted constants depend only on the Lipschitz character of \({\Omega ^-}\). (Note that [101, Theorem 2.1] proves the slightly weaker result that

$$\begin{aligned} \left\| \left( \frac{1}{2}I \pm D_0'\right) \phi \right\| _{L^2(\Gamma )}\lesssim \left\| \left( \frac{1}{2}I \mp D_0'\right) \phi \right\| _{L^2(\Gamma )}+ \left| \int _\Gamma S_0\phi \, \textrm{d}s\right| , \end{aligned}$$

but the final term on the right-hand side can be eliminated; see [68, Chapter 15, Corollary 1, Page 273] when \(\Gamma \) is the graph of a function and [3, Corollary 2.20] for \({\Omega ^-}\) bounded.)

The inequalities in (4.29) can be obtained by applying the following Dirichlet-to-Neumann and Neumann-to-Dirichlet map bounds with \(u={{\mathcal {S}}}\phi \) and using the jump relations (4.2).

  1. (i)

    If \(u\in H^1({\Omega ^-})\) is such that \(\Delta u=0\) in \({\Omega ^-}\), \(\gamma ^- u\in H^1(\Gamma )\), and \(\partial _n^- u\in {L^2(\Gamma )}\), then

    $$\begin{aligned} \left\| \nabla _{\Gamma }(\gamma ^- u)\right\| _{L^2(\Gamma )}\lesssim \left\| \partial _n^- u\right\| _{{L^2(\Gamma )}} \lesssim \left\| \nabla _{\Gamma }(\gamma ^- u)\right\| _{L^2(\Gamma )}. \end{aligned}$$
    (4.30)
  2. (ii)

    If \(u\in H^1_\mathrm{{loc}}({\Omega ^+})\) is such that \(\Delta u=0\) in \({\Omega ^+}\), \(\gamma ^+ u\in H^1(\Gamma )\), \(\partial _n^+ u\in {L^2(\Gamma )}\), and \(u(\textbf{x})= O(|\textbf{x}|^{2-d})\) for \(d\ge 3\) and \(u(\textbf{x})= O(|\textbf{x}|^{-1})\) for \(d=2\), then

    $$\begin{aligned} \left\| \nabla _{\Gamma }(\gamma ^+ u)\right\| _{L^2(\Gamma )}\lesssim \left\| \partial _n^+ u\right\| _{{L^2(\Gamma )}} \lesssim \left\| \nabla _{\Gamma }(\gamma ^+ u)\right\| _{L^2(\Gamma )}. \end{aligned}$$
    (4.31)

The link with our proofs of coercivity of our new BIEs comes from the fact that the bounds (4.30) and (4.31) can be proved using the identity (4.9) with \(\alpha =0\) and \({\widetilde{\textbf{Z}}}\) the vector field of Lemma 4.5; see, e.g., [3, Corollary 2.20].

4.5 Proofs of Theorems 2.14 and 2.15 (the 2d results)

Lemma 4.10

If u is the solution of the IDP then \(P_\Gamma (\partial _n^- u)=0\). If u is the solution of the EDP and \(d=2\), then \(P_\Gamma (\partial _n^+ u)=0\).

Proof

The result for the IDP follows from applying Green’s second identity to u and the constant function. The result for the EDP when \(d=2\) follows in a similar way, using the arguments in the proof of [65, Theorem 8.9] to deal with the integral at infinity. Alternatively, the result for the EDP when \(d=2\) is proved in [52, Proof of Theorem 6.10]; see [52, Equation 6.10]. \(\square \)

Proof of Theorem 2.14

For Parts (i) and (iii), arguing exactly as in the proofs of Part (i) of Theorems 2.1 and 2.2 gives

$$\begin{aligned} A'_{I,\textbf{Z},\alpha }\partial _n^- u=B_{I,\textbf{Z},\alpha }g_D \quad \text { and }\quad A'_{E,\textbf{Z},\alpha }\partial _n^+ u=B_{E,\textbf{Z},\alpha }g_D + \alpha u_\infty , \end{aligned}$$
(4.32)

where \(u_\infty \) is the limit of the solution of the EDP at infinity and we use Green’s integral representation \(u(\textbf{x}) = -{{\mathcal {S}}}\partial _n^+ u(\textbf{x}) + {{\mathcal {D}}}\gamma ^+ u(\textbf{x}) + u_\infty \) for \(\textbf{x}\in {\Omega ^+}\) and \(d=2\). The BIEs (2.30) and (2.34) then follow by applying \(Q_\Gamma = I-P_\Gamma \) to the equations in (4.32) and then using that \(P_\Gamma \partial ^{\pm }_n u=0\) by Lemma 4.10, so that \(\partial ^{\pm }_n u = Q_\Gamma \partial ^{\pm }_n u\).

For Part (ii), taking the non-tangential limit of u defined by (2.32) and using the jump relations (1.16) and (4.2) (similar to the proof of Part (ii) of Theorem 2.1) and the fact that \(Q_\Gamma = I-P_\Gamma \), we obtain that \(\gamma _-u =g_D\) if the BIE (2.31) holds. Exactly as in the analogous proof for \(d\ge 3\) in Sect. 4.1, \(\mathcal {K}_{\textbf{Z}}\psi \) and \({{\mathcal {S}}}\psi \) with \(\psi \in {L^2(\Gamma )}\) are in \(C^2({\Omega ^-})\), have non-tangential maximal functions in \({L^2(\Gamma )}\), and satisfy Laplace’s equation; therefore u defined by (2.32) inherits these properties.

The proof of Part (iv) is very similar to the proof of Part (ii), except that we now need to show that u defined by (2.36) satisfies \(u(\textbf{x}) = O(1)\) as \(|\textbf{x}|\rightarrow \infty \); these asymptotics follow from the first bound in (4.23) (since \(P_\Gamma Q_\Gamma \phi =0\)) and the bound (4.3).

To see Part (v), arguing as in the proof of Part (vi) of Theorems 2.1 and 2.2 below Lemma 4.5, but using (4.8) with \(d=2\), we see that \((A\psi ,\psi )_{{L^2(\Gamma )}}\ge (c/2)\Vert \psi \Vert ^2_{{L^2(\Gamma )}}\) for all real-valued \(\psi \in L^2_0(\Gamma ):= \{\phi \in {L^2(\Gamma )}:P_\Gamma \phi =0\}\) if \(\alpha \) satisfies (2.7), where A denotes any of \(A_{I,\textbf{Z},\alpha }\), \(A'_{I,\textbf{Z},\alpha }\), \(A_{E,\textbf{Z},\alpha }\), or \(A'_{E,\textbf{Z},\alpha }\). Part (v) then follows from the fact that if \((A\psi ,\psi )_{{L^2(\Gamma )}}\ge (c/2)\Vert \psi \Vert ^2_{{L^2(\Gamma )}}\) for all real-valued \(\psi \in L^2_0(\Gamma )\) (so that A is coercive on \(L^2_0(\Gamma )\) with coercivity constant c/2), then \(Q_\Gamma A Q_\Gamma + cP_\Gamma /2\) is coercive on \({L^2(\Gamma )}\) with coercivity constant c/2. Indeed, since \(Q_\Gamma '=Q_\Gamma \), \(P_\Gamma ^2=P_\Gamma \), \(P_\Gamma '=P_\Gamma \), and \(P_\Gamma Q_\Gamma =0\), it follows that, for all real-valued \(\psi \in {L^2(\Gamma )}\), \(Q_\Gamma \psi \in L^2_0(\Gamma )\) and

$$\begin{aligned} \left( \left( Q_\Gamma A Q_\Gamma + \frac{c}{2}P_\Gamma \right) \psi ,\psi \right) _{{L^2(\Gamma )}}&=\big ( A Q_\Gamma \psi ,Q_\Gamma \psi \big )_{{L^2(\Gamma )}} +\frac{c}{2}\big (P_\Gamma ^2 \psi ,\psi \big )_{{L^2(\Gamma )}} \\&\ge \frac{c}{2}\left\| Q_\Gamma \psi \right\| ^2_{{L^2(\Gamma )}} + \frac{c}{2}\left\| P_\Gamma \psi \right\| ^2_{{L^2(\Gamma )}} = \frac{c}{2}\left\| \psi \right\| ^2_{L^2(\Gamma )}. \end{aligned}$$

\(\square \)

Proof of Theorem 2.15

For Parts (i) and (iii), taking \(\textbf{Z}=\textbf{x}\) and \(\alpha =0\) in (4.32) yields

$$\begin{aligned} A'_{I,\textbf{x},0} \partial _n^+ u=B_{I,\textbf{x},0} \,g_D \quad \text { and }\quad A'_{E,\textbf{x},0} \partial _n^+ u=B_{E,\textbf{x},0} \,g_D. \end{aligned}$$

Since \(P_\Gamma \partial ^{\pm }_n u=0\) by Lemma 4.10, the BIEs (2.37) and (2.39) follow.

The proofs of Parts (ii) and (iv) follow in the same way as the proofs of Parts (ii) and (iv) of Theorem 2.14, namely by taking non-tangential limits of u, using the jump relations (1.16) and (4.2), and using the asymptotics (4.3) for the exterior problem.

Part (v) follows immediately from using the second equation in (4.6). \(\square \)

4.6 Proof of the results in Sect. 2.2.2 (the conditioning results)

4.6.1 Proof of Theorem 2.11

Theorem 2.11 is a special case of the following general theorem about GMRES applied to Galerkin linear systems of a continuous and coercive operator on a Hilbert space. We first establish some notation.

As in Sect. 1.2, we consider the Galerkin method applied to the equation \(A\phi = f\), where \(\phi , f \in {{\mathcal {H}}}\), \(A:{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}\) is a continuous (i.e. bounded) linear operator, and \({{\mathcal {H}}}\) is a Hilbert space over \(\mathbb {C}\). Let \({{\mathcal {H}}}_N \subset {{\mathcal {H}}}\) be such that \({{\mathcal {H}}}_N = \textrm{span} \{\psi ^{N}_1,\ldots , \psi ^{N}_{M_N}\}\), with \(M_N=\dim ({{\mathcal {H}}}_N)\) and \(\{\psi ^{N}_1,\ldots , \psi ^{N}_{M_N}\}\) a basis for \({{\mathcal {H}}}_N\). The Galerkin matrix of A is then defined by \(({\textsf{A}})_{ij}:=(A\psi ^{N}_j,\psi ^{N}_i)_{{{\mathcal {H}}}}\), \(i,j=1,\ldots , M_N\) (compare to (2.16)).

The rest of the set up of Sect. 2.2.2 then holds exactly as stated; i.e., we consider the equation \({\textsf{A}}\textbf{x}= \textbf{b}\), let \(\textbf{y}_m\) be the mth iterate when the linear system (2.19) is solved using GMRES with zero initial guess, let \(\textbf{r}_m\) denote the corresponding residual, and let \(\phi _N^{m}\) be defined by (2.20), so that the Galerkin solution \(\phi _N\) is given by (2.21).

Theorem 4.11

(Convergence of GMRES applied to the Galerkin linear system of a continuous and coercive operator) Suppose that \(A:{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}\) is coercive (i.e., there exists \(C_{\textrm{coer}}>0\) such that (1.8) holds) and Assumption 2.8 holds with \(\Vert \cdot \Vert _{{L^2(\Gamma )}}\) in (2.17) replaced by \(\Vert \cdot \Vert _{{{\mathcal {H}}}}\). With \(C_1\) and \(C_2\) as in (2.17), let \(\beta \in [0,\pi /2)\) be defined such that

$$\begin{aligned} \cos \beta = \frac{C_{\textrm{coer}}}{\Vert A\Vert _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}\left( \frac{C_1}{C_2}\right) ^{2}\quad \text { and let } \quad \gamma _\beta := 2 \sin \left( \frac{\beta }{4-2\beta /\pi }\right) . \end{aligned}$$
(4.33)

Given \(\varepsilon >0\), if

$$\begin{aligned} m\ge \left( \log \left( \frac{1}{\gamma _\beta }\right) \right) ^{-1} \left[ \log \left( \frac{12 \Vert A\Vert _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^3\right) + \log \left( \frac{1}{\varepsilon }\right) \right] , \end{aligned}$$
(4.34)

then

$$\begin{aligned} \frac{ \left\| \phi -\phi _N^m\right\| _{{{\mathcal {H}}}}}{\left\| \phi \right\| _{{{\mathcal {H}}}}} \le (1+\varepsilon )\frac{\Vert A\Vert _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \min _{\psi \in {{\mathcal {H}}}_N}\frac{\left\| \phi -\psi \right\| _{{{\mathcal {H}}}}}{\left\| \phi \right\| _{{{\mathcal {H}}}}}\right) + \varepsilon . \end{aligned}$$
(4.35)

The first step in proving Theorem 4.11 is to establish the following relationship between the error \(\left\| \phi -\phi _N^m\right\| _{{{\mathcal {H}}}}\), the GMRES relative residual \(\left\| \textbf{r}_m\right\| _2/\left\| \textbf{r}_0\right\| _2\), and the Galerkin error \(\left\| \phi -\phi _N\right\| _{{{\mathcal {H}}}}\).

Lemma 4.12

Suppose that \(A:{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}\) is coercive (i.e., there exists \(C_{\textrm{coer}}>0\) such that (1.8) holds) and Assumption 2.8 holds with \(\Vert \cdot \Vert _{{L^2(\Gamma )}}\) in (2.17) replaced by \(\Vert \cdot \Vert _{{{\mathcal {H}}}}\). If \(C_1\) and \(C_2\) are as in (2.17) and \(\phi _N^m\) is defined by (2.20), then

$$\begin{aligned} \frac{ \left\| \phi -\phi _N^m\right\| _{{{\mathcal {H}}}}}{\left\| \phi \right\| _{{{\mathcal {H}}}}}\le & {} \left( 1 + \frac{\Vert A\Vert _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^3 \frac{\left\| \textbf{r}_m\right\| _2}{\left\| \textbf{r}_0\right\| _2}\right) \frac{\left\| \phi -\phi _N\right\| _{{{\mathcal {H}}}}}{\left\| \phi \right\| _{{{\mathcal {H}}}}} \nonumber \\{} & {} + \frac{\Vert A\Vert _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^3 \frac{\left\| \textbf{r}_m\right\| _2}{\left\| \textbf{r}_0\right\| _2}. \end{aligned}$$
(4.36)

The right-hand side of (4.36) contains the relative residual \(\left\| \textbf{r}_m\right\| _2/\left\| \textbf{r}_0\right\| _2\). The following bound, from [5], gives sufficient conditions on m for this relative residual to be controllably small; recall that this bound is an improvement of the so-called “Elman estimate" from [28, 29].

Theorem 4.13

(Elman-type estimate for GMRES from [5]) Let \({\textsf{C}}\) be an \(M_N\times M_N\) matrix with \(0\notin W({\textsf{C}})\), where

$$\begin{aligned} W({\textsf{C}}):= \big \{ \langle {\textsf{C}}\textbf{v}, \textbf{v}\rangle : \textbf{v}\in \mathbb {C}^{M_N}, \Vert \textbf{v}\Vert _2=1\big \} \end{aligned}$$

is the field of values, also called the numerical range, of \({\textsf{C}}\). Let \(\beta \in [0,\pi /2)\) be such that

$$\begin{aligned} \cos \beta \le \frac{\textrm{dist}\big (0, W({\textsf{C}})\big )}{\Vert {\textsf{C}}\Vert _2} \end{aligned}$$
(4.37)

(observe that \(\cos \beta \) is indeed \(\le 1\) by the definition of \(W({\textsf{C}})\)) and, given \(\beta \), let

$$\begin{aligned} \gamma _\beta := 2 \sin \left( \frac{\beta }{4-2\beta /\pi }\right) . \end{aligned}$$

Let \(\textbf{r}_m\) be the mth GMRES residual, as defined in Sect. 2.2.2. Then

$$\begin{aligned} \frac{\Vert \textbf{r}_m\Vert _{2}}{\Vert \textbf{r}_0\Vert _{2}} \le \left( 2 + \frac{2}{\sqrt{3}}\right) \big (2+ \gamma _\beta \big ) \,(\gamma _\beta )^m \le 12 (\gamma _\beta )^m. \end{aligned}$$
(4.38)

Proof of Lemma 4.12

We first use continuity and coercivity of A to obtain bounds on the norm of \({\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\) and its inverse. By the definition (2.16),

$$\begin{aligned} \big ({\textsf{A}}\textbf{v},\textbf{w}\big )_2 = \big (A v_N,w_N\big )_{{{\mathcal {H}}}} \quad \text { for all }v_N, w_N \in {{\mathcal {H}}}_N. \end{aligned}$$

Using this, along with the norm equivalence (2.17), we find that, for all \(\textbf{v}, \textbf{w}\in \mathbb {C}^{M_N}\),

$$\begin{aligned} \big |\big ({\textsf{A}}\textbf{v},\textbf{w}\big )_2\big | \le \left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}} \left\| v_N\right\| _{{{\mathcal {H}}}} \left\| w_N\right\| _{{{\mathcal {H}}}} \le \left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}(C_2)^2 \big \Vert {\textsf{D}}_N^{1/2} \textbf{v}\big \Vert _2 \big \Vert {\textsf{D}}_N^{1/2} \textbf{w}\big \Vert _2 \end{aligned}$$

and

$$\begin{aligned} \big |\big ({\textsf{A}}\textbf{v},\textbf{v}\big )_2\big | \ge C_{\textrm{coer}}\left\| v_N\right\| _{{{\mathcal {H}}}}^2 \ge C_{\textrm{coer}}(C_1)^2 \big \Vert {\textsf{D}}_N^{1/2}\textbf{v}\big \Vert _2^2. \end{aligned}$$

Letting \(\widetilde{\textbf{v}}= {\textsf{D}}_N^{1/2}\textbf{v}\) and \(\widetilde{\textbf{w}}= {\textsf{D}}_N^{1/2}\textbf{w}\), we therefore have that, for all \(\widetilde{\textbf{v}}, \widetilde{\textbf{w}}\in \mathbb {C}^{M_N}\),

$$\begin{aligned} \big | \big ( {\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\widetilde{\textbf{v}},\widetilde{\textbf{w}}\big )_2 \big | \le \left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}(C_2)^2 \left\| \widetilde{\textbf{v}}\right\| _2 \left\| \widetilde{\textbf{w}}\right\| _2 \end{aligned}$$
(4.39)

and

$$\begin{aligned} \big | \big ( {\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\widetilde{\textbf{v}},\widetilde{\textbf{v}}\big )_2 \big | \ge C_{\textrm{coer}}(C_1)^2 \left\| \widetilde{\textbf{v}}\right\| _2^2. \end{aligned}$$
(4.40)

The inequalities (4.39) and (4.40) then imply that

$$\begin{aligned}{} & {} \big \Vert {\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\big \Vert _2 \le \left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}} (C_2)^2 \quad \text { and }\quad \nonumber \\{} & {} \quad \textrm{dist}\Big (0, W({\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2})\Big )\ge C_{\textrm{coer}}(C_1)^2, \end{aligned}$$
(4.41)

with the second inequality and the Lax–Milgram theorem then implying that

$$\begin{aligned} \big \Vert \big ({\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\big )^{-1}\big \Vert _2 \le \frac{1}{C_{\textrm{coer}}(C_1)^2}. \end{aligned}$$
(4.42)

We now prove (4.36). By the definitions of \(\textbf{y}_m\) (see (2.19)) and \(\textbf{r}_m\),

$$\begin{aligned} \textbf{r}_m ={\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}(\textbf{y}_m-\textbf{y}) \end{aligned}$$

and (since \(\textbf{y}_0=\textbf{0}\)) \(\textbf{r}_0 = -{\textsf{D}}_N^{-1/2}\textbf{b}= -{\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\textbf{y}\). Therefore, by (4.42) and the first bound in (4.41),

$$\begin{aligned} \left\| \textbf{y}_m-\textbf{y}\right\| _2&\le \big \Vert \big ({\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\big )^{-1}\big \Vert _2 \left\| \textbf{r}_m\right\| _2\nonumber \\&\le \frac{1}{C_{\textrm{coer}}(C_1)^2} \left( \frac{\left\| \textbf{r}_m\right\| _2}{\left\| \textbf{r}_0\right\| _2}\right) \left\| \textbf{r}_0\right\| _2 \nonumber \\&\le \frac{1}{C_{\textrm{coer}}(C_1)^2} \left( \frac{\left\| \textbf{r}_m\right\| _2}{\left\| \textbf{r}_0\right\| _2}\right) \big \Vert {\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\big \Vert _2 \left\| \textbf{y}\right\| _2\nonumber \\&\le \frac{\left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^2 \left( \frac{\left\| \textbf{r}_m\right\| _2}{\left\| \textbf{r}_0\right\| _2}\right) \left\| \textbf{y}\right\| _2. \end{aligned}$$
(4.43)

Next, the definition of \(\phi _N^m\) (2.20), the expression for \(\phi _N\) (2.21), and the norm equivalence (2.17) imply that

$$\begin{aligned} \frac{ \left\| \phi _N^m-\phi _N\right\| _{{{\mathcal {H}}}} }{ \left\| \phi _N\right\| _{{{\mathcal {H}}}} } \le \frac{C_2}{C_1} \frac{ \big \Vert {\textsf{D}}_N^{1/2}({\textsf{D}}_N^{-1/2}(\textbf{y}_m-\textbf{y}))\big \Vert _2 }{ \big \Vert {\textsf{D}}_N^{1/2}{\textsf{D}}_N^{-1/2}\textbf{y}\big \Vert _2 } = \frac{C_2}{C_1} \frac{ \big \Vert \textbf{y}_m-\textbf{y}\big \Vert _2 }{ \big \Vert \textbf{y}\big \Vert _2 }, \end{aligned}$$

and then combining this with (4.43) we obtain

$$\begin{aligned} \frac{ \left\| \phi _N^m-\phi _N\right\| _{{{\mathcal {H}}}} }{ \left\| \phi _N\right\| _{{{\mathcal {H}}}} } \le \frac{\left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^3 \left( \frac{\left\| \textbf{r}_m\right\| _2}{\left\| \textbf{r}_0\right\| _2}\right) . \end{aligned}$$

Combining this last inequality with the triangle inequality, we obtain that

$$\begin{aligned} \left\| \phi -\phi _N^m\right\| _{{{\mathcal {H}}}}&\le \left\| \phi -\phi _N\right\| _{{{\mathcal {H}}}} + \left\| \phi _N-\phi _N^m\right\| _{{{\mathcal {H}}}},\\&\le \left\| \phi -\phi _N\right\| _{{{\mathcal {H}}}}+ \frac{\left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^3 \left( \frac{\left\| \textbf{r}_m\right\| _2}{\left\| \textbf{r}_0\right\| _2}\right) \left\| \phi _N\right\| _{{{\mathcal {H}}}}, \end{aligned}$$

and then the result (4.36) follows by another use of the triangle inequality. \(\square \)

Proof of Theorem 4.11

By Part (c) of Theorem 1.1, the Galerkin error \(\left\| \phi -\phi _N\right\| _{{{\mathcal {H}}}}\) satisfies the quasioptimal error estimate (1.9). The definition of \(\beta \) in (4.33) and the bounds on \({\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\) in (4.41) imply that (4.37) is satisfied with \({\textsf{C}}= {\textsf{D}}_N^{-1/2}{\textsf{A}}{\textsf{D}}_N^{-1/2}\); note that here it is important that \({{\mathcal {H}}}\) is a Hilbert space over \(\mathbb {C}\), so that continuity and coercivity of A control \(W({\textsf{A}})\) (which involves \({\textsf{A}}\) applied to vectors in \(\mathbb {C}^{M_N}\)).

Using both (1.9) and the relative-residual bound (4.38) in (4.36), we obtain that

$$\begin{aligned} \frac{ \left\| \phi -\phi _N^m\right\| _{{{\mathcal {H}}}}}{\left\| \phi \right\| _{{{\mathcal {H}}}}} \le&\left( 1 + \frac{12\left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^3 (\gamma _\beta )^m\right) \frac{\Vert A\Vert _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\ \min _{\psi \in {{\mathcal {H}}}_N}\frac{\left\| \phi -\psi \right\| _{{{\mathcal {H}}}}}{\left\| \phi \right\| _{{{\mathcal {H}}}}} \\&+ \frac{12\left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^3 (\gamma _\beta )^m. \end{aligned}$$

Given \(\varepsilon >0\), if m satisfies (4.34), then

$$\begin{aligned} \frac{12\left\| A\right\| _{{{\mathcal {H}}}\rightarrow {{\mathcal {H}}}}}{C_{\textrm{coer}}}\left( \frac{C_2}{C_1}\right) ^3 (\gamma _\beta )^m \le \varepsilon \end{aligned}$$

and thus the bound (4.35) holds. \(\square \)

4.6.2 Conditions under which Assumption 2.8 holds

Our result about the convergence of GMRES applied to the Galerkin matrices of the new formulations, namely Theorem 2.11, is proved under Assumption 2.8, which is an assumption about the sequence of finite-dimensional subspaces \(({{\mathcal {H}}}_N)_{N=1}^\infty \) and their associated bases. Recall from Sect. 2.2.2 that Assumption 2.8 holds, indeed with \({\textsf{D}}_N\) the identity matrix, for any sequence \(({{\mathcal {H}}}_N)_{N=1}^\infty \) (and in any dimension \(d\ge 2\)) provided that the bases we choose are orthonormal. But many standard implementations of boundary element approximation methods use non-orthogonal bases, particularly bases of so-called nodal basis functions (e.g., [2, 38, 93, Page 216], [87, Pages 205 and 280]. We show as Lemma 4.15 below that Assumption 2.8 holds (moreover specifying the diagonal matrices \({\textsf{D}}_N\)) under mild constraints on the sequence of meshes when the approximation space allows discontinuities across elements. In particular, Lemma 4.15 holds when nodal basis functions are used, including for sequences of highly anisotropic meshes.

To specify the conditions under which Assumption 2.8 holds, we recall the notion of a surface mesh on \(\Gamma \), and aspects of the standard implementation of boundary element methods, including the notation of a reference element (for the moment, until we indicate otherwise, our results hold for any dimension \(d\ge 2\)). Following, e.g., [87, Defn. 4.1.2], we call \(\mathcal {G}\) a mesh of \(\Gamma \) if \(\mathcal {G}\) is a set of finitely many disjoint, relatively open, topologically regularFootnote 8 subsets of \(\Gamma \) that cover \(\Gamma \) in the sense that \(\Gamma = \cup _{\tau \in \mathcal {G}} \overline{\tau }\), and are such that the relative boundary of each \(\tau \in \mathcal {G}\) has zero surface measure. We call the elements of \(\mathcal {G}\) the (boundary) elements of the mesh and, for \(\tau \in \mathcal {G}\), set

$$\begin{aligned} h_\tau := \mathop {\textrm{diam}}(\tau ) \quad \text{ and } \quad s_\tau := |\tau |, \end{aligned}$$

where \(|\tau |\) denotes the \((d-1)\)-dimensional surface measure of \(\tau \), and set \(h:= \max _{\tau \in \mathcal {G}} h_\tau \).

We assume moreover that, for each \(\tau \in \mathcal {G}\), there exists a mapping \(\chi _\tau :\widehat{\tau }\rightarrow \tau \), for some \(\widehat{\tau }\in \mathcal {R}\), the finite set of reference elements, that is bijective and at least bi-Lipschitz, so that \(\chi _\tau ^{-1}:\tau \rightarrow \widehat{\tau }\) is well-defined and also bi-Lipschitz.Footnote 9 Here, by a reference element, \(\widehat{\tau }\), we mean, generically, some bounded, open, topologically regular subset of \(\mathbb {R}^{d-1}\), but with the idea that, in practical implementations, \(\widehat{\tau }\) is a polyhedron, usually the unit cube \(\widehat{\tau } = (0,1)^{d-1}\) or the unit simplex \(\widehat{\tau } = \{\widehat{\textbf{x}}\in (0,1)^{d-1}:\widehat{\textbf{x}}_1+\ldots + \widehat{\textbf{x}}_{d-1} < 1\}\). (In the case \(d=2\) it is usual to take \(\mathcal {R}=\{{\hat{\tau }}\}\) with \(\widehat{\tau } = (0,1)\).) For each \(\tau \in \mathcal {G}\) let \(J_\tau \in (L^\infty (\Gamma ))^{d\times (d-1)}\) denote the Jacobian of \(\chi _\tau \). Importantly, for every \(f\in L^1(\tau )\), where \(\widehat{\tau } \in \mathcal {R}\) is the domain of \(\chi _\tau \),

$$\begin{aligned} \int _\tau f(\textbf{x}) \,\textrm{d}s(\textbf{x}) = \int _{\widehat{\tau }} f(\chi _\tau (\widehat{\textbf{x}}))g_\tau (\widehat{\textbf{x}})\, \textrm{d}\widehat{\textbf{x}}, \quad \text{ where } \quad g_\tau := \big (\det (J_\tau ^TJ_\tau )\big )^{1/2}\in L^\infty (\Gamma );\nonumber \\ \end{aligned}$$
(4.44)

in particular \(s_\tau = \int _{\widehat{\tau }} g_\tau (\widehat{\textbf{x}})\, \textrm{d}\widehat{\textbf{x}}\), so that

$$\begin{aligned} g^-_\tau |\widehat{\tau }| \le s_\tau \le g^+_\tau |\widehat{\tau }|, \quad \text{ where } \quad g^+_\tau := \mathop {{\textrm{ess}} \sup }_{\widehat{\textbf{x}}\in \widehat{\tau }} g_\tau (\widehat{\textbf{x}}), \quad g^-_\tau := \mathop {{\textrm{ess}} \inf }_{\widehat{\textbf{x}}\in \widehat{\tau }} g_\tau (\widehat{\textbf{x}}).\nonumber \\ \end{aligned}$$
(4.45)

For \(p\in \mathbb {N}_0\) and \({\widehat{\tau }}\in \mathcal {R}\) let \(\mathbb {P}_p^{{\widehat{\tau }}}\) denote some finite-dimensional set of polynomials \(\psi :\widehat{\tau }\rightarrow \mathbb {R}\) that contains the polynomials of (total) degree \(\le p\). When \(\widehat{\tau }\) is a simplex one usually takes \(\mathbb {P}_p^{{\widehat{\tau }}}\) to be the set of polynomials of total degree \(\le p\); when \(\widehat{\tau }\) is a cube one usually takes \(\mathbb {P}_p^{{\widehat{\tau }}}\) to be the set of polynomials of coordinate degree \(\le p\); see, e.g., [38, Page 1494, penultimate displayed equation]. Following [87, Defn. 4.1.17], given a mesh \(\mathcal {G}\) on \(\Gamma \), define the boundary element approximation space, \(S_\mathcal {G}^p\), of discontinuous piecewise polynomials of degree \(\le p\) on \(\mathcal {G}\), by

$$\begin{aligned} S^p_\mathcal {G}:= \big \{\psi \in L^\infty (\Gamma ):\psi |_\tau \circ \chi _\tau \in \mathbb {P}_p^{{\widehat{\tau }}}, \text{ for } \text{ all } \tau \in \mathcal {G}, \text{ where } {\widehat{\tau }} \text{ is } \text{ the } \text{ domain } \text{ of } \chi _\tau \big \}.\nonumber \\ \end{aligned}$$
(4.46)

Where \(P_{{\widehat{\tau }}}=\dim (\mathbb {P}_p^{{\widehat{\tau }}})\) and \(M=\dim (S^p_{\mathcal {G}})\), we equip \(S^p_\mathcal {G}\) with a basis \(\{\psi _1,\ldots , \psi _M\}\) constructed as follows. For each \(\widehat{\tau }\in \mathcal {R}\) choose a basis \(\{\psi ^{\widehat{\tau }}_1, \ldots ,\psi ^{\widehat{\tau }}_{P_{{\widehat{\tau }}}}\}\) for \(\mathbb {P}_p^{{\widehat{\tau }}}\) (for example, a nodal basis as in [2, 38]). For each \(\tau \in \mathcal {G}\), where \(\widehat{\tau }\) is the domain of \(\chi _\tau \), define \(\psi ^{\tau }_j\in L^\infty (\Gamma )\), for \(j=1,\ldots ,P_{{\widehat{\tau }}}\), by

$$\begin{aligned} \psi ^\tau _j(\textbf{x}):= {\left\{ \begin{array}{ll} \psi ^{\widehat{\tau }}_j\big (\chi _\tau ^{-1}(\textbf{x})\big ), &{} \textbf{x}\in \tau ,\\ 0, &{} \textbf{x}\in \Gamma \setminus \tau . \end{array}\right. } \end{aligned}$$
(4.47)

Then set

$$\begin{aligned} \big \{\psi _1,\ldots , \psi _M\big \} = \big \{\psi ^\tau _j:\tau \in \mathcal {G}, j\in \{1,\ldots ,P_{{\widehat{\tau }}}\}\big \}, \end{aligned}$$
(4.48)

noting that (see, e.g., [38, p. 1495]) \(\{\psi _1,\ldots , \psi _M\}\) is a nodal basis if each \(\{\psi ^{\widehat{\tau }}_1, \ldots ,\psi ^{\widehat{\tau }}_{P_{{\widehat{\tau }}}}\}\), \(\widehat{\tau }\in \mathcal {R}\), is a nodal basis.

Consider now the case that we keep \(\mathcal {R}\), p, and the bases \(\{\psi ^{\widehat{\tau }}_1, \ldots ,\psi ^{\widehat{\tau }}_{P_{{\widehat{\tau }}}}\}\), \(\widehat{\tau }\in \mathcal {R}\), fixed but use a sequence of meshes \(\mathcal {G}_N\), \(N\in \mathbb {N}\), with associated approximation spaces \({{\mathcal {H}}}_N:= S^p_\mathcal {G_N}\) that are such that \(h_N:= \max _{\tau \in \mathcal {G}_N} h_\tau \rightarrow 0\) as \(N\rightarrow \infty \), i.e. we consider the h-version of the boundary-element method. Lemma 4.15 below applies in this regime under the following assumption on the constants \(g^\pm _\tau \) defined by (4.45) (this assumption is the first half, Equation 3.5a, of [38, Assumption 3.1]).

Assumption 4.14

There exists a constant \(c_1\ge 1\) such that, for every \(N\in \mathbb {N}\) and \(\tau \in \mathcal {G}_N\),

$$\begin{aligned} g^+_\tau \le c_1 g^-_\tau ; \end{aligned}$$
(4.49)

equivalently, there exists a constant \(c_2\ge 1\) such that, for every \(N\in \mathbb {N}\) and \(\tau \in \mathcal {G}_N\),

$$\begin{aligned} c_2^{-1}s_\tau \le g_\tau (\widehat{\textbf{x}}) \le c_2 s_\tau \quad \text{ for } \text{ almost } \text{ all } \widehat{\textbf{x}}\in {\hat{\tau }}. \end{aligned}$$
(4.50)

We make two remarks about Assumption 4.14.

  1. (i)

    The claimed equivalence of (4.49) and (4.50) follows from (4.45) (precisely, if (4.49) holds then (4.50) holds with \(c_2= c_1\max (|\widehat{\tau }|, |\widehat{\tau }|^{-1})\), and if (4.50) holds then (4.49) holds with \(c_1= c_2^2\)).

  2. (ii)

    Because \(\chi _\tau \) is bi-Lipschitz, (4.49) holds for every \(\tau \in \mathcal {G}_N\) for some \(c_1\ge 1\) (not necessarily independent of \(\tau \) and N). In particular (4.49) holds with \(c_1 =1\) if each \(\chi _\tau \) is affine, so that Assumption 4.14 holds in that case (see also the discussion below [38, Assumption 3.1]).

For the following lemma, recall that the matrix \({\textsf{A}}\) is given by (2.16) with \(A^\dag _{\textbf{Z},\alpha }\) equal to one of \(A_{I,\textbf{Z},\alpha }\), \(A'_{I,\textbf{Z},\alpha }\), \(A_{E,\textbf{Z},\alpha }\), or \(A'_{E,\textbf{Z},\alpha }\).

Lemma 4.15

(Conditions under which Assumption 2.8 holds) Suppose that, while keeping \(\mathcal {R}\), p, and the bases \(\{\psi ^{\widehat{\tau }}_1, \ldots ,\psi ^{\widehat{\tau }}_{P_{{\widehat{\tau }}}}\}\), \(\widehat{\tau }\in \mathcal {R}\), fixed, we use a sequence of meshes \(\mathcal {G}_N\), \(N\in \mathbb {N}\), with associated approximation spaces \({{\mathcal {H}}}_N:= S^p_\mathcal {G_N}\) and bases (4.48) that are such that \(h_N:= \max _{\tau \in \mathcal {G}_N} h_\tau \rightarrow 0\) as \(N\rightarrow \infty \) and Assumption 4.14 holds. Then the following is true.

  1. (i)

    Assumption 2.8 holds with \(({\textsf{D}}_N)_{ii}:= s_{i}\), \(i=1,\ldots , M_N\), where \(s_i:=s_\tau \) if \(\psi _i\) is supported in \(\tau \), with \(C_1:=c_2^{-1/2}c_{\mathcal {R}}^{-1/2}\), \(C_2:= c_2^{1/2}c_{\mathcal {R}}^{1/2}\), where \(c_{\mathcal {R}}\ge 1\) depends only on the bases \(\{\psi ^{\widehat{\tau }}_1, \ldots ,\psi ^{\widehat{\tau }}_{P_{\widehat{\tau }}}\}\), \(\widehat{\tau }\in \mathcal {R}\).

  2. (ii)

    If, in addition, \(d\ge 3\), (2.6) holds for some \(c>0\), and \(\alpha \) satisfies (2.7), then Assumption 2.8 holds also with \(({\textsf{D}}_N)_{ii}:= |{\textsf{A}}_{ii}|\), for \(i=1,\ldots , M_N\), with

    $$\begin{aligned} C_1:= C_-C^{-1/2} c_2^{-1}\quad \text { and }\quad C_2:= C_+c^{-1/2}c_2, \end{aligned}$$
    (4.51)

    where \(C:=\Vert A^\dag _{\textbf{Z},\alpha }\Vert _{L^2(\Gamma )\rightarrow L^2(\Gamma )}\) and \(C_\pm >0\) depend only on the bases \(\{\psi ^{\widehat{\tau }}_1, \ldots ,\psi ^{\widehat{\tau }}_{P_{{\widehat{\tau }}}}\}\), \(\widehat{\tau }\in \mathcal {R}\).

Part (ii) of Lemma 4.15 is proved for \(d=3\) using the coercivity results of Part (vi) of both Theorems 2.1 and 2.2. An analogous result holds for \(d=2\) using the coercivity results of Part (v) of Theorem 2.14, but we omit this for brevity.

Proof of Lemma 4.15

  1. (i)

    Given \(w_N = \sum _{\ell =1}^{M_N}w^N_{\ell } \psi _{\ell }^N \in {{\mathcal {H}}}_N\), by (4.48),

    $$\begin{aligned} w_N= \sum _{\tau \in \mathcal {G}_N}\sum _{j=1}^{P_{{\widehat{\tau }}}} w_{j}^\tau \psi _{j}^\tau \end{aligned}$$
    (4.52)

    for some coefficients \(\{w_j^\tau \}_{j=1}^{P_{{\widehat{\tau }}}}\), and where \(\widehat{\tau }\) is the domain of \(\chi _\tau \). Thus, by (4.44), for all \(\tau \in \mathcal {G}_N\),

    $$\begin{aligned} \big \Vert w_N|_\tau \big \Vert ^2_{L^2(\tau )} = \int _{\widehat{\tau }} |\widehat{\psi }(\widehat{\textbf{x}})|^2 g_\tau (\widehat{\textbf{x}})\, \textrm{d}\widehat{\textbf{x}}, \end{aligned}$$
    (4.53)

    where

    $$\begin{aligned} \widehat{\psi }(\widehat{\textbf{x}}):= w_N\big (\chi _\tau (\widehat{\textbf{x}})\big ) = \sum _{j=1}^{P_{{\widehat{\tau }}}} w_j^\tau \psi ^{\widehat{\tau }}_j(\widehat{\textbf{x}}) \quad \text { for }\widehat{\textbf{x}}\in \widehat{\tau } \end{aligned}$$
    (4.54)

    (and we have used (4.52) and (4.47)). For every \(\widehat{\tau }\in \mathcal {R}\), every \(\widehat{\phi }\in \mathbb {P}_p^{{\widehat{\tau }}}\) can be written as \(\widehat{\phi }= \sum _{j=1}^{P_{{\widehat{\tau }}}} a_j\psi ^{\widehat{\tau }}_j\) for some unique vector \(\textbf{a}=(a_1,\ldots ,a_{P_{{\widehat{\tau }}}})^T\). With \(\mathbb {P}_p^{{\widehat{\tau }}}\) equipped with the norm \(\Vert \cdot \Vert _{\widehat{\tau }}\) defined by

    $$\begin{aligned} \Vert \widehat{\phi }\Vert ^2_{\widehat{\tau }} = \sum _{j=1}^{P_{{\widehat{\tau }}}} a_j^2, \end{aligned}$$
    (4.55)

    since \(\mathbb {P}_p^{{\widehat{\tau }}}\) is finite-dimensional, there exists \(c_{\widehat{\tau }}\ge 1\) such that

    $$\begin{aligned} c_{\widehat{\tau }}^{-1}\big \Vert \widehat{\phi }\big \Vert ^2_{\widehat{\tau }}\le \big \Vert \widehat{\phi }\big \Vert ^2_{L^2(\widehat{\tau })}\le c_{\widehat{\tau }}\big \Vert \widehat{\phi }\big \Vert ^2_{\widehat{\tau }} \quad \text { for all }\widehat{\phi }\in \mathbb {P}_p^{{\widehat{\tau }}}. \end{aligned}$$
    (4.56)

    Therefore, using (4.50) and (4.56) (with \(\widehat{\phi }=\widehat{\psi }\)) in (4.53), we have that, for all \(\tau \in \mathcal {G}_N\),

    $$\begin{aligned} c^{-1}_{\widehat{\tau }} c^{-1}_2s_\tau \big \Vert \widehat{\psi }\big \Vert ^2_{\widehat{\tau }}\le & {} c_2^{-1}s_\tau \big \Vert \widehat{\psi }\big \Vert ^2_{L^2(\widehat{\tau })} \le \big \Vert w_N|_\tau \big \Vert ^2_{L^2(\tau )}\nonumber \\\le & {} c_2s_\tau \big \Vert \widehat{\psi }\big \Vert ^2_{L^2(\widehat{\tau })} \le c_2s_\tau c_{\widehat{\tau }}\big \Vert \widehat{\psi }\big \Vert ^2_{\widehat{\tau }}. \end{aligned}$$
    (4.57)

    Furthermore, by (4.54) and (4.55),

    $$\begin{aligned} \Vert \widehat{\psi }\Vert ^2_{\widehat{\tau }} = \sum _{j=1}^{P_{{\widehat{\tau }}}} (w_j^\tau )^2. \end{aligned}$$
    (4.58)

    If \(({\textsf{D}}_N)_{ii}:= s_{i}\), \(i=1,\ldots , M_N\), then, by (4.52),

    $$\begin{aligned} \big \Vert {\textsf{D}}_N^{1/2} \textbf{w}\big \Vert ^2_{2} = \sum _{\tau \in \mathcal {G}_N}s_\tau \sum _{j=1}^{P_{{\widehat{\tau }}}} (w_j^\tau )^2. \end{aligned}$$
    (4.59)

    Therefore, combining (4.57), (4.58), and (4.59), we see that, with the choice \(({\textsf{D}}_N)_{ii}:= s_{i}\), \(i=1,\ldots , M_N\), Assumption 2.8 holds with \(C_1=c^{-1/2}_2c_{\mathcal {R}}^{-1/2}\) and \(C_2=c_2^{1/2}c^{1/2}_{\mathcal {R}}\), where \(c_{\mathcal {R}}:= \max _{\widehat{\tau } \in \mathcal {R}} c_{\widehat{\tau }}\).

  2. (ii)

    By the coercivity and continuity of \(A^\dag _{\textbf{Z},\alpha }\) from Theorem 2.1 or Theorem 2.2,

    $$\begin{aligned} \frac{c}{2}\Vert \psi ^{N}_i\Vert ^2_{L^2(\Gamma )}\le \big |\big (A^\dag _{\textbf{Z},\alpha }\psi ^{N}_i,\psi ^{N}_i\big )_{L^2(\Gamma )}\big | \le C\Vert \psi ^{N}_i\Vert ^2_{L^2(\Gamma )}; \end{aligned}$$

    therefore, since \(|{\textsf{A}}_{ii}|=|(A^\dag _{\textbf{Z},\alpha }\psi ^{N}_i,\psi ^{N}_i)_{L^2(\Gamma )}|\),

    $$\begin{aligned} \frac{c}{2}\Vert \psi ^{N}_i\Vert ^2_{L^2(\Gamma )}\le |{\textsf{A}}_{ii}|\le C\Vert \psi ^{N}_i\Vert ^2_{L^2(\Gamma )}. \end{aligned}$$
    (4.60)

    Further, where \(\tau \) is the support of \(\psi ^N_i\) and \(\widehat{\tau }\) is the domain of \(\chi _\tau \), by (4.44),

    $$\begin{aligned} \Vert \psi ^{N}_i\Vert ^2_{L^2(\Gamma )} = \int _{\tau } |\psi ^\tau _j(\textbf{x})|^2 \, \textrm{d}s(\textbf{x}) = \int _{\widehat{\tau }} |\psi ^{\widehat{\tau }}_j(\widehat{\textbf{x}})|^2 g_\tau (\widehat{\textbf{x}})\, \textrm{d}\widehat{\textbf{x}}, \end{aligned}$$

    for some \(j\in \{1,...,P_{{\widehat{\tau }}}\}\), so that, by (4.50),

    $$\begin{aligned} c_- c^{-1}_2 s_\tau \le \Vert \psi ^{N}_i\Vert ^2_{L^2(\Gamma )}\le c_+ c_2 s_\tau , \end{aligned}$$
    (4.61)

    where

    $$\begin{aligned} c_+:= \max _{\widehat{\tau } \in \mathcal {R}, j=1,\ldots ,P_{{\widehat{\tau }}}} \Vert \psi ^{\widehat{\tau }}_j\Vert _{L^2(\widehat{\tau })}^2 \quad \text { and }\quad c_-:= \min _{\widehat{\tau } \in \mathcal {R}, j=1,\ldots ,P_{{\widehat{\tau }}}} \Vert \psi ^{\widehat{\tau }}_j\Vert _{L^2(\widehat{\tau })}^2. \end{aligned}$$

    Thus, combining (4.60) and (4.61), we have that

    $$\begin{aligned} \frac{c}{2}c_- c^{-1}_2 s_\tau \le |{\textsf{A}}_{ii}|\le Cc_+ c_2 s_\tau \quad \text{ for } i=1,\ldots , M_N. \end{aligned}$$

    Therefore, if \(({\textsf{D}}_N)_{ii}:= |{\textsf{A}}_{ii}|\), for \(i=1,\ldots , M_N\), then, by (4.52),

    $$\begin{aligned} \frac{c}{2}c_-c_2^{-1}\sum _{\tau \in \mathcal {G}_N}s_\tau \sum _{j=1}^{P_{{\widehat{\tau }}}} (w_j^\tau )^2 \le \big \Vert {\textsf{D}}_N^{1/2} \textbf{w}\big \Vert ^2_{2} \le Cc_+c_2 \sum _{\tau \in \mathcal {G}_N}s_\tau \sum _{j=1}^{P_{{\widehat{\tau }}}} (w_j^\tau )^2. \end{aligned}$$

    By (4.58) and (4.57), Assumption 2.8 therefore holds with \(C_1\) and \(C_2\) given by (4.51).

\(\square \)

Remark 4.16

(The novelty of Lemma 4.15) Similar results to Lemma 4.15 are given in [38], where, for a continuous, coercive, and symmetric sesquilinear form, scaling by the diagonal part of the Galerkin matrix (as in Part (ii) of Lemma 4.15) is used to remove the ill-conditioning of the Galerkin matrix due to mesh degeneracy; see [38, Equations 1.5\(-\)1.7].

The advantage of the results of [38] compared to those of Lemma 4.15 is that [38] works in \(H^s(\Gamma )\) for \(|s|\le 1\), whereas Lemma 4.15 only works in \(L^2(\Gamma )\). However, Lemma 4.15 works with rather general meshes in arbitrary dimensions, subject only to Assumption 4.14, whereas [38] imposes the following conditions on the mesh: (i) the mesh is regular in the sense of [87, Definition 4.1.4], see [38, Page1495], and (ii) the mesh satisfies [38, Assumptions 3.1 and 3.2], with the latter requiring, e.g., that neighbouring mesh elements have comparable aspect ratios.

5 Wellposedness and regularity results for the Laplace interior and exterior oblique Robin problems

5.1 Statement of the Laplace interior and exterior oblique Robin problems

Definition 5.1

(The Laplace interior oblique Robin problem (IORP)) With \({\Omega ^-}\) as in Sect. 1.6, given \(g\in L^2(\Gamma )\), \(\textbf{Z}\in (L^\infty (\Gamma ))^{d}\), and \(\alpha \in L^\infty (\Gamma )\), find \(u\in H^1({\Omega ^-})\) with \(\gamma ^- u\in H^1(\Gamma )\) and \(\partial _n^- u\in {L^2(\Gamma )}\) such that \(\Delta u =0\) in \({\Omega ^-}\) and

$$\begin{aligned} (\textbf{Z}\cdot \textbf{n}) \partial _n^- u+ \textbf{Z}\cdot \nabla _\Gamma ( \gamma ^- u) + \alpha \, \gamma ^- u= g \quad \text { on } \Gamma . \end{aligned}$$
(5.1)

Definition 5.2

(The Laplace exterior oblique Robin problem (EORP)) With \({\Omega ^+}\) as in Sect. 1.6, given \(g\in L^2(\Gamma )\), \(\textbf{Z}\in (L^\infty (\Gamma ))^{d}\), and , find \(u\in H^1_{\textrm{loc}}({\Omega ^+})\) with \(\gamma ^+ u\in H^1(\Gamma )\) and \(\partial _n^+ u\in {L^2(\Gamma )}\) such that \(\Delta u =0\) in \({\Omega ^+}\),

$$\begin{aligned} (\textbf{Z}\cdot \textbf{n}) \partial _n^+ u+ \textbf{Z}\cdot \nabla _\Gamma ( \gamma ^+ u) - \alpha \, \gamma ^+ u= g \quad \text { on } \Gamma , \end{aligned}$$
(5.2)

and, as \(|\textbf{x}| \rightarrow \infty \), \(u(\textbf{x})= O(1)\) when \(d=2\) and \(u(\textbf{x}) = o(|\textbf{x}|^{3-d})\) when \(d\ge 3\) (uniformly in all directions \(\textbf{x}/|\textbf{x}|\)).

A regularity result of Nečas [78] (stated as Theorem B.1 below) implies that either of the requirements \(\partial _n^- u\in {L^2(\Gamma )}\) and \(\gamma ^- u\in H^1(\Gamma )\) in Definition 5.1 can be removed; similarly in Definition 5.2.

The IORP and EORP can also be formulated in terms of non-tangential maximal functions and non-tangential limits (similar to the case of the Dirichlet problem discussed in Sect. 1.6). We now give this alternative formulation for the IORP and prove that it is equivalent to Definition 5.1; this equivalence is necessary to use results from the harmonic-analysis literature on the standard Laplace oblique derivative problem (see Theorem 5.13 below). The alternative formulation for the EORP and proof of equivalence to Definition 5.2 are completely analogous and are omitted.

Definition 5.3

(The Laplace IORP via non-tangential limits) With \({\Omega ^-}\) as in Sect. 1.6, given \(g\in L^2(\Gamma )\), \(\textbf{Z}\in (L^\infty (\Gamma ))^{d}\), and \(\alpha \in L^\infty (\Gamma )\), find \(u\in C^2({\Omega ^-})\) with \((\nabla u)^*\in (L^2(\Gamma ))^d\) such that \(\Delta u =0\) in \({\Omega ^-}\) and

$$\begin{aligned} \textbf{Z}\cdot \widetilde{\gamma }^- (\nabla u ) + \alpha \widetilde{\gamma }^-u = g \quad \text { on } \Gamma , \end{aligned}$$
(5.3)

where \(\widetilde{\gamma }^-\) is the non-tangential limit defined by (1.13).

Theorem 5.4

(Equivalence of the different formulations of the IORP) The formulations of the IORP in Definition 5.1 and 5.3 are equivalent (i.e., if u is a solution to the IORP in the sense of Definition 5.1, then it is a solution in the sense of Definition 5.3, and vice versa).

Proof

If u is a solution of the IORP in the sense of Definition 5.1, then \(u\in C^\infty ({\Omega ^-})\) by elliptic regularity. Furthermore, \(u\in H^{3/2}({\Omega ^-})\) by Lemma B.4, and then \((\nabla u)^* \in L^2(\Gamma )\) by Part (iii) of Theorem B.2. By Lemma B.3, \(\widetilde{\gamma }^-u = \gamma ^- u\), and, by Lemma B.4,

$$\begin{aligned} \widetilde{\gamma }^-(\nabla u) = \textbf{n}\partial _n^- u+ \nabla _\Gamma (\gamma ^- u) \quad \text { almost everywhere on } \Gamma . \end{aligned}$$
(5.4)

Therefore

$$\begin{aligned} \textbf{Z}\cdot \widetilde{\gamma }^- (\nabla u) + \alpha \widetilde{\gamma }^- u = (\textbf{Z}\cdot \textbf{n}) \partial _n^- u+ \textbf{Z}\cdot \nabla _\Gamma (\gamma ^- u) + \alpha \gamma ^- u, \end{aligned}$$
(5.5)

so that the boundary condition (5.3) is equivalent to (5.1); therefore, u is a solution of the IORP in the sense of Definition 5.3.

Conversely, if u is the solution of the IORP in the sense of Definition 5.3, then \(u\in H^{3/2}({\Omega ^-})\) by Part (iii) of Theorem B.2. Then Lemma B.4 implies that \(\partial _n^- u\in {L^2(\Gamma )}\), \(\gamma ^- u\in H^1(\Gamma )\), and (5.4) holds. Hence (5.5) holds and the boundary condition (5.1) is equivalent to (5.3); therefore, u is a solution of the IORP in the sense of Definition 5.1. \(\square \)

5.2 Link between the IORP/EORP and the BIEs in Theorems 2.1, 2.2

Theorem 5.5

(\(A^\prime _{I,\textbf{Z},\alpha }\) can be used to solve the EORP for \(d\ge 3\)) If \(d\ge 3\) then the single-layer potential \(u={{\mathcal {S}}} \phi \) with density \(\phi \in L^2(\Gamma )\) satisfies the exterior oblique Robin problem (Definition 5.2) if and only if

$$\begin{aligned} A^\prime _{I,\textbf{Z},\alpha }\phi = -g. \end{aligned}$$
(5.6)

Conversely, if \(d\ge 3\) and u satisfies the EORP, then \(u={{\mathcal {S}}} \phi \) for some \(\phi \in L^2(\Gamma )\) that satisfies (5.6).

Theorem 5.6

(\(A^\prime _{E,\textbf{Z},\alpha }\) can be used to solve the IORP for \(d\ge 2\)) The single-layer potential \(u={{\mathcal {S}}} \phi \), with density \(\phi \in L^2(\Gamma )\), satisfies the IORP (Definition 5.1) if and only if

$$\begin{aligned} A^\prime _{E,\textbf{Z},\alpha }\phi = g. \end{aligned}$$
(5.7)

Conversely, if u satisfies the IORP, then, provided \(a\ne \textrm{Cap}_\Gamma \) when \(d=2\), \(u={{\mathcal {S}}} \phi \), where \(\phi \in L^2(\Gamma )\) satisfies (5.7).

Proof of Theorem 5.5

If \(d \ge 3\) and \(u={{\mathcal {S}}} \phi \) with \(\phi \in L^2(\Gamma )\), then by, e.g., [17, Theorem 2.14] \(u\in C^2({\Omega ^+})\) and \(\Delta u =0\) in \({\Omega ^+}\), and, by (4.23), \(u(\textbf{x})=O(|\textbf{x}|^{2-d})\) as \(|\textbf{x}|\rightarrow \infty \), uniformly in \(\textbf{x}/|\textbf{x}|\). By, e.g., [17, Theorem 2.14], \(u\in H^1_{\textrm{loc}}({\Omega ^+})\) and, by the jump relations (4.2) and the definition of \(K_{\textbf{Z}}'\) (1.18), (5.2) holds if and only if \(\phi \) satisfies (5.6). Conversely, if u satisfies the EORP, then, by the invertibility of S recalled in Lemma A.1 below, \(\phi := S^{-1} \gamma ^+ u\in L^2(\Gamma )\). Defining \(v:={{\mathcal {S}}}\phi \), v satisfies the Laplace exterior Dirichlet problem with boundary data \(\gamma ^+ v=\gamma ^+{{\mathcal {S}}}\phi = S\phi = \gamma ^+ u\), so that \(v=u\) by uniqueness for the EDP. As established in the first part of the proof, since u satisfies the EORP, \(\phi \) satisfies (5.6). \(\square \)

Proof of Theorem 5.6

This is very similar to the proof of Theorem 5.5, except that now we can also consider \(d=2\), since (by definition) there are no conditions at infinity imposed on the solution of the IORP. \(\square \)

Theorem 5.7

Let \(P_\textrm{DtN}^\pm :H^1(\Gamma )\rightarrow {L^2(\Gamma )}\) denote the Dirichlet-to-Neumann maps for Laplace’s equation in \(\Omega ^\pm \); i.e., the maps \(g_D\mapsto \partial _n^\pm u\) for u as in Definitions 1.3/1.4 respectively. Let \(P_\textrm{ItD}^{-, \alpha , \textbf{Z}}: {L^2(\Gamma )}\rightarrow H^1(\Gamma )\) denote the map \(g\mapsto \gamma ^- u\) where u is as in Definition 5.1. Let \(P_\textrm{ItD}^{+, \alpha , \textbf{Z}}: {L^2(\Gamma )}\rightarrow H^1(\Gamma )\) denote the map \(g\rightarrow \gamma ^+ u\) where u is as in Definition 5.2. Then, as operators on \({L^2(\Gamma )}\),

$$\begin{aligned} \big ( A'_{E,\textbf{Z},\alpha }\big )^{-1} = \frac{1}{\textbf{Z}\cdot \textbf{n}}I - \left( P^+_\textrm{DtN} + \frac{1}{ \textbf{Z}\cdot \textbf{n}}\big ( \alpha +\textbf{Z}\cdot \nabla _{\Gamma }\big ) \right) P^{-, \alpha , \textbf{Z}}_\textrm{ItD} \end{aligned}$$
(5.8)

and

$$\begin{aligned} \big ( A'_{I,\textbf{Z},\alpha }\big )^{-1} = \frac{1}{\textbf{Z}\cdot \textbf{n}}I - \left( P^-_\textrm{DtN} + \frac{1}{ \textbf{Z}\cdot \textbf{n}}\big ( -\alpha +\textbf{Z}\cdot \nabla _{\Gamma }\big ) \right) P^{+, \alpha , \textbf{Z}}_\textrm{ItD}. \end{aligned}$$
(5.9)

Proof

We first prove (5.8). Suppose \(A'_{E,\textbf{Z},\alpha }\phi =g\) with \(\phi ,g\in {L^2(\Gamma )}\) and let \(u:= {{\mathcal {S}}}\phi \). Then \(\gamma ^+ u=\gamma ^- u= P^{-,\alpha ,\textbf{Z}}_\textrm{ItD} g\) by the first jump relation in (4.2) and Theorem 5.6. By the second jump relation in (4.2), the definition of \(P_\textrm{DtN}^+\), and the boundary condition (5.1),

$$\begin{aligned} \phi&= \partial _n^- u- \partial _n^+ u, \\&= \frac{1}{\textbf{Z}\cdot \textbf{n}} \big ( g - \textbf{Z}\cdot \nabla _{\Gamma }(\gamma ^- u) - \alpha \gamma ^- u\big ) - P_\textrm{DtN}^+ \gamma ^+ u= \frac{1}{\textbf{Z}\cdot \textbf{n}}g \\&\quad - \left( P_\textrm{DtN}^+ + \frac{\alpha + \textbf{Z}\cdot \nabla _{\Gamma }}{\textbf{Z}\cdot \textbf{n}}\right) P_\textrm{ItD}^{-,\alpha ,\textbf{Z}}g, \end{aligned}$$

which implies (5.8). The proof of (5.9) is then very similar, using Theorem 5.5 instead of Theorem 5.6. \(\square \)

5.3 Statement of the wellposedness results and implications for the BIEs in Theorems 2.1 and 2.2

Theorem 5.8

(Uniqueness for the IORP) Suppose that, for some \(\beta \in (0,1]\), \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) and \(\alpha \in C^{0,\beta }(\Gamma )\) and that, for some constants \(c, c_0>0\),

$$\begin{aligned} \textbf{Z}(\textbf{x})\cdot \textbf{n}(\textbf{x}) \ge c \quad \text{ for } \text{ almost } \text{ every } \textbf{x}\in \Gamma \quad \text{ and } \quad \alpha (\textbf{x}) \ge c_0 \quad \text { for }\textbf{x}\in \Gamma . \end{aligned}$$
(5.10)

Then the IORP has at most one solution.

Corollary 5.9

(Existence for the IORP and invertibility of \(A'_{E,\textbf{Z},\alpha }\)) If the assumptions of Theorem 5.8 hold and \(a\ne \textrm{Cap}_\Gamma \) when \(d=2\), then \(A'_{E,\textbf{Z},\alpha }\) is invertible and the IORP has exactly one solution.

Theorem 5.10

(Uniqueness for the EORP) Suppose that, for some \(\beta \in (0,1]\), \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) and \(\alpha \in C^{0,\beta }(\Gamma )\), and that (5.10) holds, for some constants \(c, c_0>0\). Then the EORP has at most one solution.

Corollary 5.11

(Existence for the EORP and invertibility of \(A'_{I,\textbf{Z},\alpha }\)) If the assumptions of Theorem 5.10 hold and \(d\ge 3\), then the EORP has exactly one solution and \(A'_{I,\textbf{Z},\alpha }\) is invertible.

5.4 Proofs of Theorems 5.8 and 5.10

Recall that, for \(1\le p\le \infty \), \(H^{1,p}(\Gamma ):=\{\phi \in L^p(\Gamma ):\nabla _\Gamma \phi \in L^p(\Gamma )\}\) is a Banach space with the norm \(\Vert \phi \Vert _{H^{1,p}(\Gamma )}:= \Vert \phi \Vert _{L^p(\Gamma )} + \Vert \nabla _\Gamma \phi \Vert _{L^p(\Gamma )} \). Note that \(H^1(\Gamma )=H^{1,2}(\Gamma )\), with equivalence of norms.

The following result is standard in the theory of potential theory on Lipschitz domains; see, e.g., [102, Page 203].

Lemma 5.12

Suppose that \(\textbf{Z}\in (C(\Gamma ))^d\) and the first of the bounds (5.10) holds for some \(c>0\). Then, for each \(\textbf{x}\in \Gamma \) there exists \(R>0\) and \(F\in C^{0,1}(\mathbb {R}^{d-1})\) and a rotated coordinate system \(0{\tilde{x}}_1...{\tilde{x}}_d\), with origin at \(\textbf{x}\) and with the \({\tilde{x}}_d\) axis pointing in the direction \(\textbf{Z}(\textbf{x})\), such that, where \({\tilde{y}}^\prime := ({\tilde{y}}_1,...,{\tilde{y}}_{d-1})\),

$$\begin{aligned} B_R(\textbf{x})\cap {\Omega ^+}= & {} B_R(\textbf{x})\cap \{\textbf{y}= ({\tilde{y}}^\prime ,{\tilde{y}}_d):{\tilde{y}}_d > F({\tilde{y}}^\prime )\}, \quad \\ B_R(\textbf{x})\cap {\Omega ^-}= & {} B_R(\textbf{x})\cap \{\textbf{y}= ({\tilde{y}}^\prime ,{\tilde{y}}_d):{\tilde{y}}_d < F({\tilde{y}}^\prime )\}. \end{aligned}$$

The following key regularity estimate follows immediately from [82, 102].

Theorem 5.13

(Regularity for the interior oblique derivative problem.) Suppose that \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) for some \(\beta \in (0,1]\), the first inequality in (5.10) holds for some constant \(c>0\), and u satisfies the Laplace oblique derivative problem (i.e., the IORP in the special case \(\alpha =0\)) with data g.

  1. (i)

    If \(g\in C^{0,\beta }(\Gamma )\), then \(u\in C^{1,\gamma }(\overline{{\Omega ^-}})\) for some \(\gamma \in (0,\beta ]\) depending only on \({\Omega ^-}\).

  2. (ii)

    If \(g\in L^p(\Gamma )\) with \(2\le p <\infty \), then \((\nabla u)^*\in (L^p(\Gamma ))^d\).

Proof

  1. (i)

    It is known from [13, Section 4], [50, 82, 102] that if the first inequality in (5.10) holds, then the Laplace oblique derivative problem has a solution if and only if g satisfies finitely-many linear conditions (i.e., conditions of the form \((g,\phi _j)_\Gamma =0\), \(j=1,...,N\), for some \(N\in \mathbb {N}\) and \(\phi _1,...,\phi _N\in L^2(\Gamma )\)). If \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) and u is a solution for particular data \(g\in C^{0,\beta }(\Gamma )\), the finitely-many linear conditions on g are satisfied, and u can be written as \(u=u_P+u_H\) where \(u_P\) is the particular solution studied in [82], which is shown in [82, §3] to satisfy \(u_P \in C^{1,\gamma }({\Omega ^-})\) for some \(\gamma \in (0,\beta ]\) (dependent on \({\Omega ^-}\)), and \(u_H\) is a solution of the homogeneous oblique derivative problem, which is shown in [102, Corollary 2.7] to be constant in \({\Omega ^-}\).

  2. (ii)

    This follows from arguing as in (i), but replacing the results of [82] for Hölder continuous g by those of [13] for \(g\in L^p(\Gamma )\) with \(2-\epsilon<p<2+\epsilon \) (for some \(\epsilon >0\) dependent on \({\Omega ^-}\)) and [50] for \(g\in L^p(\Gamma )\) with \(p>2\) (note that while the results of [13, 50] only require that \(\textbf{Z}\) is continuous, [102, Corollary 2.7] requires \(\textbf{Z}\) to be Hölder continuous).

\(\square \)

Theorem 5.14

(Regularity for the IORP) Suppose that \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) and \(\alpha \in C^{0,\beta }(\Gamma )\) for some \(\beta \in (0,1]\), \(\textbf{Z}\) satisfies the first inequality in (5.10) for some \(c>0\), and u satisfies the Laplace interior oblique Robin problem with data \(g\in C^{0,\beta }(\Gamma )\). Then \(u\in C^{1,\gamma }(\overline{{\Omega ^-}})\) for some \(\gamma \in (0,\beta ]\).

Proof

Suppose that the conditions of the theorem are satisfied, in particular that u satisfies the IORP with data \(g\in C^{0,\beta }(\Gamma )\), for some \(\beta \in (0,1]\). Suppose also that \(2\le p <\infty \) and that \(\gamma ^- u\in L^p(\Gamma )\). Then since, clearly, \(g\in L^p(\Gamma )\), u is a solution of the Laplace oblique derivative problem with data in \(L^{p}(\Gamma )\). Therefore, by Part (ii) of Theorem 5.13, \((\nabla u)^* \in (L^{p}(\Gamma ))^d\) and thus \(\gamma ^- u\in H^{1,p}(\Gamma )\) by Corollary B.7. This implies, by the Sobolev embedding theorem [1, Chapter V, Equations 6 and 4], that, if \(p\ge d-1\), then \(\gamma ^- u\in L^{q}(\Gamma )\) for all \(2\le q<\infty \), while, if \(p<d-1\), then \(\gamma ^- u\in L^{q}(\Gamma )\) for \(2\le q \le p_0\) where \(1/p_0 = 1/p -1/(d-1)\). Since, certainly, \(\gamma ^- u\in {L^2(\Gamma )}\) (as \(\gamma ^- u\in H^1(\Gamma )\) by definition of the IORP), applying the above argument at most a finite number of times leads to the conclusion that \(\gamma ^- u\in H^{1,q}(\Gamma )\) for all \(2\le q<\infty \). But this implies, by the Sobolev embedding theorem [1, Chapter V, Equation 9], that \(\gamma ^- u\in C^{0,\beta '}(\Gamma )\) for all \(0<\beta '<1\). Thus u is a solution of the Laplace oblique derivative problem with data in \(C^{0,\beta }(\Gamma )\), and thus the result that \(u\in C^{1,\gamma }(\overline{{\Omega ^-}})\) for some \(\gamma \in (0,\beta ]\) follows from Part (i) of Theorem 5.13. \(\square \)

Corollary 5.15

(Regularity for the EORP) Suppose that \(\textbf{Z}\in (C^{0,\beta }(\Gamma ))^d\) and \(\alpha \in C^{0,\beta }(\Gamma )\) for some \(\beta \in (0,1]\), \(\textbf{Z}\) satisfies the first inequality in (5.10) for some \(c>0\), and u satisfies the Laplace exterior oblique Robin problem with data \(g\in C^{0,\beta }(\Gamma )\). Then \(u\in C^{1,\gamma }(\overline{{\Omega ^+}\cap B_R})\) for all \(R>0\) and for some \(\gamma \in (0,\beta ]\).

Proof

Since \({\Omega ^-}\) is bounded, \(\Gamma \subset B_r\) for some \(r>0\). Suppose that u satisfies the EORP and choose \(R_2>R_1>R_0>r\) and \(\chi \in C_\textrm{comp}^\infty (\mathbb {R}^d)\) with \(\chi (\textbf{x})=1\) for \(|\textbf{x}|\le R_0\) and \(\chi (\textbf{x})=0\) for \(|\textbf{x}|\ge R_1\). Let \(v(\textbf{x}):= \chi (\textbf{x}) u(\textbf{x})\) for \(\textbf{x}\in G:={\Omega ^+}\cap B_{R_2}\), so that, in particular, \(v=u\) in \({\Omega ^+}\cap B_{R_0}\). The idea now is to create a solution of an IORP on G, and then use the interior regularity result of Theorem 5.14. Since u is harmonic in \({\Omega ^+}\),

$$\begin{aligned} \Delta v = F:= 2\nabla \chi \cdot \nabla u + u \Delta \chi \quad \text{ in } G. \end{aligned}$$

Since \(\chi \in C_\textrm{comp}^\infty (\mathbb {R}^d)\) and \(u\in C^{\infty }({\Omega ^+})\), \(F\in C_\textrm{comp}^\infty (G)\). Therefore

$$\begin{aligned} {\widehat{v}}(\textbf{x}):= -\int _{G} \Phi (\textbf{x},\textbf{y})F(\textbf{y})\, \textrm{d}\textbf{y}, \quad \text { for }\textbf{x}\in G, \end{aligned}$$

satisfies \({\widehat{v}}\in C^2(\overline{G})\) and \(\Delta {\widehat{v}} = F\) in G. Let \(w(\textbf{x}):= v(\textbf{x})-{\widehat{v}}(\textbf{x})\) for \(\textbf{x}\in G\), and define \({\widetilde{\textbf{Z}}}\in (C^{0,\beta }(\partial G))^d\) and \({\widetilde{\alpha }}\in C^{0,\beta }(\partial G)\) by \({\widetilde{\textbf{Z}}}:= - \textbf{Z}\) on \(\Gamma \), \({\widetilde{\textbf{Z}}}(\textbf{x}):= \textbf{x}\), for \(\textbf{x}\in \partial B_{R_2}\), and \({\widetilde{\alpha }}:= \alpha \) on \(\Gamma \), \({\widetilde{\alpha }}:= 0\) on \(\partial B_{R_2}\). Then \(w\in H^1(G)\) with trace \(\gamma w\in H^1(\partial G)\), \(\Delta w=0\) in G, and

$$\begin{aligned} ({\widetilde{\textbf{Z}}}\cdot \textbf{n}) \partial _\textbf{n}w + {\widetilde{\textbf{Z}}} \cdot \nabla _\Gamma ( \gamma w) + {\widetilde{\alpha }} \, \gamma w = {\widetilde{g}} \text { on } \partial G, \end{aligned}$$

where \(\textbf{n}\) is the unit normal pointing out of G and \({\widetilde{g}}\in C^{0,\beta }(\partial G)\) is defined by \({\widetilde{g}}:= -({\widetilde{\textbf{Z}}}\cdot \nabla {\widehat{v}} +{\widetilde{\alpha }} {\widehat{v}}) - g\) on \(\Gamma \), and by \({\widetilde{g}}:= -({\widetilde{\textbf{Z}}}\cdot \nabla {\widehat{v}} +{\widetilde{\alpha }} {\widehat{v}})\) on \(\partial B_{R_2}\). Theorem 5.14 implies that, for some \(\gamma \in (0,\beta ]\), \(w\in C^{1,\gamma }(\overline{G})\), so that \(v \in C^{1,\gamma }(\overline{G})\) and \(u \in C^{1,\gamma }(\overline{{\Omega ^+}\cap B_{R_0}})\). Since u is harmonic in \({\Omega ^+}\), \(u \in C^{1,\gamma }(\overline{{\Omega ^+}\cap B_{R}})\) for every \(R>0\). \(\square \)

We can now prove Theorems 5.8 and 5.10 and Corollaries 5.9 and 5.11.

Proof of Theorem 5.8

Suppose that u satisfies the IORP with \(g=0\) and that, without loss of generality, u is real-valued. To show that \(u=0\) it is enough to show that \(u\le 0\) in \({\Omega ^-}\), since this implies, by the same argument applied to \(-u\), that also \(u\ge 0\), and hence \(u=0\). By Theorem 5.14, \(u\in C^1(\overline{{\Omega ^-}})\) (indeed \(\nabla u\) is Hölder continuous). By the maximum principle, since \(u\in C^2({\Omega ^-})\cap C(\overline{{\Omega ^-}})\) is harmonic in \({\Omega ^-}\), the maximum value of u in \(\overline{{\Omega ^-}}\) is attained at some point \(\textbf{x}_0\in \Gamma \). Since \(u\in C^1(\overline{{\Omega ^-}})\) it follows from (5.1) with \(g=0\) that

$$\begin{aligned} \alpha (\textbf{x}_0)u(\textbf{x}_0) = -\textbf{Z}(\textbf{x}_0)\cdot \nabla u(\textbf{x}_0)=-\lim _{h\rightarrow 0^+}\frac{u(\textbf{x}_0)-u(\textbf{x}_0-h\textbf{Z}(\textbf{x}_0))}{h}. \end{aligned}$$

Since \(\textbf{Z}\) is continuous and \(\textbf{Z}\cdot \textbf{n}\ge c\ge 0\) almost everywhere on \(\Gamma \), \(\textbf{x}_0-h\textbf{Z}(\textbf{x}_0) \in {\overline{{\Omega ^-}}}\) for all sufficiently small \(h>0\) by Lemma 5.12, so that \(\textbf{Z}(\textbf{x}_0)\cdot \nabla u(\textbf{x}_0)\ge 0\) since \(\textbf{x}_0\) is the global maximum. Since \(\alpha (\textbf{x}_0) >0\), it follows that \(u(\textbf{x}_0)\le 0\), so that \(u\le 0\) in \({\Omega ^-}\). \(\square \)

Proof of Theorem 5.10

Suppose that u satisfies the EORP with \(g=0\) and, without loss of generality, is real-valued. As in the proof of Theorem 5.8, it is enough to show that \(u\le 0\) in \({\Omega ^+}\). We recall that, when \(d=2\), the condition that u is bounded on \({\Omega ^+}\) implies that, for some \(u_\infty \in \mathbb {R}\),

$$\begin{aligned} u(\textbf{x}) = u_\infty + O(|\textbf{x}|^{-1}) \quad \text{ as } |\textbf{x}|\rightarrow \infty , \end{aligned}$$

uniformly in \(\textbf{x}/|\textbf{x}|\), and that

$$\begin{aligned} u_\infty = \frac{1}{2\pi R}\int _{\partial B_R} u\, \textrm{d}s \end{aligned}$$
(5.11)

if \(\Gamma \subset B_R\) [52, Equation 6.11].

By Corollary 5.15, \(u\in C^1(\overline{{\Omega ^+}})\). By the maximum principle, since \(u\in C^2({\Omega ^+})\cap C(\overline{{\Omega ^+}})\) is harmonic in \({\Omega ^+}\), the maximum value of u in \(\overline{{\Omega ^+}}\) is attained on \(\Gamma \) or, when \(d=2\), \(u(\textbf{x})\le u_\infty \) for \(\textbf{x}\in {\Omega ^+}\). If the maximum is attained on \(\Gamma \), the result that \(u\le 0\) follows by arguing as in the proof of Theorem 5.8. Therefore, it is sufficient to prove that the maximum is attained on \(\Gamma \) when \(d=2\). If \(u(\textbf{x})\le u_\infty \) for \(\textbf{x}\in {\Omega ^+}\), then (5.11) implies that \(u(\textbf{x})= u_\infty \) for \(|\textbf{x}|\ge R\) if \(\Gamma \subset B_R\), so that the maximum is attained in \({\Omega ^+}\). The maximum principle (see, e.g., [52, Theorem 6.8]) then implies that u is constant in \({\Omega ^+}\), so that the maximum is also attained on \(\Gamma \). \(\square \)

The following proofs of Corollaries 5.9 and 5.11 use the fact that, when \(\alpha \in L^\infty (\Gamma )\), \(\textbf{Z}\) is continuous, and (2.6) (i.e., the first lower bound in (5.10)) holds, then \(A'_{I,\textbf{Z},\alpha }\) and \(A'_{E,\textbf{Z},\alpha }\) are Fredholm of index zero by Parts (iii) and (iv) of Theorem 2.1 and 2.2 respectively. Although these two theorems are for \(d=3\), Parts (iii) and (iv) also hold when \(d=2\) (as noted at the beginning of Sect. 2.3).

Proof of Corollary 5.9

If we can prove invertibility of \(A'_{E,\textbf{Z},\alpha }\), then existence of a solution to the IORP follows from Theorem 5.5. Since \(A'_{E,\textbf{Z},\alpha }\) is Fredholm of index zero on \({L^2(\Gamma )}\), by the Fredholm alternative (see, e.g., [65, Theorem 2.27]), to prove invertibility it is sufficient to prove injectivity. Assume that \(A'_{E,\textbf{Z},\alpha }\phi =0\) for \(\phi \in L^2(\Gamma )\). By Theorem 5.6, \(u:= {{\mathcal {S}}}\phi \) satisfies the IORP, and by Theorem 5.8\(u=0\) in \({\Omega ^-}\). Therefore \(\gamma ^- u=0\) and the first jump relation in (4.2) implies that \(S\phi =0\). Lemma A.1 then implies that \(\phi =0\) and the proof is complete. \(\square \)

Proof of Corollary 5.11

This is very similar to that of Corollary 5.9 except that now we only work in \(d\ge 3\), since Theorem 5.5 requires \(d\ge 3\). \(\square \)

Remark 5.16

(The results of [59]) Although not directly used to prove the results in this section, the results of [59] concern the Laplace IORP in Lipschitz domains with Hölder continuous \({\widetilde{\textbf{Z}}}\) and g, and we comment here on their relevance to the results above.

The results of [59] give an alternative route for obtaining uniqueness of the IORP (i.e., proving Theorem 5.8). Indeed, in the proof of Theorem 5.8, once we have established that \(u\in C^1(\overline{{\Omega ^-}})\) (by using Theorem 5.14), then uniqueness follows from [59, Theorem 3.2]. The reason we argue as we do in the proof of Theorem 5.8 is that this argument easily carries over to the proof of uniqueness for the EORP (Theorem 5.10), whereas [59, Theorem 3.2] concerns only the IORP.

Furthermore, [59, Theorem 3.2] implies that, under the assumptions of Theorem 5.14, there exists \(\beta _0<1\), depending only on the Lipschitz constant of \({\Omega ^-}\), such that if \(\beta < \beta _0\) then \(u \in C^{1,\beta }(\overline{{\Omega ^-}})\).

Remark 5.17

(Additional uniqueness results for the EORP with \(\textbf{Z}=\textbf{x}\) ) The coercivity result of Theorem 2.5 allows us to extend the range of \(\alpha \) for which the EORP is unique when \(\textbf{Z}=\textbf{x}\) and \(d\ge 3\). Indeed, Theorem 2.5 implies that \(A'_{I,\textbf{Z},\alpha }\) is injective when \(\textbf{Z}=\textbf{x}, \alpha (\textbf{x})\ge -(d-2)/2\) for almost every \(\textbf{x}\in \Gamma \), and \(d\ge 3\). Then, using Theorem 5.5 and arguing as at the end of the proof of Corollary 5.9, we see that the solution of the EORP is unique under these conditions. This result proves uniqueness for certain non-positive values of \(\alpha \), which are not covered by Theorem 5.10.

5.5 Link between the IORP/EORP and the BIEs in Theorem 2.14

Lemma 5.18

Given \(g\in {L^2(\Gamma )}\), if \(\phi \) satisfies

$$\begin{aligned} T_{E,\textbf{Z},\alpha ,\beta }' \phi = g, \end{aligned}$$
(5.12)

(with \(T_{E,\textbf{Z},\alpha ,\beta }'\) defined by (2.33)) and \(d=2\), then

$$\begin{aligned} u = {{\mathcal {S}}}Q_\Gamma \phi + \frac{\beta }{\alpha } P_\Gamma \phi - \frac{1}{\alpha }P_\Gamma A'_{E,\textbf{Z},\alpha }Q_\Gamma \phi \end{aligned}$$
(5.13)

satisfies the IORP.

Lemma 5.19

Given \(g\in {L^2(\Gamma )}\), if \(\phi \) satisfies

$$\begin{aligned} T_{I,\textbf{Z},\alpha ,\beta }' \phi = -g, \end{aligned}$$
(5.14)

(with \(T_{I,\textbf{Z},\alpha ,\beta }'\) defined by (2.29)) and \(d=2\), then

$$\begin{aligned} u = {{\mathcal {S}}}Q_\Gamma \phi + \frac{\beta }{\alpha } P_\Gamma \phi - \frac{1}{\alpha }P_\Gamma A'_{I,\textbf{Z},\alpha }Q_\Gamma \phi \end{aligned}$$
(5.15)

satisfies the EORP.

Proofs of Lemmas 5.18, 5.19

The fact that u given by (5.13)/(5.15) is \(C^2\) and satisfies Laplace’s equation follows from, e.g., [17, Theorem 2.14]. The condition that \(u= O(1)\) at infinity for the EORP follows from the asymptotics (4.24), the definition of \(P_\Gamma \) (2.28), and that \(Q_\Gamma := I- P_\Gamma \). The BIEs (5.13)/(5.15) follow from the jump relations (4.2) and the definitions of \(A'_{I,\textbf{Z},\alpha }\) (2.1), \(A'_{E,\textbf{Z},\alpha }\) (2.9), and \(K_{\textbf{Z}}'\) (1.18). \(\square \)

Remark 5.20

(Link with the work of Medková [67]) In [67, Theorem 5.23.5], the solution of the IORP is sought as (5.13) without the final term on the right-hand side, resulting in the BIE \((A'_{E,\textbf{Z},\alpha }Q_\Gamma + \beta P_\Gamma )\phi =g\); this BIO is then proved to be invertible on \({L^2(\Gamma )}\) if \(\beta =\alpha \) and \(\alpha \) is sufficiently large [67, Theorem 5.23.5]. The advantage of including the final term on the right-hand side of (5.13) is that, by Theorem 2.14, the resulting BIO \(T_{E,\textbf{Z},\alpha ,\beta }' \) is not just invertible when \(\alpha \) is sufficiently large, but also coercive by Part (v) of Theorem 2.14.