Adaptive BEM with inexact PCG solver yields almost optimal computational costs

We consider the preconditioned conjugate gradient method (PCG) with optimal preconditioner in the frame of the boundary element method for elliptic first-kind integral equations. Our adaptive algorithm steers the termination of PCG as well as the local mesh-refinement. Besides convergence with optimal algebraic rates, we also prove almost optimal computational complexity. In particular, we provide an additive Schwarz preconditioner which can be computed in linear complexity and which is optimal in the sense that the condition numbers of the preconditioned systems are uniformly bounded. As model problem serves the 2D or 3D Laplace operator and the associated weakly-singular integral equation with energy space H~-1/2(Γ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\widetilde{H}^{-1/2}(\Gamma )$$\end{document}. The main results also hold for the hyper-singular integral equation with energy space H1/2(Γ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H^{1/2}(\Gamma )$$\end{document}.


Model problem.
Let Ω ⊂ R d with d = 2, 3 be a bounded Lipschitz domain with polyhedral boundary ∂Ω. Let Γ ⊆ ∂Ω be a (relatively) open and connected subset. Given f : Γ → R, we seek the density φ : Γ → R of the weakly-singular integral equation where G(·) denotes the fundamental solution of the Laplace operator, i.e., G(z) = − 1 2π log |z| for d = 2 resp. G(z) = 1 4π Given a triangulation T • of Γ, we employ a lowest-order Galerkin boundary element method (BEM) to compute a T • -piecewise constant function φ • ∈ P 0 (T • ) such that With the numbering T • = {T 1 , . . . , T N }, consider the standard basis χ •,j : j = 1, . . . , N of P 0 (T • ) consisting of characteristic functions χ •,j of T j ∈ T • . We make the ansatz Then, the Galerkin formulation (3) is equivalent to the linear system where the matrix A • ∈ R N ×N is positive definite and symmetric. For a given initial triangulation T 0 , we consider an adaptive mesh-refinement strategy of the type which generates a sequence T of successively refined triangulations T for all ∈ N 0 . We note that the condition number of the Galerkin matrix A from (5) depends on the number of elements of T , as well as the minimal and maximal diameter. Therefore, the step solve requires an efficient preconditioner as well as an appropriate iterative solver.
1.2. State of the art. In the last decade, the mathematical understanding of adaptive mesh-refinement has matured. We refer to [Dör96,MNS00,BDD04,Ste07,CKNS08,FFP14] for some milestones for adaptive finite element methods for second-order linear elliptic equations, [Gan13, FKMP13, FFK + 14, FFK + 15, AFF + 17] for adaptive BEM, and [CFPP14] for a general framework of rate-optimality of adaptive mesh-refining algorithms. The interplay between adaptive mesh-refinement, optimal convergence rates, and inexact solvers has been addressed and analyzed for adaptive FEM for linear problems in [Ste07,ALMS13,AGL13], for eigenvalue problems in [CG12], and recently also for strongly monotone nonlinearities in [GHPS17]. In particular, all available results for adaptive BEM [Gan13, FKMP13, FFK + 14, FFK + 15, AFF + 17] assume that the Galerkin system (5) is solved exactly. Instead, the present work analyzes an adaptive algorithm which steers both, the local mesh-refinement and the iterations of the PCG algorithm.
In principle, it is known [CFPP14, Section 7] that convergence and optimal convergence rates are preserved if the linear system is solved inexactly, but with sufficient accuracy. The purpose of this work is to guarantee the latter by incorporating an appropriate stopping criterion for the PCG solver into the adaptive algorithm. Moreover, to prove that the proposed algorithm does not only lead to optimal algebraic convergence rates, but also to (almost) optimal computational costs, we provide an appropriate symmetric and positive definite preconditioner P ∈ R N ×N such that • first, the matrix-vector products with P −1 can be computed at linear cost; • second, the system matrix P −1/2 A P −1/2 of the preconditioned linear system P −1/2 A P −1/2 x = P −1/2 b (7) has a uniformly bounded condition number which is independent of T .
Then, x = P −1/2 x solves the original system (5). To that end, we exploit the multilevel structure of adaptively generated meshes in the framework of adaptive Schwarz methods. For hyper-singular integral equations, such a multilevel additive Schwarz preconditioner has been proposed and analyzed in [FFPS17a,FMPR15] for d = 2, 3 and for weaklysingular integral equations in [FFPS17b] for d = 2. In particular, the present work closes this gap by analyzing an optimal additive Schwarz preconditioner for weakly-singular integral equations for d = 3. We note that the proofs of [FFPS17a,FFPS17b] do not transfer to weakly-singular integral equations for d = 3. Instead, we build on recent results for finite element discretizations [HWZ12,AGS16] which are then transferred to the present BEM setting by use of an abstract concept from [Osw99].
1.3. Outline and main results. Section 2 introduces the functional analytic framework and fixes the necessary notation. Section 3 states our main results. In Section 3.1, we define a local multilevel additive Schwarz preconditioner (24) for a sequence of locally refined meshes. Theorem 3 states that the 2 -condition number of the preconditioned systems is uniformly bounded for all these meshes, i.e., the preconditioner is optimal. In Section 3.2, we first state our adaptive algorithm which steers the local mesh-refinement as well as the stopping of the PCG iteration (Algorithm 5). Theorem 8 proves • that the overall error in the energy norm can be controlled a posteriori, • that the quasi-error (which consists of energy norm error plus error estimator) is linearly convergent in each step of the adaptive algorithm (i.e., independent of whether the algorithm decides for local mesh-refinement or for one step of the PCG iteration), • that the quasi-error even decays with optimal rate (i.e., with each possible algebraic rate) with respect to the degrees of freedom, i.e., Algorithm 5 is rate optimal in the sense of, e.g., [Ste07,CKNS08,FKMP13,CFPP14]. Finally, Section 3.3 considers the computational costs. Under realistic assumptions on the treatment of the arising discrete integral operators, Corollary 10 states that the quasi-error converges at almost optimal rate (i.e., with rate s − ε for any ε > 0 if rate s > 0 is possible for the exact Galerkin solution) with respect to computational costs, i.e., Algorithm 5 requires almost optimal computational time. Section 4 underpins our theoretical findings by some 2D and 3D experiments. The proof of Theorem 3 is given in Section 5, the proofs of Theorem 8 and Corollary 10 are given in Section 6. The final Section 7 shows that our main results also apply to the hyper-singular integral equation.
Here, h T := diam(T ) > 0 denotes the Euclidean diameter of T , i.e., the length of the line segment.
We employ the extended bisection algorithm from [AFF + 13]. For a mesh T • and M • ⊆ T • , let T • := refine(T • , M • ) be the coarsest mesh such that all marked elements T ∈ M • have been refined, i.e., M • ⊆ T • \T • . We write T • ∈ refine(T • ), if there exists n ∈ N 0 , conforming triangulations T 0 , . . . , T n and corresponding sets of marked elements M j ⊆ T j such that i.e., T • is obtained from T • by finitely many steps of refinement. Note that the bisection algorithm from [AFF + 13] guarantees, in particular, that all T • ∈ refine(T • ) are uniformly γ-shape regular, where γ depends only on T • .
2.4. Mesh-refinement for 3D BEM. For d = 3, a mesh T • of Γ is a conforming triangulation into non-degenerate compact surface triangles. In particular, we avoid hanging nodes. To ease the presentation, we suppose that the elements T ∈ T • are flat. The triangulation is called γ-shape regular, if Here, diam(T ) denotes the Euclidean diameter of T and h T := |T | 1/2 with |T | being the two-dimensional surface measure. Note that γ-shape regularity implies that h T ≤ diam(T ) ≤ γ h T and hence excludes anisotropic elements.
For 3D BEM, we employ 2D newest vertex bisection (NVB) to refine triangulations locally; see [Ste08,KPP13] for details on the refinement algorithm and Figure 1 for an illustration. For a mesh T • and M • ⊆ T • , we employ the same notation T • := refine(T • , M • ) resp. T • ∈ refine(T • ) as for d = 2.
2.6. Preconditioned conjugate gradient method (PCG). Suppose that P • , A • ∈ R N ×N are symmetric and positive definite matrices. Given b • ∈ R N and an initial guess x •0 , PCG (see [GVL13,Algorithm 11.5.1]) aims to approximate the solution x • ∈ R N to (5). We note that each step of PCG has the following computational costs: • O(N ) cost for vector operations (e.g., assignment, addition, scalar product), • computation of one matrix-vector product with A • , • computation of one matrix-vector product with P −1 • . Let x • ∈ R N be the solution to (7) and recall that x • = P −1/2 • x • . We note that PCG formally applies the conjugate gradient method (CG, see [GVL13, Algorithm 11.3.2]) for the matrix A • := P −1/2 • A • P −1/2 • and the right-hand side b The iterates x •k ∈ R N of PCG (applied to P • , A • , b • , and the initial guess x •0 ) and the iterates x •k of CG (applied to A • , b • , and the initial guess see [GVL13, Section 11.5]. Moreover, direct computation proves that Consequently, [GVL13, Theorem 11.3.3] for CG (applied to A • , b • , x •0 ) yields the following lemma for PCG (which follows from the implicit steepest decent approach of CG).
If the matrix A • ∈ R N ×N stems from the Galerkin discretization (5) for T • = {T 1 , . . . , T N }, there is a one-to-one correspondence of vectors y • ∈ R N and discrete functions ψ 2.7. Optimal preconditioners. We say that P • is an optimal preconditioner, if C pcg ≥ 1 in the 2 -condition number estimate (18) depends only on γ-shape regularity of T • and the initial mesh T 0 (and is hence essentially independent of the mesh T • ).

Main results
3.1. Optimal additive Schwarz preconditioner. In this work, we consider multilevel additive Schwarz preconditioners that build on the adaptive mesh-hierarchy.
Let E • denote the set of all nodes (d = 2) resp. edges (d = 3) of the mesh T • which do not belong to the relative boundary ∂Γ. Only for Γ = ∂Ω, E • contains all nodes resp. edges of T • . For E ∈ E • , let T ± ∈ T • denote the two unique elements with T + ∩ T − = E. We define the Haar-type function ϕ •,E ∈ P 0 (T • ) (associated to E ∈ E • ) by where |E| := 1 for d = 2 and |E| := diam(E) for d = 3. Note that For d = 3, we additionally suppose that the orientation of each edge E is arbitrary but fixed. We choose T + ∈ T • such that ∂T + and E ⊂ ∂T + have the same orientation.
Given a mesh T 0 , suppose that T is a sequence of locally refined meshes, i.e., for all ∈ N 0 , there exists a set M ⊆ T such that T +1 = refine(T , M ). Then, define which consist of new (interior) nodes/edges plus some of their neighbours. We note the following subspace decomposition which is, in general, not direct.
Lemma 2. With X • := P 0 (T • ) and X •,E := span{ϕ •,E }, it holds that June 4, 2018 Additive Schwarz preconditioners are based on (not necessarily direct) subspace decompositions. Following the standard theory (see, e.g., [TW05, Chapter 2]), (23) yields a (local multilevel) preconditioner. To provide its matrix formulation, let I k, ∈ R #T ×#T k be the matrix representation of the canonical embedding P 0 (T k ) → P 0 (T ) for k < , i.e., Let H ∈ R #T ×#E denote the matrix that represents Haar-type functions, i.e., Since only two coefficients per column are non-zero, H is sparse, while I k, is non-sparse in general. Finally, define the (non-invertible) diagonal matrix D ∈ R #E ×#E by Then, the matrix representation of the preconditioner associated to (23) reads For d = 2, the subsequent Theorem 3 is already proved in [FFPS17b, Section III.B] for Γ = ∂Ω and in [Füh14, Section 6.3] for Γ ∂Ω. For d = 3, we need the following additional assumptions: • First, suppose that Ω ⊂ R 3 is simply connected and Γ = ∂Ω. • Second, let T 0 be a conforming triangulation of Ω into non-degenerate compact simplices such that T 0 = T 0 | Γ is the induced boundary partition on Γ. Then, the following theorem is our first main result. The proof is given in Section 5.
Theorem 3. Under the foregoing assumptions, the preconditioner P L from (24) is optimal, i.e., there holds (18), where C pcg ≥ 1 depends only on Ω and T 0 , but is independent of L ∈ N.
We stress that the matrix in (24) will never be assembled in practice. The PCG algorithm only needs the action of P −1 L on a vector. This can be done recursively by using the embeddings I , +1 which are, in fact, sparse. Up to (storing and) inverting A 0 on the coarse mesh, the evaluation of P −1 L x can be done in O(#T L ) operations; see, e.g., [FFPS17a, Section 3.1] for a detailed discussion. If the mesh T L is fine compared to the initial mesh T 0 (or if A 0 is realized with, e.g., H-matrix techniques), then the computational costs and storage requirements associated with A 0 can be neglected.
Remark 4. Our proof for d = 3 requires additional assumptions on Ω, Γ = ∂Ω, and T 0 . As stated above, the case d = 2 allows for a different proof (which, however, does not transfer to d = 3) and can thus avoid these assumptions; see [FFPS17b,Füh14]. We believe that Theorem 3 also holds for d = 3 and Γ ∂Ω. This is also underpinned by a numerical experiment in Section 4.4. The mathematical proof, however, remains open.
3.2. Optimal convergence of adaptive algorithm. We analyze the following adaptive strategy which is driven by the weighted-residual error estimator (14). We note that Algorithm 5 as well as the following results are independent of the precise preconditioning strategy as long as the employed preconditioners are optimal; see Section 2.7.
(ii) Do one step of the PCG algorithm with the optimal preconditioner P j to obtain Output: Sequences of successively refined triangulations T j , discrete solutions φ jk , and corresponding error estimators η j (φ jk ), for all j ≥ 0 and k ≥ 0.
Remark 7. Let Q := (j, k) ∈ N 0 × N 0 : index (j, k) is used in Algorithm 5 . It holds that (0, 0) ∈ Q. Moreover, for j, k ∈ N 0 , it holds that If j is clear from the context, we abbreviate k := k(j), e.g., φ jk := φ jk(j) . In particular, it holds that φ jk = φ (j+1)0 . Since PCG (like any Krylov method) provides the exact solution after at most #T j steps, it follows that 1 ≤ k(j) < ∞. Finally, we define the ordering Moreover, let be the total number of PCG iterations until the computation of φ jk . Note that j > j and |(j , k )| = |(j, k)| imply that j = j + 1, k = k(j), and k = 0 and hence φ j k = φ jk .
Theorem 8. The output of Algorithm 5 satisfies the following assertions (a)-(c). The constants C rel , C eff > 0 depend only on q pcg , Γ, and the uniform γ-shape regularity of T j ∈ refine(T 0 ), whereas C lin > 1 and 0 < q lin < 1 depend additionally only on θ and λ, and C opt > 0 depends additionally only on s, T 0 , and Λ 0k .

Almost optimal computational complexity.
Suppose that we use H 2matrices for the efficient treatment of the discrete single-layer integral operator. Recall that the storage requirements (resp. the cost for one matrix-vector multiplication) of an H 2 -matrix are of order O(N p 2 ), where N is the matrix size and p ∈ N is the local block rank. For H 2 -matrices (unlike H-matrices), these costs are, in particular, independent of a possibly unbalanced binary tree which underlies the hierarchical data structure [Hac15].
For a mesh T • ∈ T, we employ the local block rank p = O(log(1 + #T • )) to ensure that the matrix compression is asymptotically exact as N = #T • → ∞, i.e., the error between the exact matrix and the H-matrix decays exponentially fast; see [Hac15]. We stress that we neglect this error in the following and assume that the matrix-vector multiplication (based on the H 2 -matrix) yields the exact matrix-vector product.
The computational cost for storing A • (as well as for one matrix-vector multiplication) is O((#T • ) log 2 (1 + #T • )). In an idealized optimal case, the computation of φ • is hence We consider the computational costs for one step of Algorithm 5: • We assume that one step of the PCG algorithm with the employed optimal preconditioner is of cost O (#T j ) log 2 (1 + #T j ) ; cf. the preconditioner from Section 3.1.
• We assume that we can compute η j (ψ j ) for any ψ j ∈ P 0 (T j ) (by means of numerical • Clearly, the Dörfler marking in Step (v) can be done in O (#T j ) log(1 + #T j ) operations by sorting. Moreover, for C mark = 2, Stevenson [Ste07] proposed a realization of the Dörfler marking based on binning, which can be performed at linear cost O(#T j ).
• Finally, the mesh-refinement in Step (vi) can be done in linear complexity O(#T j ) if the data structure is appropriate. Overall, one step of Algorithm 5 is thus done in O((#T j ) log 2 (1 + #T j )) operations. However, an adaptive step (j , k ) ∈ Q depends on the full history of previous steps.
• Hence, the cumulative computational complexity for the adaptive step The following corollary proves that Algorithm 5 does not only lead to convergence of the quasi-error Λ jk with optimal rate with respect to the degrees of freedom (see Theorem 8), but also with almost optimal rate with respect to the computational costs.
Corollary 10. For j ∈ N 0 , let T j+1 = refine( T j , M j ) with arbitrary M j ⊆ T j and T 0 = T 0 . Let s > 0 and suppose that the corresponding error estimator η j ( φ j ) converges at rate s with respect to the single-step computational costs, i.e., Suppose that λ and θ satisfy the assumptions of Theorem 8(c). Then, the quasi-errors Λ jk generated by Algorithm 5 converge almost at rate s with respect to the cumulative computational costs, i.e.,
In Figure 3, we compare Algorithm 5 for different values for θ and λ as well as uniform mesh-refinement. Uniform mesh-refinement leads only to the rate O(N −1/2 ), while adaptivity, independently of the value of θ and λ, regains the optimal rate O(N −3/2 ). A naive initial guess in Step (vi) of Algorithm 5 (i.e., if φ (j+1)0 := 0) leads to a logarithmical growth of the number of PCG iterations, whereas for nested iteration φ (j+1)0 := φ jk (as formulated in Algorithm 5) the number of PCG iterations stays uniformly bounded, cf. Figure 4. Finally, Figure 2 shows the condition numbers for an artificial refinement towards the left end point and for Algorithm 5 with λ = 10 −3 and θ = 0.5.
Then, u admits a generic singularity at the reentrant corner. The exact solution φ of (1) is just the normal derivative of the solution u.
Note that u admits a singularity along the reentrant edge. The exact solution φ of (1) is just the normal derivative of the exact solution u.
In Figure 9, we compare Algorithm 5 with different values for θ and λ to uniform mesh-refinement. Uniform mesh-refinement leads only to a reduced rate of O(N −1/2 ), while adaptivity, independently of θ and λ, leads to the improved rate of approximately O(N −2/3 ). While one would expect O(N −3/4 ) for smooth exact solutions φ , this would require anisotropic elements along the reentrant edge for the present solution φ = ∂ n u. Since NVB guarantees uniform γ-shape regularity of the meshes, the latter is not possible and hence leads to a reduced optimal rate. Finally, Figure 8 shows the condition numbers for (diagonal or additive Schwarz) preconditioning and no preconditioning for artificial refinements towards one reentrant corner or the reentrant edge as well as the condition numbers of the matrices arising from Algorithm 5 with λ = 10 −3 and θ = 0.5.  For the numerical solution of the Galerkin system, we employ PCG with the additive Schwarz preconditioner from Section 3.1. We note that Theorem 3 does not cover this setting. In particular, we note that the proposed additive Schwarz preconditioner from Section 3.1 appears to be optimal, while the mathematical optimality proof still remains open for screens, cf. Figure 10.
In Figure 11, we compare Algorithm 5 with different values for θ and λ to uniform mesh-refinement. We see that uniform mesh-refinement leads only to a reduced rate of O(N −1/4 ), while adaptivity, independently of θ and λ, leads to the improved rate of approximately O(N −1/2 ). Figure 12, we aim to underpin the almost optimal computational complexity of Algorithm 5 (see Corollary 10). To this end, we plot the error estimator η j (φ jk ) over the cumulative sums

Computational complexity. With
for θ = 0.4 and λ ∈ {1, 10 −3 }. The negative impact of the logarithmic terms on the (preasymptotic) convergence rate is clearly visible.

Proof of Theorem 3 (Optimal Multilevel Preconditioner)
For d = 2, we refer to [FFPS17b,Füh14] and thus focus only on d = 3 and Γ = ∂Ω. Due to our additional assumption, T 0 = T 0 | Γ is the restriction of a conforming simplicial triangulation T 0 of Ω to the boundary Γ. Moreover, 2D NVB refinement of T 0 (on the boundary Γ) is a special case of 3D NVB refinement of T 0 (in the volume Ω) plus restriction to the boundary; see, e.g., [Ste08]. Hence, each mesh T • ∈ T = refine(T 0 ) is the restriction of a conforming NVB refinement T • ∈ T := refine( T 0 ), i.e., T • = T • | Γ . Throughout, let T • ∈ T be the coarsest extension of T • ∈ T. Recall that NVB is a binary refinement rule. Therefore, T • ∈ refine(T • ) also implies that T • ∈ refine( T • ). Finally, we note that all triangulations T • ∈ T are uniformly γ-shape regular, i.e., where γ depends only on T 0 . Our argument adapts ideas from [HM12], where a subspace decomposition for the lowest-order Nédélec space ND 1 ( T • ) (see, e.g., [HZ09]) in H(curl ; Ω) implies a decomposition of the corresponding discrete trace space. While the original idea dates back to [Osw99], a nice summary of the argument is found in [HM12, Section 2].
Remark 11. (i) Our proof is based on the construction of an extension operator from P 0 * (T • ) to ND 1 ( T • ), see Lemma 13 below. It is not clear if such an operator can be constructed for the case Γ ∂Ω.
(ii) In [HJHM15], a subspace decomposition of the lowest-order Raviart-Thomas space RT 0 ( T • ) (see, e.g., [XCN09]) in H(div ; Ω) implies a decomposition of the corresponding normal trace space P 0 (T • ). Due to different scaling properties of the Raviart-Thomas basis functions (in the H(div ; Ω) norm) and their normal trace (in the H −1/2 (Γ) norm), this argument does not apply in our case.
Lemma 13. There exists a linear operator E • : P 0 The constant C > 0 depends only on γ-shape regularity of T • .
Combining these results, we conclude the proof.

Abstract additive Schwarz preconditioners.
Let X denote some finite dimensional Hilbert space with norm · X and subspace decomposition where I is a finite index set. The additive Schwarz operator is given by S = i∈I S i , where S i is the X -orthogonal projection onto X i , i.e., where · , · X denotes the scalar product on X . Then, the operator S is positive definite and symmetric (with respect to · , · X ). Define the multilevel norm It is proved, e.g., in [Osw94,Theorem 16 then the extreme eigenvalues of S −1 (and hence those of S) are bounded (from above and below). In particular, the additive Schwarz operator S is optimal in the sense that its condition number (ratio of largest and smallest eigenvalues) depends only on C > 0.
Let S denote the matrix representation of S. Then, the norm equivalence from above and the latter observations imply that the condition number of S is bounded. The abstract theory on additive Schwarz operators given in [TW05,Chapter 2] shows that S has the form S = P −1 A, where A is the Galerkin matrix of · , · X . Therefore, boundedness of the condition number of S implies optimality of the preconditioner P −1 .
We shortly discuss the matrix representation (24) of the additive Schwarz preconditioner P −1 . Following [TW05, Chapter 2], let A i denote the Galerkin matrix of · , · X restricted to X i , and let I i denote the matrix that realizes the embedding from X i → X . We consider the matrix representation of S i : X → X i ⊂ X . Let x ∈ X with coordinate vector x, and let x i ∈ X i be arbitrary with coordinate vector x i . The defining relation of S i then reads in matrix-vector form (with S i being the matrix representation of S i ) as

or equivalently
Since A i is invertible, we have that Note that the range of the operator S i is X i and correspondingly for the matrix representation S i . We therefore apply the embedding I i and obtain the representation To finally prove (24), note that for one-dimensional subspaces X i , A i reduces to the diagonal entry of the matrix A. Overall, we thus derive the matrix representation (24).

Subspace decomposition of ND
The following result is taken from [HWZ12, Theorem 4.1]; see also the references therein. In particular, we note that their proof requires the assumption that Ω is simply connected.
Then, it holds that Moreover, it holds that where C > 0 depends only on Ω and T 0 . 5.4. Subspace decomposition of P 0 (T • ) in H −1/2 (Γ). It remains to prove the following proposition to conclude the proof of Theorem 3.
Proof of lower estimate in (44). Let ψ ∈ P 0 (T L ) with arbitrary decomposition Note that X ,E ⊂ P 0 * (T ). Recall the extension operator E from Lemma 13. Define v * := L =1 E∈E E ψ ,E ∈ Y L . Then, curl v * · n| Γ = ψ * and hence ||| v * ||| 2 Y L from the continuity of the trace operator in H(div ; Ω). Moreover, the triangle inequality, the lower bound from Proposition 14, and Lemma 13 show that Taking the infimum over all possible decompositions (45), we derive the lower estimate in (44) by definition (41) of the multilevel norm.

Proof of Theorem 8 (Rate Optimality of Adaptive Algorithm)
In the spirit of [CFPP14], we give an abstract analysis, where the precise problem and discretization (i.e., Galerkin BEM with piecewise constants for the weakly-singular integral equation for the 2D and 3D Laplacian) enter only through certain properties of the error estimator. These properties are explicitly stated in Section 6.1, before Section 6.2 provides general PCG estimates. The remaining sections (Section 6.3-6.6) then only exploit these abstract framework to prove Theorem 8 and Corollary 10.
6.1. Axioms of adaptivity. In this section, we recall some structural properties of the residual error estimator (14) which have been identified in [CFPP14] to be important and sufficient for the numerical analysis of Algorithm 5. For the proof, we refer to [FKMP13,FFK + 14]. We only note that (A4) already implies (A3) with C rel ≤ C drl in general; see [CFPP14, Section 3.3].
For ease of notation, let T 0 be the fixed initial mesh of Algorithm 5. Let T := refine(T 0 ) be the set of all possible meshes that can be obtained by successively refining T 0 .
Proposition 16. There exist constants C stb , C red , C rel > 0 and 0 < q red < 1 which depend only on Γ and the γ-shape regularity, such that the following properties (A1)-(A4) hold: (A1) stability on non-refined element domains: For each mesh T • ∈ T, all refinements T • ∈ refine(T • ), arbitrary discrete functions v • ∈ P 0 (T • ) and v • ∈ P 0 (T • ), and an arbitrary set U • ⊆ T • ∩ T • of non-refined elements, it holds that (A2) reduction on refined element domains: For each mesh T • ∈ T, all refinements T • ∈ refine(T • ), and arbitrary v • ∈ P 0 (T • ) and v • ∈ P 0 (T • ), it holds that (A3) reliability: For each mesh T • ∈ T, the error of the exact discrete solution φ • ∈ P 0 (T • ) of (11) is controlled by (A4) discrete reliability: For each mesh T • ∈ T and all refinements T • ∈ refine(T • ), there exists a set R •, such that the difference of φ • ∈ P 0 (T • ) and φ • ∈ P 0 (T • ) is controlled by 6.2. Energy estimates for the PCG solver. This section collects some auxiliary results which rely on the use of PCG and, in particular, PCG with an optimal preconditioner. We first note the following Pythagoras identity.
and x •k the iterates of the PCG algorithm. There holds the Pythagoras identity Proof. According to the definition of PCG (and CG), it holds that Together with (17) and (20), this proves (49).
The following lemma collects some estimates which follow from the contraction property (19) of PCG.
According to [AFF + 17], it holds that be the Galerkin projection. Let Π j : L 2 (Γ) → P 0 (T j ) be the L 2 -orthogonal projection. With the Céa lemma and a duality argument (see, e.g., [CP06,Theorem 4.1]), we see that . Combining the latter estimates, we see that With Lemma 18(iv), we hence prove the efficiency estimate (27).

Proof of Theorem 8(b).
The following lemma is the heart of the proof of Theorem 8(b).
Proof. The proof is split into five steps.
Step 4. We use the definition φ (j+1)0 := φ jk from Step (vi) of Algorithm 5 to see that For the first summand of (62), we use stability (A1) and reduction (A2). Together with the Dörfler marking strategy in Step (v) of Algorithm 5 and M j ⊆ T j \T j+1 , we see that With this and stability (A1), the Young inequality and Lemma 18(ii) yield that For the second summand of (62), we apply the Pythagoras identity (61) together with Lemma 18(i) and obtain that Combining (62)-(65), we end up with Using the same arguments as in Step 2, we get that This concludes the proof of (51).
Proof of Theorem 8(b). The proof is split into three steps.
Step 1. Let j ∈ N. Recall the Pythagoras identity (61). We use stability (A1) and Step (iv) of Algorithm 5 to see that With the Pythagoras identity (49), we may argue similarly to obtain that ∆ jk Hence, it follows that ∆ jk ∆ j(k−1) .

From
Step 1, Lemma 19, and the geometric series (for the sum over k), it follows that For k < k(j ), inequality (52) and the geometric series (for the sum over j) yield that For k = k(j ), inequality (52), the geometric series, and Step 1 yield that Step 3. According to the proof of [CFPP14, Lemma 4.9], estimate (66) guarantees (and is even equivalent to) the existence of 0 < q lin < 1 such that Clearly, it holds that Λ jk ∆ 1/2 jk for all (j, k) ∈ Q. This and (67) conclude the proof.
6.5. Proof of Theorem 8(c). The proof of optimal convergence rates requires the following additional properties of the mesh-refinement strategy. For 3D BEM (and 2D NVB from Section 2.4) these properties are verified in [BDD04,Ste07,Ste08], and any assumption on T 0 is removed in [KPP13]. For 2D BEM (and the extended 1D bisection from Section 2.3), these properties are verified in [AFF + 13]. (R1) splitting property: Each refined element is split in at least 2 and at most in C son ≥ 2 many sons, i.e., for all T • ∈ T and all M • ⊆ T • , the refined mesh (R2) overlay estimate: For all meshes T ∈ T and T • , T • ∈ refine(T ) there exists a common refinement (R3) mesh-closure estimate: There exists C mesh > 0 such that the sequence T j with corresponding M j ⊆ T j , which is generated by Algorithm 5, satisfies that #M .
Another lemma, which we need for the proof of Theorem 8(c), shows that the iterates φ •k of Algorithm 5 are close to the exact Galerkin approximation φ • ∈ P 0 (T • ).
Finally, we need the following lemma which immediately shows "⇐=" in (31).
Lemma 22. Suppose (R1). For j ∈ N 0 , let T j+1 = refine( T j , M j ) with arbitrary, but non-empty M j ⊆ T j and T 0 = T 0 . Let Q ⊆ N 0 × N 0 be an index set and φ jk ∈ P 0 ( T j ) for all (j, k) ∈ Q. Let s > 0 and suppose that the corresponding quasi-errors Λ 2 Then, it follows that φ As < ∞.
Proof. Due to the Pythagoras identity (61) and stability (A1), it holds that Additionally, [BHP17,Lemma 22] shows that Given N ∈ N 0 , there exists an index j ∈ N 0 such that With (74)-(76), it follows that Since the upper bound is finite and independent of N , this implies that φ As < ∞.
Proof of Theorem 8(c). With Lemma 22, it only remains to prove the implication "=⇒" in (31). The proof is split into three steps, where we may suppose that φ As < ∞.
Combining the last two estimates, we see that This concludes the proof.
The Lax-Milgram theorem yields existence and uniqueness of u • ∈ S 1 (T • ) such that With the corresponding weighted-residual error estimator, it holds that see [CS95,Car97] for d = 2 resp. [CMPS04] for d = 3.
In [Füh14,FFPS17a], optimal additive Schwarz preconditioners are derived for this setting. Hence, Algorithm 5 can also be used in the present setting. We refer to [FFK + 15, Section 3.3] for the fact that the axioms of adaptivity (A1)-(A4) from Proposition 16 remain valid for the hyper-singular integral equation. All other arguments in Section 6 rely only on general properties of the PCG algorithm (Section 6.2), the properties (A1)-(A4), and the Hilbert space setting of ||| · |||. Overall, this proves that our main results (Theorem 8 and Corollary 10) also cover the hyper-singular integral equation.