Compression of boundary integral operators discretized by anisotropic wavelet bases

. The present article is devoted to wavelet matrix compression for boundary integral equations when using anisotropic wavelet bases for the discretization. We provide a compression scheme which amounts to only O ( N ) relevant matrix coeﬃcients in the system matrix without deteriorating the accuracy oﬀered by the underlying Galerkin scheme. Here, N denotes the degrees of freedom in the related trial spaces. By numerical results we validate our theoretical ﬁndings.


Introduction
Many problems from engineering result in partial differential equations, which can often be solved efficiently by using the finite element method [3,4].However, if the loading of the equation is zero, under some circumstances, one may transform the partial differential equation in a domain into an integral equation on its boundary.Such an integral equation can then be solved using the boundary element method [23,26].There are many practical problems which can be treated with the boundary element method, such as for example the Laplacian or linear elasticity problems [22,26], scattering problems [6], and homogenization problems [1,5,19].
A huge advantage of the boundary element method is that the integral domain under consideration is reduced from an n-dimensional surface to a (n − 1)dimensional hypersurface.This brings us a significant reduction in the number of the degrees of freedom, but since the integral kernels are, in general, nonlocal, also densely populated matrices.
To overcome the dense matrices, fast boundary element methods have been developed, such as adaptive cross adaptation [2], the fast multipole method [12], or the wavelet matrix compression [8,24].A comparison of these methods with respect to their advantages and disadvantages can be found in [14] for example.The computational cost of all these methods scale linearly or loglinearly in the number of degrees of freedom.Indeed, the wavelet matrix compression has been shown to have linear cost complexity, compare [8].Moreover, the boundary integral operator is s -compressible [27], with the consequence of a quasi-optimal convergence for the adaptive wavelet boundary element method [9,11,17,28].
However, all these works consider isotropic wavelets, meaning that the mesh of the underlying multiscale hierarchy consists of isotropic elements.Therefore, only anisotropic singularities cannot be resolved properly.This leads to a loss in the convergence rate if the solution of the boundary integral equation exhibits such singularites.Anisotropic singularities, however, appear if the boundary under under consideration contains edges as this is the case, for example, for Fichera's corner [17].This gives rise to considering anisotropic tensor product wavelets, which are allowed to refine in one coordinate direction whilst staying coarse in the other coordinate direction.With such wavelet functions, the disadvantage of the isotropic wavelets might be overcome.
Anisotropic tensor wavelets for boundary integral equations have been considered first in [13] in the context of sparse tensor product approximations.In [20], for both isotropic and nonisotropic boundary integral operators which are discretized with respect to sparse tensor product spaces, a compression scheme has been developed.This scheme which leads to an essentially linear number of the degrees of freedom therein, provided the underlying integro-differential operator is of the order 2q > 1 2 ( √ 5 − 1) > 0. We, on the other hand, will construct in the present article a linearly scaling compression scheme for integral operators of arbitrary order which are discretized with respect to the full tensor product space.In particular, our compression estimates can be used to improve the results from [20].Note, however, that the computation of the matrix entries of the compressed system matrix is not a topic of the present article.This can be done by using the techniques and results of [15,29].
The rest of the article is structured as follows.In Section 2, we introduce the boundary integral equation to be solved.Then, in Section 3, we define the anisotropic wavelet basis we shall use for the discretization on the unit square.Estimates on the size of the entries of the respective Galerkin matrix with respect to the unit square are derived in Section 4. The wavelet matrix compression is proposed in Section 5.The number of the remaining nonzero matrix entries is counted in Section 6.In Section 7, we generalize the wavelet matrix compression to the boundary of a Lipschitz domain.Consistency and convergence of the wavelet matrix compression is proven in Section 8.In Section 9, we provide numerical experiments to validate the results derived.Finally, in Section 10, we state concluding remarks.
Throughout the article, let us replace generic constants by the notation A B, which means that A is bounded by a constant multiple of B, and, similarly we define A B if and only if B A. If A B and B A, we write A ∼ B.Moreover, if j, j ∈ N 2 0 are given multiindices, the inequality j ≤ j is understood componentwise.Especially, the notion j < j means that j ≤ j and j = j .Finally, we set 1 := (1, 1).

Problem Formulation
2.1.Parametrization.Throughout this article, we consider a bounded, piecewise smooth domain Ω ⊆ R 3 with Lipschitz boundary Γ := ∂Ω.We us assume that Γ can be decomposed into r four-sided, smooth patches Γ i , i = 1, . . ., r, such that This decomposition needs to be admissible, meaning that the intersection Γ i ∩ Γ j is for i = j either empty, a common vertex, or a common edge of both Γ i and Γ j , cf. Figure 1.We next choose smooth diffeomorphisms γ i : := [0, 1] 2 → Γ i such that there exist constants c i and C i with This parametrization should fulfill the matching condition that γ i and γ j coincide up to orientation at a common edge of two neighbouring patches Γ i and Γ j .
2.2.Boundary integral equation.In the following, we intend to calculate the solution u of the boundary integral equation where A : H q (Γ) → H −q (Γ) is a given boundary integral operator.Typically, the kernel k is asymptotically smooth of the order 2q, that is, K is singular only on the diagonal {x = y} and smooth apart from it in terms of provided that 2 + 2q + |α| + |β| > 0. We assume that A is bounded and strongly elliptic on H q (Γ), meaning that there exists a uniform constant c > 0, such that for any u ∈ H q (Γ), we have Furthermore, for the sake of convenience the operator A is assumed to be injective.If this is not the case, but if its kernel is finite-dimensional and known in advance, then one can consider A as on operator and the presented approach is still valid, which is the case for example for any interior Neumann problem, where the kernel consists of all constant functions.A practical example, which can be written as a boundary integral equation, is the Laplace problem with homogeneous Dirichlet boundary data in three spatial dimensions for given boundary data g ∈ H It is well known that this problem is uniquely solvable.As described in detail in [23,26], for example, we may write v ∈ H 1 (Ω) as a layer potential of an unknown density u ∈ H q (Γ), that is v = Pu, where P is a linear and continuous boundary potential operator from H q (Γ) to H 1 (Ω).By taking the trace of the equation v = Pu, we arrive at a boundary integral equation Especially, in the case of the single layer and the double layer potential, the kernels are given by It can easily be seen that k s and k d are asymptotically smooth kernels of the orders 2q = −1 and 2q = 0, respectively.
2.3.Galerkin scheme.By multiplying (2) with a test function φ ∈ H q (Γ), we derive the variational formulation of the boundary integral equation under consideration: Similar to [8], we are considering a sequence of nested trial spaces V J ⊆ V J+1 ⊆ . . .⊆ H q (Γ), which is asymptotically dense in H q (Γ).For any fixed level J (reflecting a mesh width of size ∼ 2 −J ), we restrict the variational formulation (6) to If V J = span{ψ 1 , . . ., ψ N J } ⊆ H q (Γ), the Galerkin problem ( 7) is equivalent to the linear system of equations Especially, by means of Cea's Lemma, the solution u J ∈ V J satisfies an estimate of the form Herein, the right-hand side can estimated further by imposing more knowledge on the trial spaces V J .

Discretization
3.1.Single-scale bases.A natural choice of trial functions are piecewise polynomial functions, defined on the unit interval, tensorized, and then transported onto a surface patch Γ i .We postpone this transportation to Section 7 and consider the unit square first.To this end, we first have to consider the unit interval I = [0, 1].Given a level j, we want to construct a space V j , with dim V j ∼ 2 j , which consists of piecewise polynomial functions on the dyadic intervals [k2 −j , (k + 1)2 −j ], k = 0, 1, . . ., 2 j − 1.This is possible by choosing a suitable function φ and then rescaling it according to , where ∆ j is a suitable index set.We remark that this scaling implies φ j,k L 2 ([0,1]) ∼ 1.
Outgoing from this construction, we can define any ansiotropic tensor product function and the corresponding tensor product spaces.In particular, for j = (j 1 , j 2 ) ∈ N 2 0 , and . With these functions, we define the trial space The spaces V j are said to have the approximation order d ∈ N given by and the regularity γ given by In the simplest case, we take piecewise constant scaling functions which are defined by φ := 1 [0,1] .Then, for any fixed j ∈ N, we define the local trial functions This yields the well-known approximation spaces V j := span{φ j,k : k ∈ ∆ j }, having the parameters γ = 1 2 and d = 1.In general, piecewise polynomial functions of the order r result in an approximation order d = r.The regularity γ is, however, limited by the global smoothness of the trial functions.There holds γ = 1 2 if they are discontinuous while there is γ = 3 2 if they are continuous.3.2.Wavelet bases.Although the above method is very intuitive, we have a lot of difficulties to deal with.As the boundary integral operators under consideration are not local, the Galerkin problem results in fully populated matrices.This drawback can, up to logarithmic terms, be overcome with fast boundary element methods like the fast multipole method [12].An alternative approach is to consider specific, linear combinations of piecewise polynomial trial functions, which are called wavelets.For a full introduction into this topic, see for example [10,16,24].
The general idea is to discretize the complement of V j−1 in V j .Roughly speaking, given a function u j in V j , the projection P j−1 u j ∈ V j−1 is a good estimation on u j , and the difference u j − P j−1 u j can be expressed in terms of complementary basis functions.To this end, we fix a minimal level j 0 and introduce complement spaces W j for all j > j 0 , satisfying Similar as before, W j is spanned by basis functions of the form The function ψ is the so-called mother wavelet.Also here, we note that ψ j,k L 2 (Γ) ∼ 1.However, it can be shown that ψ j,k } j,k is a Riesz basis of multiple Sobolev spaces if properly scaled, compare e.g.[24].We remark that the identity V j = V j−1 ⊕ W j implies that For the sake of notational convenience, we set W j0 := V j0 and denote ψ j0,k := φ j0,k for all k ∈ ∇j 0 := ∆ j0 .By tensorising (10) with itself, we arrive at Hence, in view of (9), we can write with the tensor product wavelets ψ j,k = ψ j1,k1 ⊗ ψ j2,k2 and 3.3.Notation.Let us define some notation which we will use throughout the remainder of this article.First, we define the support of a wavelet as and, accordingly, Ω j,k := supp ψ j,k .
Similarly, we define the singular support, i.e., the points at which a wavelet is not smooth, as Ω σ j,k := singsupp ψ j,k , Ω σ j,k := singsupp ψ j,k .For a pair of wavelets ψ j,k and ψ j ,k , we let Moreover, we also define dist Ω σ ji,ki , Ω j i ,k i , otherwise.Finally, given a wavelet ψ j,k , we say that ψ j ,k is located in the far-field of ψ j,k if there holds dist(Ω j,k , Ω j ,k ) 2 − min{j,j } , otherwise, we say that ψ j ,k is located in the near-field of ψ j,k .For the tensorized wavelets, this threshold is the maximal support length, which amounts to 2 − min{j1,j2,j 1 ,j 2 } .3.4.Some important wavelet properties.Wavelet functions have some very nice properties, see e.g.[7,10,24] for the full range of expressions.In this section, we will restrict ourselves on the most important ones, which are needed in the remainder of this article.
First, it is well-known (see e.g.[7,10]) that a one-dimensional wavelet basis on the interval possesses a unique, biorthogonal dual basis.By tensorising this dual basis with itself, we get a biorthogonal dual basis on , which we denote by ψ.This dual basis then provides the approximation order d and the regularity γ > 0.
As already stated in [13,24], the set Ψ := {ψ j,k : j ≥ j 0 , k ∈ ∇ j } forms a Riesz basis of L 2 ( ), meaning that Moreover, in accordance with [7,13], they satisfy the norm equivalences, where H s • := H s if s ≥ 0 and H s • := H s if s < 0. For a multiindex j, let us define the (non-orthogonal) projection onto V j as whereas for J ∈ N, we define the projection onto V J as By using the a tensor argument, the duality and the biorthogonality, the onedimensional approximation property, which is derived e.g. in [24], generalizes to ) Moreover, there holds Bernstein's inequality Perhaps the most important property of wavelets for the present article is that they have vanishing moments, also called cancellation property, which is induced from the approximation order of the dual basis.Namely, in accordance with [7], there holds By explicitly enrolling the tensor product structure of the wavelet ψ j,k , we can immediately deduce that Remark 3.1.Due to the tensor product structure of the wavelets, we must tensorize scaling functions on the coarsest level with wavelets on a finer level.This means that we cannot use d vanishing moments in both directions.However, if I ⊆ {j 1 , j 2 } denotes the subset of indices corresponding to univariate wavelets with d vanishing moments, we have the estimate

Matrix entry estimates
In order to develop a compression scheme for the operator A with respect to the wavelet basis Ψ, we need to estimate the matrix entries in the Galerkin matrix where the wavelet function ψ i,j,k is the lifting of the function ψ j,k onto the patch Γ i , i.e., ψ i,j,k (x) := ψ j,k γ −1 i (x) .For now, let us consider the situation r = 1, where the only patch present is the unit square , in which case we can assume that γ i = id.The discussion of the situation on a Lipschitz manifold is postponed to Section 7.

4.1.
Far-field estimates.For the remainder of Section 4.1, we assume that δ tot > 0, which means that the first compression [8,16,24] applies.There exist estimations for the entries by Reich [20,21], which make use of the vanishing moments in the one dimensional wavelets with the smallest corresponding support length.This is especially useful when considering thin, long wavelets with a small distance.We quote the following result.Theorem 4.1 ([20, Theorem 2.1.9]).For j, j ≥ j 0 + 1, there holds Here, j (1) , j (2) ⊆ {j 1 , j 1 , j 2 , j 2 } ∩ [j 0 + 1, ∞) can be any two distinct indices, the best behaviour is obtained by choosing the two largest indices.
Let us next derive an estimate which makes use of the vanishing moments in every one-dimensional wavelet, which is beneficial if the supports of the wavelets ψ j,k and ψ j ,k are small.Theorem 4.2.For j, j ≥ j 0 + 1, there holds Proof.By explicitly enrolling the tensor product structure of the wavelets, we can write We can use the vanishing moments of ψ j 2 ,k 2 to deduce that .
Note that we can differentiate under the integral because the kernel k is smooth and bounded on Ω j,k × Ω j ,k .The vanishing moments of ψ j 1 ,k 1 then allow us proceed with the estimate to .
By subsequently using the vanishing moments of ψ j2,k2 , and ψ j1,k1 as well, we finally arrive at .
If we remember the fact that the kernel k is asymptotically smooth of order 2q, compare , we can deduce (18).

4.2.
Near-field estimates.As we will see in Section 5, we may use the previous estimates only if a wavelet pair is in the far-field, meaning that the supports are sufficiently far away.For the near field, we need to derive different estimates.In this case, we explicitly enroll the tensor product structure of the wavelets again.We will use an approach which is similar to the one created in [20].
To this end, we define the dimensionally reduced kernel y)ψ j 2 ,k 2 (y ) dy dy (19) and the operator A 1 as the integral operator with the kernel K 1 .By definition, the kernel K 1 depends on the wavelets ψ j2,k2 and ψ j 2 ,k 2 , but the context will always clarify this relation.Due to the tensor product structure, the dimensionally reduced operator obviously satisfies holds.However, there are also vanishing moments of the wavelets hidden in the kernel K 1 , which can be used to improve the estimate (20) and hence also Theorem 2.1.7 in [20]: , and max{j 1 , j 1 }, max{j 2 , j 2 } > j 0 .Then, we have .
Proof.We will simply derive the appropriate estimate for the kernel k 1 similar to (20).Then, the rest of the proof may be completed by simply following the arguments of [20].
If x = x , then the function under the integral in ( 19) is bounded, so we may directly differentiate under the integral.Moreover, let us without loss of the generality assume that j 2 > j 2 .Then, As the remainder of the proof is based on the ideas of [8, Section 6], we just sketch it.Without loss of the generality, we may assume that j 1 ≤ j 1 .In this case, ψ j1,k1 is located on a smooth part of the wavelet ψ j 1 ,k 1 , so we may decompose This can be realized by Calderón's extension theorem [25] with f H s ([0,1]) 2 sj 1 .Hence, we have For the function f , we define the operator by employing a smooth cutoff function χ satisfying χ| [0,1] = 1.Then, A 1 is a pseudo-differential operator of the order m = 1 + 2q + d, cf.[18].Remarking that in our case, we have c(j 2 , j 2 ) = 2 − 1 2 (j2+j 2 ) 2 − dj2 , we may apply [20, Lemma 2.1.4]and the fact that σ x1 2 −j 1 to conclude.Remark 4.4.Up to now, we have just considered a reduction to the first coordinate direction.Nevertheless, as also done in [20], a reduction to the second coordinate is possible by using a similar definition for the operator A 2 , and the same estimates hold with exchanged indices.

Matrix compression scheme
To keep the number of the degrees of freedom small, we need to introduce a compression scheme, according to which many matrix entries do not have to be calculated, whilst obtaining convergence with the full rate offered by the underlying Galerkin scheme.We differ between the first compression and the second compression, but for either case, we require a matrix block error which is controlled by a level dependent parameter σ j,j given by Here, d > d and κ > 0 are sufficiently small, but fixed real numbers, which are introduced in order to avoid logarithmic terms in the consistency estimates.
5.1.Far-field: First compression.In the case of the first compression, we consider a pair of wavelets ψ j,k and ψ j ,k , whose supports are located sufficiently far away from each other.As we will see, we need to estimate a sum of matrix coefficients by an integral, which requires that the minimal distance between the respective wavelets' supports is large enough.In two dimensions, we must have a minimal distance, which is at least as wide the largest face of the included supports, namely 2 − min{j1,j2,j 1 ,j 2 } .If this is not the case, however, we can make use of the tensor product structure and estimate the sum only in the coordinate direction of x i , which results in a minimal distance of 2 − min{ji,j i } .This procedure basically follows [20], but is adapted here to the setting on the full tensor product space.5.1.1.Compression in the x-and y-coordinate.For a fixed maximal level J, we define the compressed matrix for the first compression A c1,1 J as for k ∈ ∇ j , k ∈ ∇ j , and |j| ∞ , |j | ∞ ≤ J. Herein, the cutoff parameter B j,j is given as where a > 0 is a fixed real number.
Theorem 5.1.Let R J := A J − A c1,1 J .Then, for the matrix block R j,j , we have the estimate R j,j 2 a −(2q+4 d) 2 −σ j,j with a generic constant that is independent of the refinement level J.
Proof.We advance similar as in [8].First, we define the set ∇ B j := {k ∈ ∇ j : δ tot > B j,j }.Then, we estimate the column sum of the block R j,j by where the last inequality is due to Theorem 4.2.By the compression rule (22), we have the relation δ tot ≥ B j,j , and, since also B j,j 2 −j1 , 2 −j2 , we can estimate the sum by an integral, yielding .
As we also have B j,j ≥ a2 , we obtain Using exactly the same arguments, we can likewise derive the estimate for the row sums Similar to [8], we now use the estimate for the operator norm of a matrix which gives us, together with ( 24), ( 25), the desired result Remark 5.2.Similar to [20], using Theorem 4.1 and the cutoff parameter , where j (1) , j (2) ⊆ {j 1 , j 2 , j 1 , j 2 }, we have a compression scheme 1) , j (2) > j 0 , δ tot > Bj,j , A J (j,k),(j ,k ) , otherwise. ( The requirement j (1) , j (2) > 0 is necessary to ensure the validity of Theorem 4.1.
By modifying the appropriate calculations, we get that the corresponding difference matrix This will be important when we consider the complexity since we can also compress matrix blocks where scaling functions are involved in at most two coordinate directions.

Compression in only one coordinate direction.
As remarked earlier, we need at least that δ tot 2 − min{j1,j2,j 1 ,j 2 } in the proof of Theorem 5.1 to estimate the row and column sums of the matrix blocks by an integral.If this is not the case, we may estimate the sum by an integral in just one coordinate direction x i .This leads to restrictions on the distance in only this coordinate direction.Especially when the term 2 − min{j1,j2,j 1 ,j 2 } in ( 23) is too large, this approach is beneficial.As all the derivations can be found in [20], we just quote the results.We also remark that we have exchanged the | • | 1 -norms from [20] with | • | ∞ -norms in (21), since we are not working on a sparse tensor product space but on the full tensor product space.
Let us define the parameters We can then define the compressed value A J (j,k),(j ,k ) , otherwise, and then the compressed matrix by the rule if j (1) , j (2) > j 0 , A J (j,k),(j ,k ) , otherwise. ( The latter definition is just a restriction of the compression to the matrix blocks A j,j , for which j (1) , j (2) > 0, meaning that we can use Theorem 4.1 to estimate the corresponding matrix entries.With these definitions, out of the proof of Theorem 2.3.1 in [20], one immediately obtains the following result: J .Then, the compressed matrix blocks satisfy the estimate with a generic constant that is independent of the refinement level J.
By combining (22), (27), and (28), we can define the first compression of the matrix by (29) This compression affects the far-field of the system matrix in wavelet coordinates.5.2.Near-field: Second compression.Up to now, we have considered wavelets with disjoint and distant supports.As we will see, we can also discard many entries if the supports of the wavelet pairs are close or even if they overlap, where a strict requirement is that the distance of the support of the smaller wavelet to the singular support of the larger wavelet is sufficiently big.
We will only use one direction for the second compression as done by Reich in [20,21], but with improved parameters.We define Then, the compressed values are given by w (j,k),(j ,k (j,k),(j ,k Similar as in the first compression, we define the corresponding compressed matrices as Combining these two compression schemes leads to the second compressed matrix Remark 5.4.It suffices to compress only the entries with a2 − min{j1,j2,j 1 ,j 2 } ≥ δ x1 , δ x2 .Otherwise, we have and the first compression applies, meaning that either the entries are zero, or that, as we will see, there are only O(4 J ) such entries.
For the remainder of this section, let us without loss of the generality assume that j 1 ≤ j 1 .The following estimate holds: Theorem 5.5.The matrix blocks of the perturbed matrix R J := A J − A c2 J satisfy the estimate R j,j a −(1+2q+2 d) 2 −σ j,j with a generic constant that is independent of the refinement level J.
Let us therefore consider the case where j 1 = min{j 1 , j 2 , j 1 , j 2 }.For the sake of comfortability, let us define the index sets , and likewise, As one readily verifies, the cardinality of these sets is bounded by Next, we recall that, according to Theorem 4.3, we have the estimate .
This allows us to estimate the column sums by k∈∇ j r (j,k),(j ,k ) . Similarly, we may estimate the row sums by . Hence, we can argue in complete analogy to the proof of Theorem 5.1.
By using exactly the same arguments, but with interchanging the coordinate directions, we may also control the compression error of the entries, for which δ x1 > a2 − min{j1,j 1 } and δ x2 ≤ a2 − min{j2,j 2 } .This implies the control of the error for the whole matrix A c2,1 J .For the matrix A c2,2 J , we may use exactly the same arguments as in the proof of Theorem 2.3.2 in [20], with the only adaptation being that we have to use Theorem 4.3 to estimate the matrix entries instead of Theorem 2.1.7 in [20].
Finally, by using an additive argument, we can pose the main theorem of this section.
Theorem 5.6.Consider the compressed matrix Then, the block error is controlled by Remark 5.7.We have improved both the parameters E j,j and F xi j,j in contrast to [20]: For both E j,j and F xi j,j , we gain an additional factor 2 − d max{j ,j } , = i, in the above error estimates, but we have to pay another d in the denominator.This is not mandatory to ensure linear complexity, but it reduces the number of required vanishing moments, for example in the case of piecewise constant wavelets for the single layer operator to as few as three, as it is the case if an isotropic wavelet basis is used.
Much more important, we win the additional factor for E j,j in the above error estimates.This is strictly necessary to ensure a linear complexity.

Complexity
We are now going to count the number of nonzero matrix coefficients of the compression matrix A c J and show that this number is asymptotically bounded by N J = 4 J .In the arguments below, it is crucial that the exponents are positive to estimate the sum asymptotically by the largest term.To this end, for the sake of simplicity, we will for the remainder of this section assume that κ > 0 is sufficiently small.This is not problematic since we only have a bounded number of restrictions on κ, depending only on the uniform constants d , d, and the order of the operator 2q.Moreover, as we will see, we need to require the inequalities Since this inequality is strict, there will especially always be space for a small κ > 0, which can be inserted between d and the minimum over all these terms.First of all, we note that the restriction of the compression to the appropriate matrix blocks in ( 27), (30), and (31) never causes a problem.Indeed, if we can not compress a matrix block, then at least two indexes in the set {j 1 , j 2 , j 1 , j 2 } are equal to j 0 .In particular, as dim V j0 ∼ 2 j0 , there are only O(2 2j0 ) = O(1) rows and columns corresponding to such situation.As every row and column contains at O(N J ) entries, these are at most O(N J ) entries in total.
For the compressed blocks, we organize the proof in the following steps.First, we split up the unit square in at most nine regions, compare Figure 2, corresponding to the distance in each coordinate direction being either big or small.These nine regions correspond to the four possible cases (I) In Section 6.1, we will show that already the first compression gives a linear complexity when there holds Then, in Section 6.2, we consider the wavelet pairs whose supports are closer together than a2 − min{j1,j2,j 1 ,j 2 } and we will show the linear complexity for those regions as well.
6.1.Complexity of the first compression.We advance by the type of the compression which is performed on the matrix entries.First, we count all the nontrivial entries remaining from the compression scheme (22) in the case when Theorem 6.1.Assume that we set all matrix entries to zero where the underlying wavelets satisfy δ tot > B j,j with B j,j given by (36).Then, only O(4 J ) nontrivial entries remain.Proof.In any column of a block A j,j , we find O(2 |j |1 B 2 j,j ) entries, for which the distance of the supports is bounded by B j,j .Since there are O (2 |j|1 ) columns in such a block, there are at most O(2 |j|1+|j |1 B 2 j,j ) nonzero entries per block.Hence, the total complexity for the whole matrix is given by To calculate the sum, we explicitly enrol the indices j 1 and j 2 .Then, after calculating the two sums over j 2 , and shifting the index in the second sum, we obtain . j (2) := j 2 , we conclude Second, if j 2 ≤ j 2 and |j| ∞ = j 2 , then we have the order j 1 ≤ j 1 ≤ j 2 ≤ j 2 , so with j (1) := j 2 , j (2) := j 2 , we may directly sum up Third, if j 2 ≤ j 2 and |j| ∞ = j 1 , we have j 1 ≤ j 2 ≤ j 2 ≤ j 1 , therefore the choice j (1) Finally, if j 2 ≤ j 2 and |j| ∞ = j 2 , we also have j 1 ≤ j 1 ≤ j 2 .If we choose j (1) := j 2 and j (2) := j 1 , then the complexity reads as Let us now consider the last possible situation, that is, we suppose that which describes the region (IV) in (35), or, the near-field region in Figure 2. In this case, the compression scheme (31) applies.Remember that we still assume that j 1 ≤ j 1 .Lemma 6.5.Consider all matrix entries in A c J such that the underlying wavelet pairs ψ j,k and ψ j ,k satisfy Then, after the compression scheme (31), these are at most O(4 J ) nontrivial entries.
Proof.It suffices to count the respective matrix entries of A c2,2 J .Suppose first that j 2 ≥ j 2 .In this case, the number of nontrivial entries in the matrix block [A c2,2 J ] j,j can be estimated by Remarking that for κ > 0 sufficiently small, we have d − 2κ > 0, so the properties |j| ∞ ≥ j 1 and |j | ∞ ≥ j 2 imply that the number of entries can be estimated by As the indices can be exchanged, the two sums are equal and can be estimated by .
Putting everything together, we obtain that In the case where j 2 ≥ j 2 , we need to argue in a slightly different way.By the assumption j 1 ≤ j 1 , we have at least O(2 |j |1 ) nontrivial entries in the matrix block [A c2,2 J ] j,j , as at every point at which the singular supports Ω σ j 1 ,k 1 and Ω σ j 2 ,k 2 intersect, there is at least one smaller wavelet ψ j,k touching it.On the other hand, every nontrivial entry satisfies σ x1 ≤ F x1 j,j and σ x2 ≤ F x2 j,j , so the number of nontrivial entries in a block [A c2,2 J ] j,j can be estimated by If the maximum in (41) is equal to 1, then it is easy to see that If, however, the maximum in (41) is not equal to 1, we have The first sum can be treated as in the previous case, whereas for the second sum, there holds .
Hence, we can conclude that also C (2) 4 J .
With the preceeding three lemmata and Theorem 6.1 at hand, we conclude the main result of this section.Theorem 6.6.Assume that −q < d ≤ d and that (34) holds.Then, compressed matrix A c J arising from (33) contains at most O(4 J ) nontrivial entries.
Remark 6.7.For the piecewise constant (d = 1) and piecewise bilinear (d = 2) wavelets, used most often in practice, one needs at least the following number of vanishing moments: If 2q = 1, using piecewise constant wavelets is mathematically not meaningful because the energy space H 1 2 (Γ) cannot be discretized by discontinuous trial functions.Note that these are the same values as in the setting of isotropic wavelet bases, compare [8].

The situation on a Lipschitz manifold
Up to now, we have only considered the situation on the unit square.As stated in Section 2, we are however interested in the solution of a boundary integral equation posed on the boundary Γ of a Lipschitz domain D ⊆ R 3 .Recall that the boundary Γ is given as the union of the patches Γ i , which can be smoothly parametrized by γ i : → Γ i .7.1.Sobolev spaces on manifolds.To properly define the Sobolev space H s (Γ), we choose for each patch Γ i an extension Γ i such that Γ i Γ i Γ.Then, the family { Γ i } i is an overlapping decomposition of Γ and, by the Lipschitz continuity of Γ, every parametrization γ i can be used to construct a piecewise smooth, globally Lipschitz continuous parametrization γ i : → Γ i , where is a suitable superset of .
Next, we choose a partition of the unity {χ i } r i=1 , consisting of nonnegative, smooth functions We then define the Sobolev space H s (Γ) for s ≤ 1 as the space of functions u : Γ → R, for which the norm is finite.Obviously, this norm is induced by the inner product For −1 ≤ s < 0, we define H s (Γ) as usual as the dual of H −s (Γ) with respect to the L 2 (Γ)-inner product.
Remark 7.1.The above definition depends on the parametrizations γ i and the chosen partition of the unity.However, it can be shown that the space H s (Γ) consists of the same functions, regardless which parametrizations and which partition of the unity is chosen [30].Moreover, the requirement |s| ≤ 1 amounts from the Lipschitz continuity of γ i .For a C k,α -surface, it is possible to define these spaces up to |s| ≤ k + α.

7.2.
Patchwise smooth wavelets.Similar to the existing literature, see e.g.[8,24], we discretize the energy space H q (Γ) with q ≤ 1 by transporting the wavelet functions from the unit square onto Γ. Precisely, we define a basis function as , and then consider the basis set To construct a trial space on the level J, we proceed as on the unit square, meaning that we cut the basis off at the level J, obtaining Note that the above wavelets are supported on a single patch Γ i only.In general, they are not continuous over the patch boundaries, so they only attain a regularity of γ = 1 2 , regardless how smooth the wavelets are piecewise.It is possible to construct wavelets which are continuous over the patch boundaries, see e.g.[24], but γ = 1 2 is sufficient if we want to discretize an operator of nonpositive order.Because the wavelets are supported on a single patch, and each parametrization and cutoff function is smooth, one can generalize all the wavelet properties stated in Section 3.4.For the same reason, the Lipschitz continuity of the surface is no restriction in using d > 1 vanishing moments for the cancellation property ( 17), since we do not have to consider the behaviour of the test function over the patch boundaries, and the expressions | • | W d,∞ (Ω i,j,k) are well-defined.
Finally, as the spaces V J coincide with the space spanned by all lifted dyadic indicator functions on the level J, we can directly quote the following lemma, compare [18].Lemma 7.2.For a continuous, strongly elliptic and injective operator A : H q (Γ) → H −q (Γ), the Galerkin discretization is stable, meaning that for any sufficiently large J, and Furthermore, let u be the solution of (5) and u J the solution of (7).Then, we have the convergence provided that Γ is sufficiently regular.
Remark 7.3.The condition that A is injective is, as already stated, not strictly necessary.It suffices if the kernel is finite-dimensional and known in advance.
7.3.Matrix estimates.As we will see, all the matrix estimates on a given surface can be concluded from the matrix estimates on the unit square.Depending on the situation of the two patches on which the wavelets are supported, we need to differ between several cases.To this end, let Ki,i (x, x ) := K γ i (x), γ i (x ) det Dγ i Dγ i (x) det Dγ i Dγ i (x ) (42) with 1 ≤ i, i ≤ r denote the transported kernel function.With the transported kernel function at hand, we find that where we define Âi,i as the integral operator with the transported kernel Ki,i .Since that local parametrizations γ i and γ i are smooth, the transported kernel functions also satisfy the decay property provided that 2 + 2q + |α| + |α | > 0. Therefore, in view of ( 43) and ( 44), the farfield estimates of Section 4.1 hold true also in case of piecewise smooth Lipschitz manifolds.
7.3.1.Wavelets supported on the same patch.Let us look at the easiest situation first.If we consider the interaction between ψ i,j,k and ψ i,j ,k , we can use the relations (43) and (44) together with γ i (x) − γ i (x ) ∼ x − x to conclude that the situation on a single patch is equivalent to that of the unit square.Therefore, also the near-field estimates are valid one-to-one and we find at most O(4 J ) nontrivial entries in the matrix block associated with Γ i ×Γ i .Especially, the compression error in each matrix block satisfies the same estimates as in Section 5.
7.3.2.Wavelets supported on patches with a common edge.Let us now assume that Γ i and Γ i share a common edge.For the sake of simplicity, we assume that the common edge Σ satisfies especially that there holds γ i (1, x 2 ) = γ i (0, x 2 ) for all x 2 ∈ [0, 1].Otherwise, we can apply suitable rotations such that this assumption holds.3 for a graphical illustration.For the near field estimates, we need to interpret the coordinate directions in a meaningful way.This is quite intuitive in Figure 3, since we can simply define the x-direction as the direction across the edge, while the y-direction can be interpreted as the direction parallel to the edge.Especially, we find For Theorem 4.3, we followed the arguments in [20], where the smooth part of the bigger wavelet ψ j 1 ,k 1 is extended to a smooth, compactly supported function.Then, also the operator A 1 is extended to a classical, pseudo-differential operator A 1 , compare [18].This is not so easy to do in the current situation, as the kernel Ki,i is no longer asymptotically smooth of the order 2q, since it is only continuous over the edge Σ.Nevertheless, this difficulty can be overcome.For the sake of simplicity, let us define Ωj1,k1 × Ωj2,k2 := supp Then, there holds Ωj1,k1 ⊆ [0, 1] and Ωj 1 ,k 1 ⊆ [1, 2] in (43).Hence, we either have σ x1 = 0, in which case we cannot compress in the direction of x 1 at all, or that the smooth extension function f from the proof of Theorem 4.3 is equal to 0, since the two wavelets are located on different patches.Hence, we can ignore everything about the pseudo-differential operator and only consider the complement function.
In the second coordinate direction, the situation looks more complicated, as Ωj2,k2 and Ωj 2 ,k 2 may not be disjoint.In this case, we still need to argue with Â 2 , which corredponds to the kernel Although γ is overall only Lipschitz continuous, it is indeed smooth in the second coordinate direction, as both γ i and γ i are smooth and coincide on the common edge Σ.Thus, the arguments of [20] work in this case as well.
In the proof of Theorem 4.3, we have also made use of vanishing moments hidden in the kernel.This is possible to do here as well.Indeed, if e.g.j 1 ≥ j 1 , then only have to consider a term of the form ess sup where the asymptotic estimate for Ki,i holds since, γ i is smooth on Ω j 1 ,k 1 × [0, 1] in view of (42).Hence, the same arguments as in the proof of Theorem 4.3 apply.

7.3.3.
Patches with a common vertex.If the patches Γ i and Γ i have a common vertex, we can use a similar argument as before.We assume that the common vertex v satisfies -possibly after application of suitable rotations and translations γ i Concerning the first compression, in view of (43), (44), and

Consistency and convergence
In this section, we are going to show that the Galerkin scheme for the compressed operator converges as well as the Galerkin scheme for the uncompressed operator.This means that the wavelet matrix compression under consideration realizes the discretization error accuracy offered by the underlying Galerkin scheme.
Similar to [8], we define the compressed boundary integral operator A ε J in accordance with which defines a continuous operator H s (Γ) → H s−2q (Γ) for all − γ < s < γ + 2q.Especially, this operator represents the compressed matrix A ε J in terms of Theorem 8.1 (Consistency).Let A ε J denote the compressed matrix of the level J with a parameter a, such that Then, for q ≤ s, t ≤ d, the associated compressed operator A ε J satisfies the estimate holds uniformly in J.
Proof.By the definition of the operators A, A ε J , the biorthogonality and the representation formula u = j,k together with all the block error estimations and the definition of σ j,j in (21), we obtain that For each the two sums, we may apply the inequality of Cauchy-Schwarz to obtain that Herein, the first sum can be estimated by Moreover, if s ≥ 0, the second sum can be treated by the approximation property (15) of the spaces V n , and we obtain that 2 −sn u H s (Γ) .
Whereas, if s < 0, we can use the Bernstein inequality (16) to get After applying the same procedure to v, we finally arrive at 2 n (d −t) ε 2 J(2q−s−t) u H s (Γ) v H t (Γ) since q ≤ s, t ≤ d < d .
Our next goal is to show that the compressed wavelet scheme converges to the original solution.First, similar to [8], we keep in mind that Theorem 8.1 implies that (A − A ε J )u J , v J Γ ε u J H q (Γ) v J H q (Γ) , u J , v J ∈ V J .In view of the strong ellipticity (4), we can conclude that u, A ε J + (A ε J ) u Γ ≥ (c − 2ε) u 2 H q (Γ) , so the operator A ε J is strongly elliptic for ε sufficiently small, too.These two properties then imply that the operator A ε J is stable in the sense that A ε J u J H −q (Γ) ∼ u J H q (Γ) .With these two results at hand, we may deduce the following two theorems using the arguments of [8].Theorem 8.2 (Convergence).Let ε be sufficiently small such that A ε J is strongly elliptic.Then, the solution of the compressed matrix equation u J = |j|∞≤J k∈∇ j u j,k ψ j,k , where the coefficient vector u J satisfies A ε J u J = g J , where g J j,k = g, ψ j,k Γ , converges to the solution u of (5) in H q (Γ) and the estimate u − u J H q (Γ) 2 J(q−d) u H d (Γ) holds.
Theorem 8.3 (Aubin-Nitsche).Let all the assumptions of Theorem 8.2 hold and moreover assume that A : H t+q (Γ) → H t−q (Γ) is an isomorphism for any 0 ≤ t ≤ d − q.Then, we have the error estimate u − u J H q−t (Γ) 2 J(q−d−t) u H d (Γ) .
These two theorems can be shown by using only the consistency, the ellipticity, the stability, and the approximation property, cf.[8].

Numerical Computations
In this section, we present numerical experiments to validate the theoretical findings.We use piecewise constant wavelets with three vanishing moments and consider the single layer operator on the unit square.We compute first the full wavelet Galerkin matrix with A being the single layer operator.Then, the obsolete entries are removed according to the compression scheme from Section 5.The number of nonzero entries are calculated and plotted (blue line) in on the left-hand side of Figure 6.To this end, we have chosen the parameters d = 3, d = 1.1, and κ = 10 −3 , while for the bandwidth parameter a the different values a = 0.5, 1.0, 2.0 have been considered.The system matrix for the anisotropic tensor product wavelet basis and its compressed counterpart with 4 7 = 16 384 rows and columns and a = 1.0 can be found in Figure 5. From a theoretical point of view, the number of nontrivial entries in each block can be bounded by N j,j 2 |j|1+|j |1 B 2 j,j + min{2 − min{j1,j2,j 1 ,j 2 } , E j,j } + F x1 j,j F x2 j,j .
Moreover, in accordance with (40) and (41), an additional summand 2 min{|j|1+j |1} is added if j ≤ j or j ≤ j.The number of nontrivial entries in the whole compressed matrix can therefore be also bounded by This estimate is also found (red line) in the plot on the left-hand side of Figure 6.

Conclusion
We have developed a matrix compression scheme for the boundary element method using anisotropic tensor product wavelets.In the end, we get a quasi-sparse matrix containing only O(N ) nontrivial entries whilst the approximate solution converges to the exact solution at the rate of the discretisation error.This applies for every integral operator of arbitrary order on the unit square.On a Lipschitz geometry, however, the order of the integral operator is bounded by 2q < 1 for a conforming method since the underlying wavelet construction yields ansatz functions which are discontinuous across patch boundaries.
Likewise to [20,21], our compression scheme may be generalized to an arbitrary spatial dimension on the unit cube.One would have to choose the compression parameters according to the location of the two tensor product wavelets with respect to each other.Then, one should combine the first compression in all directions, in which the two wavelets are in the far-field with a second compression for all directions, in which the wavelets are in the near-field.Especially, optimal compression estimates in sparse tensor product spaces are possible.
On the contrary, it is not known yet whether the anisotropic tensor product wavelet basis is s -compressible, which was established for the isotropic wavelet basis in [27].With the s -compressibility at hand, it was shown in [9,11] that

Figure 1 .
Figure 1.A parametrization of Fichera's vertex.The different shadings represent the different r = 24 patches Γ j .

Figure 2 .
Figure 2. Graphical illustration of the regions described in (35).Note that there are at most nine different regions.

Figure 3 .
Figure 3. Graphical illustration of the parametrization γ in case of a common edge.

Figure 4 .
Figure 4.A possible Lipschitz continuous extension of the maps γ i and γ i in case of a common vertex.

Figure 5 .
Figure 5.The wavelet Galerkin matrix (left) and its compressed version (right) for J = 7.We have used the parameters a = 1.0, d = 3, d = 1.1, and κ = 10 −3 .The colour indicates the absolute values of the matrix entries in a logarithmic scale.