1 Introduction

Many problems from engineering result in partial differential equations, which can often be solved efficiently by using the finite element method [3, 4]. However, if the loading of the equation is zero, under some circumstances, one may transform the partial differential equation in a domain into an integral equation on its boundary. Such an integral equation can then be solved using the boundary element method [28, 32]. There are many practical problems which can be treated with the boundary element method, such as for example the Laplacian or linear elasticity problems [26, 32], scattering problems [7], and homogenization problems [1, 6, 23].

A huge advantage of the boundary element method is that the integral domain under consideration is reduced from \(n\) spatial dimensions to an \( (n-1) \)-dimensional surface. This brings us a significant reduction in the number of degrees of freedom, but since the integral kernels are, in general, nonlocal, also densely populated matrices.

To overcome the dense matrices, fast boundary element methods have been developed, such as adaptive cross adaptation [2], the fast multipole method [16], or the wavelet matrix compression [9, 30]. A comparison of these methods with respect to their advantages and disadvantages can be found in [18] for example. The computational cost of all these methods scale linearly or loglinearly in the number of degrees of freedom. Indeed, the wavelet matrix compression has been shown to have linear cost complexity, compare [9]. Moreover, the boundary integral operator is \(s^\star \)-compressible [33], with the consequence of a quasi-optimal convergence for the adaptive wavelet boundary element method [10, 15, 21, 34].

However, all these works consider isotropic wavelets, meaning that the mesh of the underlying multiscale hierarchy consists of isotropic elements. Therefore, only anisotropic singularities cannot be resolved properly. This leads to a loss in the convergence rate if the solution of the boundary integral equation exhibits such singularites. Anisotropic singularities, however, appear if the boundary under under consideration contains edges as this is the case, for example, for Fichera’s corner [21]. This gives rise to considering anisotropic tensor product wavelets, which are allowed to refine in one coordinate direction whilst staying coarse in the other coordinate direction. With such wavelet functions, the disadvantage of the isotropic wavelets might be overcome.

Anisotropic tensor wavelets for boundary integral equations have been considered first in [17] in the context of sparse tensor product approximations. In [24], for both isotropic and nonisotropic boundary integral operators which are discretized with respect to sparse tensor product spaces, a compression scheme has been developed. This scheme leads to an essentially linear number of the degrees of freedom therein, provided the underlying integro-differential operator is of the order \(2q> \frac{1}{2}(\sqrt{5}-1) > 0\). We, on the other hand, will construct in the present article a linearly scaling compression scheme for integral operators of arbitrary order which are discretized with respect to the full tensor product space. In particular, our compression estimates improve the results from [24]. Note, however, that the computation of the matrix entries of the compressed system matrix is not a topic of the present article. This can be done by using the techniques and results of [19, 36], exploiting the exponentially convergent quadrature scheme proposed in [27].

The rest of the article is structured as follows. In Sect. 2, we introduce the boundary integral equation to be solved. Then, in Sect. 3, we define the anisotropic wavelet basis we shall use for the discretization on the unit square. Estimates on the size of the entries of the respective Galerkin matrix with respect to the unit square are derived in Sect. 4. The wavelet matrix compression is proposed in Sect. 5. The number of the remaining nonzero matrix entries is counted in Sect. 6. In Sect. 7, we generalize the wavelet matrix compression to the boundary of a Lipschitz domain. Consistency and convergence of the wavelet matrix compression is proven in Sect. 8. In Sect. 9, we provide numerical experiments to validate the results derived. Finally, in Sect. 10, we state concluding remarks.

Throughout the article, let us replace generic constants by the notation \(A \lesssim B\), which means that \(A\) is bounded by a constant multiple of \(B\), and, similarly we define \(A \gtrsim B\) if and only if \(B \lesssim A\). If \(A \lesssim B\) and \(B \lesssim A\), we write \(A \sim B\). Moreover, if \({\textbf {j}} , {\textbf {j}} '\in \mathbb {N}_0^2\) are given multiindices, the inequality \({\textbf {j}} \le {\textbf {j}} '\) is understood componentwise. Especially, the notion \({\textbf {j}} <{\textbf {j}} '\) means that \({\textbf {j}} \le {\textbf {j}} '\) and \({\textbf {j}} \not ={\textbf {j}} '\). Finally, we set \(\textbf{1} \mathrel {\mathrel {\mathop :}=}(1,1)\).

2 Problem formulation

2.1 Parametrization

Throughout this article, we consider a bounded, piecewise smooth domain \(\varOmega \subset \mathbb {R}^3\) with Lipschitz boundary \(\varGamma \mathrel {\mathrel {\mathop :}=}\partial \varOmega \). We assume that \(\varGamma \) can be decomposed into \(r\) four-sided, smooth patches \(\varGamma _i\), \(i = 1, \ldots , r\), such that

$$\begin{aligned} \varGamma = \bigcup _{i =1}^r \varGamma _i. \end{aligned}$$

This decomposition needs to be admissible, meaning that the intersection \(\varGamma _i \cap \varGamma _j\) is for \(i \ne j\) either empty, a common vertex, or a common edge of both \(\varGamma _i\) and \(\varGamma _j\), cf. Fig. 1. We next choose smooth diffeomorphisms \(\varvec{\gamma }_i: \square \mathrel {\mathrel {\mathop :}=}[0,1]^2\rightarrow \varGamma _i\) such that there exist constants \(c_i\) and \(C_i\) with

$$\begin{aligned} 0< c_i \le \kappa _i(\hat{\varvec{x}}) \mathrel {\mathrel {\mathop :}=}\sqrt{ \det \big ( {\textbf {D}} \varvec{\gamma }_i(\hat{\varvec{x}})^\intercal {\textbf {D}} \varvec{\gamma }_i(\hat{\varvec{x}}) \big )} \le C_i < \infty ,\quad \hat{\varvec{x}} \in \square . \end{aligned}$$
(1)

This parametrization should fulfil the matching condition that \(\varvec{\gamma }_i\) and \(\varvec{\gamma }_j\) coincide up to orientation at a common edge of two neighbouring patches \(\varGamma _i\) and \(\varGamma _j\).

Fig. 1
figure 1

A parametrization of Fichera’s vertex. The different shadings represent the different \(r = 24\) patches \(\varGamma _j\)

For the current type of surfaces, Sobolev spaces \(H^s(\varGamma )\) can be introduced along the lines of [28, 32] for all \(-s_\varGamma \le s\le s_\varGamma \), where \(s_\varGamma \) depends on the global smoothness of the surface. For example, we have \(s_\varGamma = 1\) in the case of a Lipschitz continuous domain \(\varOmega \). Moreover, the inner product

$$\begin{aligned} \langle u, v \rangle _\varGamma := \int _\varGamma u(\varvec{x}) v(\varvec{x}) \; {\text {d}}S_{\varvec{x}} = \sum _{i=1}^r \int _{\square } u(\varvec{\gamma }_i({\hat{\varvec{x}}})) v(\varvec{\gamma }_i({\hat{\varvec{x}}})) \kappa _i({\hat{\varvec{x}}})\; {\text {d}}{\hat{\varvec{x}}} \end{aligned}$$

can be continuously extended to the duality pairing \(H^{-s}(\varGamma )\times H^s(\varGamma )\).

2.2 Boundary integral equation

Let \(\mathcal {A}: H^q(\varGamma )\rightarrow H^{-q}(\varGamma )\) be a boundary integral operator which is for sufficiently smooth u pointwise defined by

$$\begin{aligned} \mathcal {A}u(\varvec{x}) \mathrel {\mathrel {\mathop :}=}\int _\varGamma K(\varvec{x}, \varvec{y}) u(\varvec{y}) \; {\text {d}}S_{\varvec{y}}, \quad \varvec{x}\in \varGamma . \end{aligned}$$

In the following, we intend to calculate the solution \(u\in H^q(\varGamma )\) of the boundary integral equation

$$\begin{aligned} \mathcal {A}u = g\quad \text {in}\ H^{-q}(\varGamma ). \end{aligned}$$
(2)

Typically, the kernel \(K\) is asymptotically smooth of the order \(2q\), that is, \(K\) is singular only on the diagonal \( \{ (\varvec{x}, \varvec{y}) \in \varGamma \times \varGamma : \varvec{x}= \varvec{y}\}\) and smooth apart from it in terms of

$$\begin{aligned} \big | \partial _{\varvec{x}}^{\varvec{\upalpha }} \partial _{\varvec{y}}^{\varvec{\upbeta }} K (\varvec{x}, \varvec{y}) \big | \le C_{\varvec{\upalpha }, \varvec{\upbeta }} \Vert \varvec{x}- \varvec{y}\Vert ^{-(2 + 2q + |\varvec{\upalpha }| + |\varvec{\upbeta }|)}, \quad \varvec{\upalpha }, \varvec{\upbeta }\in \mathbb {N}_0^2, \end{aligned}$$
(3)

provided that \(2 + 2q + |\varvec{\upalpha }| + |\varvec{\upbeta }| > 0\). We assume that \(\mathcal {A}\) is bounded and also strongly elliptic on \(H^q(\varGamma )\), meaning that there exists a uniform constant \(c > 0\), such that for any \(u \in H^q(\varGamma )\), we have

$$\begin{aligned} \big \langle (\mathcal {A}+ \mathcal {A}') u, u \big \rangle _\varGamma \ge c \Vert u\Vert _{H^q(\varGamma )}^2, \end{aligned}$$
(4)

where \(\mathcal {A}':H^q(\varGamma )\rightarrow H^{-q}(\varGamma )\) is the dual operator of \(\mathcal {A}\). Furthermore, for the sake of convenience, the operator \(\mathcal {A}\) is assumed to be injective. If this is not the case, but if its kernel is finite-dimensional and known in advance, then one can consider \(\mathcal {A}\) as an operator

$$\begin{aligned} \mathcal {A}: \quad H^q(\varGamma ) / \ker \mathcal {A}\quad \rightarrow \quad \big ( H^q(\varGamma ) / \ker \mathcal {A})' \end{aligned}$$

and the presented approach is still valid, which for example is the case for any interior Neumann problem, where the kernel consists of all constant functions.

A practical example, which can be written as a boundary integral equation, is the Laplace problem with homogeneous Dirichlet boundary data in three spatial dimensions for given boundary data \(g \in H^{\frac{1}{2}}(\varGamma )\), i.e.,

$$\begin{aligned} \varDelta v = 0\ \text {in } \varOmega , \qquad v = g\ \text {on } \varGamma . \end{aligned}$$

It is well known that this problem is uniquely solvable. As described in detail in [28, 32], for example, we may write \(v\in H^1(\varOmega )\) as a layer potential of an unknown density \(u \in H^q(\varGamma )\), that is \(v = \mathcal {P}u\), where \(\mathcal {P}\) is a linear and continuous boundary potential operator from \(H^{q}(\varGamma )\) to \(H^1(\varOmega )\). By applying the trace operator to both sides of the equation \(v = \mathcal {P}u\), we arrive at the boundary integral equation

$$\begin{aligned} \mathcal {A}u \mathrel {\mathrel {\mathop :}=}{\text {tr}} \big (\mathcal {P}u \big ) = g. \end{aligned}$$
(5)

Especially, in the case of the single layer and the double layer potential, the kernels are given by

$$\begin{aligned} K_s(\varvec{x}, \varvec{y}) = \frac{1}{4 \pi \Vert \varvec{x}- \varvec{y}\Vert }, \ \ K_d(\varvec{x}, \varvec{y}) = \frac{\langle \varvec{x}- \varvec{y}, \varvec{n}_{\varvec{y}} \rangle }{4 \pi \Vert \varvec{x}- \varvec{y}\Vert ^3} \ \ \text {for a.e. }(\varvec{x},\varvec{y}) \in \overline{\varOmega }\times \varGamma , \end{aligned}$$

where \(\varvec{n}_{\varvec{y}}\) denotes the outward-directed normal at \(\varvec{y}\). It can easily be seen that \(K_s\) and \(K_d\) are asymptotically smooth kernels of the orders \(2q=-1\) and \(2q=0\), respectively. Operators of positive order arise for example from the hypersingular integral equation or in case of nonlocal diffusion, compare [14].

2.3 Galerkin scheme

By testing (2) with \(\phi \in H^{q}(\varGamma )\) with respect to the duality pairing on \(H^{-q}(\varGamma )\times H^q(\varGamma )\), we derive the variational formulation of the boundary integral equation under consideration:

$$\begin{aligned} \text {find } u \in H^q(\varGamma ) \text { such that } \langle \mathcal {A}u,\phi \rangle _\varGamma = \langle g,\phi \rangle _\varGamma \ \text { for any } \phi \in H^{q}(\varGamma ). \end{aligned}$$
(6)

Similar to [9], we are considering a sequence of nested trial spaces \(V_{{\textbf {J}} } \subset V_{{\textbf {J}} +{\textbf {1}} } \subset \dots \subset H^q(\varGamma )\), which is asymptotically dense in \(H^q(\varGamma )\). Here and in the remainder of this article, for \({\textbf {J}} \mathrel {\mathrel {\mathop :}=}(J,J)\), the space \(V_{{\textbf {J}} }\) denotes a trial space of piecewise polynomials on an isotropic mesh with mesh width of size \(\sim 2^{-J}\). For any fixed level J, we then restrict the variational formulation (6) to \(V_{{\textbf {J}} }\) to obtain the Galerkin problem

$$\begin{aligned} \text {find } u_J \in V_{{\textbf {J}} } \text { such that } \langle \mathcal {A}u_J,\phi \rangle _\varGamma = \langle g,\phi \rangle _\varGamma \ \text { for any } \phi \in V_{{\textbf {J}} }. \end{aligned}$$
(7)

If \(V_{{\textbf {J}} } = {{\,\textrm{span}\,}}\{ \phi _1, \ldots , \phi _{N_J} \} \subset H^{q}(\varGamma )\), the Galerkin problem (7) is equivalent to the linear system of equations

$$\begin{aligned} {\textbf {A}} _J {\textbf {u}} _J = {\textbf {g}} _J, \ {\textbf {A}} _J = \big [ \big \langle \mathcal {A}\phi _\ell , \phi _k \big \rangle _\varGamma \big ]_{k, \ell = 1}^{N_J}, \ {\textbf {g}} _J = \big [ \big \langle g,\phi _k \big \rangle _\varGamma \big ]_{k = 1}^{N_J}, \ {\textbf {u}} _J = \big [ c_k \big ]_{k = 1}^{N_J}, \end{aligned}$$

where \(u_J = \sum _{k=1}^{N_J} c_k \phi _k\). Especially, by means of Cea’s Lemma, the solution \(u_J\in V_{{\textbf {J}} }\) satisfies an estimate of the form

$$\begin{aligned} \Vert u-u_J\Vert _{H^q(\varGamma )} \lesssim \inf _{v_J\in V_{{\textbf {J}} }}\Vert u-v_J\Vert _{H^q(\varGamma )}. \end{aligned}$$

Herein, the right-hand side can be estimated further by imposing more knowledge on the trial spaces \(V_{{\textbf {J}} }\).

3 Discretization

3.1 Single-scale bases

A natural choice of trial functions are piecewise polynomial functions, defined on the unit interval, tensorized, and then transported onto a surface patch \(\varGamma _i\). We postpone this transportation to Sect. 7 and consider the unit square first. To this end, we first have to consider the unit interval \(I = [0, 1]\). Given a level \(j\), we want to construct a space \(V_j\) with \(\dim V_j\sim 2^j\), which consists of piecewise polynomial functions on the dyadic intervals \([2^{-j}k, 2^{-j}(k+1)]\), \(k = 0, 1, \ldots , 2^{j}-1\). This is possible by choosing a suitable function \(\phi \) and then rescaling it according to

$$\begin{aligned} \phi _{j, k}(x) \mathrel {\mathrel {\mathop :}=}2^{\frac{j}{2}} \phi \big ( 2^{j}x - k \big ), \quad k \in \varDelta _j, \quad x \in [0, 1], \end{aligned}$$

where \(\varDelta _j\) is a suitable index set. Note that this scaling implies \(\Vert \phi _{j, k}\Vert _{L^2([0, 1])}\sim 1\).

Outgoing from this construction, we can define any ansiotropic tensor product function and the corresponding tensor product spaces. In particular, for \({\textbf {j}} = (j_1, j_2) \in \mathbb {N}_0^2\), and \({\textbf {k}} = (k_1, k_2) \in \varDelta _{{\textbf {j}} } \mathrel {\mathrel {\mathop :}=}\varDelta _{j_1} \times \varDelta _{j_2}\), we define the tensor product function

$$\begin{aligned} \phi _{{\textbf {j}} , {\textbf {k}} }(\varvec{x}) \mathrel {\mathrel {\mathop :}=}\big ( \phi _{j_1, k_1} \otimes \phi _{j_2, k_2} \big ) (\varvec{x}) = \phi _{j_1, k_1}(x_1) \phi _{j_2, k_2}(x_2), \quad \varvec{x}\in \square . \end{aligned}$$

With these functions, we define for \({\textbf {J}} =(J,J)\) the trial space

$$\begin{aligned} V_{{\textbf {J}} } \mathrel {\mathrel {\mathop :}=}{{\,\textrm{span}\,}}\big \lbrace \phi _{{\textbf {J}} , {\textbf {k}} }: \,{\textbf {k}} \in \varDelta _{{\textbf {J}} } \big \rbrace . \end{aligned}$$
(8)

The spaces \(V_{{\textbf {J}} }\) are said to have the approximation order \(d \in \mathbb {N}\) given by

$$\begin{aligned} d = \sup \left\{ s \in \mathbb {R}:\inf _{v_{J} \in V_{{\textbf {J}} }} \big \Vert v - v_{J}\big \Vert _{L^2(\square )} \lesssim 2^{-Js} \Vert v\Vert _{H^s(\square )} \ \text { for any } v \in H^s(\square ) \right\} \end{aligned}$$

and the regularity \(\gamma \) given by

$$\begin{aligned} \gamma = \sup \left\{ s \in \mathbb {R}:V_{{\textbf {J}} } \subset H^s(\square ) \right\} . \end{aligned}$$

In the simplest case, we take piecewise constant scaling functions which are defined by the indicator function \(\phi \mathrel {\mathrel {\mathop :}=}\mathbb {1}_{[0, 1]}\). Then, for any fixed \(j \in \mathbb {N}\), we define the local trial functions

$$\begin{aligned} \phi _{j, k} = 2^{\frac{j}{2}} \mathbb {1}_{[2^{-j}k,2^{-j}(k+1)]} = 2^{\frac{j}{2}} \phi (2^j \cdot - k),\quad k \in \varDelta _j\mathrel {\mathrel {\mathop :}=}\{ 0, 1, \ldots , 2^{j} - 1\}. \end{aligned}$$

This yields the well-known approximation spaces \(V_j \mathrel {\mathrel {\mathop :}=}{{\,\textrm{span}\,}}\{ \phi _{j, k}: k \in \varDelta _j\}\), having the parameters \(\gamma = \frac{1}{2}\) and \(d = 1\).

In general, piecewise polynomial functions of the order \(r\) result in an approximation order \(d = r\). The regularity \(\gamma \) is, however, limited by the global smoothness of the trial functions. There holds \(\gamma = \frac{1}{2}\) if they are discontinuous while there is \(\gamma = \frac{3}{2}\) if they are continuous.

We note that in general, we could also define trial spaces \(V_{{\textbf {j}} }\) for any anisotropic level \({\textbf {j}} =(j_1, j_2)\), where in general, \(j_1\ne j_2\). This would, however, also require adaptations in the definitions of \(d\) and \(\gamma \).

3.2 Wavelet bases

Although the above method is very intuitive, we have a lot of difficulties to deal with. As the boundary integral operators under consideration are not local, the Galerkin problem results in fully populated matrices. This drawback can, up to logarithmic terms, be overcome with fast boundary element methods like the fast multipole method [16]. An alternative approach is to consider specific, linear combinations of piecewise polynomial trial functions, which are called wavelets. For a full introduction into this topic, see for example [13, 20, 30].

The general idea is to discretize the complement of \(V_{j-1}\) in \(V_j\). Roughly speaking, given a function \(u_j\) in \(V_j\), the projection \(Q_{j-1}u_j \in V_{j-1}\) is a good estimation on \(u_j\), and the difference \(u_j - Q_{j-1} u_j\) can be expressed in terms of complementary basis functions. To this end, we fix a minimal level \(j_0 \in \mathbb {N}_0\) and introduce complement spaces \(W_j\) for all \(j > j_0\), satisfying

$$\begin{aligned} V_j = V_{j-1} \oplus W_j. \end{aligned}$$
(9)

Similar as before, \(W_j\) is spanned by basis functions of the form

$$\begin{aligned} \psi _{j, k}\mathrel {\mathrel {\mathop :}=}2^{\frac{j-1}{2}} \psi \big (2^{j-1}\cdot - k), \quad k \in \nabla _j, \quad j > j_0, \end{aligned}$$
(10)

where \(\psi \) is a so-called mother wavelet. Note that by construction, we have \(|\nabla _j|\sim 2^{j-1}\). Also here, there holds \(\Vert \psi _{j, k}\Vert _{L^2(\varGamma )} \sim 1\). We remark that the identity (9) implies that

$$\begin{aligned} V_j = V_{j_0} \oplus W_{j_0 + 1} \oplus W_{j_0 + 2} \oplus \cdots \oplus W_j. \end{aligned}$$
(11)

For the sake of notational convenience, we set \(W_{j_0} \mathrel {\mathrel {\mathop :}=}V_{j_0}\) and denote \(\psi _{j_0, k} \mathrel {\mathrel {\mathop :}=}\phi _{j_0, k}\) for all \(k \in \nabla _{j_0}\mathrel {\mathrel {\mathop :}=}\varDelta _{j_0}\).

By tensorizing (11) with itself, for \({\textbf {J}} =(J,J)\), we arrive at

$$\begin{aligned} V_{{\textbf {J}} } = V_J \otimes V_J = \bigoplus _{|{\textbf {j}} |_\infty \le J} W_{{\textbf {j}} }, \quad W_{{\textbf {j}} } \mathrel {\mathrel {\mathop :}=}W_{j_1} \otimes W_{j_2}. \end{aligned}$$

Note that the space \(V_{{\textbf {J}} }\) coincides with \(V_{{\textbf {J}} }\) from (8), especially it is the full trial space which is used to discretize the energy space.

In view of (10), we can write

$$\begin{aligned} V_{{\textbf {J}} } = V_J \otimes V_J = {{\,\textrm{span}\,}}(\varPsi _J \otimes \varPsi _J) = {{\,\textrm{span}\,}}\left\{ \varPsi _{{\textbf {j}} , {\textbf {k}} } :|{\textbf {j}} |_\infty \le J,\ {\textbf {k}} \in \nabla _{\textbf {j}} \right\} \end{aligned}$$

with the tensor product wavelets \(\psi _{{\textbf {j}} , {\textbf {k}} } =\psi _{j_1,k_1}\otimes \psi _{j_2,k_2}\) and \(\nabla _{{\textbf {j}} } \mathrel {\mathrel {\mathop :}=}\nabla _{j_1} \times \nabla _{j_2}\).

3.3 Notation

Let us define some notation which we will use throughout the remainder of this article. First, for the sake of simplicity, with \({\textbf {j}} _0\) we always mean the multiindex \({\textbf {j}} _0 \mathrel {\mathrel {\mathop :}=}(j_0,j_0)\), and with a captial \({\textbf {J}} \), we always mean a multiindex \({\textbf {J}} = (J,J)\). Of course, \(J\) is variable, but \(j_0\) depends only on the chosen wavelet basis.

Next, we define the support of a wavelet as

$$\begin{aligned} \varOmega _{j, k} \mathrel {\mathrel {\mathop :}=}{{\,\textrm{supp}\,}}\psi _{j, k} \end{aligned}$$

and, accordingly,

$$\begin{aligned} \varOmega _{{\textbf {j}} , {\textbf {k}} } \mathrel {\mathrel {\mathop :}=}{{\,\textrm{supp}\,}}\psi _{{\textbf {j}} , {\textbf {k}} }. \end{aligned}$$

Similarly, we define the singular support, i.e., the points at which a wavelet is not smooth, as

$$\begin{aligned} \varOmega _{j, k}^\sigma \mathrel {\mathrel {\mathop :}=}{{\,\textrm{singsupp}\,}}\psi _{j, k}, \quad \varOmega _{{\textbf {j}} , {\textbf {k}} }^\sigma \mathrel {\mathrel {\mathop :}=}{{\,\textrm{singsupp}\,}}\psi _{{\textbf {j}} , {\textbf {k}} }. \end{aligned}$$

For a pair of wavelets \(\psi _{{\textbf {j}} , {\textbf {k}} }\) and \(\psi _{{\textbf {j}} ', {\textbf {k}} '}\), we let

$$\begin{aligned} \delta _{x_i} \mathrel {\mathrel {\mathop :}=}{{\,\textrm{dist}\,}}\big ( \varOmega _{j_i, k_i},\varOmega _{j_i', k_i'} \big ), \quad \delta _ {{\text {tot}}} \mathrel {\mathrel {\mathop :}=}{{\,\textrm{dist}\,}}\big ( \varOmega _{{\textbf {j}} , {\textbf {k}} },\varOmega _{{\textbf {j}} ', {\textbf {k}} '} \big ). \end{aligned}$$

Moreover, we also define

$$\begin{aligned} \sigma _{x_i} \mathrel {\mathrel {\mathop :}=}{\left\{ \begin{array}{ll} {{\,\textrm{dist}\,}}\big ( \varOmega _{j_i, k_i},\varOmega _{j_i', k_i'}^\sigma \big ), &{}j_i \ge j_i', \\ {{\,\textrm{dist}\,}}\big ( \varOmega _{j_i, k_i}^\sigma ,\varOmega _{j_i', k_i'} \big ), &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

Finally, given a wavelet \(\psi _{j, k}\), we say that \(\psi _{j', k'}\) is located in the far-field of \(\psi _{j, k}\) if there holds \({{\,\textrm{dist}\,}}( \varOmega _{j, k}, \varOmega _{j', k'} ) \gtrsim 2^{- \min \{j, j' \}}\), otherwise, we say that \(\psi _{j', k'}\) is located in the near-field of \(\psi _{j, k}\). For the tensorized wavelets, this threshold is the maximal support length, which amounts to \(2^{- \min \{j_1, j_2, j_1', j_2'\}}\).

3.4 Some important wavelet properties

Wavelet functions have some very nice properties, see e.g. [8, 13, 30] for the full range of expressions. In this section, we will restrict ourselves to the most important ones, which are needed in the remainder of this article.

First, as already stated in [17], the set \(\varPsi \mathrel {\mathrel {\mathop :}=}\{ \psi _{{\textbf {j}} , {\textbf {k}} } :{\textbf {j}} \ge {\textbf {j}} _0, \ {\textbf {k}} \in \nabla _{{\textbf {j}} } \}\) forms a Riesz basis of \(L^2(\square )\), meaning that

$$\begin{aligned} \left\| \sum _{{\textbf {j}} , {\textbf {k}} } c_{{\textbf {j}} , {\textbf {k}} } \psi _{{\textbf {j}} , {\textbf {k}} } \right\| _{L^2(\square )}^2 \sim \sum _{{\textbf {j}} , {\textbf {k}} } \big | c_{{\textbf {j}} , {\textbf {k}} } \big |^2, \end{aligned}$$

provided that the one-dimensional wavelet basis forms a Riesz basis of \(L^2([0,1])\). Moreover, it is well-known (see e.g. [8, 13]) that a one-dimensional Riesz basis of \(L^2([0, 1])\) possesses a unique, dual basis which is biorthogonal to the primal one. By tensorizing this dual basis with itself, we get the dual basis of \(\varPsi \) in \(L^2(\square )\), which we denote by \(\widetilde{\varPsi }\). This dual basis then provides the approximation order \(\widetilde{d}\) and the regularity \(\widetilde{\gamma }>0\).

The primal and the dual basis satisfy the following norm equivalences:

Theorem 1

Let us denote \(H^s_{\circ } \mathrel {\mathrel {\mathop :}=}H^s\) for \(s \ge 0\), and \(H^s_{\circ } = \widetilde{H}^s\) for \(s < 0\), which is the dual space of \(H^{-s}\). Then, there holds

$$\begin{aligned}&\left\| u \right\| _{H_\circ ^s(\square )}^2 \sim \sum _{{\textbf {j}} , {\textbf {k}} } 2^{2s |{\textbf {j}} |_\infty } \big | \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, \, u \rangle _\square \big |^2, \quad -\widetilde{\gamma }< s < \gamma , \quad u \in H^s_{\circ }(\square ), \end{aligned}$$
(12)
$$\begin{aligned}&\left\| u \right\| _{H_\circ ^t(\square )}^2 \sim \sum _{{\textbf {j}} , {\textbf {k}} } 2^{2t |{\textbf {j}} |_\infty } \big | \langle u, \, \psi _{{\textbf {j}} , {\textbf {k}} } \rangle _\square \big |^2, \quad -\gamma< t < \widetilde{\gamma }, \quad u \in H^t_{\circ }(\square ). \end{aligned}$$
(13)

Proof

In accordance with e.g. [9, 30], the univariate wavelet basis satisfies

$$\begin{aligned}&\left\| u \right\| _{H_\circ ^s([0, 1])}^2 \sim \sum _{j, k} 2^{2sj} \big | \langle \widetilde{\psi }_{j, k}, u \rangle _{[0, 1]} \big |^2, \quad -\widetilde{\gamma }< s< \gamma , \quad u \in H^s_{\circ }([0, 1]),\\&\left\| u \right\| _{H_\circ ^t([0, 1])}^2 \sim \sum _{j, k} 2^{2tj} \big | \langle u, \psi _{j, k} \rangle _{[0, 1]} \big |^2, \quad -\gamma< t < \widetilde{\gamma }, \quad u \in H^t_{\circ }([0, 1]). \end{aligned}$$

Thus, when \(0 \le s < \gamma \), we can use the identity \(H^s(\square ) = H^s([0,1])\otimes L^2([0,1])\cap L^2([0,1])\otimes H^s([0,1])\) and standard tensor product arguments to obtain

$$\begin{aligned} \Vert u\Vert _{H^s([0,1])\otimes L^2([0,1])}^2 \sim \sum _{{\textbf {j}} ,{\textbf {k}} } 2^{2sj_1} \big |\langle \widetilde{\psi }_{{\textbf {j}} ,{\textbf {k}} }, u\rangle _\square \big |^2 \end{aligned}$$

and likewise

$$\begin{aligned} \Vert u\Vert _{L^2([0,1])\otimes H^s([0,1])}^2 \sim \sum _{{\textbf {j}} ,{\textbf {k}} } 2^{2sj_2} \big |\langle \widetilde{\psi }_{{\textbf {j}} ,{\textbf {k}} }, u\rangle _\square \big |^2. \end{aligned}$$

Therefore, as \(2^{2sj_1}+2^{2sj_2} \sim 2^{2\,s|{\textbf {j}} |_\infty }\), we arrive at

$$\begin{aligned} \Vert u\Vert _{H^s(\square )}^2&\sim \Vert u\Vert _{H^s([0,1])\otimes L^2([0,1])}^2 + \Vert u\Vert _{L^2([0,1])\otimes H^s([0,1])}^2\\ {}&\sim \sum _{{\textbf {j}} ,{\textbf {k}} } 2^{2s|{\textbf {j}} |_\infty } \big |\langle \widetilde{\psi }_{{\textbf {j}} ,{\textbf {k}} },u\rangle _\square \big |^2. \end{aligned}$$

The same arguments also allow us to show (13) for \(0 \le t< \widetilde{\gamma }\).

In case of \(-\widetilde{\gamma }< s < 0\), we use a duality argument. We find that

$$\begin{aligned} \Vert u\Vert _{\widetilde{H}^s(\square )}&= \sup _{\Vert v\Vert _{H^{-s}(\square )} = 1} \langle v, u \rangle _\square \\&\le \sup _{\Vert v\Vert _{H^{-s}(\square )} = 1} \bigg ( \sum _{{\textbf {j}} , {\textbf {k}} } 2^{2s|{\textbf {j}} |_\infty } \big | \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, u \rangle _\square \big |^2 \bigg )^{\frac{1}{2}}\\&\quad \cdot \bigg ( \sum _{{\textbf {j}} , {\textbf {k}} } 2^{-2s|{\textbf {j}} |_\infty } \big | \langle v, \psi _{{\textbf {j}} , {\textbf {k}} }, \rangle _\square \big |^2 \bigg )^{\frac{1}{2}} \\&\sim \bigg ( \sum _{{\textbf {j}} , {\textbf {k}} } 2^{2s|{\textbf {j}} |_\infty } \big | \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, u \rangle _\square \big |^2 \bigg )^{\frac{1}{2}}, \end{aligned}$$

where we have used (13) for \(0 \le -s < \widetilde{\gamma }\). This shows the lower bound in (12). For the upper bound, we remark that

$$\begin{aligned} \bigg ( \sum _{{\textbf {j}} , {\textbf {k}} } 2^{2s|{\textbf {j}} |_\infty } \big | \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, u \rangle _\square \big |^2 \bigg )^{\frac{1}{2}}&= \sup _{\Vert {\textbf {d}} \Vert _{\ell ^2} = 1} \bigg | \sum _{{\textbf {j}} , {\textbf {k}} } 2^{s|{\textbf {j}} |_\infty } \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, u \rangle _\square d_{{\textbf {j}} , {\textbf {k}} } \bigg | \\&= \sup _{\Vert {\textbf {d}} \Vert _{\ell ^2} = 1} \bigg | \bigg \langle \sum _{{\textbf {j}} , {\textbf {k}} } 2^{s|{\textbf {j}} |_\infty } d_{{\textbf {j}} , {\textbf {k}} } \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, \ u \bigg \rangle _\square \bigg | \\&\le \sup _{\Vert {\textbf {d}} \Vert _{\ell ^2} = 1} \Vert u\Vert _{\widetilde{H}^s(\square )} \bigg \Vert \sum _{{\textbf {j}} , {\textbf {k}} } 2^{s|{\textbf {j}} |_\infty } d_{{\textbf {j}} , {\textbf {k}} } \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} } \bigg \Vert _{H^{-s}(\square )}, \end{aligned}$$

and

$$\begin{aligned} \bigg \Vert \sum _{{\textbf {j}} , {\textbf {k}} } 2^{s|{\textbf {j}} |_\infty } d_{{\textbf {j}} , {\textbf {k}} } \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} } \bigg \Vert _{H^{-s}(\square )} \sim \sum _{{\textbf {j}} , {\textbf {k}} } 2^{-2s|{\textbf {j}} |_\infty } \big | 2^{s|{\textbf {j}} |_\infty } d_{{\textbf {j}} , {\textbf {k}} } \big |^2 = \Vert {\textbf {d}} \Vert _{\ell ^2}^2. \end{aligned}$$

This proves (12). With the same arguments, we can also show (13) for \(- \gamma< t < 0\).\(\square \)

Remark 1

For a smaller range of parameters, this has already been shown in [17], while similar norm equivalences have been derived in [5, 29] for wavelets which are at least continuous. Moreover, we should mention that the upper bounds in (12) and (13) can be extended to \(-\widetilde{d}< s < \gamma \), and \(- d< t < \widetilde{\gamma }\), respectively, whereas the lower bounds can be extended to \(- \widetilde{\gamma }< s < d\), and \(-\gamma< t < \widetilde{d}\), respectively, see e.g. [30] for the details.

For a multiindex \({\textbf {j}} \), let us define the (non-orthogonal) projections

$$\begin{aligned} Q_{{\textbf {j}} } u&\mathrel {\mathrel {\mathop :}=}\sum _{{\textbf {k}} \in \nabla _{\textbf {j}} } \big \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, u \big \rangle _\varGamma \psi _{{\textbf {j}} , {\textbf {k}} }, \\ \quad Q_{J} u&\mathrel {\mathrel {\mathop :}=}\sum _{|{\textbf {j}} |_\infty \le J} Q_{{\textbf {j}} } u = \sum _{|{\textbf {j}} |_\infty \le J} \sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }} \big \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, u \big \rangle _\varGamma \psi _{{\textbf {j}} , {\textbf {k}} }, \end{aligned}$$

provided that \(u \in H^s_ {\circ }(\square )\) for some \(s > -\widetilde{\gamma }\). By using a tensor product argument, the duality and the biorthogonality, the univariate approximation property, which is derived e.g. in [30], generalizes to

$$\begin{aligned} \Vert u - Q_J u\Vert _{H^s_{\circ }(\square )} \lesssim 2^{J(s-t)} \Vert u\Vert _{H^t_{\circ }(\square )}, \ \ s \le t, \ \ - \widetilde{d}\le s< \gamma , \ \ - \widetilde{\gamma } < t \le d. \end{aligned}$$
(14)

Moreover, there holds Bernstein’s inequality

$$\begin{aligned} \Vert Q_{{\textbf {j}} } u\Vert _{H^s_{\circ }(\square )} \lesssim 2^{|{\textbf {j}} |_\infty (s-t)} \Vert Q_{{\textbf {j}} } u\Vert _{H^t_{\circ }(\square )}, \quad t \le s < \gamma , \end{aligned}$$

and, regarding \(Q_J u \in V_{\textbf {J}} \), also

$$\begin{aligned} \Vert Q_J u\Vert _{H^s_{\circ }(\square )} \lesssim 2^{J(s-t)} \Vert Q_{J}u\Vert _{H^t_{\circ }(\square )}, \quad t \le s < \gamma . \end{aligned}$$
(15)

Perhaps the most important property of wavelets for the present article is that they have vanishing moments, also called cancellation property, which is induced from the approximation order of the dual basis. Namely, in accordance with [8], there holds

$$\begin{aligned} \big | \big \langle \psi _{j, k}, u \big \rangle _{[0,1]} \big | \lesssim 2^{- (\widetilde{d} + \frac{1}{2}) j} |u|_{W^{\widetilde{d}, \infty }(\varOmega _{j, k})}, \quad u \in W^{\widetilde{d}, \infty }(\varOmega _{j, k}). \end{aligned}$$

By explicitly enrolling the tensor product structure of the wavelet \(\psi _{{\textbf {j}} , {\textbf {k}} }\), we can immediately deduce that

$$\begin{aligned} \big | \big \langle \psi _{{\textbf {j}} , {\textbf {k}} }, u \big \rangle _\square \big | \lesssim 2^{-(\widetilde{d} + \frac{1}{2}) |{\textbf {j}} |_1} |u|_{W^{2 \widetilde{d}, \infty }(\varOmega _{{\textbf {j}} , {\textbf {k}} })}, \quad u \in W^{2 \widetilde{d}, \infty }(\varOmega _{{\textbf {j}} , {\textbf {k}} }). \end{aligned}$$
(16)

Herein, for \(k\in \mathbb {N}_0\), the seminorm \(|u|_{W^{k,\infty }(\varXi )} = \max _{|\varvec{\upalpha }|= k} \Vert \partial ^{\varvec{\upalpha }}u\Vert _{L^{\infty }(\varXi )}\) denotes the \(W^{k,\infty }\)-seminorm on \(\varXi \).

Remark 2

Due to the tensor product structure of the wavelets, we must tensorize scaling functions on the coarsest level with wavelets on a finer level. This means that we cannot use \(\widetilde{d}\) vanishing moments in both directions. However, if \(\mathcal {I}\subset \{j_1, j_2\}\) denotes the subset of indices corresponding to univariate wavelets with \(\widetilde{d}\) vanishing moments, we have the estimate

$$\begin{aligned} \big | \big \langle \psi _{{\textbf {j}} , {\textbf {k}} }, u \big \rangle _\square \big | \lesssim 2^{- \frac{1}{2} |{\textbf {j}} |_1 - \widetilde{d} \sum _{j \in \mathcal {I}} j} |u|_{W^{|\mathcal {I}|\widetilde{d}, \infty }(\varOmega _{{\textbf {j}} , {\textbf {k}} })}, \quad u \in W^{|\mathcal {I}| \widetilde{d}, \infty }(\varOmega _{{\textbf {j}} , {\textbf {k}} }). \end{aligned}$$

4 Matrix entry estimates

In order to develop a compression scheme for the operator \(\mathcal {A}\) with respect to the wavelet basis \(\varPsi \), we need to estimate the matrix entries in the Galerkin matrix. For now, let us consider the situation \(r = 1\), where the only patch present is the unit square \(\square \), in which case we can assume that \(\varvec{\gamma }_i = \text {id}\). The discussion of the situation on a Lipschitz manifold is postponed to Sect. 7.

4.1 Far-field estimates

For the remainder of Sect. 4.1, we assume that \(\delta _ {{\text {tot}}} > 0\), which means that the first compression [9, 20, 30] applies. There exist estimations for the entries by Reich [24, 25], which make use of the vanishing moments in the one-dimensional wavelets with the smallest corresponding support lengths. This is especially useful when considering thin, long wavelets with a small distance. We quote the following result.

Theorem 2

([24, Theorem 2.1.9]) For \({\textbf {j}} , {\textbf {j}} '\ge {\textbf {j}} _0\), there holds

$$\begin{aligned} \big | \big \langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\square \big | \lesssim 2^{-\frac{1}{2}(|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)} 2^{-\widetilde{d}(j^{(1)} + j^{(2)})} {{\,\textrm{dist}\,}}\big ( \varOmega _{{\textbf {j}} , {\textbf {k}} },\varOmega _{{\textbf {j}} ', {\textbf {k}} '} \big )^{-(2 + 2q + 2 \widetilde{d})}. \end{aligned}$$

Here, \(\big \{j^{(1)}, j^{(2)}\big \}\subset \{ j_1, j_1', j_2, j_2'\} \cap [j_0+1, \, \infty )\) can be any two distinct indices, the best behaviour is obtained by choosing the two largest indices.

Let us next derive an estimate which makes use of the vanishing moments in every one-dimensional wavelet, which is beneficial if the supports of the wavelets \(\psi _{{\textbf {j}} , {\textbf {k}} }\) and \(\psi _{{\textbf {j}} ', {\textbf {k}} '}\) are small.

Theorem 3

For \({\textbf {j}} , {\textbf {j}} ' \ge {\textbf {j}} _0+\textbf{1}\), there holds

$$\begin{aligned} \big | \big \langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \, \psi _{{\textbf {j}} , {\textbf {k}} }\big \rangle _\square \big | \lesssim 2^{-(\widetilde{d} + \frac{1}{2}) (|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)} \delta _{{\text {tot}}}^{-(2 + 2q + 4 \widetilde{d})}. \end{aligned}$$
(17)

Proof

By explicitly enrolling the tensor product structure of the wavelets, we can write

$$\begin{aligned} \big \langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\square \sim \int _0^1\!\int _0^1\!\int _\square \! K(\varvec{x}, \varvec{x}') \psi _{{\textbf {j}} , {\textbf {k}} }(\varvec{x}) \; {\text {d}}\varvec{x}\;\psi _{j_1', k_1'} (x_1') \; {\text {d}}x_1' \;\psi _{j_2', k_2'} (x_2') \; {\text {d}}x_2'. \end{aligned}$$

We can use the vanishing moments of \(\psi _{j_2', k_2'}\) to deduce that

$$\begin{aligned}&\big | \big \langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\square \big | \lesssim 2^{-(\widetilde{d} + \frac{1}{2})j_2'}\\&\quad \cdot \bigg \Vert \int _0^1 \int _\square \partial _{x_2'}^{\widetilde{d}} K \left( \varvec{x},\big [{\begin{matrix} x_1' \\ \cdot \end{matrix}}\big ] \right) \psi _{{\textbf {j}} , {\textbf {k}} }(\varvec{x}) \; {\text {d}}\varvec{x}\,\psi _{j_1', k_1'}(x_1') \; {\text {d}}x_1' \bigg \Vert _{L^\infty ( \varOmega _{j_2, k_2} )}. \end{aligned}$$

Note that we can differentiate under the integral because the kernel \(K\) is smooth and bounded on \(\varOmega _{{\textbf {j}} , {\textbf {k}} } \times \varOmega _{{\textbf {j}} ', {\textbf {k}} '}\). The vanishing moments of \(\psi _{j_1', k_1'}\) then allow us to proceed with the estimate to

$$\begin{aligned}&\big | \big \langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\square \big | \lesssim 2^{-(\widetilde{d} + \frac{1}{2})|{\textbf {j}} '|_1}\\&\quad \cdot \bigg \Vert \int _0^1 \int _0^1 \partial _{x_1'}^{\widetilde{d}} \partial _{x_2'}^{\widetilde{d}} K \left( \big [{\begin{matrix} x_1 \\ x_2 \end{matrix}}\big ],\cdot \right) \psi _{j_1, k_1}(x_1) \; {\text {d}}x_1\,\psi _{j_2, k_2}(x_2) \; {\text {d}}x_2 \bigg \Vert _{L^\infty (\varOmega _{{\textbf {j}} ', {\textbf {k}} '})}. \end{aligned}$$

By subsequently using the vanishing moments of \(\psi _{j_2, k_2}\), and \(\psi _{j_1, k_1}\) as well, we finally arrive at

$$\begin{aligned} \big | \big \langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\square \big |&\lesssim 2^{-(\widetilde{d} + \frac{1}{2}) (|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)} \big \Vert \partial _{x_1}^{\widetilde{d}} \partial _{x_2}^{\widetilde{d}} \partial _{x_1'}^{\widetilde{d}} \partial _{x_2'}^{\widetilde{d}} K\big \Vert _{L^\infty ( \varOmega _{{\textbf {j}} , {\textbf {k}} } \times \varOmega _{{\textbf {j}} ', {\textbf {k}} '} )}. \end{aligned}$$

If we remember the fact that the kernel \(K\) is asymptotically smooth of order \(2q\), compare (3), we can deduce (17).\(\square \)

4.2 Near-field estimates

As we will see in Sect. 5, we may use the previous estimates only if a wavelet pair is in the far-field, meaning that the supports are sufficiently far away. For the near-field, we need to derive different estimates. In this case, we explicitly enrol the tensor product structure of the wavelets again. We will use an approach which is similar to the one created in [24].

To this end, we define the dimensionally reduced kernel

$$\begin{aligned} \begin{aligned}&K_1(x, x') \mathrel {\mathrel {\mathop :}=}\int _0^1 \int _0^1 K \left( \big [{\begin{matrix} x \\ y \end{matrix}}\big ], \big [{\begin{matrix} x' \\ y' \end{matrix}}\big ] \right) \psi _{j_2, k_2}(y) \psi _{j_2', k_2'}(y') \; {\text {d}}y \; {\text {d}}y',\\&\hspace{5cm} x, x' \in [0, 1], \end{aligned} \end{aligned}$$
(18)

and the operator \( \mathcal {A}_1\) as the integral operator with the kernel \(K_1\). By definition, the kernel \(K_1\) depends on the wavelets \(\psi _{j_2, k_2}\) and \(\psi _{j_2', k_2'}\), but the context will always clarify this relation.

Due to the tensor product structure, the dimensionally reduced operator obviously satisfies \(\langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \psi _{{\textbf {j}} , {\textbf {k}} } \rangle _{\square } = \langle \mathcal {A}_1 \psi _{j_1', k_1'}, \psi _{j_1, k_1} \rangle _{[0,1]}\). Moreover, due to [24, Lemma 2.1.5], the estimate

$$\begin{aligned} \left| \partial _x^\alpha \partial _{x'}^{\alpha '} K_1(x, x') \right| \lesssim 2^{- \frac{1}{2}(j_2 + j_2')} |x - x'|^{-(2 + 2q + \alpha + \alpha ')} \end{aligned}$$
(19)

holds for any \(x,x' \in [0, 1]\) with \(x\ne x'\). However, there are also vanishing moments of the wavelets hidden in the kernel \(K_1\), which can be used to improve the estimate (19) and hence also Theorem 2.1.7 in [24]:

Theorem 4

Assume that \(0 < \sigma _{x_1} \lesssim 2^{- \min \{j_1, j_1'\}}\), and \(\max \{j_1, j_1'\}, \max \{j_2, j_2'\} > j_0\). Then, we have

$$\begin{aligned}&\left. \begin{array}{c} \big | \big \langle \mathcal {A}_1 \psi _{j_1', k_1'}, \psi _{j_1, k_1} \big \rangle _{[0,1]} \big | \\ \big | \big \langle \mathcal {A}_1 \psi _{j_1, k_1}, \psi _{j_1', k_1'} \big \rangle _{[0,1]} \big | \end{array}\right\} \\&\quad \lesssim 2^{- \frac{1}{2} (j_2 + j_2')} 2^{-\widetilde{d} \max \{j_2, j_2' \}}2^{- \frac{1}{2} |j_1 - j_1'| } 2^{- \widetilde{d} \max \{j_1, j_1' \}} \sigma _{x_1}^{-(1 + 2q + 2 \widetilde{d})}. \end{aligned}$$

Proof

We will simply derive the appropriate estimate for the kernel \(K_1\) similar to (19). Then, the rest of the proof may be completed by simply following the arguments of [24].

If \(x \ne x'\), then the function under the integral in (18) is bounded, so we may directly differentiate under the integral. Moreover, let us without loss of generality assume that \(j_2 > j_2'\). Then, for \(x,x' \in [0, 1]\), with \(x\ne x'\),

$$\begin{aligned}&\big | \partial _x^\alpha \partial _{x'}^{\alpha '} K_1(x, x') \big | \\&\quad = \left| \int _{\varOmega _{j_2, k_2}} \psi _{j_2, k_2}(y) \int _{\varOmega _{j_2', k_2'}} \psi _{j_2', k_2'} (y') \, \partial _x^\alpha \partial _{x'}^{\alpha '} \, K \left( \big [{\begin{matrix} x \\ y \end{matrix}}\big ],\big [{\begin{matrix} x' \\ y' \end{matrix}}\big ] \right) \; {\text {d}}y' \; {\text {d}}y \right| \\&\quad \lesssim 2^{-(\widetilde{d} + \frac{1}{2})j_2} \sup _{y \in \varOmega _{j_2, k_2}} \left| \int _{\varOmega _{j_2', k_2'}} \, \psi _{j_2', k_2'} (y') \partial _y^{\widetilde{d}} \partial _x^\alpha \partial _{x'}^{\alpha '} \, K \left( \big [{\begin{matrix} x \\ y \end{matrix}}\big ],\big [{\begin{matrix} x' \\ y' \end{matrix}}\big ] \right) \; {\text {d}}y' \right| \\&\quad \lesssim 2^{-(\widetilde{d} + \frac{1}{2})j_2} \sup _{y \in \varOmega _{j_2, k_2}} \int _{\varOmega _{j_2', k_2'}} \underbrace{\big | \psi _{j_2', k_2'}(y') \big |}_{\lesssim 2^{j_2'/2}} \\&\quad \qquad \cdot \underbrace{\left\| \big [{\begin{matrix} x \\ y \end{matrix}}\big ] - \big [{\begin{matrix} x' \\ y' \end{matrix}}\big ] \right\| }_{\ge |x - x'|} \bigg .^{-(2 + 2q + \widetilde{d} + \alpha + \alpha ')} \; {\text {d}}y_2' \\&\quad \lesssim 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d} j_2} |x - x'|^{-(2 + 2q + \widetilde{d} + \alpha + \alpha ')}, \end{aligned}$$

where for the last line, we have used the property \(|\varOmega _{j_2', k_2'}| \lesssim 2^{-j_2'}\).

As the remainder of the proof is based on the ideas of [9, Section 6], we just sketch it. Without loss of generality, we may assume that \(j_1' \le j_1\). In this case, \(\psi _{j_1, k_1}\) is located on a smooth part of the wavelet \(\psi _{j_1', k_1'}\), so we may decompose \(\psi _{j_1', k_1'} = \tilde{f} + \bar{f}\) such that \(\tilde{f}\) is a smooth function satisfying

$$\begin{aligned} \tilde{f} \big |_{\varOmega _{j_1, k_1}} = \psi _{j_1', k_1'}\big |_{\varOmega _{j_1, k_1}}. \end{aligned}$$

This can be realized by Calderón’s extension theorem [31] with \(\Vert \tilde{f}\Vert _{H^s([0, 1])}\lesssim 2^{s j_1'}\). Hence, we have

$$\begin{aligned} \big | \big \langle \mathcal {A}_1 \psi _{j_1', k_1'}, \psi _{j_1, k_1} \big \rangle _{[0,1]} \big | \le \big | \big \langle \mathcal {A}_1 \tilde{f}, \psi _{j_1, k_1} \big \rangle _{[0,1]} \big | + \big | \big \langle \mathcal {A}_1 \bar{f}, \psi _{j_1, k_1} \big \rangle _{[0,1]} \big |. \end{aligned}$$

The estimate for \(\bar{f}\) follows directly from [24, Lemma 2.1.1].

For the function \(\tilde{f}\), we define the operator

$$\begin{aligned} \mathcal {A}_1^\sharp \tilde{f} (x) \mathrel {\mathrel {\mathop :}=}\int _\mathbb {R}\chi (x) \chi (x') K_1(x, x') \tilde{f}(x') \; {\text {d}}x', \quad x \in \mathbb {R}, \end{aligned}$$

where \(\chi \in C^{\infty }_c(\mathbb {R})\) is a smooth cutoff function \(\chi \) satisfying \(\chi |_{[0,1]} = 1\). Then, \(\mathcal {A}_1^\sharp \) is a pseudo-differential operator of the order \(m = 1 + 2q + \widetilde{d}\), cf. [22]. Remarking that in our case, we have \(c(j_2, j_2') = 2^{- \frac{1}{2}(j_2 + j_2')} 2^{-\widetilde{d}j_2}\), we may apply [24, Lemma 2.1.4] and the fact that \(\sigma _{x_1} \lesssim 2^{- j_1'}\) to conclude.\(\square \)

Remark 3

Up to now, we have just considered a reduction to the first coordinate direction. Nevertheless, as also done in [24], a reduction to the second coordinate is possible by using a similar definition for the operator \(\mathcal {A}_2\), and the same estimates hold with exchanged indices.

5 Matrix compression scheme

To keep the number of the degrees of freedom small, we need to introduce a compression scheme, according to which many matrix entries do not have to be calculated, whilst obtaining convergence with the full rate offered by the underlying Galerkin scheme. We differ between the first compression and the second compression, but for either case, we require a matrix block error which is controlled by a level dependent parameter \(\sigma _{{\textbf {j}} , {\textbf {j}} '}\) given by

$$\begin{aligned} \sigma _{{\textbf {j}} , {\textbf {j}} '} \mathrel {\mathrel {\mathop :}=}2J(d'-q) - d'(|{\textbf {j}} |_\infty + |{\textbf {j}} '|_\infty ) + \kappa (2|{\textbf {j}} |_\infty + 2 |{\textbf {j}} '|_\infty - |{\textbf {j}} |_1 - |{\textbf {j}} '|_1). \end{aligned}$$
(20)

Here, \(d' > d\) and \(\kappa > 0\) are sufficiently small, but fixed real numbers, which are introduced in order to avoid logarithmic terms in the consistency estimates.

5.1 Far-field: first compression

In the case of the first compression, we consider a pair of wavelets \(\psi _{{\textbf {j}} , {\textbf {k}} }\) and \(\psi _{{\textbf {j}} ', {\textbf {k}} '}\), whose supports are located sufficiently far away from each other. As we will see, we need to estimate a sum of matrix coefficients by an integral, which requires that the minimal distance between the respective wavelets’ supports is large enough. In two dimensions, we must have a minimal distance, which is at least as wide the largest face of the included supports, namely \(2^{- \min \{j_1, j_2, j_1', j_2'\}}\).

If this is not the case, however, we can make use of the tensor product structure and estimate the sum only in the coordinate direction of \(x_i\), which results in a minimal distance of \(2^{- \min \{j_i, j_i'\}}\) in this direction. This procedure basically follows [24], but is adapted here to the setting on the full tensor product space.

5.1.1 Compression in the \(\varvec{x}\)- and \(\varvec{y}\)-coordinate

For a fixed maximal level \(J\), we define the compressed matrix for the first compression \({\textbf {A}} _J^{c_1,1}\) as

$$\begin{aligned} \big [{\textbf {A}} _J^{c_1,1} \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \mathrel {\mathrel {\mathop :}=}\left\{ \begin{aligned} 0,&\quad {\textbf {j}} , {\textbf {j}} ' \ge {\textbf {j}} _0+\textbf{1},\quad \delta _{{\text {tot}}}>\mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}, \\ \big [{\textbf {A}} _J\big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')},&\quad \text {otherwise}, \end{aligned}\right. \end{aligned}$$
(21)

for \({\textbf {k}} \in \nabla _{{\textbf {j}} }\), \({\textbf {k}} ' \in \nabla _{{\textbf {j}} '}\), and \(|{\textbf {j}} |_\infty , |{\textbf {j}} '|_\infty \le J\). Herein, the cutoff parameter \(\mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}\) is given as

$$\begin{aligned} \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '} \mathrel {\mathrel {\mathop :}=}a \max \Bigg \{ 2^{- \min \{ j_1, j_2, j_1', j_2' \}},\, 2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d} (|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)}{2q + 4 \widetilde{d}}} \Bigg \}, \end{aligned}$$
(22)

where \(a > 0\) is a fixed real number.

Theorem 5

Let \({\textbf {R}} _J \mathrel {\mathrel {\mathop :}=}{\textbf {A}} _J - {\textbf {A}} _J^{c_1,1}\). Then, for the matrix block \({\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\), we have the estimate

$$\begin{aligned} \Vert {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\Vert _2 \lesssim a^{-(2q + 4 \widetilde{d})} 2^{-\sigma _{{\textbf {j}} , {\textbf {j}} '}} \end{aligned}$$

with a generic constant that is independent of the refinement level \(J\).

Proof

We advance similar as in [9] and recall that \({\textbf {j}} , {\textbf {j}} ' \ge {\textbf {j}} _0 + {\textbf {1}} \). First, we define the set

$$\begin{aligned} \nabla _{\textbf {j}} ^{\mathcal {B}} \mathrel {\mathrel {\mathop :}=}\{ {\textbf {k}} \in \nabla _{{\textbf {j}} } :\delta _{{\text {tot}}} > \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '} \}. \end{aligned}$$

Then, we estimate the column sum of the block \({\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\) by

$$\begin{aligned} \sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big |&= \sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }^\mathcal {B}} \big | \big \langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\square \big |\\&\lesssim 2^{-(\widetilde{d} + \frac{1}{2}) (|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)} \sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }^\mathcal {B}} \delta _{{\text {tot}}}^{-(2 + 2q + 4 \widetilde{d})}, \end{aligned}$$

where the last inequality is due to Theorem 3. By the compression rule (21), we have the relation \(\delta _{{\text {tot}}} \ge \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}\), and, since also \(\mathcal {B}_{{\textbf {j}} , {\textbf {j}} '} \gtrsim 2^{- \min \{j_1, j_2\}}\), we can estimate the sum by an integral, yielding

$$\begin{aligned} \sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big |&\lesssim 2^{-(\widetilde{d} + \frac{1}{2})(|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)} 2^{|{\textbf {j}} |_1} \int _{\Vert \varvec{x}\Vert \ge \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}} \Vert \varvec{x}\Vert ^{-(2 + 2q + 4 \widetilde{d})} \; {\text {d}}\varvec{x}\\&\quad \lesssim 2^{-(\widetilde{d} + \frac{1}{2})(|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)} 2^{|{\textbf {j}} |_1} \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}^{-(2q + 4 \widetilde{d})}. \end{aligned}$$

As we also have \(\mathcal {B}_{{\textbf {j}} , {\textbf {j}} '} \ge a 2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d}(|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)}{2q + 4 \widetilde{d}}}\), we obtain

$$\begin{aligned} \sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big | \lesssim a^{-(2q + 4 \widetilde{d})} 2^{\frac{1}{2}(|{\textbf {j}} |_1 - |{\textbf {j}} '|_1)} 2^{- \sigma _{{\textbf {j}} , {\textbf {j}} '}}. \end{aligned}$$
(23)

Using exactly the same arguments, we can likewise derive the estimate for the row sums

$$\begin{aligned} \sum _{{\textbf {k}} ' \in \nabla _{{\textbf {j}} '}} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big | \lesssim a^{-(2q + 4 \widetilde{d})} 2^{\frac{1}{2}(|{\textbf {j}} '|_1 - |{\textbf {j}} |_1)} 2^{- \sigma _{{\textbf {j}} , {\textbf {j}} '}}. \end{aligned}$$
(24)

Similar to [9], we now use the estimate for the operator norm of a matrix

$$\begin{aligned} \Vert {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\Vert _2^2 \le \Vert {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\Vert _1 \Vert {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\Vert _\infty = \Vert c {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\Vert _1 \Vert c^{-1} {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\Vert _\infty , \quad c > 0, \end{aligned}$$

which gives us, together with (23), (24), the desired result

$$\begin{aligned} \big \Vert {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\big \Vert _2^2&\le \left( \max _{{\textbf {k}} ' \in \nabla _{{\textbf {j}} '}} \sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }} 2^{\frac{|{\textbf {j}} '|_1 - |{\textbf {j}} |_1}{2}} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big | \right) \\&\quad \cdot \left( \max _{{\textbf {k}} \in \nabla _{{\textbf {j}} }} \sum _{{\textbf {k}} ' \in \nabla _{{\textbf {j}} '}} 2^{\frac{|{\textbf {j}} |_1 - |{\textbf {j}} '|_1}{2}} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big | \right) \\&\lesssim a^{-2(2q + 4 \widetilde{d})} 2^{- 2\sigma _{{\textbf {j}} , {\textbf {j}} '}}. \end{aligned}$$

Remark 4

Similar to [24], using Theorem 2 and the cutoff parameter

$$\begin{aligned} \widetilde{\mathcal {B}}_{{\textbf {j}} , {\textbf {j}} '} \mathrel {\mathrel {\mathop :}=}a\max \left\{ 2^{- \min \{j_1, j_2, j_1', j_2' \}}, \ 2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d}(j^{(1)} + j^{(2)})}{2q + 2 \widetilde{d}}} \right\} , \end{aligned}$$

where \(\big \{j^{(1)},j^{(2)}\big \}\subset \{j_1,j_2,j_1',j_2'\}\), we have a compression scheme

$$\begin{aligned} \big [ {\textbf {A}} _J^{c_1, 2} \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \mathrel {\mathrel {\mathop :}=}{\left\{ \begin{array}{ll} 0, &{}j^{(1)}, j^{(2)}> j_0, \quad \delta _{\text {tot}} > \widetilde{\mathcal {B}}_{{\textbf {j}} , {\textbf {j}} '}, \\ \big [ {\textbf {A}} _J \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$
(25)

The requirement \(j^{(1)}, j^{(2)} > j_0\) is necessary to ensure the validity of Theorem 2.

By modifying the appropriate calculations, we get that the corresponding difference matrix \({\textbf {R}} _J = {\textbf {A}} _J - {\textbf {A}} _J^{c_1, 2}\) satisfies

$$\begin{aligned} \Vert {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\Vert _2 \lesssim a^{-(2q + 2 \widetilde{d})} 2^{- \sigma _{{\textbf {j}} , {\textbf {j}} '}}. \end{aligned}$$

This will be important when we consider the complexity since we can also compress matrix blocks where scaling functions are involved in at most two coordinate directions.

5.1.2 Compression in only one coordinate direction

As remarked earlier, we need at least that \(\delta _{{\text {tot}}} \gtrsim 2^{- \min \{j_1, j_2, j_1', j_2'\}}\) in the proof of Theorem 5 to estimate the row and column sums of the matrix blocks by an integral. If this is not the case, we may estimate the sum by an integral in just one coordinate direction \(x_i\). This leads to restrictions on the distance in only this coordinate direction. Especially when the term \(2^{-\min \{j_1, j_2, j_1', j_2'\}}\) in (22) is too large, this approach is beneficial. As all the derivations can be found in [24], we just quote the results. We also remark that we have exchanged the \(|\cdot |_1\)-norms from [24] with \(|\cdot |_\infty \)-norms in (20), since we are not working on a sparse tensor product space but on the full tensor product space.

Let us define the parameters

$$\begin{aligned} \mathcal {D}_{{\textbf {j}} , {\textbf {j}} '}^{x_1}&\mathrel {\mathrel {\mathop :}=}a \max \left\{ 2^{- \min \{j_1, j_1'\}},2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d}(j^{(1)} + j^{(2)}) - \min \{j_2, j_2'\}}{1 + 2q + 2 \widetilde{d}}} \right\} , \\ \mathcal {D}_{{\textbf {j}} , {\textbf {j}} '}^{x_2}&\mathrel {\mathrel {\mathop :}=}a \max \left\{ 2^{- \min \{j_2, j_2'\}},2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d}(j^{(1)} + j^{(2)}) - \min \{j_1, j_1'\}}{1 + 2q + 2 \widetilde{d}}} \right\} . \end{aligned}$$

We can then define the compressed value

$$\begin{aligned} v_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \mathrel {\mathrel {\mathop :}=}\left\{ \begin{aligned} 0,&\quad \text {if } {\left\{ \begin{array}{ll} \delta _{x_1}> \mathcal {D}_{{\textbf {j}} , {\textbf {j}} '}^{x_1}, \\ \delta _{x_2} \le a2^{-\min \{j_2, j_2' \},} \end{array}\right. } \\ 0,&\quad \text {if } {\left\{ \begin{array}{ll} \delta _{x_2} > \mathcal {D}_{{\textbf {j}} , {\textbf {j}} '}^{x_2}, \\ \delta _{x_1} \le a2^{-\min \{j_1, j_1' \},} \end{array}\right. } \\ \big [{\textbf {A}} _J\big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')},&\quad \text {otherwise}, \end{aligned} \right. \end{aligned}$$

and then the compressed matrix by the rule

$$\begin{aligned} \big [ {\textbf {A}} _J^{c_1, 3} \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \mathrel {\mathrel {\mathop :}=}{\left\{ \begin{array}{ll} v_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\text {if } j^{(1)}, j^{(2)} > j_0,\\ \big [ {\textbf {A}} _J \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$
(26)

The latter definition is just a restriction of the compression to the matrix blocks \({\textbf {A}} _{{\textbf {j}} , {\textbf {j}} '}\), for which \(j^{(1)}, j^{(2)} > 0\), meaning that we can use Theorem 2 to estimate the corresponding matrix entries.

With these definitions, out of the proof of Theorem 2.3.1 in [24], one immediately obtains the following result:

Theorem 6

Let \({\textbf {R}} _J \mathrel {\mathrel {\mathop :}=}{\textbf {A}} _J - {\textbf {A}} _J^{c_1,3}\). Then, the compressed matrix blocks satisfy the estimate

$$\begin{aligned} \big \Vert {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\big \Vert _2 \lesssim a^{-(1 + 2q + 2 \widetilde{d})} 2^{- \sigma _{{\textbf {j}} , {\textbf {j}} '}} \end{aligned}$$

with a generic constant that is independent of the refinement level \(J\).

By combining (21), (25), and (26), we can define the first compression of the matrix by

$$\begin{aligned} \begin{aligned}&\big [ {\textbf {A}} _J^{c_1} \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \\&\quad \mathrel {\mathrel {\mathop :}=}{\left\{ \begin{array}{ll} 0, &{}[{\textbf {A}} _J^{c_1, \ell }]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} = 0 \text { for some } \ell \in \{1, 2, 3\}, \\ {[}{\textbf {A}} _J]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\text {otherwise.} \end{array}\right. } \end{aligned} \end{aligned}$$
(27)

This compression affects the far-field of the system matrix in wavelet coordinates.

5.2 Near-field: second compression

Up to now, we have considered wavelets with disjoint and distant supports. As we will see, we can also discard many entries if the supports of the wavelet pairs are close or even if they overlap, where a strict requirement is that the distance of the support of the smaller wavelet to the singular support of the bigger wavelet is sufficiently large.

We will only use one direction for the second compression as done by Reich in [24, 25], but with improved parameters. We define

$$\begin{aligned} \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '}&\mathrel {\mathrel {\mathop :}=}a 2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d} ( \max \{j_1, j_1' \} + \max \{j_2, j_2'\}) - \min \{j_1, j_1', j_2, j_2' \}}{1 + 2q + 2 \widetilde{d}}}, \\ \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_1}&\mathrel {\mathrel {\mathop :}=}a 2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d} (\max \{j_1, j_1' \} + \max \{j_2, j_2' \}) - \min \{j_2, j_2'\}}{1 + 2q + 2 \widetilde{d}}}, \\ \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_2}&\mathrel {\mathrel {\mathop :}=}a 2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d} (\max \{j_1, j_1' \} + \max \{j_2, j_2' \}) - \min \{j_1, j_1'\}}{1 + 2q + 2 \widetilde{d}}}. \end{aligned}$$

Then, the compressed values are given by

$$\begin{aligned}{} & {} w^{(1)}_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \mathrel {\mathrel {\mathop :}=}\left\{ \begin{aligned} 0,&\ \text {if} {\left\{ \begin{array}{ll} \sigma _{x_1}> \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '}, \\ \delta _{x_1} \le a2^{-\min \{j_1, j_1' \}}, \\ a2^{- \min \{j_1, j_2, j_1', j_2'\}}> \delta _{x_2}> a2^{-\min \{j_2, j_2' \},} \end{array}\right. } \\ 0,&\ \text {if} {\left\{ \begin{array}{ll} \sigma _{x_2}> \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '}, \\ \delta _{x_2} \le a2^{- \min \{j_2, j_2' \}}, \\ a2^{- \min \{j_1, j_2, j_1', j_2'\}}> \delta _{x_1}> a2^{-\min \{j_1, j_1' \},} \end{array}\right. } \\ \big [{\textbf {A}} _J\big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')},&\ \text {otherwise}, \end{aligned} \right. \\{} & {} w^{(2)}_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \mathrel {\mathrel {\mathop :}=}\left\{ \begin{aligned} 0,&\ \text {if}{\left\{ \begin{array}{ll} \sigma _{x_1}> \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_1}, \\ \delta _{x_1} \le a2^{-\min \{j_1, j_1' \}}, \\ \delta _{x_2} \le a2^{-\min \{j_2, j_2' \},} \end{array}\right. } \\ 0,&\ \text {if} {\left\{ \begin{array}{ll} \sigma _{x_2} > \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_2}, \\ \delta _{x_2} \le a2^{- \min \{j_2, j_2' \}}, \\ \delta _{x_1} \le a2^{-\min \{j_1, j_1' \},} \end{array}\right. } \\ \big [{\textbf {A}} _J\big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')},&\ \text {otherwise}. \end{aligned} \right. \end{aligned}$$

Similar as in the first compression, we define the corresponding compressed matrices as

$$\begin{aligned} \big [ {\textbf {A}} ^{c_2, 1}_J \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}&\mathrel {\mathrel {\mathop :}=}{\left\{ \begin{array}{ll} w^{(1)}_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\!\!\!\text {if } \max \{j_1, j_1'\}, \max \{j_2, j_2' \} > j_0, \\ \big [{\textbf {A}} _J\big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\!\!\!\text {otherwise}, \end{array}\right. } \end{aligned}$$
(28)
$$\begin{aligned} \big [ {\textbf {A}} ^{c_2, 2}_J \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}&\mathrel {\mathrel {\mathop :}=}{\left\{ \begin{array}{ll} w^{(2)}_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\!\!\!\text {if } \max \{j_1, j_1'\}, \max \{j_2, j_2' \} > j_0, \\ \big [{\textbf {A}} _J\big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\!\!\!\text {otherwise}. \end{array}\right. } \end{aligned}$$
(29)

Combining these two compression schemes leads to the second compressed matrix

$$\begin{aligned} \big [ {\textbf {A}} _J^{c_2} \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \mathrel {\mathrel {\mathop :}=}{\left\{ \begin{array}{ll} 0, &{}\!\!\!\big [{\textbf {A}} _J^{c_2,\ell } \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} = 0 \text { for some } \ell \in \{1, 2\}, \\ \big [ {\textbf {A}} _J \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\!\!\!\text {otherwise}. \end{array}\right. } \end{aligned}$$

Remark 5

It suffices to compress only the entries with \(a2^{- \min \{j_1, j_2, j_1', j_2' \}} \ge \delta _{x_1}, \delta _{x_2}\). Otherwise, we have

$$\begin{aligned} \delta _{{\text {tot}}} \ge \max \{ \delta _{x_1}, \, \delta _{x_2} \} > a 2^{- \min \{j_1, j_2, j_1', j_2' \}} \end{aligned}$$

and the first compression applies, meaning that either the entries are zero, or that, as we will see, there are only \(\mathcal {O}(2^{2J})\) such entries.

For the remainder of this section, let us without loss of generality assume that \(j_1' \le j_1\). The following estimate holds:

Theorem 7

The matrix blocks of the perturbed matrix \({\textbf {R}} _J \mathrel {\mathrel {\mathop :}=}{\textbf {A}} _J - {\textbf {A}} _J^{c_2}\) satisfy the estimate

$$\begin{aligned} \big \Vert {\textbf {R}} _{{\textbf {j}} , {\textbf {j}} '}\big \Vert _2 \lesssim a^{-(1 + 2q + 2 \widetilde{d})} 2^{-\sigma _{{\textbf {j}} , {\textbf {j}} '}} \end{aligned}$$

with a generic constant that is independent of the refinement level \(J\).

Proof

It suffices to consider the two coordinate directions separately. For \({\textbf {A}} _J^{c_2, 1}\), we first consider the case where \(\delta _{x_1} \le a2^{- \min \{j_1, j_1' \}}\) and \(\delta _{x_2} > a2^{- \min \{j_2, j_2' \}}\).

First, assume that \(j_1' \ne \min \{ j_1, j_2, j_1', j_2'\}\). Since we assumed that \(j_1' \le j_1\), this means that either \(j_2 < j_1'\), or \(j_2' < j_1'\), so we have \( \min \{j_2, j_2'\} = \min \{ j_1, j_2, j_1', j_2'\}, \) resulting in

$$\begin{aligned} \delta _{x_2} > a 2^{- \min \{j_2, j_2'\}} = a2^{- \min \{j_1, j_2, j_1', j_2'\}}. \end{aligned}$$

Hence, according to the definition of \(w_{({\textbf {j}} ,{\textbf {k}} ),({\textbf {j}} ',{\textbf {k}} ')}^{(1)}\), we do not compress such entries here and thus they do not contribute to the block error.

Let us therefore consider the case where \(j_1' = \min \{j_1, j_2, j_1', j_2' \}\). For the sake of comfortability, we define the index sets

$$\begin{aligned} \mathcal {I}_{{\textbf {j}} , {\textbf {k}} } \mathrel {\mathrel {\mathop :}=}\left\{ {\textbf {k}} ' \in \nabla _{{\textbf {j}} '}: \begin{array}{c} \sigma _{x_1} > \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '} \\ \delta _{x_1} \le a 2^{-j_1'} \\ a2^{-j_1'} \ge \delta _{x_2} \ge a2^{- \min \{j_2, j_2' \}} \end{array} \right\} , \end{aligned}$$

and likewise,

$$\begin{aligned} \mathcal {I}_{{\textbf {j}} ', {\textbf {k}} '} \mathrel {\mathrel {\mathop :}=}\left\{ {\textbf {k}} \in \nabla _{{\textbf {j}} }: \begin{array}{c} \sigma _{x_1} > \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '} \\ \delta _{x_1} \le a 2^{-j_1'} \\ a2^{-j_1'} \ge \delta _{x_2} \ge a2^{- \min \{j_2, j_2' \}} \end{array} \right\} . \end{aligned}$$

As one readily verifies, the cardinality of these sets is bounded by

$$\begin{aligned} \big | \mathcal {I}_{{\textbf {j}} , {\textbf {k}} } \big | \lesssim 2^{j_2' - j_1'}, \qquad \big | \mathcal {I}_{{\textbf {j}} ', {\textbf {k}} '} \big | \lesssim 2^{j_1 - j_1'} 2^{j_2 - j_1'}. \end{aligned}$$

Next, we recall that, according to Theorem 4, we have the estimate

$$\begin{aligned} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big | \lesssim 2^{\frac{1}{2}(j_1' - j_1)} 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d} (j_1 + \max \{j_2, j_2'\})} \sigma _{x_1}^{-(1 + 2q + 2 \widetilde{d})}. \end{aligned}$$

This allows us to estimate the column sums by

$$\begin{aligned}&\sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big |\\&\quad \lesssim \sum _{{\textbf {k}} \in \mathcal {I}_{{\textbf {j}} ', {\textbf {k}} '}} 2^{\frac{1}{2}(j_1' - j_1)} 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d}( j_1 + \max \{j_2, j_2'\})} \sigma _{x_1}^{-(1 + 2q + 2 \widetilde{d})} \\&\quad \lesssim 2^{j_1 - j_1'} 2^{j_2 - j_1'} 2^{\frac{1}{2}(j_1' - j_1)} 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d} (j_1 + \max \{j_2, j_2'\})} \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '}^{-(1 + 2q + 2 \widetilde{d})} \\&\quad \lesssim a^{-(1 + 2q + 2 \widetilde{d})} 2^{\frac{1}{2}(|{\textbf {j}} |_1 - |{\textbf {j}} '|_1)} 2^{-\sigma _{{\textbf {j}} , {\textbf {j}} '}}. \end{aligned}$$

Similarly, we may estimate the row sums by

$$\begin{aligned}&\sum _{{\textbf {k}} ' \in \nabla _{{\textbf {j}} '}} \big | r_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big |\\&\quad \lesssim \sum _{{\textbf {k}} ' \in \mathcal {I}_{{\textbf {j}} , {\textbf {k}} }} 2^{\frac{1}{2}(j_1' - j_1)} 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d}(j_1 + \max \{j_2, j_2'\})} \sigma _{x_1}^{-(1 + 2q + 2 \widetilde{d})} \\&\quad \lesssim 2^{j_2' - j_1'} 2^{\frac{1}{2}(j_1' - j_1)} 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d} (j_1 + \max \{j_2, j_2'\})} \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '}^{-(1 + 2q + 2 \widetilde{d})} \\&\quad \lesssim a^{-(1 + 2q + 2 \widetilde{d})} 2^{\frac{1}{2}(|{\textbf {j}} '|_1 - |{\textbf {j}} |_1)} 2^{- \sigma _{{\textbf {j}} , {\textbf {j}} '}}. \end{aligned}$$

Hence, we can argue in complete analogy to the proof of Theorem 5.

By using exactly the same arguments, but with interchanging the coordinate directions, we may also control the compression error of the entries, for which \(\delta _{x_1} > a2^{- \min \{j_1, j_1'\}}\) and \(\delta _{x_2} \le a2^{- \min \{j_2, j_2'\}}\). This implies the control of the error for the whole matrix \({\textbf {A}} _J^{c_2, 1}\).

For the matrix \({\textbf {A}} _J^{c_2, 2}\), we may use exactly the same arguments as in the proof of Theorem 2.3.2 in [24], with the only adaptation being that we have to use Theorem 4 to estimate the matrix entries instead of Theorem 2.1.7 in [24].\(\square \)

Finally, by using an additive argument, we can pose the main theorem of this section.

Theorem 8

Consider the compressed matrix

$$\begin{aligned} \big [{\textbf {A}} _J^c \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \mathrel {\mathrel {\mathop :}=}{\left\{ \begin{array}{ll} 0, &{}\!\!\![{\textbf {A}} _J^{c_\ell }]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} = 0 \text { for some } \ell \in \{1, 2\}, \\ {[}{\textbf {A}} _J]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, &{}\!\!\!\text {otherwise}. \end{array}\right. } \end{aligned}$$
(30)

Then, the block error is controlled by

$$\begin{aligned} \big \Vert [{\textbf {A}} _J]_{{\textbf {j}} , {\textbf {j}} '} - [{\textbf {A}} _J^c]_{{\textbf {j}} , {\textbf {j}} '}\big \Vert _2 \lesssim \varepsilon \, 2^{- \sigma _{{\textbf {j}} , {\textbf {j}} '}}, \end{aligned}$$

with

$$\begin{aligned} \varepsilon \mathrel {\mathrel {\mathop :}=}\max \left\{ a^{-(2q + 4 \widetilde{d})}, \, a^{-(2q + 2 \widetilde{d})}, \, a^{-(1 + 2q + 2 \widetilde{d})} \right\} . \end{aligned}$$

Remark 6

We have improved both the parameters \(\mathcal {E}_{{\textbf {j}} , {\textbf {j}} '}\) and \(\mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_i}\) in contrast to [24]: For both \(\mathcal {E}_{{\textbf {j}} , {\textbf {j}} '}\) and \(\mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_i}\), we gain an additional factor

$$\begin{aligned} 2^{-\widetilde{d} \max \{j_\ell , j_\ell ' \}}, \quad \ell \ne i, \end{aligned}$$

in the above error estimates, but we have to pay another \(\widetilde{d}\) in the denominator. This is not mandatory to ensure linear complexity, but it reduces the number of required vanishing moments, for example in the case of piecewise constant wavelets for the single layer operator to as few as three, as it is the case if an isotropic wavelet basis is used.

Much more important, we win the additional factor

$$\begin{aligned} 2^{-\frac{\min \{j_1, j_2, j_1', j_2'\}}{1 + 2q + 2 \widetilde{d}}} \end{aligned}$$

for \(\mathcal {E}_{{\textbf {j}} , {\textbf {j}} '}\) in the above error estimates. This is strictly necessary to ensure a linear complexity.

6 Complexity

We are now going to count the number of nonzero matrix coefficients of the compressed matrix \({\textbf {A}} _J^c\) and show that this number is asymptotically bounded by \(N_J = 2^{2J}\). In the arguments below, it is crucial that the exponents are positive to estimate the sum asymptotically by the largest term. To this end, for the sake of simplicity, we will for the remainder of this section assume that \(\kappa > 0\) is sufficiently small. This is not problematic since we only have a bounded number of restrictions on \(\kappa \), depending only on the uniform constants \(d'\), \(\widetilde{d}\), and the order of the operator \(2q\). Moreover, as we will see, we need to require the inequalities

$$\begin{aligned} d< d' < \min \lbrace q + \tilde{d},\ 2q + \tilde{d} \rbrace . \end{aligned}$$
(31)

Since this inequality is strict, there will especially always be space for a small \(\kappa > 0\), which can be inserted between \(d'\) and the minimum over both these terms.

First of all, we note that the restriction of the compression to the appropriate matrix blocks in (25), (28), and (29) never causes a problem. Indeed, if we cannot compress a matrix block, then at least two indices in the set \(\{j_1, j_2, j_1', j_2'\}\) are equal to \(j_0\). In particular, as \(\dim V_{{\textbf {j}} _0} \sim 2^{2 j_0}\), there are only \(\mathcal {O}(2^{2j_0}) = \mathcal {O}(1)\) rows and columns corresponding to such situations. As every row and column contains at most \(\mathcal {O}(N_J)\) entries, these are at most \(\mathcal {O}(N_J)\) such entries in total.

For the compressed blocks, we organize the proof in the following steps. First, we split up the unit square into at most nine regions, compare Fig. 2, corresponding to the distance in each coordinate direction being either big or small. These nine regions correspond to the four possible cases

$$\begin{aligned} \begin{aligned} \text {(I)}&{\left\{ \begin{array}{ll} \delta _{x_1}> a2^{-\min \{j_1, j_1'\}}, \\ \delta _{x_2}> a 2^{- \min \{j_2, j_2'\}}, \end{array}\right. }{} & {} {}&\text {(I}\!\text {I)} {\left\{ \begin{array}{ll} \delta _{x_1} \le a 2^{- \min \{j_1, j_1'\}}, \\ \delta _{x_2}> a2^{- \min \{j_2, j_2'\}}, \end{array}\right. } \\ \text {(I}\!\text {I}\!\text {I)}&{\left\{ \begin{array}{ll} \delta _{x_1} > a 2^{- \min \{j_1, j_1'\}}, \\ \delta _{x_2} \le a 2^{- \min \{j_2, j_2'\}}, \end{array}\right. }{} & {} {}&\text {(I}\!\text {V)} {\left\{ \begin{array}{ll} \delta _{x_1} \le a 2^{- \min \{j_1, j_1'\}}, \\ \delta _{x_2} \le a 2^{- \min \{j_2, j_2'\}}. \end{array}\right. } \end{aligned} \end{aligned}$$
(32)

In Sect. 6.1, we will show that already the first compression gives a linear complexity when there holds

$$\begin{aligned} \delta _{{\text {tot}}} > a 2^{- \min \{j_1, j_2, j_1', j_2'\}}. \end{aligned}$$

Then, in Sect. 6.2, we consider the wavelet pairs whose supports are closer together than \(a 2^{- \min \{j_1, j_2, j_1', j_2' \}}\) and we will show the linear complexity for those regions as well.

Fig. 2
figure 2

Graphical illustration of the regions described in (32). Note that there are at most nine different regions

6.1 Complexity of the first compression

We advance by the type of the compression which is performed on the matrix entries. First, we count all the nontrivial entries remaining from the compression scheme (21) in the case when

$$\begin{aligned} \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '} \sim 2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d}(|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)}{2q + 4 \widetilde{d}}}. \end{aligned}$$
(33)

Theorem 9

Assume that we set all matrix entries to zero where the underlying wavelets satisfy \(\delta _{{\text {tot}}} > \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}\) with \(\mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}\) given by (33). Then, only \(\mathcal {O}(2^{2J})\) nontrivial entries remain.

Proof

In any column of a block \({\textbf {A}} _{{\textbf {j}} , {\textbf {j}} '}\), we find \(\mathcal {O}(2^{|{\textbf {j}} |_1} \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}^2)\) entries, for which the distance of the supports is bounded by \(\mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}\). Since there are \(\mathcal {O}(2^{|{\textbf {j}} '|_1})\) columns in such a block, there are at most \(\mathcal {O}(2^{|{\textbf {j}} |_1 + |{\textbf {j}} '|_1} \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}^2)\) nonzero entries per block. Hence, the total complexity for the whole matrix is given by

$$\begin{aligned} \mathcal {C}&\lesssim \sum _{\begin{array}{c} |{\textbf {j}} |_\infty \le J \\ |{\textbf {j}} '|_\infty \le J \end{array}} 2^{|{\textbf {j}} |_1 + |{\textbf {j}} '|_1} 2^{\frac{\sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d}(|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)}{q + 2 \widetilde{d}}} \nonumber \\&\lesssim \sum _{\begin{array}{c} |{\textbf {j}} |_\infty \le J \\ |{\textbf {j}} '|_\infty \le J \end{array}} 2^{\frac{2J(d'-q) - (d' - 2 \kappa )(|{\textbf {j}} |_\infty + |{\textbf {j}} '|_\infty ) + (\widetilde{d} + q - \kappa )(|{\textbf {j}} |_1 + |{\textbf {j}} '|_1)}{q + 2 \widetilde{d}}} \nonumber \\&= 2^{2J \frac{d'-q}{q + 2 \widetilde{d}}} \left( \sum _{|{\textbf {j}} |_\infty \le J} 2^{\frac{-(d' - 2 \kappa )|{\textbf {j}} |_\infty + (\widetilde{d} + q - \kappa )|{\textbf {j}} |_1}{q + 2 \widetilde{d}}} \right) ^2. \end{aligned}$$
(34)

To calculate the sum, we explicitly enrol the indices \(j_1\) and \(j_2\). Then, after calculating the two sums over \(j_2\), and shifting the index in the second sum, we obtain

$$\begin{aligned}&\sum _{j_1=0}^J \left( \sum _{j_2 = 0}^{j_1-1} 2^{\frac{-(d' - 2 \kappa )j_1 + (\widetilde{d} + q - \kappa )(j_1 + j_2)}{q + 2 \widetilde{d}}} + \sum _{j_2 = j_1}^J 2^{\frac{-(d' - 2 \kappa )j_2 + (\widetilde{d} + q - \kappa )(j_1 + j_2)}{q + 2 \widetilde{d}}} \right) \nonumber \\&\quad \lesssim \sum _{j_1 = 0}^J \left( 2^{\frac{\widetilde{d} + q + \kappa - d'}{q + 2 \widetilde{d}}j_1} 2^{\frac{\widetilde{d} + q - \kappa }{q + 2 \widetilde{d}} j_1} + 2^{\frac{\widetilde{d} + q - \kappa }{q + 2 \widetilde{d}} j_1} 2^{\frac{\widetilde{d} + q + \kappa - d'}{q + 2 \widetilde{d}}j_1} \sum _{j_2 = 0}^{J - j_1} 2^{\frac{\widetilde{d} + q + \kappa - d'}{q + 2 \widetilde{d}}j_2} \right) \nonumber \\&\quad \lesssim \sum _{j_1 = 0}^J \left( 2^{\frac{2 \widetilde{d} + 2q - d'}{q + 2 \widetilde{d}} j_1} + 2^{\frac{\widetilde{d} + q + \kappa - d'}{q + 2 \widetilde{d}}J} 2^{\frac{\widetilde{d} + q - \kappa }{q + 2 \widetilde{d}}j_1}\right) \nonumber \\&\quad \lesssim 2^{J \frac{2 \widetilde{d} + 2q - d'}{q + 2 \widetilde{d}}}. \end{aligned}$$
(35)

Inserting this result back into (34), we obtain

$$\begin{aligned} \mathcal {C}\lesssim 2^{2J \frac{q + 2 \widetilde{d}}{q + 2 \widetilde{d}}} = \mathcal {O}(2^{2J}), \end{aligned}$$

which is what we wanted to show.\(\square \)

Remark 7

If we tensorize scaling functions on the coarsest level with wavelets, we must use the cutoff parameter from Remark 4. However, by using very similar arguments as in the proof above, one concludes that there are only \(\mathcal {O}(2^{2J})\) nontrivial entries in this case. If we enrol the expression explicitly, we may assume that \(j^{(1)} = |{\textbf {j}} |_\infty \) and \(j^{(2)} = |{\textbf {j}} '|_\infty \). Since \(q< d < d'\), the exponent in the second sum in (35) will then be negative, but thus the sum can be estimated by the lower limit in the exponent, which is \(j_1\).

Remark 8

Theorem 9 and Remark 7 impliy that there are only \(\mathcal {O}(2^{2J})\) entries in the region \(\text {(I)}\). Indeed, in view of (22), if \(\mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}= a 2^{- \min \{j_1, j_2, j_1', j_2'\}}\), then the entries in the region \(\text {(I)}\) are all set to zero. Otherwise, there are at most \(\mathcal {O}(2^{2J})\) nontrivial entries by Theorem 9 or Remark 7, respectively.

6.2 Complexity of the second compression

In the next step, we want to cover the regions \(\text {(I}\!\text {I})\) and \((\text {I}\!\text {I}\!\text {I})\) in (32). Due to the symmetry in the problem, these two regions yield the same complexity, so we may only consider the region \((\text {I}\!\text {I})\), that is, \(\delta _{x_1} \lesssim 2^{- \min \{j_1, j_1' \}}\) and \(\delta _{x_2} \gtrsim 2^{- \min \{j_2, j_2' \}}\). For the sake of simplicity, let us for the remainder of this section assume that \(j_1' \le j_1\), as the case \(j_1' > j_1\) follows directly by exchanging \({\textbf {j}} \) and \({\textbf {j}} '\).

Lemma 1

Consider all matrix entries in \({\textbf {A}} _{J}^{c}\) such that the underlying wavelet pairs \(\psi _{{\textbf {j}} , {\textbf {k}} }\) and \(\psi _{{\textbf {j}} ', {\textbf {k}} '}\) satisfy

$$\begin{aligned} \delta _{x_1} \le a 2^{-j_1'}, \quad \delta _{x_2} > a 2^{-\min \{j_2, j_2' \}}. \end{aligned}$$
(36)

Then, after the combination of the compression schemes (27) and (28), these are at most \(\mathcal {O}(2^{2J})\) nontrivial entries.

Proof

We may without loss of generality assume that \(j_1' = \min \{j_1, j_2, j_1', j_2'\}\). If not, then we must have either \(j_1' > j_2\) or \(j_1' > j_2'\) since we assumed \(j_1' \le j_1\). Therefore, we would have

$$\begin{aligned} \delta _{{\text {tot}}} \ge \delta _{x_2} > a2^{ -\min \{j_2, j_2'\}} = a2^{- \min \{j_1, j_2, j_1', j_2'\}}. \end{aligned}$$

Hence, these entries are either trivial or there are only \(\mathcal {O}(2^{2J})\) of them due to Sect. 6.1. As a consequence, we may especially assume in the following \(j_1' \le \min \{j_2, j_2'\}\).

We remark that we only have to consider the situation

$$\begin{aligned} \mathcal {D}_{{\textbf {j}} , {\textbf {j}} '}^{x_2} \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '} \sim 2^{\frac{2 \sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d} \left( j^{(1)} + j^{(2)} + j_1 + \max \{j_2, j_2' \}\right) - 2 j_1'}{1 + 2q + 2 \widetilde{d}}}, \end{aligned}$$

since if \(\mathcal {D}_{{\textbf {j}} , {\textbf {j}} '}^{x_2} = a 2^{- \min \{j_2, j_2'\}}\), in view of (36), all entries are compressed.

In order to estimate the nontrivial entries, we need to consider four different cases: First, if \(j_2 \le j_2'\) and \(|{\textbf {j}} |_\infty = j_1\), we have \(j_1'\le j_2 \le j_1, j_2'\). Hence, by using \(j^{(1)} \mathrel {\mathrel {\mathop :}=}j_1\) and \(j^{(2)} \mathrel {\mathrel {\mathop :}=}j_2'\), we conclude

$$\begin{aligned} \mathcal {C}&\lesssim \sum _{\begin{array}{c} j_0< j_1' \le j_1 \le J \\ j_0< j_2 \le j_2' \le J \end{array}} 2^{|{\textbf {j}} |_1 + |{\textbf {j}} '|_1} 2^{\frac{2 \sigma _{{\textbf {j}} , {\textbf {j}} '} - \widetilde{d}(j_1 + j_2' + j_1 + j_2') - 2 j_1'}{1 + 2q + 2 \widetilde{d}}} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{\begin{array}{c} j_0< j_1' \le j_1 \le J \\ j_0< j_2 \le j_2' \le J \end{array}} 2^{j_1 \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2 \big (1 - \frac{2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )}\\&\qquad \cdot 2^{j_1' \big ( 1 - \frac{2 + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0 < j_1, j_2' \le J} 2^{j_1 \big ( 2 - \frac{2 + 2d' + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big (2 - \frac{2d' + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \big ( \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}} + 4 - \frac{2 + 4d' + 4 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \\&= 2^{2J}. \end{aligned}$$

Second, if \(j_2 \le j_2'\) and \(|{\textbf {j}} |_\infty = j_2\), then we have the order \(j_1' \le j_1 \le j_2 \le j_2'\), so with \(j^{(1)} \mathrel {\mathrel {\mathop :}=}j_2'\), \(j^{(2)} \mathrel {\mathrel {\mathop :}=}j_2\), we may directly sum up

$$\begin{aligned} \mathcal {C}&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0< j_1' \le j_1 \le j_2 \le j_2' \le J} 2^{j_1 \big ( 1 - \frac{\widetilde{d} + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2 \big ( 1 - \frac{2d' + \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )}\\&\qquad \cdot 2^{j_1' \big ( 1 - \frac{2 + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0< j_1 \le j_2 \le j_2' \le J} 2^{j_1 \big (2 - \frac{2 + \widetilde{d} + 4 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2 \big ( 1 - \frac{2d' + \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0< j_2 \le j_2' \le J} 2^{j_2 \big ( 3 - \frac{2 + 2d' + 2 \widetilde{d} + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q }{1 + 2q + 2 \widetilde{d}}} \sum _{j_0 < j_2' \le J} 2^{j_2' \big ( 4 - \frac{2 + 4d' + 4 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{2J}. \end{aligned}$$

Third, if \(j_2' \le j_2\) and \(|{\textbf {j}} |_\infty = j_1\), we have \(j_1' \le j_2' \le j_2 \le j_1\), therefore the choice \(j^{(1)} \mathrel {\mathrel {\mathop :}=}j_1\), \(j^{(2)} \mathrel {\mathrel {\mathop :}=}j_2\) yields

$$\begin{aligned} \mathcal {C}&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0< j_1' \le j_2' \le j_2 \le j_1 \le J} 2^{j_1 \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2 \big ( 1 - \frac{2 \widetilde{d} + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )}\\&\qquad \cdot 2^{j_1' \big (1 - \frac{2 + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big ( 1 - \frac{2d' - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0< j_2' \le j_2 \le j_1 \le J} 2^{j_1 \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2 \big ( 1 - \frac{2 \widetilde{d} + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big ( 2 - \frac{2 + 2d'}{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0< j_2 \le j_1 \le J} 2^{j_1 \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2 \big ( 3 - \frac{2 + 2d' + 2 \widetilde{d} + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0 < j_1 \le J} 2^{j_1 \big ( 4 - \frac{2 + 4d' + 4 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{2J}. \end{aligned}$$

Finally, if \(j_2' \le j_2\) and \(|{\textbf {j}} |_\infty = j_2\), we also have \(j_1' \le j_1 \le j_2\). If we choose \(j^{(1)}\mathrel {\mathrel {\mathop :}=}j_2\) and \(j^{(2)}\mathrel {\mathrel {\mathop :}=}j_1\), then the complexity reads as

$$\begin{aligned} \mathcal {C}&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{\begin{array}{c} j_0< j_2' \le j_2 \le J \\ j_0< j_1' \le j_1 \le j_2 \le J \end{array}} 2^{j_1 \big ( 1 - \frac{2 \widetilde{d} + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2 \big ( 1 - \frac{2d' + 2 \widetilde{d} - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )}\\&\qquad \cdot 2^{j_1' \big ( 1 - \frac{2 + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big ( 1 - \frac{2d' - 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0< j_1 \le j_2 \le J} 2^{j_1 \big ( 2 - \frac{2 + 2 \widetilde{d} + 4 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2 \big (2 - \frac{4d' + 2 \widetilde{d} - 4 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \sum _{j_0 < j_2 \le J} 2^{j_2 \big (4 - \frac{2 + 4d' + 4 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \\&\lesssim 2^{2J}. \end{aligned}$$

\(\square \)

Let us now consider the last possible situation, that is, we suppose that

$$\begin{aligned} \delta _{x_1} \lesssim 2^{- \min \{j_1, j_1'\}}, \quad \delta _{x_2} \lesssim 2^{- \min \{j_2, j_2'\}}, \end{aligned}$$

which describes the region \((\text {I}\!\text {V})\) in (32), or, the near-field region in Fig. 2. In this case, the compression scheme (29) applies. Remember that we still assume that \(j_1' \le j_1\).

Lemma 2

Consider all matrix entries in \({\textbf {A}} _{J}^{c}\) such that the underlying wavelet pairs \(\psi _{{\textbf {j}} , {\textbf {k}} }\) and \(\psi _{{\textbf {j}} ', {\textbf {k}} '}\) satisfy

$$\begin{aligned} \delta _{x_1} \le a 2^{-j_1'}, \qquad \delta _{x_2} \le a 2^{- \min \{j_2, j_2'\}}. \end{aligned}$$

Then, after the compression scheme (29), these are at most \(\mathcal {O}(2^{2J})\) nontrivial entries.

Proof

It suffices to count the respective matrix entries of \({\textbf {A}} _J^{c_2, 2}\). Suppose first that \(j_2' \ge j_2\). In this case, the number of nontrivial entries in the matrix block \([{\textbf {A}} _J^{c_2, 2}]_{{\textbf {j}} , {\textbf {j}} '}\) can be estimated by

$$\begin{aligned} N_{{\textbf {j}} , {\textbf {j}} '} \lesssim 2^{|{\textbf {j}} |_1 + |{\textbf {j}} '|_1} \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_1} \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_2} \sim 2^{|{\textbf {j}} |_1 + |{\textbf {j}} '|_1} 2^{\frac{2 \sigma _{{\textbf {j}} , {\textbf {j}} '} - 2 \widetilde{d}(j_1 + j_2') - (j_1' + j_2)}{1 + 2q + 2 \widetilde{d}}}. \end{aligned}$$
(37)

Remarking that for \(\kappa > 0\) sufficiently small, we have \(d' - 2 \kappa > 0\), so the properties \(|{\textbf {j}} |_\infty \ge j_1\) and \(|{\textbf {j}} '|_\infty \ge j_2'\) imply that the number of entries can be estimated by

$$\begin{aligned} \mathcal {C}^{(1)}&\lesssim \sum _{\begin{array}{c} j_0< j_1 \le J \\ j_0< j_1' \le j_1 \end{array}} \sum _{\begin{array}{c} j_0< j_2' \le J \\ j_0< j_2 \le j_2' \end{array}} N_{{\textbf {j}} , {\textbf {j}} '}\\&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \left( \sum _{\begin{array}{c} j_0< j_1 \le J \\ j_0< j_1' \le j_1 \end{array}} 2^{j_1 \big (1 - \frac{2d' - 2 \kappa + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_1' \big (1 - \frac{1 + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \right) \\&\qquad \cdot \left( \sum _{\begin{array}{c} j_0< j_2' \le J \\ j_0 < j_2 \le j_2' \end{array}} 2^{j_2 \big (1 - \frac{1 + 2 \kappa }{1 + 2q + 2 \widetilde{d}}\big )} 2^{j_2'\big (1 - \frac{2d' - 2 \kappa + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \right) . \end{aligned}$$

As the indices can be exchanged, the two sums are equal and can be estimated by

$$\begin{aligned}&\sum _{j_1 = j_0 + 1}^J 2^{j_1 \big (1 - \frac{2d' - 2 \kappa + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \sum _{j_1' = j_0 + 1}^{j_1} 2^{j_1' \big (1 - \frac{1 + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )}\\&\quad \lesssim \sum _{j_1 = j_0 + 1}^J 2^{j_1\big (2 - \frac{1 + 2d' + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \lesssim 2^{J\big (2 - \frac{1 + 2d' + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )}. \end{aligned}$$

Putting everything together, we obtain that

$$\begin{aligned} \mathcal {C}^{(1)} \lesssim 2^{J \big ( \frac{4d'- 4q}{1 + 2q + 2 \widetilde{d}} + 4 - \frac{2 + 4d' + 4 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} = 2^{2J}. \end{aligned}$$

In the case where \(j_2 \ge j_2'\), we need to argue in a slightly different way. By the assumption \(j_1' \le j_1\), we have at least \(\mathcal {O}(2^{|{\textbf {j}} '|_1})\) nontrivial entries in the matrix block \([{\textbf {A}} _J^{c_2, 2}]_{{\textbf {j}} , {\textbf {j}} '}\), as at every point at which the singular supports \(\varOmega _{j_1', k_1'}^\sigma \) and \(\varOmega _{j_2', k_2'}^\sigma \) intersect, there is at least one smaller wavelet \(\psi _{{\textbf {j}} , {\textbf {k}} }\) touching it. On the other hand, every nontrivial entry satisfies \(\sigma _{x_1}\le a \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_1}\) and \(\sigma _{x_2} \le a\mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_2}\), so the number of nontrivial entries in a block \([{\textbf {A}} _J^{c_2, 2} ]_{{\textbf {j}} , {\textbf {j}} '}\) can be estimated by

$$\begin{aligned} N_{{\textbf {j}} , {\textbf {j}} '} \lesssim 2^{ |{\textbf {j}} '|_1 } \max \big \{ 1, \ 2^{ |{\textbf {j}} |_1 } \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_1} \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_2} \big \}. \end{aligned}$$
(38)

If the maximum in (38) is equal to \(1\), then it is easy to see that

$$\begin{aligned} \sum _{{\textbf {j}} _0 + {\textbf {1}} \le {\textbf {j}} ' \le {\textbf {j}} \le {\textbf {J}} } N_{{\textbf {j}} , {\textbf {j}} '}&\lesssim \sum _{j_0< j_1' \le j_1 \le J} \sum _{j_0< j_2' \le j_2 \le J} 2^{j_1' + j_2'} \\&\lesssim \sum _{j_0< j_1 \le J} \sum _{j_0 < j_2 \le J} 2^{j_1 + j_2} \lesssim 2^{2J}. \end{aligned}$$

If, however, the maximum in (38) is not equal to \(1\), we have

$$\begin{aligned} N_{{\textbf {j}} , {\textbf {j}} '} \lesssim 2^{|{\textbf {j}} |_1 + |{\textbf {j}} '|_1} 2^{\frac{2 \sigma _{{\textbf {j}} , {\textbf {j}} '} - 2 \widetilde{d}(j_1 + j_2) - (j_1' + j_2')}{1 + 2q + 2 \widetilde{d}}}. \end{aligned}$$

Using again that \(d' - 2 \kappa > 0\), \(|{\textbf {j}} |_\infty \ge j_1\), \(|{\textbf {j}} '|_\infty \ge j_2'\), one obtains

$$\begin{aligned} \mathcal {C}^{(2)}&\lesssim 2^{J \frac{4d' - 4q}{1 + 2q + 2 \widetilde{d}}} \left( \sum _{\begin{array}{c} j_0< j_1 \le J \\ j_0< j_1' \le j_1 \end{array}} 2^{j_1 \big (1 - \frac{2d' - 2 \kappa + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_1' \big (1 - \frac{1 + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} \right) \\&\qquad \cdot \left( \sum _{\begin{array}{c} j_0< j_2 \le J \\ j_0 < j_2' \le j_2 \end{array}} 2^{j_2 \big (1 - \frac{2 \widetilde{d} + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big (1 - \frac{2d' - 2 \kappa + 1}{1 + 2q + 2 \widetilde{d}} \big )} \right) . \end{aligned}$$

The first sum can be treated as in the previous case, whereas for the second sum, there holds

$$\begin{aligned}&\sum _{\begin{array}{c} j_0< j_2 \le J \\ j_0< j_2' \le j_2 \end{array}} 2^{j_2 \big (1 - \frac{2 \widetilde{d} + 2 \kappa }{1 + 2q + 2 \widetilde{d}} \big )} 2^{j_2' \big (1 - \frac{2d' - 2 \kappa + 1}{1 + 2q + 2 \widetilde{d}} \big )}\\&\qquad \lesssim \sum _{j_0 < j_2 \le J} 2^{j_2 \big (2 - \frac{1 + 2d' + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}} \big )} \lesssim 2^{J\big (2 - \frac{1 + 2d' + 2 \widetilde{d}}{1 + 2q + 2 \widetilde{d}}\big )}. \end{aligned}$$

Hence, we can conclude that also \(\mathcal {C}^{(2)} \lesssim 2^{2J}\).\(\square \)

With the preceeding three lemmata and Theorem 9 at hand, we conclude the main result of this section.

Theorem 10

Assume that (31) holds. Then, the compressed matrix \({\textbf {A}} _J^c\) arising from (30) contains at most \(\mathcal {O}(2^{2J})\) nontrivial entries.

Remark 9

For the piecewise constant (\(d=1\)) and piecewise bilinear (\(d=2\)) wavelets, used most often in practice, one needs at least the following number of vanishing moments:

If \(2q = 1\), then using piecewise constant wavelets is mathematically not meaningful because the energy space \(H^{\frac{1}{2}}(\varGamma )\) cannot be discretized by discontinuous trial functions. Note that these are the same values as in the setting of isotropic wavelet bases, compare [9].

7 The situation on a Lipschitz manifold

Up to now, we have only considered the situation on the unit square. As stated in Sect. 2, we are particularly interested in the solution of a boundary integral equation posed on the boundary \(\varGamma \) of a Lipschitz domain \(\varOmega \subset \mathbb {R}^3\). Recall that the boundary \(\varGamma \) is given as the union of the patches \(\varGamma _i\), which can be smoothly parametrized by \(\varvec{\gamma }_i: \square \rightarrow \varGamma _i\). Moreover, we assume that the parametrizations \(\varvec{\gamma }_i\) and \(\varvec{\gamma }_j\) coincide up to the orientation on a common edge of \(\varGamma _i\) and \(\varGamma _j\).

We define the Sobolev spaces \(H^s(\varGamma )\) according to [32]. We remark that the norm \(\Vert \cdot \Vert _{H^s(\varGamma )}\) depends on the parametrizations \(\varvec{\gamma }_i\). However, there can be shown that these norms are equivalent, regardless which parametrizations are chosen [32, 37]. Recall that \(|s| \le 1\) would hold on a Lipschitz smooth surface. For a \(C^{k, \alpha }\)-surface, it is possible to define these spaces up to \(|s| \le k + \alpha \), compare [37] for example.

7.1 Globally discontinuous wavelets

Similar to the existing literature, see e.g. [9, 30], we discretize the energy space \(H^q(\varGamma )\) with \(2q<1\) by transporting the wavelet functions from the unit square onto \(\varGamma \). Precisely, we define a basis function as

$$\begin{aligned} \psi _{i, {\textbf {j}} , {\textbf {k}} } \mathrel {\mathrel {\mathop :}=}\psi _{{\textbf {j}} , {\textbf {k}} } \circ \varvec{\gamma }_i^{-1}, \end{aligned}$$

and then consider the basis set

$$\begin{aligned} \varPsi \mathrel {\mathrel {\mathop :}=}\big \lbrace \psi _{i, {\textbf {j}} , {\textbf {k}} }: 1 \le i \le r,\,{\textbf {j}} \ge {{\textbf {j}} _0},\,{\textbf {k}} \in \nabla _{{\textbf {j}} } \big \rbrace . \end{aligned}$$

To construct a trial space on the level \(J\), we proceed as on the unit square, meaning that we cut the basis off at the level \(J\), obtaining

$$\begin{aligned} V_{{\textbf {J}} } \mathrel {\mathrel {\mathop :}=}{{\,\textrm{span}\,}}\varPsi _{{\textbf {J}} }, \quad \varPsi _{{\textbf {J}} } \mathrel {\mathrel {\mathop :}=}\big \{ \psi _{i, {\textbf {j}} , {\textbf {k}} }: 1 \le i \le r,\,|{\textbf {j}} |_\infty \le J,\,{\textbf {k}} \in \nabla _{{\textbf {j}} } \big \}. \end{aligned}$$

Note that the above wavelets are supported on a single patch \(\varGamma _i\) only. In general, they are not continuous over the patch boundaries, so they only attain a regularity of \(\gamma = \frac{1}{2}\), regardless how smooth the wavelets are piecewise. It is possible to construct wavelets which are continuous over the patch boundaries, see e.g. [12], but \(\gamma = \frac{1}{2}\) is sufficient if we want to discretize an operator of nonpositive order.

Because the wavelets are supported on a single patch, and each parametrization and cutoff function is smooth, one can generalize all the wavelet properties stated in Sect. 3.4. For the same reason, the Lipschitz continuity of the surface is no restriction in using \(\widetilde{d} > 1\) vanishing moments for the cancellation property (16), since we do not have to consider the behaviour of the test function over the patch boundaries, and the expressions \(|\cdot |_{W^{\widetilde{d}, \infty }(\varOmega _{i, {\textbf {j}} , {\textbf {k}} })}\) are well-defined.

Finally, as the spaces \(V_{{\textbf {J}} }\) coincide with the space spanned by all lifted, isotropic scaling functions on the level \(J\), we can directly quote the following lemma, compare [35].

Lemma 3

For a continuous, strongly elliptic and injective operator \(\mathcal {A}: H^q(\varGamma ) \rightarrow H^{-q}(\varGamma )\), the Galerkin discretization is stable, meaning that

$$\begin{aligned} \big \langle (\mathcal {A}+ \mathcal {A}') v_J, v_J \big \rangle _\varGamma \gtrsim \Vert v_J\Vert _{H^q(\varGamma )}^2, \quad v_J \in V_{{\textbf {J}} }, \end{aligned}$$

for any sufficiently large \(J\), and

$$\begin{aligned} \big | \big \langle \mathcal {A}v_J, w_J \big \rangle _\varGamma \big | \lesssim \Vert v_J\Vert _{H^q(\varGamma )} \Vert w_J\Vert _{H^q(\varGamma )}, \quad v_J, w_J \in V_{{\textbf {J}} }. \end{aligned}$$

Furthermore, let \(u\) be the solution of (5), and \(u_J\) the solution of (7). Then, we have the convergence

$$\begin{aligned} \Vert u - u_J\Vert _{H^t(\varGamma )} \lesssim 2^{-J(s-t)} \Vert u\Vert _{H^s(\varGamma )}, \ \ 2q - d \le t < \gamma , \ \ t \le s, \ \ q \le s \le d, \end{aligned}$$

provided that \(u \in H^s(\varGamma )\) and that \(\varGamma \) is sufficiently regular to ensure that the involved Sobolev spaces are well-defined.

Remark 10

The condition that \(\mathcal {A}\) is injective is, as already stated, not strictly necessary. It suffices if the kernel is finite-dimensional and known in advance.

7.2 Matrix estimates for discontinuous wavelets

As we will see, all the matrix estimates on a given surface can be concluded from the matrix estimates on the unit square. Depending on the situation of the two patches on which the wavelets are supported, we need to differ between several cases. To this end, for \(\hat{\varvec{x}}, \hat{\varvec{x}}' \in \square \), let

$$\begin{aligned} \begin{aligned} {\hat{K}}_{i,i'}({\hat{\varvec{x}}},{\hat{\varvec{x}}}')&\mathrel {\mathrel {\mathop :}=}K \big ( \varvec{\gamma }_i({\hat{\varvec{x}}}), \varvec{\gamma }_{i'}({\hat{\varvec{x}}}') \big )\\&\qquad \qquad \cdot \sqrt{\det \big ({\textbf {D}} \varvec{\gamma }_i ^\intercal {\textbf {D}} \varvec{\gamma }_i \big ) ( {\hat{\varvec{x}}})} \sqrt{\det \big ({\textbf {D}} \varvec{\gamma }_{i'}^\intercal {\textbf {D}} \varvec{\gamma }_{i'} \big ) ( {\hat{\varvec{x}}}')} \end{aligned} \end{aligned}$$
(39)

with \(1\le i,i'\le r\) denote the transported kernel function. With the transported kernel function at hand, we find that

$$\begin{aligned} \begin{aligned}&\big \langle \mathcal {A}\psi _{i', {\textbf {j}} ', {\textbf {k}} '}, \psi _{i, {\textbf {j}} , {\textbf {k}} } \big \rangle _\varGamma \\&\quad = \int _\varGamma \int _\varGamma K(\varvec{x}, \varvec{x}') \psi _{{\textbf {j}} , {\textbf {k}} } \big ( \varvec{\gamma }_i^{-1}(\varvec{x}) \big ) \psi _{{\textbf {j}} ', {\textbf {k}} '} \big ( \varvec{\gamma }_{i'}^{-1}(\varvec{x}') \big ) \; {\text {d}}S_{\varvec{x}'} \; {\text {d}}S_{\varvec{x}} \\&\quad = \int _\square \int _\square {\hat{K}}_{i,i'}(\hat{\varvec{x}}, \hat{\varvec{x}}') \psi _{{\textbf {j}} , {\textbf {k}} } ({\hat{\varvec{x}}}) \psi _{{\textbf {j}} ', {\textbf {k}} '}( {\hat{\varvec{x}}}') \; {\text {d}}\hat{\varvec{x}}' \; {\text {d}}{\hat{\varvec{x}}} \\&\quad = \big \langle \hat{\mathcal {A}}_{i,i'} \psi _{{\textbf {j}} ', {\textbf {k}} '}, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\square , \end{aligned} \end{aligned}$$
(40)

where we define \({\hat{\mathcal {A}}}_{i,i'}\) as the integral operator with the transported kernel \({\hat{K}}_{i,i'}\). Since the local parametrizations \(\varvec{\gamma }_i\) and \(\varvec{\gamma }_{i'}\) are smooth, the transported kernel functions also satisfy the decay property

$$\begin{aligned} \big | \partial _{\hat{\varvec{x}}}^{\varvec{\upalpha }} \partial _{\hat{\varvec{x}}'}^{\varvec{\upalpha }'} {\hat{K}}_{i,i'} (\hat{\varvec{x}}, \hat{\varvec{x}}') \big |\le C_{\varvec{\upalpha }, \varvec{\upalpha }', K, i, i'} \Vert \varvec{\gamma }_i({\hat{\varvec{x}}}) - \varvec{\gamma }_{i'}({\hat{\varvec{x}}}')\Vert ^{-(2 + 2q + |\varvec{\upalpha }| + |\varvec{\upalpha }'|)} \end{aligned}$$
(41)

for almost all \( (\hat{\varvec{x}}, \hat{\varvec{x}'}) \in \square \times \square \), provided that \(2 + 2q + |\varvec{\upalpha }| + |\varvec{\upalpha }'| > 0\). Therefore, in view of (40) and (41), the far-field estimates of Sect. 4.1 hold true also in case of piecewise smooth Lipschitz manifolds.

7.2.1 Wavelets supported on the same patch

Let us look at the easiest situation first. If we consider the interaction between \(\psi _{i,{\textbf {j}} ,{\textbf {k}} }\) and \(\psi _{i,{\textbf {j}} ',{\textbf {k}} '}\), we can use the relations (40) and (41) together with

$$\begin{aligned} \Vert \varvec{\gamma }_i({\hat{\varvec{x}}})-\varvec{\gamma }_i({\hat{\varvec{x}}}')\Vert \sim \Vert {\hat{\varvec{x}}}-{\hat{\varvec{x}}}'\Vert , \quad \hat{\varvec{x}}, \hat{\varvec{x}}' \in \square , \end{aligned}$$

to conclude that the situation on a single patch is equivalent to that of the unit square. Therefore, also the near-field estimates are valid one-to-one and we find at most \(\mathcal {O}(2^{2J})\) nontrivial entries in the matrix block associated with \(\varGamma _i \times \varGamma _i\). Especially, the compression error in each matrix block satisfies the same estimates as in Sect. 5.

7.2.2 Patches with a common edge

Let us now assume that \(\varGamma _i\) and \(\varGamma _{i'}\) share a common edge. For the sake of simplicity, we assume that the common edge \(\varSigma \) satisfies

$$\begin{aligned} \varvec{\gamma }_i\big ( \{1\} \times [0, 1] \big ) = \varSigma = \varvec{\gamma }_{i'}\big ( \{0\} \times [0, 1] \big ), \end{aligned}$$

especially that there holds \(\varvec{\gamma }_i(1,x_2) = \varvec{\gamma }_{i'}(0,x_2)\) for all \(x_2\in [0,1]\). Otherwise, we can apply suitable rotations such that this assumption holds.

Fig. 3
figure 3

Graphical illustration of the parametrization \(\varvec{\gamma }\) in case of a common edge

By glueing the two parametrizations together, we obtain a Lipschitz continuous parametrization \(\varvec{\gamma }: [0,2]\times [0,1]\rightarrow \varGamma _i \cup \varGamma _{i'}\) such that

$$\begin{aligned} \varvec{\gamma }(\varvec{x}) = {\left\{ \begin{array}{ll} \varvec{\gamma }_i(\varvec{x}), &{}\varvec{x}\in \left[ 0,1 \right] \times [0, 1], \\ \varvec{\gamma }_{i'}(x_1-1,x_2), &{}\varvec{x}\in \left[ 1,2\right] \times [0, 1], \end{array}\right. } \end{aligned}$$

compare Fig. 3 for a graphical illustration. For the near-field estimates, we need to interpret the coordinate directions in a meaningful way. This is quite intuitive in Fig. 3, since we can simply define the \(x\)-direction as the direction across the edge, while the \(y\)-direction can be interpreted as the direction parallel to the edge. Especially, we find

$$\begin{aligned} \Vert \varvec{\gamma }_i({\hat{\varvec{x}}})-\varvec{\gamma }_{i'}({\hat{\varvec{x}}}')\Vert \sim \big \Vert {\hat{\varvec{x}}}-{\hat{\varvec{x}}}'-\big [{\begin{matrix} 1 \\ 0 \end{matrix}}\big ]\big \Vert , \quad \hat{\varvec{x}}, \hat{\varvec{x}'} \in \square , \end{aligned}$$

where \(\Vert \cdot \Vert \) denotes the Euclidean norm on \(\mathbb {R}^3\) and \(\mathbb {R}^2\), respectively.

For Theorem 4, we followed the arguments in [24], where the smooth part of the bigger wavelet \(\psi _{j_1', k_1'}\) is extended to a smooth, compactly supported function. Then, also the operator \(\mathcal {A}_1\) is extended to a classical, pseudo-differential operator \(\mathcal {A}_1^\sharp \), compare [22]. This is not so easy to do in the current situation, as the kernel \(\hat{K}_{i, i'}\) is no longer asymptotically smooth of the order \(2q\), since it is only continuous over the edge \(\varSigma \). Nevertheless, this difficulty can be overcome. For the sake of simplicity, let us define

$$\begin{aligned} \hat{\varOmega }_{j_1, k_1} \times \hat{\varOmega }_{j_2, k_2} \mathrel {\mathrel {\mathop :}=}{{\,\textrm{supp}\,}}\big ( \psi _{i, {\textbf {j}} , {\textbf {k}} } \circ \varvec{\gamma }\big ), \quad \hat{\varOmega }_{j_1', k_1'} \times \hat{\varOmega }_{j_2', k_2'} \mathrel {\mathrel {\mathop :}=}{{\,\textrm{supp}\,}}\big ( \psi _{i', {\textbf {j}} ', {\textbf {k}} '} \circ \varvec{\gamma }\big ). \end{aligned}$$

Then, there holds \(\hat{\varOmega }_{j_1, k_1} \subset [0, 1]\) and \(\hat{\varOmega }_{j_1', k_1'} \subset [1, 2]\) in (40). Hence, we either have \(\sigma _{x_1} = 0\), in which case we cannot compress in the direction of \(x_1\) at all, or \(\sigma _{x_1} > 0\).

If we are considering discontinuous wavelets, the smooth extension function \(\tilde{f}\) from the proof of Theorem 4 is equal to 0, since the two wavelets are located on different patches. Hence, we can ignore everything about the pseudo-differential operator and only consider the complement function.

In the second coordinate direction, the situation looks more complicated, as \(\hat{\varOmega }_{j_2, k_2}\) and \(\hat{\varOmega }_{j_2', k_2'}\) may not be disjoint. In this case, we still need to argue with \(\hat{\mathcal {A}}_2^\sharp \), which corresponds to the kernel

$$\begin{aligned} \hat{K}^{(2)}_{i, i'} (\hat{x}_2, \hat{x}_2')&\mathrel {\mathrel {\mathop :}=}\int _{0}^1 \int _{0}^1 \hat{K}_{i, i'} ( {\hat{\varvec{x}}}, {\hat{\varvec{x}}}' ) \psi _{j_1, k_1}({\hat{x}} _1) \psi _{j_1', k_1'}({\hat{x}} _1') \; {\text {d}}\hat{x}_1' \; {\text {d}}\hat{x}_1,\\&\qquad \qquad \hat{x}_2, \hat{x}_2' \in [0, 1]. \end{aligned}$$

Although \(\varvec{\gamma }\) is overall only Lipschitz continuous, the kernel \(\hat{K}_{i, i'}^{(2)}\) is smooth. Indeed, by (1) and the smoothness of \(\varvec{\gamma }_i\) and \(\varvec{\gamma }_{i'}\), for any fixed \(\hat{x}_2,\hat{x}_2' \in [0, 1]\) with \(\hat{x}_2 \ne \hat{x}_2'\), it is not hard to see that all derivatives \(\partial _{x_2}^{\beta } \partial _{x_2'}^{\beta '} \hat{K}_{i, i'}\) of the kernel from (39) exist and are bounded independently of \(x_1, x_1' \in (0, 1)\). Therefore, we may differentiate \(\hat{K}_{i, i'}^{(2)}\) under the integral to obtain that also \(\hat{K}_{i, i'}^{(2)}\) is smooth apart from the diagonal. By arguing as in Sect. 4, additionally using the bi-Lipschitz continuity of the parametrization \(\varvec{\gamma }\), we may derive a Calderón-Zygmund estimate for \(\hat{K}_{i, i'}^{(2)}\), i.e.,

$$\begin{aligned} \big | \partial _{x_2}^{\beta } \partial _{x_2'}^{\beta '} \hat{K}_{i, i'}^{(2)} (\hat{x}_2, \hat{x}_2') \big | \lesssim C_{\beta , \beta ', K, i, i'} |\hat{x}_2 - \hat{x}_2'|^{-(2 + 2q + \beta + \beta ')}, \quad \hat{x}_2, \hat{x}_2' \in [0, 1]. \end{aligned}$$

Therefore, the arguments of [24] work in this case as well.

In the proof of Theorem 4, we have also made use of vanishing moments hidden in the kernel. This is possible to do here as well. Indeed, if e.g. \(j_1' \ge j_1\), then we only have to consider a term of the form

$$\begin{aligned} \mathop {\text {ess}\,\text {sup}}\limits _{{\hat{x}}_1' \in \hat{\varOmega }_{j_1', k_1'}} \left| \int _{\hat{\varOmega }_{j_1, k_1}} \psi _{j_1, k_1}({\hat{x}}_1) \ \partial _{{\hat{x}}_1'}^{\widetilde{d} }{\hat{K}}_{i, i'}({\hat{\varvec{x}}}, {\hat{\varvec{x}}}') \; {\text {d}}{\hat{x}}_1 \right| , \end{aligned}$$

where the asymptotic estimate for \({\hat{K}}_{i, i'}\) holds since \(\varvec{\gamma }_{i'}\) is smooth on \(\varOmega _{j_1', k_1'} \times [0, 1]\) in view of (39). Hence, the same arguments as in the proof of Theorem 4 apply.

7.2.3 Patches with a common vertex

If the patches \(\varGamma _i\) and \(\varGamma _{i'}\) have a common vertex, we can use a similar argument as before. We assume that the common vertex \(\varvec{v}\) satisfies, possibly after applications of suitable rotations and translations,

$$\begin{aligned} \varvec{\gamma }_i\big (\big [{\begin{matrix} 1 \\ 1 \end{matrix}}\big ]\big ) = \varvec{v}= \varvec{\gamma }_{i'}\big (\big [{\begin{matrix} 0 \\ 0 \end{matrix}}\big ]\big ). \end{aligned}$$

Hence, we may find a Lipschitz continuous parametrization \(\varvec{\gamma }: [0, 2]^2 \rightarrow \varGamma \) such that

$$\begin{aligned} \varvec{\gamma }\big |_{[0, 1]^2} = \varvec{\gamma }_i, \quad \varvec{\gamma }\big |_{[1, 2]^2} = \varvec{\gamma }_{i'}. \end{aligned}$$
Fig. 4
figure 4

A possible Lipschitz continuous extension of the maps \(\varvec{\gamma }_i\) and \(\varvec{\gamma }_{i'}\) in case of a common vertex

Concerning the first compression, in view of (40), (41), and

$$\begin{aligned} \Vert \varvec{\gamma }_i({\hat{\varvec{x}}})-\varvec{\gamma }_{i'}({\hat{\varvec{x}}}')\Vert \sim \big \Vert {\hat{\varvec{x}}}-{\hat{\varvec{x}}}'-\big [{\begin{matrix} 1 \\ 1 \end{matrix}}\big ]\big \Vert , \quad \hat{\varvec{x}}, \hat{\varvec{x}}' \in \square , \end{aligned}$$
(42)

we can obviously use the estimates of Sect. 4.1 here, too.

To proceed with the second compression, we need an estimate like the one in Theorem 4, for which we define

$$\begin{aligned} \hat{K}_{i, i'}^{(1)} ({\hat{x}}_1, {\hat{x}}_1') \mathrel {\mathrel {\mathop :}=}\int _0^1 \int _0^1 \hat{K}_{i, i'} ({\hat{\varvec{x}}}, {\hat{\varvec{x}}}') \psi _{j_2, k_2}({\hat{x}}_2) \psi _{j_2', k_2'}({\hat{x}}_2') \; {\text {d}}\hat{x}_2' \; {\text {d}}\hat{x}_2, \ \ \hat{x}_1, \hat{x}_1' \in [0, 1]. \end{aligned}$$

In view of (42) and the fact that the restrictions of \(\varvec{\gamma }\) to \([0, 1]^2\) and \([1, 2]^2\) are smooth, we deduce for \(\varvec{x}=\varvec{\gamma }_i({\hat{\varvec{x}}})\) and \(\varvec{x}'=\varvec{\gamma }_i({\hat{\varvec{x}}}')\) that

$$\begin{aligned} \left| \partial _x^\alpha \partial _{x'}^{\alpha '}\hat{K}_{i, i'}^{(1)} ({\hat{x}}_1,{\hat{x}}_1') \right|&\lesssim 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d} \max \{j_2, j_2' \}} \Vert \varvec{x}-\varvec{x}'\Vert ^{-(2 + 2q + \widetilde{d} + \alpha + \alpha ')} \\&\lesssim 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d} \max \{j_2, j_2' \}} \big \Vert {\hat{\varvec{x}}}-{\hat{\varvec{x}}}'-\big [{\begin{matrix} 1 \\ 1 \end{matrix}}\big ]\big \Vert ^{-(2 + 2q + \widetilde{d} + \alpha + \alpha ')} \\&\lesssim 2^{- \frac{1}{2}(j_2 + j_2')} 2^{- \widetilde{d} \max \{j_2, j_2' \}} |{\hat{x}}_1 - {\hat{x}}_1'-1|^{-(2 + 2q + \widetilde{d} + \alpha + \alpha ')}. \end{aligned}$$

This is exactly the estimate needed for Theorem 4 when considered on the interval \([0,2]\) with \({\hat{x}}_1\in [0,1]\) and \({\widetilde{x}}_1 \mathrel {\mathrel {\mathop :}=}{\hat{x}}_1'+1\in [1,2]\). Similarly, we can derive such an estimate for the second coordinate direction. As there holds

$$\begin{aligned} \big \langle \mathcal {A}\psi _{i', {\textbf {j}} ', {\textbf {k}} '}, \psi _{i, {\textbf {j}} , {\textbf {k}} } \big \rangle _{\varGamma }&= \int _0^1 \int _0^1 \hat{K}_{i, i'}^{(1)} (\hat{x}_1, \hat{x}_1') \psi _{j_1, k_1}(\hat{x}_1) \psi _{j_1', k_1'} (\hat{x}_1') \; {\text {d}}\hat{x}_1' \; {\text {d}}\hat{x}_1 \\&= \int _0^1 \int _0^1 \hat{K}_{i, i'}^{(2)} (\hat{x}_2, \hat{x}_2') \psi _{j_2, k_2} (\hat{x}_2) \psi _{j_2', k_2'} (\hat{x}_2') \; {\text {d}}\hat{x}_2' \; {\text {d}}\hat{x}_2, \end{aligned}$$

it is enough if one realization of the entry can be compressed.

Moreover, on the parameter domain \([0, 2]^2\), cf. Fig. 4, the preimages of \(\varOmega _{i, {\textbf {j}} , {\textbf {k}} }\) and \(\varOmega _{i', {\textbf {j}} ', {\textbf {k}} '}\) either touch each other, in which case we cannot compress, or they are well-seperated in at least one coordinate direction. Hence, in this direction, the smooth extension \(\tilde{f}\) of the larger wavelet is 0 as well, so we do only have to consider the complement function. This estimate depends only on the distance between the supports, which is, at least in this coordinate direction, equivalent to the distance in \(\mathbb {R}^3\), since

$$\begin{aligned} \Vert \varvec{\gamma }_i({\hat{\varvec{x}}}) - \varvec{\gamma }_{i'}({\hat{\varvec{x}}}')\Vert _2 \sim \big \Vert {\hat{\varvec{x}}} - {\hat{\varvec{x}}}'-\big [{\begin{matrix} 1 \\ 1 \end{matrix}}\big ]\big \Vert _2 \sim \big \Vert {\hat{\varvec{x}}} - {\hat{\varvec{x}}}'-\big [{\begin{matrix} 1 \\ 1 \end{matrix}}\big ]\big \Vert _\infty = |{\hat{x}}_\ell - {\hat{x}}_\ell '-1| \end{aligned}$$

for an \(\ell \in \{1, 2\}\).

7.2.4 Well-separated patches

If \(\varGamma _i\) and \(\varGamma _{i'}\) do neither share a common vertex nor a common edge, then, since the domain under consideration has at least a Lipschitz boundary, there holds \({{\,\textrm{dist}\,}}( \varGamma _i,\varGamma _{i'} ) \gtrsim 1\ge 2^{- \min \{j_1, j_2, j_1', j_2' \}}\). Therefore, the first compression is possible for all such entries. We also note that in this case, we only need the spatial distance, so we do not have to think about appropriate coordinate directions here.

In the first compression, there are two different possibilities: First, if

$$\begin{aligned} \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '} \sim 2^{- \min \{j_1, j_2, j_1', j_2'\}}, \end{aligned}$$

and \(h \mathrel {\mathrel {\mathop :}=}\min \{ 1,{{\,\textrm{dist}\,}}(\varGamma _i,\varGamma _j) \} > 0\), then we can compress all entries, for which the maximal support length is smaller than \(h\), meaning if

$$\begin{aligned} \min \{j_1, j_2, j_1', j_2'\} > \lceil - \log _2 h \rceil \mathrel {=\mathrel {\mathop :}}C, \end{aligned}$$

leaving us at

$$\begin{aligned} \sum _{\begin{array}{c} j_0< j_1, j_2 \le C \\ j_0 < j_1', j_2' \le C \end{array}} 2^{|{\textbf {j}} |_1 + |{\textbf {j}} '|_1 - \min \{j_1, j_2, j_1', j_2'\}} \lesssim C 2^{3C} = \mathcal {O}(1) \end{aligned}$$

entries. Note that this is entirely theoretical – it is clear that this constant deteriorates if \(h > 0\) is small. If not, then there are only \(\mathcal {O}(2^{2J})\) entries on each patch-patch interaction, which is implied by Sect. 6.1.

Summarized, we arrive at the following theorem:

Theorem 11

Consider a patchwise smooth and globally Lipschitz continuous manifold, and a boundary integral operator \(\mathcal {A}\) of the order \(2q < 1\). By using the liftings of wavelet functions which satisfy the conditions on \(\tilde{d}\) in (31), we can compress the Galerkin matrix \({\textbf {A}} _J\) associated with the operator \(\mathcal {A}\) into matrix \({\textbf {A}} _J^c\), which contains only \(\mathcal {O}(2^{2J})\) nontrivial entries. Moreover, each block satisfies the error estimate

$$\begin{aligned} \big \Vert [{\textbf {A}} _J]_{{\textbf {j}} ,{\textbf {j}} '} - [{\textbf {A}} _J^c]_{{\textbf {j}} ,{\textbf {j}} '}\big \Vert _2 \lesssim 2^{-\sigma _{{\textbf {j}} ,{\textbf {j}} '}}. \end{aligned}$$

7.3 Globally continuous wavelets

In order to discretize \(H^q(\varGamma )\) for \(q\ge \frac{1}{2}\), we need to construct a wavelet basis of globally continuous functions. To this end, let us consider the wavelet bases introduced in [11]. Because \(\widetilde{d}=2\) vanishing moments is enough to optimally compress the hypersingular operator, cf. Remark 9, we restrict ourselves to this specific case. However, the subsequent construction works also in case of more vanishing moments. We shall first briefly illustrate how the associated piecewise linear wavelet basis on the unit interval looks like.

For a fixed \(j \ge j_0 \mathrel {\mathrel {\mathop :}=}2\), we consider the space \(V_j = {{\,\textrm{span}\,}}\{\phi _{j,0}, \ldots , \phi _{j,2^{j}}\}\), where each \(\phi _{j,k}\) is a properly scaled and piecewise linear hat function. In particular, for \(k=1,\ldots ,2^{j}-1\), we have \(\phi _{j,k}(2^{-j}\ell ) = 2^{j/2}\delta _{k,\ell }\), for \(\ell =0,\ldots ,2^{j}\). We note that \(\phi _{j,0}\) and \(\phi _{j,2^{j}}\) are boundary functions, satisfying \(\phi _{j,0}(0)=2^{j/2}\), and \(\phi _{j,2^j}(1)=2^{j/2}\), respectively.

As wavelet spaces, we define \(W_j\mathrel {\mathrel {\mathop :}=}{{\,\textrm{span}\,}}\{\psi _{j,0},\ldots ,\psi _{j,2^{j-1}-1}\}\), where

$$\begin{aligned} \psi _{j,0}&= \frac{5}{8}\phi _{j,1} -\frac{3}{4}\phi _{j,2} -\frac{1}{4}\phi _{j,3} +\frac{1}{4}\phi _{j,4} +\frac{1}{8}\phi _{j,5},\\ \psi _{j,2^{j-1}-1}&= \frac{1}{8}\phi _{j,2^{j}-5} +\frac{1}{4}\phi _{j,2^{j}-4} -\frac{1}{4}\phi _{j,2^{j}-3} -\frac{3}{4}\phi _{j,2^{j}-2} +\frac{5}{8}\phi _{j,2^j-1}, \end{aligned}$$

and for \(k=1,\ldots ,2^{j-1}-2\)

$$\begin{aligned} \psi _{j,k} = -\frac{1}{8}\phi _{j,2k-1} -\frac{1}{4}\phi _{j,2k} +\frac{3}{4}\phi _{j,2k+1} -\frac{1}{4}\phi _{j,2k+2} -\frac{1}{8}\phi _{j,2k+3}, \end{aligned}$$

compare Fig. 5. By construction, all wavelets satisfy homogeneous Dirichlet boundary conditions. Therefore, the tensor product basis on the unit square satisfies also homogeneous Dirichlet boundary conditions except for the tensor products which involve the coarse grid scaling functions at the interval’s boundary. More precisely, the only functions associated with an edge are tensor products of the respective coarse grid scaling function perpendicular to the edge with wavelets parallel to the edge, while the only functions associated with a vertex consist of the tensor product of the respective coarse grid scaling functions. Hence, in order to glue wavelets across an edge together, we only need to glue scaling functions, which is not hard. Likewise, the only functions, which must be glued together across a vertex, are tensor products of scaling functions, which is an easy task, too.

Fig. 5
figure 5

The spaces \({V}_2\) and \({W}_3\) with piecewise linear functions on the unit interval

7.4 Matrix estimates for continuous wavelets

Out of the arguments in Sect. 7.2, we can directly conclude the same properties for all wavelets, which are supported on a single patch. Hence, the only remaining wavelets are those which are supported across an edge or a vertex.

We first note that there is only a single function which is supported on a particular vertex. It is composed of scaling functions. Therefore, there are at most \(\mathcal {O}(1)\) such matrix rows and columns, by which this case is trivial.

Next, we consider wavelets which are supported across an edge. Let us use the notation of Sect. 7.2, i.e., we call the edge \(\varSigma \) and assume without loss of generality that it belongs to the patches \(\varGamma _i\) and \(\varGamma _{i'}\) in the direction \(x_1\) as in Fig. 3. Let the wavelet \(\psi _{\varSigma ,{\textbf {j}} ,{\textbf {k}} }\) on the level \({\textbf {j}} \) with \(j_1=j_0\) be supported on \(\varSigma \), while the test wavelet on level \({\textbf {j}} '\) is arbitrary. We differ between two cases:

First, if \(j_1'=j_0\), then the test wavelet is supported on an edge, too, and we do not need to compress such entries, because in total, there are at most \(\mathcal {O}(2^{2J})\) such entries.

On the other hand, if \(j_1'>j_0\), we know that the test wavelet is supported either on \(\varGamma _i\) or on \(\varGamma _{i'}\), but not on both of them. We thus assume without loss of generality that \(\psi _{i',{\textbf {j}} ',{\textbf {k}} '}\) is supported on \(\varGamma _{i'}\). Hence, the corresponding matrix entry is given by

$$\begin{aligned} \langle \mathcal {A}\psi _{i',{\textbf {j}} ',{\textbf {k}} '}, \psi _{\varSigma ,{\textbf {j}} ,{\textbf {k}} }\rangle _{\varGamma }&= \int _{\varGamma _{i'}} \int _{\varGamma _{i'}} K(\varvec{x},\varvec{x}') \psi _{\varSigma ,{\textbf {j}} ,{\textbf {k}} }(\varvec{x}) \psi _{i',{\textbf {j}} ',{\textbf {k}} '}(\varvec{x}') \; {\text {d}}\varvec{x}'\; {\text {d}}\varvec{x}\\&\quad +\int _{\varGamma _{i}} \int _{\varGamma _{i'}} K(\varvec{x},\varvec{x}') \psi _{\varSigma ,{\textbf {j}} ,{\textbf {k}} }(\varvec{x}) \psi _{i',{\textbf {j}} ',{\textbf {k}} '}(\varvec{x}') \; {\text {d}}\varvec{x}'\; {\text {d}}\varvec{x}. \end{aligned}$$

We shall consider these two integrals separately.

For the first one, we remark that \(\psi _{\varSigma ,{\textbf {j}} ,{\textbf {k}} }\big |_{\varGamma _{i'}}\) is nothing but the tensor product of a scaling function on the coarsest level with a wavelet. As we integrate over the same patch \(\varGamma _{i'}\) only, this is equivalent to the situation on the unit square.

The second integral, on the other hand, can be treated as in Sect. 7.2. Indeed, in this case, we can consider \(\psi _{\varSigma ,{\textbf {j}} ,{\textbf {k}} }\big |_{\varGamma _i}\) and \(\psi _{i',{\textbf {j}} ',{\textbf {k}} '}\) as a pair of globally discontinuous, patchwise supported wavelets.

Therefore, we can state the following theorem:

Theorem 12

Consider a patchwise smooth, globally Lipschitz continuous domain and a continuous wavelet basis which satisfies the following conditions:

  1. 1.

    There are no more than \(\mathcal {O}(1)\) wavelets supported on each vertex.

  2. 2.

    On each fixed level \({\textbf {j}} \) and each edge \(\varSigma \), there are only \(\mathcal {O}(2^{|{\textbf {j}} |_\infty })\) wavelets \(\psi _{\varSigma ,{\textbf {j}} ,{\textbf {k}} }\) supported on \(\varSigma \).

  3. 3.

    All wavelets, except for the scaling functions, have patchwise vanishing moments.

  4. 4.

    The number of vanishing moments fulfils the requirements of (31).

Then, we can compress the Galerkin matrix \({\textbf {A}} _J\) of a boundary integral operator \(\mathcal {A}\) of order \(2q < 3\) into a matrix \({\textbf {A}} _J^c\) which contains at most \(\mathcal {O}(2^{2J})\) nontrivial entries. Moreover, each matrix block satisfies the error estimate

$$\begin{aligned} \big \Vert [{\textbf {A}} _J]_{{\textbf {j}} ,{\textbf {j}} '}-[{\textbf {A}} _J^c]_{{\textbf {j}} ,{\textbf {j}} '}\big \Vert _2 \lesssim 2^{-\sigma _{{\textbf {j}} ,{\textbf {j}} '}}. \end{aligned}$$

8 Consistency and convergence

In this section, we are going to show that the Galerkin scheme for the compressed operator converges as well as the Galerkin scheme for the uncompressed operator. This means that the wavelet matrix compression under consideration realizes the discretization error accuracy offered by the underlying Galerkin scheme, provided that the wavelet basis under consideration satisfies the requirements of Sect. 7.

Similar to [9], we define the compressed boundary integral operator \(\mathcal {A}_J^c: H^{s}(\varGamma ) \rightarrow H^{s - 2q}(\varGamma )\) in accordance with

$$\begin{aligned} \mathcal {A}_J^c u \mathrel {\mathrel {\mathop :}=}\sum _{\begin{array}{c} |{\textbf {j}} |_\infty \le J \\ |{\textbf {j}} '|_\infty \le J \end{array}} \sum _{\begin{array}{c} {\textbf {k}} \in \nabla _{\textbf {j}} \\ {\textbf {k}} ' \in \nabla _{{\textbf {j}} '} \end{array}} \big [ {\textbf {A}} _J^c \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big \langle \widetilde{\psi }_{{\textbf {j}} ', {\textbf {k}} '}, u \big \rangle _\varGamma \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, \quad u \in H^s(\varGamma ), \end{aligned}$$

which defines a continuous operator for all \(- \widetilde{\gamma }< s < \widetilde{\gamma } + 2q\). Especially, this operator represents the compressed matrix \({\textbf {A}} _J^c\) in terms of

$$\begin{aligned} \big \langle \mathcal {A}_J^c \psi _{{\textbf {j}} ', {\textbf {k}} '}, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\varGamma = \big [ {\textbf {A}} _J^c \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, \quad |{\textbf {j}} |_\infty , |{\textbf {j}} '|_\infty \le J, \quad {\textbf {k}} \in \nabla _{{\textbf {j}} },\ {\textbf {k}} ' \in \nabla _{{\textbf {j}} '}. \end{aligned}$$

Theorem 13

(Consistency) Let \(\varepsilon > 0\) and let \({\textbf {A}} _J^c\) denote the compressed matrix on the level \(J\) with a parameter \(a\), such that

$$\begin{aligned} \max \big \{ a^{-(2q + 4 \widetilde{d})}, \, a^{-(2q + 2 \widetilde{d})}, \, a^{-(1 + 2q + 2 \widetilde{d})} \big \} \le \varepsilon . \end{aligned}$$

Then, for \(q \le s, t \le d\), the associated compressed operator \(\mathcal {A}_J^c\) satisfies the estimate

$$\begin{aligned} \big | \big \langle (\mathcal {A}- \mathcal {A}_J^c ) Q_J u, \, Q_J v \big \rangle _\varGamma \big | \lesssim \varepsilon \, 2^{J(2q - s -t)} \Vert u\Vert _{H^s(\varGamma )} \Vert v\Vert _{H^t(\varGamma )} \end{aligned}$$
(43)

uniformly in \(J\), provided that \(\widetilde{\gamma }> -q\), \(u \in H^s(\varGamma )\) and \(v \in H^t(\varGamma )\).

Proof

We note that, since \(s\ge q > -\widetilde{\gamma }\), we have the representation formula

$$\begin{aligned} u = \sum _{{\textbf {j}} , {\textbf {k}} } \big \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, u \big \rangle _\varGamma \psi _{{\textbf {j}} , {\textbf {k}} }, \quad u \in H^s(\varGamma ). \end{aligned}$$

By the definition of the operators \(\mathcal {A}\), \(\mathcal {A}_J^c\) and the biorthogonality, together with all the block error estimations and the definition of \(\sigma _{{\textbf {j}} , {\textbf {j}} '}\) in (20), we obtain that

$$\begin{aligned}&\bigg |\big \langle (\mathcal {A}- \mathcal {A}_J^c) Q_J u, \, Q_J v \big \rangle _\varGamma \bigg |\\&\quad = \Bigg | \sum _{\begin{array}{c} |{\textbf {j}} |_\infty \le J \\ |{\textbf {j}} '|_\infty \le J \end{array}} \sum _{\begin{array}{c} {\textbf {k}} \in \nabla _{\textbf {j}} \\ {\textbf {k}} ' \in \nabla _{{\textbf {j}} '} \end{array}} \big \langle \widetilde{\psi }_{{\textbf {j}} , {\textbf {k}} }, u \big \rangle _\varGamma \big [ {\textbf {A}} _J - {\textbf {A}} _J^c \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')} \big \langle \widetilde{\psi }_{{\textbf {j}} ', {\textbf {k}} '}, v \big \rangle _\varGamma \Bigg |\\&\quad \le \sum _{\begin{array}{c} |{\textbf {j}} |_\infty \le J \\ |{\textbf {j}} '|_\infty \le J \end{array}} \bigg | \big [ u_{{\textbf {j}} , {\textbf {k}} } \big ]_{\textbf {k}} ^\intercal \big [ {\textbf {A}} _J - {\textbf {A}} _J^c \big ]_{{\textbf {j}} , {\textbf {j}} '} \big [ v_{{\textbf {j}} ', {\textbf {k}} '} \big ]_{{\textbf {k}} '} \bigg | \\&\quad \le \sum _{\begin{array}{c} |{\textbf {j}} |_\infty \le J \\ |{\textbf {j}} '|_\infty \le J \end{array}} \big \Vert \big [ u_{{\textbf {j}} , {\textbf {k}} } \big ]_{\textbf {k}} \big \Vert _2 \ \big \Vert \big [ {\textbf {A}} _J - {\textbf {A}} _J^c\big ]_{{\textbf {j}} , {\textbf {j}} '}\big \Vert _2 \ \big \Vert \big [ v_{{\textbf {j}} ', {\textbf {k}} '} \big ]_{{\textbf {k}} '}\big \Vert _2 \\&\quad \lesssim \varepsilon \, 2^{2J(q - d')} \sum _{n, n' = 0}^J 2^{(d' - 2\kappa ) (n + n')} \left( \sum _{|{\textbf {j}} |_\infty = n} 2^{\kappa |{\textbf {j}} |_1}\big \Vert \big [ u_{{\textbf {j}} , {\textbf {k}} } \big ]_{\textbf {k}} \big \Vert _2 \right) \\&\qquad \quad \cdot \left( \sum _{|{\textbf {j}} '|_\infty = n'} 2^{\kappa |{\textbf {j}} '|_1}\big \Vert \big [ v_{{\textbf {j}} ', {\textbf {k}} '} \big ]_{{\textbf {k}} '}\big \Vert _2 \right) . \end{aligned}$$

For each the two sums, we may apply the inequality of Cauchy-Schwarz to obtain that

$$\begin{aligned} \sum _{|{\textbf {j}} |_\infty = n} 2^{\kappa |{\textbf {j}} |_1}\big \Vert \big [ u_{{\textbf {j}} , {\textbf {k}} } \big ]_{\textbf {k}} \big \Vert _2&\le \left( \sum _{|{\textbf {j}} |_\infty = n} 2^{2 \kappa |{\textbf {j}} |_1} \right) ^{\frac{1}{2}} \left( \sum _{|{\textbf {j}} |_\infty = n} \big \Vert \big [ u_{{\textbf {j}} , {\textbf {k}} } \big ]_{\textbf {k}} \big \Vert _2^2 \right) ^{\frac{1}{2}}. \end{aligned}$$

Herein, the first sum can be estimated by

$$\begin{aligned} \sum _{|{\textbf {j}} |_\infty = n} 2^{2 \kappa |{\textbf {j}} |_1}&= 2^{4 \kappa n } + 2 \sum _{m = 0}^{n-1} 2^{2 \kappa (m + n)} \lesssim 2^{4 \kappa n}. \end{aligned}$$

Moreover, if \(s \ge 0\), the second sum can be treated by the approximation property (14) of the spaces \(V_n\), and we obtain that

$$\begin{aligned} \left( \sum _{|{\textbf {j}} |_\infty = n} \big \Vert \big [ u_{{\textbf {j}} , {\textbf {k}} } \big ]_{\textbf {k}} \big \Vert _2^2 \right) ^{\frac{1}{2}}&\sim \Vert (Q_n - Q_{n-1}) u\Vert _{L^2(\varGamma )} \\&\le \Vert Q_n u - u\Vert _{L^2(\varGamma )} + \Vert u - Q_{n-1} u\Vert _{L^2(\varGamma )} \\&\lesssim 2^{-sn} \Vert u\Vert _{H^s(\varGamma )}. \end{aligned}$$

Whereas, if \(s < 0\), we can use the Bernstein inequality (15) to get

$$\begin{aligned} \sum _{|{\textbf {j}} |_\infty = n} \big \Vert \big [u_{{\textbf {j}} , {\textbf {k}} }]_{\textbf {k}} \big \Vert _2^2&\sim \Vert (Q_n - Q_{n-1}) u \Vert _{L^2(\varGamma )} \le \Vert Q_n u \Vert _{L^2(\varGamma )} + \Vert Q_{n-1} u \Vert _{L^2(\varGamma )} \\&\lesssim 2^{-sn} \Vert u \Vert _{H^s(\varGamma )}. \end{aligned}$$

After applying the same procedure to \(v\), we finally arrive at

$$\begin{aligned}&\bigg |\big \langle (\mathcal {A}- \mathcal {A}_J^c) Q_J u,Q_J v \big \rangle _\varGamma \bigg |\\&\quad \lesssim \varepsilon 2^{2J(q-d')} \sum _{n, n' = 0}^J 2^{n(d'-s)} 2^{n'(d' - t)} \Vert u\Vert _{H^s(\varGamma )} \Vert v\Vert _{H^t(\varGamma )} \\&\quad \lesssim \varepsilon 2^{2J(q-d')} \Vert u\Vert _{H^s(\varGamma )} \Vert v\Vert _{H^t(\varGamma )} \sum _{n = 0}^J 2^{n(d'-s)} \sum _{n' = 0}^J 2^{n'(d'-t)} \\&\quad \lesssim \varepsilon 2^{J(2q - s- t)} \Vert u\Vert _{H^s(\varGamma )} \Vert v\Vert _{H^t(\varGamma )}, \end{aligned}$$

since \(q\le s, t \le d < d'\).\(\square \)

Our next goal is to show that the compressed wavelet scheme converges to the original solution. First, similar to [9], we keep in mind that Theorem 13 implies that

$$\begin{aligned} \big | \big \langle (\mathcal {A}- \mathcal {A}_J^{c}) u_J,v_J \big \rangle _\varGamma \big | \lesssim \varepsilon \, \Vert u_J\Vert _{H^q(\varGamma )} \Vert v_J\Vert _{H^q(\varGamma )}, \quad u_J, v_J \in V_{{\textbf {J}} }. \end{aligned}$$

In view of the strong ellipticity (4), we can conclude that

$$\begin{aligned} \big \langle \big (\mathcal {A}_J^c + (\mathcal {A}_J^c)' \big ) u_J, u_J \big \rangle _\varGamma \ge (c - 2 \varepsilon ) \Vert u_J\Vert _{H^q(\varGamma )}^2, \end{aligned}$$

so the operator \(\mathcal {A}_J^c\) is strongly elliptic for \(\varepsilon \) sufficiently small, too. These two properties then imply that the operator \(\mathcal {A}_J^c\) is stable in the sense that

$$\begin{aligned} \big \Vert \mathcal {A}_J^c u_J\big \Vert _{H^{-q}(\varGamma )} \sim \big \Vert u_J\big \Vert _{H^q(\varGamma )}. \end{aligned}$$

With these results at hand, we may deduce the following two theorems by applying the arguments of [9]. We remark that the proofs of these theorems involve only the consistency, the ellipticity, the stability, and the approximation property, compare [9].

Theorem 14

(Convergence) Let \(\varepsilon \) be sufficiently small such that \(\mathcal {A}_J^c\) is strongly elliptic. Then, the solution of the compressed matrix equation

$$\begin{aligned} u_J = \sum _{|{\textbf {j}} |_\infty \le J} \sum _{{\textbf {k}} \in \nabla _{{\textbf {j}} }} u_{{\textbf {j}} , {\textbf {k}} } \psi _{{\textbf {j}} , {\textbf {k}} }, \end{aligned}$$

where the coefficient vector \({\textbf {u}} _J\) satisfies

$$\begin{aligned} {\textbf {A}} _J^c {\textbf {u}} _J = {\textbf {g}} _J, \quad \text {where}\ \big [{\textbf {g}} _J \big ]_{{\textbf {j}} , {\textbf {k}} } = \big \langle g, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\varGamma , \end{aligned}$$

converges to the solution \(u\) of (5) in \(H^q(\varGamma )\) and we have the estimate

$$\begin{aligned} \Vert u - u_J\Vert _{H^q(\varGamma )} \lesssim 2^{J(q-d)} \Vert u\Vert _{H^d(\varGamma )}, \end{aligned}$$

provided that \(u \in H^d(\varGamma )\) and \(-\widetilde{\gamma }<q<\gamma \).

Theorem 15

(Aubin-Nitsche) Let all the assumptions of Theorem 14 hold and moreover assume that \(\mathcal {A}': H^{t+q}(\varGamma ) \rightarrow H^{t-q}(\varGamma )\) is an isomorphism for any \(0 \le t \le d-q\). Then, we have the error estimate

$$\begin{aligned} \Vert u - u_J\Vert _{H^{q-t}(\varGamma )} \lesssim 2^{J(q-d-t)} \Vert u\Vert _{H^d(\varGamma )}, \end{aligned}$$

provided that \(u \in H^d(\varGamma )\) and \(-\widetilde{\gamma }<q<\gamma \).

9 Numerical computations

Fig. 6
figure 6

Ordering of the matrix blocks for \(J = 3\). Note that the different colours correspond to the blocks with different maximal levels (color figure online)

In this section, we present numerical experiments to validate the theoretical findings. We use piecewise constant wavelets with three vanishing moments and consider the single layer operator on the unit square. We compute first the full wavelet Galerkin matrix

$$\begin{aligned} {\textbf {A}} _J = \big [ \big \langle \mathcal {A}\psi _{{\textbf {j}} ', {\textbf {k}} '}, \psi _{{\textbf {j}} , {\textbf {k}} } \big \rangle _\square \big ]_{({\textbf {j}} , {\textbf {k}} ), ({\textbf {j}} ', {\textbf {k}} ')}, \quad |{\textbf {j}} |_\infty , |{\textbf {j}} '|_\infty \le J, \ {\textbf {k}} \in \nabla _{{\textbf {j}} }, \ {\textbf {k}} ' \in \nabla _{{\textbf {j}} '}. \end{aligned}$$

with \(\mathcal {A}\) being the single layer operator. Then, the obsolete entries are removed according to the compression scheme from Sect. 5. The number of nonzero entries are calculated and plotted (blue line) in on the left-hand side of Fig. 8. To this end, we have chosen the parameters \(\tilde{d} = 3\), \(d' = 1.1\), and \(\kappa = 10^{-3}\), while for the bandwidth parameter \(a\) the different values \(a = 0.5, 1.0, 2.0\) have been considered. The system matrix for the anisotropic tensor product wavelet basis and its compressed counterpart with \(2^{2 \cdot 7} = 16\,384\) rows and columns and \(a=1.0\) can be found in Fig. 7. Note that the matrix blocks are lexicographically ordered as illustrated in Fig. 6.

Fig. 7
figure 7

The wavelet Galerkin matrix (left) and its compressed version (right) for \(J = 7\). We have used the parameters \(a=1.0\), \(\tilde{d} = 3\), \(d' = 1.1\), and \(\kappa = 10^{-3}\). The colour indicates the absolute values of the matrix entries in a logarithmic scale (color figure online)

From a theoretical point of view, the number of nontrivial entries in each block can be bounded by

$$\begin{aligned} \begin{aligned} N_{{\textbf {j}} , {\textbf {j}} '}&\lesssim 2^{|{\textbf {j}} |_1 + |{\textbf {j}} '|_1} \\&\ \ \cdot \Big ( \mathcal {B}_{{\textbf {j}} , {\textbf {j}} '}^2 + \min \big \{ 2^{- \min \{j_1, j_2, j_1', j_2'\}}, \, \mathcal {E}_{{\textbf {j}} , {\textbf {j}} '} \big \} \big (\mathcal {D}_{{\textbf {j}} ,{\textbf {j}} '}^{x_1}+\mathcal {D}_{{\textbf {j}} ,{\textbf {j}} '}^{x_2}\big ) + \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_1} \mathcal {F}_{{\textbf {j}} , {\textbf {j}} '}^{x_2} \Big ). \end{aligned} \end{aligned}$$
(44)

Moreover, in accordance with (37) and (38), an additional summand \(2^{\min \{|{\textbf {j}} |_1,|{\textbf {j}} '|_1\}}\) is added if \({\textbf {j}} \le {\textbf {j}} '\) or \({\textbf {j}} ' \le {\textbf {j}} \). The number of nontrivial entries in the whole compressed matrix can therefore also be bounded by

$$\begin{aligned} \mathcal {C}\lesssim \sum _{|{\textbf {j}} |_\infty \le J} \sum _{|{\textbf {j}} '|_\infty \le J} N_{{\textbf {j}} , {\textbf {j}} '}. \end{aligned}$$
(45)

This estimate is also found (red line) in the plot on the left-hand side of Fig. 8.

On the right-hand side of Fig. 8, we have computed the consistency errors arising from the compression scheme. Therein, we have generated random vectors \({\textbf {u}} , {\textbf {v}} \) and scaled them such that they correspond to coefficient vectors of functions \(u,v \in H^1(\varGamma )\) with respect to the anisotropic tensor product wavelet basis. We have used 100 random samples to calculate the quantity \(\big | {\textbf {v}} ^\intercal ({\textbf {A}} _J - {\textbf {A}} _J^{c} ){\textbf {u}} \big |\) which is the discrete version of (43). We see that the calculations match the expected behaviour.

Fig. 8
figure 8

Left: Calculation (blue) and estimation (red) of the number of nonzero entries according to (44) and (45). Right: Consistency error according to (43) for \(s = t = d\) (color figure online)

10 Conclusion

We have developed a matrix compression scheme for the boundary element method using anisotropic tensor product wavelets. In the end, we get a quasi-sparse matrix containing only \(\mathcal {O}(N)\) nontrivial entries whilst the approximate solution converges to the exact solution at the rate of the discretization error. This applies for every integral operator of order \(2q<3\) on the unit square and a smooth boundary. On a Lipschitz geometry, however, the order of the integral operator is bounded by \(2|q| \le 2\) since the underlying Sobolev spaces \(H^s(\varGamma )\) are only defined for \(|s|\le 1\).

Likewise to [24, 25], our compression scheme may be generalized to an arbitrary spatial dimension on the unit cube. One would have to choose the compression parameters according to the location of the two tensor product wavelets with respect to each other. Then, one should combine the first compression in all directions, in which the two wavelets are in the far-field with a second compression for all directions, in which the wavelets are in the near-field. Especially, our compression estimates considerably improve previous results for the compression of nonlocal operators in sparse tensor product spaces.

On the contrary, it is not known yet whether the boundary integral operator is \(s^\star \)-compressible with respect to the anisotropic tensor product wavelet basis, which was established for the isotropic wavelet basis in [33]. With the \(s^\star \)-compressibility at hand, it was shown in [10, 15] that adaptive wavelet compression schemes can approximate the solution at the rate of the best \(N\)-term approximation at a linear complexity.