H\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {H}$$\end{document}-inverses for RBF interpolation

We consider the interpolation problem for a class of radial basis functions (RBFs) that includes the classical polyharmonic splines (PHS). We show that the inverse of the system matrix for this interpolation problem can be approximated at an exponential rate in the block rank in the H\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {H}$$\end{document}-matrix format, if the block structure of the H\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {H}$$\end{document}-matrix arises from a standard clustering algorithm.


Introduction
Radial basis functions (RBFs) have become an important tool in computational mathematics.Starting from the general question of interpolating scattered data, they found their way into statistics applications (see, e.g., [Wah90,GS94] or [BL88] for application in machine learning) and, as a specific instance of meshfree methods, into the realm of numerical methods for partial differential equations [Isk04,Wen05].
The analysis of the approximation properties of a variety of RBFs is by now rather mature, [Buh03,Isk04,Wen05].The classical interpolation problem with RBFs leads to linear systems with fully populated system matrices, which brings their efficient solution to the fore.A basic question in this connection is that of an efficient representation of the system matrix.Many RBFs such as polyharmonic splines, multiquadrics, and Gaussians are "asymptotically smooth" so that the system matrix can very efficiently be approximated by blockwise low rank matrices, [BN98,BPT07] and [ILBW17,LM17].This observation opened the door for log-linear complexity matrix-vector multiplication and a subsequent iterative solution.Nevertheless, good preconditioning strategies are necessary.Domain decomposition techniques [BLB00,LBW19a], possibly combined with multilevel techniques [Isk04,LBW20], are an option.A recent alternative is the use of the arithmetic that comes with rank-structured matrices.Here, we consider specifically H-matrices, [Hac15].These are blockwise low-rank matrices endowed with a hierarchical structure that leads to an (approximate) arithmetic including addition, multiplication, inversion, and LU-factorization in logarithmic-linear complexity.This arithmetic can thus be used as a direct solver or be employed to create preconditioners.The arithmetic is only approximate but the error can be controlled by either a priori chosen parameters [GH03] or adaptively [BR03,Gra05].Yet, a basic question remains whether the target, i.e., the inverse of the system matrix or the LU-factorization, can be represented in the H-matrix format.This question has been answered for finite element discretizations of elliptic PDEs in [BH03], recently improved in [FMP15] (to arbitrary accuracy) and [AFM21] (locally refined meshes), as well as for boundary element discretizations, [FMP16,FMP17], and the coupling of finite elements and boundary elements, [FMP20].
It is the purpose of the present paper to provide such an approximation result in T.2.18 for RBFs that are fundamental solutions of certain partial differential operators with constant coefficients; in particular, the popular polyharmonic splines, introduced and analyzed in [Duc76], are covered by the present paper (see L.2.11).Consequently, see [Beb07,FMP15], one also obtains the existence of LU -decompositions in the H-matrix format, which gives mathematical underpinning to their observed good performance in black-box preconditioning, [LBW19b,LBW20].
The present paper is structured as follows: In Section 2 we formulate our model problem, the so called interpolation problem together with the corresponding native space (V, a(•, •)).We then reformulate the problem as a linear system of equations (LSE) and introduce the interpolation matrix.Moreover, some basic definitions concerning H-matrices are given and finally, we state the main result, T.2.18.Section 3 is devoted to the proof of our main result: First, in Section 3.1, we investigate the native space V and introduce the homogeneous native space V 0 ⊆ V .In Section 3.2, we reformulate the original interpolation problem with an orthogonality condition in terms of a(•, •) and the space V 0 .In Section 3.4, C. 3.19, we derive a representation formula for the inverse system matrix to reduce the original "matrix approximation problem" to a "function approximation problem".Then, the main step in the proof is the construction of low dimensional spaces from which solutions to the interpolation problem can be approximated at an exponential rate, Section 3.9.Finally, Section 4 provides some numerical examples.
Concerning notation: We write "a b" iff there exists a constant C > 0 such that a ≤ Cb.The constant might depend on the space dimension d, the orders k and k min of the native space V , the coefficients σ l of the bilinear form a(•, •), et cetera.However, it does not depend on critical parameters such as the problem size N .We write a b, if both a b and a b are satisfied.Matrices and vectors in linear systems of equations are expressed in boldface letters, e.g., A ∈ R N ×N and f ∈ R N .The only exception to this rule is the set C ⊆ R N introduced in D. 2.8, from which coefficient vectors c ∈ C are drawn.For all x ∈ R d and r > 0, we write Ball(x, r) := {y ∈ R d | y − x 2 < r} for the Euclidean ball of radius r, centered at x.The diameter of a set Ω ⊆ R d is denoted by diam 2 (Ω) := sup x,y∈Ω x − y 2 and the distance between two sets Ω 1 , Ω 2 ⊆ R d is given by dist 2 (Ω 1 , Ω 2 ) := inf x1∈Ω1,x2∈Ω2 x 2 − x 1 2 .To treat derivatives in all space dimensions d ≥ 1 alike, we adopt the usual multi-index notation D α := ∂ and |α| := α 1 + • • • + α d , where α = (α 1 , . . ., α d ) ∈ N d 0 .We frequently drop the supplement "for all α ∈ N d 0 " and use shorthand phrases like "for all |α| = k" to describe the set {α ∈ N d 0 | |α| = k}.As for specific function spaces, we work with the following definitions (all function spaces are meant to be realvalued): Let d ≥ 1 and Ω ⊆ R d be an open set.For all p ∈ N 0 , we use the polynomial space P p (Ω) := span {x α | |α| ≤ p}, and we also write P −1 (Ω) := {0}.We write L p (Ω), p ∈ [1, ∞], for the classical Lebesgue spaces.Moreover, a measurable function f : Ω −→ R belongs to the space L p loc (Ω), iff it satisfies f | ω ∈ L p (ω) for all bounded open sets ω ⊆ R d satisfying ω ⊆ Ω.For all k ∈ N 0 and p ∈ [1, ∞], the Sobolev space W k,p (Ω) consists of all k-times weakly differentiable functions f ∈ L 1 loc (Ω), whose weak derivatives (D α f ) |α|≤k lie in L p (Ω).Similarly, a function f ∈ L 1 loc (Ω) belongs to the space W k,p loc (Ω), if it is k-times weakly differentiable and if there holds D α f ∈ L p loc (Ω) for all |α| ≤ k.In the case p = 2, we write H k (Ω) := W k,2 (Ω) and H k loc (Ω) := W k,2 loc (Ω).
In other words, a function v ∈ L 1 loc (R d ) belongs to V , if, for all l ∈ {k min , . . ., k} and all |α| = l, the α-th weak derivative

Furthermore, for every open subset
Note that the assumption k > d/2 guarantees that the Sobolev embedding , for all v ∈ H k (Ω).We will make use of this fact on several occasions throughout this work.
For the basic properties of V , P , a(•, •) and | • | a , we refer the reader to L.3.2 in the proof section below.In brief, a(•, •) defines a symmetric positive semi-definite bilinear form on V and | • | a is a seminorm with kernel P .There holds V ⊆ C 0 (R d ), so that every v ∈ V has well-defined point-values v(x), x ∈ R d .In the extreme case k min = k, there holds V = BL k (R d ), which is the well-known homogeneous Sobolev space or Beppo-Levi space (e.g., [DL95], [SSN98]).In the other extreme case k min = 0, however, there holds V = H k (R d ), the standard Sobolev space.The norms are equivalent and the polynomial space becomes trivial, P = {0}.In particular, (V, a(•, •)) is then a proper Hilbert space.
(2) Unisolvent subset: There exists a set of points {ξ α | |α| < k min } ⊆ R d , unisolvent for the space P , such that (3) Balance N ↔ h min : There exist constants C, σ card ≥ 1, such that During the proof of T.3.32, the assumption of boundedness will allow us to apply some Poincaré-type inequality on a fixed, bounded subset of R d , which is independent of the problem size N .Furthermore, as a minor technical detail, it guarantees that h min,N 1 for all N ≥ N min .
The unisolvency assumption, on the other hand, allows us to make the following implication: If a polynomial p ∈ P satisfies p(ξ α ) = 0 for all |α| < k min , then already p = 0.This argument will be used on numerous occasions.Note that all sets of interpolation points {x 1 , . . ., x N }, N ≥ N min , must contain the same set of unisolvent points This assumption is necessary to ensure that the aforementioned Poincaré constant in T.3.32 is independent of the problem size N .Clearly, if the family ({x 1 , . . ., x N }) N ≥Nmin is constructed by an algorithm that successively adds more points to some initial point set, never deleting or modifying existing ones, then this assumption is satisfied.
The asserted balance between the problem size N and the separation distance h min,N is fulfilled for a wide variety of families of interpolation points.As an example, consider the case where a bounded domain Ω ⊆ R d is given and where {x 1 , . . ., x N } ⊆ Ω for all N ≥ N min .Denote by T N := {T 1 , . . ., T N } the corresponding Voronoi decomposition of Ω, i.e., T n = {x ∈ Ω | x − x n = min m∈{1,...,N } x − x m }, and by h max,N := max n∈{1,...,N } diam 2 (T n ) the maximal cell diameter.Now suppose that h σ card max,N ≤ Ch min,N , which is satisfied, e.g., for uniform and algebraically graded families of interpolation points.Then, Although D.2.2 is phrased in terms of a family of interpolation points, most of the upcoming results deal with a single set {x 1 , . . ., x N } ⊆ R d of interpolation points.For the most part, one can therefore think of N as being "fixed" throughout this work.
Definition 2.3.The evaluation operator Now that the native space V and the interpolation points x 1 , . . ., x N are fixed, let us pose the interpolation problem: Problem 2.4.Let f ∈ V .Find a function u ∈ V that satisfies the following interpolation and minimization properties: Clearly, the interpolation condition "E N u = E N f " can also be written as "∀n ∈ {1, . . ., N } : In other words, we are looking for a minimizer u of the seminorm | • | a over the set of interpolants of f .While the interpolation conditions fix the values of u at the interpolation points x 1 , . . ., x N , the minimization property determines the behavior of u on the rest of R d .
In Section 3.2 below we will see that, for every given f ∈ V , this interpolation problem has a unique solution u ∈ V .It turns out that the mapping f → u is linear and that there holds the a priori bound |u| a ≤ |f | a , i.e., the problem is well-posed.
2.2.The fundamental solution.At first glance, for given f ∈ V , the solution u ∈ V of P. 2.4 looks like an infinite-dimensional object.However, since the set of interpolants ũ ∈ V of f is so large, the minimization property contains a lot of information about u.In fact, we shall shortly see that u can be written in the form u = ) is a specific function that is intimately linked to the native space (V, a(•, •)) from D.2.1.More precisely, ϕ is a fundamental solution of the differential operator that is associated with the bilinear form a(•, •).The above representation will then allow us to rephrase P.2.4 as a linear system of equations (LSE) for the coefficients c n , d α ∈ R.
Before we treat the general setting k min ≥ 0, let us have a look at the much simpler case k min = 0 first.Recall that then V = H k (R d ) with equivalent norms.
Lemma 2.5.If k min = 0, then there exists a unique function ϕ ∈ V , such that the following equality holds true: Proof.The continuous Sobolev embedding Then, according to the Riesz-Fréchet Representation Theorem, there exists a unique function ϕ ∈ V , such that a(ϕ, v) = v(0) for all v ∈ V .Since the coefficients σ l l!/α! of a(•, •) are spatially constant, a simple integral transformation yields the desired formula for general x 0 ∈ R d .Now, given data f ∈ V , we make the ansatz u := for the solution of P.2.4.The coefficients c n ∈ R can be chosen such that the interpolation conditions E N u = E N f are satisfied (cf.L.3.12).On the other hand, for all v ∈ V with E N v = 0, the defining equation of ϕ tells us that a(u, v) = N n=1 c n v(x n ) = 0.In L.3.7 further below, it will be argued in more detail that this orthogonality implies the required minimization property.We conclude that u = N n=1 c n ϕ(• − x n ) is indeed the unique solution of P.2.4.Now let us return to the general case k min ≥ 0. Since a(•, •) need not be strictly positive definite, the existence of a "basis function" ϕ is not so straightforward any more.Therefore, we have to take a different approach.
Definition 2.6.Denote by σ kmin , . . ., σ k ≥ 0 the coefficients of the bilinear form a(•, •) from D.2.1.We define a differential operator The precise relationship between a(•, •) and D 2k is described in the subsequent lemma.Anticipating L.3.2, we claim that there holds . Furthermore, we will prove that every u ∈ V lies in H k loc (R d ), so that the integrals in a(u, v) are amenable to successive partial integrations over bounded subsets of R d .
Lemma 2.7.For all u ∈ V and v ∈ C ∞ 0 (R d ), there holds the identity Proof.The relation follows readily from successive partial integrations and the identity |α|=l (l!/α!) ) and the fact that D α u ∈ L 2 loc (R d ) for all |α| ≤ k, guarantee that all integrals involved in the computation are well-defined.
Definition 2.8.We define the set of coefficient vectors A careful analysis of the function ϕ from the case k min = 0 above leads us to the following set of assumptions: Assumption 2.9.We assume that the coefficients σ kmin , . . ., σ k ≥ 0 from D.2.1 are such that there exists a function ϕ : R d −→ R with the following properties: (1) Regularity: There holds ϕ ∈ C 0 (R d ).
(2) Fundamental solution: ϕ is a fundamental solution of D 2k , i.e., (3) Conformity: For all c ∈ C, there holds Note that we did not assume that ϕ lies in the native space V .Instead, we only require certain linear combinations of its translates to do so.Combining L. 2.7 with (2) and (3), we regain the important orthogonality a(u, v) Using a density argument, the orthogonality can then be extended to the space of functions v ∈ V with E N v = 0.A detailed proof of these facts will be given later in L.3.10.
In contrast to the case k min = 0, a function ϕ satisfying A.2.9 need not be unique if k min > 0. In fact, if ϕ is a valid choice and if p ∈ P , then ϕ + p is valid as well.
According to the much celebrated Malgrange-Ehrenpreis Theorem, [Mal56], [Ehr54], every non-trivial differential operator with constant coefficients has a fundamental solution in the distributional sense.Since D 2k = k l=kmin σ l (−∆) l falls into this category, this theorem provides us with a distribution ϕ ∈ (C ∞ 0 (R d )) that satisfies assumption (2) in the distributional sense.However, to the best of our knowledge, there is no guarantee that this ϕ satisfies (1) and (3), if no further assumptions on the coefficients σ l are made.L.2.5 is somewhat unsatisfactory, since it does not provide an explicit formula for the basis function ϕ.However, for a specific choice of the coefficients σ 0 , . . ., σ k , we have the following result, the proof of which we postpone to Section 3.11.
We mention that ϕ can also be written in the form which goes by the name of Matérn function, Sobolev spline or Bessel potential in the literature (e.g., [AS61]).
We finish this section with another example for a function ϕ satisfying A.2.9.This time, we look at the other extreme case, k min = k.The proof will be delayed to Section 3.11 again.
2.3.The LSE.We already suggested that the solution u ∈ V of P.2.4 can be written in the form u = Definition 2.12.Denote by x 1 , . . ., x N ∈ R d the interpolation points from D.2.2 and by ϕ ∈ C 0 (R d ) the fundamental solution from A.2.9.For every n ∈ {1, . . ., N }, we define the translate Furthermore, let {π α | |α| < k min } ⊆ P be the Lagrange basis associated with the unisolvent point set Definition 2.13.We define the following matrices: Furthermore, we define the interpolation matrix As will be shown later in L.3.11, the interpolation matrix is invertible.Now, let f ∈ V and set Denote by c ∈ R N and d ∈ R Nmin the unique solution of the following LSE: Then, the unique solution u ∈ V of P.2.4 can be written in the following form: Once again, we postpone the derivation of this identity to the proof section below, L.3.12.The first row of the LSE encodes the interpolation conditions E N u = E N f , whereas the second row guarantees that the coefficient vector c ∈ R N lies in the set C from D.2.8.Owing to A.2.9, this implies the conformity 2.4.Hierarchical matrices.An extensive discussion of hierarchical matrices can be found in the books [Hac15,Beb08,Bör10].Here, we only provide a bare minimum of definitions that are need for the subsequent analysis.
(2) There holds P = P small ∪ P adm , where every (I, J) ∈ P small is small and every (I, J) ∈ P adm is admissible.
(3) There exists a constant C > 0, such that While items (1) and ( 2) are more or less standard in the literature about hierarchical matrices, item (3) might seem odd to the informed reader.Usually, this inequality is proved, rather than assumed (e.g., [Hac15, Lemma 6.5.8]).However, its proof typically requires a rigorous introduction of (hierarchical) cluster trees and (hierarchical) block cluster trees, two types of data structure that are described in great detail in [Hac15,Chapter 5].Here, we largely avoid this tedious task and only give a brief overview: In essence, a (hierarchical) cluster tree T N is a system of index clusters I ⊆ {1, . . ., N } that contains the full index set {1, . . ., N } as its root.Furthermore, for every I ∈ T N with #I > σ small , the cluster tree also contains two non-empty, disjoint clusters I 1 , I 2 with I = I 1 ∪ I 2 .Starting at the tree root, these sons are typically generated by a predefined clustering strategy which takes into account the geometric positions of the interpolation points x 1 , . . ., x N ∈ R d .An example for a geometrically balanced clustering strategy, which uses a hierarchy of axesparallel boxes B I ⊆ R d , can be found in [GHLB04].Finally, we denote by depth(T N ) ∈ N the tree's depth, and assume that the reader is familiar with this notion.
On the other hand, a (hierarchical) block cluster trees T N ×N consists of tuples (I, J), where I and J are clusters from a given cluster tree T N .This time, the tuple ({1, . . ., N }, {1, . . ., N }) is the tree's root and, for every (I, J) ∈ T N ×N with diam 2 (B I ) > σ adm dist 2 (B I , B J ), all four pairs of sons (I 1 , J 1 ), (I I , J 2 ), (I 2 , J 1 ), (I 2 , J 2 ) lie in T N ×N .Here, the sets B I , B J ⊆ R d are the boxes that were used to build the cluster tree T N .Finally, the well-known sparsity constant C sparse (T N ×N ) ∈ N counts the maximum number of "partners" (I, J) ∈ T N ×N that each cluster I or J can have (see, e.g., [Hac15, Section 6.3]).
The geometrically balanced clustering strategy from [GHLB04] guarantees the bounds depth(T N ) ln(h −1 min ) and C sparse (T N ×N ) 1. We plug in the relation 1 N σ card h d min from D.2.2 and get depth(T N ) ln(N ).Now, using [Hac15, Lemma 6.5.8], we obtain the following inequality: These considerations justify the assumptions that we made in D.2.16.
Definition 2.17.Let P be a sparse hierarchical block partition and r ∈ N a given block rank bound.We define the set of H-matrices by We mention that, according to [Hac15, Lemma 6.3.6], the memory requirements to store an H-matrix M ∈ H(P, r) can be bounded by C sparse (T N ×N )(σ small + r)depth(T N )N .Inserting the relations C sparse (T N ×N ) 1 and depth(T N ) ln(N ) from before, we get an overall bound of O(r ln(N )N ) for the storage complexity.
2.5.The main result.We are finally in the position to formulate our main result.
Theorem 2.18.Let a(•, •) be the bilinear form from D.2.1 and consider a set of interpolation points {x 1 , . . ., x N } ⊆ R d satisfying D.2.2.Denote by ϕ ∈ C 0 (R d ) the fundamental solution from A.2.9.Furthermore, let ( A B T B 0 ) be the interpolation matrix from D. 2.13.Finally, let P be a sparse hierarchical block partition as in D. 2.16.Write the inverse of the interpolation matrix in the block form Then, there exists a constant σ exp > 0 such that the following holds true: For every block rank bound r ∈ N, there exists an H-matrix M ∈ H(P, r) such that

Proof of main results
3.1.The native space V .We start the proof of the main result with a discussion of the native space V from D.2.1.Given a function v ∈ V and a multi-index α ∈ N d 0 with |α| ∈ {k min , . . ., k}, we know that there exists an α-th weak derivative D Naively, the definition of V does not tell us anything about the existence and regularity of lower-order derivatives D α v ∈ L 1 loc (R d ), where |α| < k min .However, as shown below, it turns out that these lower-order derivatives do exist and lie in L 2 loc (R d ).Our proof of this fact uses a Poincaré-type inequality of the form , where Ω ⊆ R d is a ball.The following, slightly more general result, will cover all the occurrences of Poincaré-type inequalities in the present work.
Lemma 3.1.Let Ω ⊆ R d be a bounded Lipschitz domain.Let k ∈ N 0 and Z be a normed vector space.Furthermore, let ι Z : H k (Ω) −→ Z be a linear continuous operator satisfying the implication (ι Z p = 0 ⇒ p = 0) for every p ∈ P k−1 (Ω).Then, there holds the following Poincaré-type inequality: Proof.This can be proved by contradiction using arguments similar to [Eva10, Section 5.8.1.].Now let us return to the question of existence of the lower-order derivatives D α v, |α| < k min , if v ∈ V is given.Apart from a Poincaré inequality, the subsequent proof sports a standard mollification argument with C ∞ 0 (R d )-functions, as can be found in any good textbook treating Sobolev spaces (e.g., [GT01], [Eva10]).In brief, given a "standard mollifier (1) The function a(•, •) defines a symmetric, positive semi-definite bilinear form on V and | • | a defines a seminorm.(2) There holds P ⊆ V .
(3) Let u, v ∈ V such that u ∈ P or v ∈ P .Then, a(u, v) = 0. (4) For all v ∈ V , there holds |v| a = 0, if and only if v ∈ P .
(5) For all u, v ∈ V , there holds the Cauchy-Schwarz inequality |a(u, v)| ≤ |u| a |v| a .(6) There hold the following inclusions (as sets): In particular, every v ∈ V is k-times weakly differentiable and there holds D α v ∈ L 2 loc (R d ) for all |α| < k min , as well as D α v ∈ L 2 (R d ) for all |α| ∈ {k min , . . ., k}. (7) For all v ∈ V , there hold the bounds and a well-known Sobolev embedding.Item (7) follows from the wellknown characterization of Sobolev spaces by Fourier transforms (e.g., [Eva10, Section 5.8.4.]).Finally, let us sketch the proof of is the sequence of mollifiers from above.We then consider a sequence of bounded Lipschitz domains The next lemma establishes the fact that C ∞ 0 (R d ) ⊆ V is dense.We remind the reader of D.2.8, where the set ) be a cut-off function with supp(κ) ⊆ B 2 and κ ≡ 1 on B 1 , where B n := Ball(0, n) ⊆ R d denotes the ball with integer radius n ∈ N, centered at the origin.Clearly, the scaled version n −l for all l ∈ N 0 .Next, denote by Â := B 2 \B 1 the "reference annulus" and let Π : H kmin ( Â) −→ P kmin−1 ( Â) be the corresponding orthogonal projection.Let F n : Â −→ B 2n \B n , F n (x) := nx, and consider the polynomial Since we are in the case d ≥ 2 at the moment, the annulus Â is connected and we can apply the Poincaré-type inequality from L.3.1 to the bounded Lipschitz domain Ω = Â, the normed vector space Z = H kmin ( Â) and the continuous mapping ι Z = Π : H kmin ( Â) −→ H kmin ( Â).Then, using a standard scaling argument, we find that kmin j=0 Now, for every n ∈ N, we define the approximation The convergence of the last term follows from the fact that Clearly, there exists an index n 0 ∈ N, such that {x 1 , . . ., x N } ⊆ B n for all n ≥ n 0 .Since κ n ≡ 1 on B n , we get the following identity: This settles the question of density in the case d ≥ 2. Now we turn our attention to the case d = 1: Since the reference annulus Â = (−2, −1) ∪ (1, 2) is not connected, we don't have the Poincaré inequality at our disposal.Instead, we show that the inclusions To see the latter one, let v ∈ V ∩ C ∞ (R) and consider the Taylor-type approximants , meaning that v n lies in the supposedly dense subset.To bound the error |v − v n | a , we apply Leibniz' differentiation rule to the product κ n v (kmin) and exploit the support properties of κ n .Skipping the steps that are similar to the multivariate case, we obtain On the other hand, since κ n | Bn ≡ 1, Taylor's Theorem tells us that v n (x) = v(x) for all x ∈ B n .Consequently, for sufficiently large n ∈ N, there must hold Clearly, there must hold v ∈ P kmin−1 ((−∞, −R)) and v ∈ P kmin−1 ((R, ∞)).In particular, for all integer n ≥ R and all j ∈ N 0 , we can bound We then define the approximants v n := κ n v ∈ C ∞ 0 (R).Using Leibniz' product rule and the stability L 2 (R\Bn) This finishes the proof.
Unless k min = 0, it is now clear that (V, a(•, •)) is not an inner product space, let alone a Hilbert space.However, using the evaluation operator , it is not difficult to find a useful subspace of V , where a(•, •) is strictly positive definite: Definition 3.4.We define the homogeneous native space Recall that "E N v = 0" is equivalent to "∀n ∈ {1, . . ., N } : v(x n ) = 0".In particular, the space V 0 depends on the number N of interpolation points x 1 , . . ., x N , as well as their positions.
Proof.The identity V 0 ∩ P = {0} follows from the unisolvency assumption in D.
We obtain limits Problem 3.6.Let f ∈ V .Find a function u ∈ V that satisfies the following interpolation and orthogonality conditions: Using standard variational techniques, it is not difficult to show that P.2.4 and P.3.6 are indeed equivalent: Lemma 3.7.Let f ∈ V .A function u ∈ V solves P.2.4, if and only if it solves P.3.6.
The orthogonality relation in P.3.6 is more suitable for our upcoming error analysis and we will not have to deal with the original minimization property in P.2.4 any further.The next lemma answers the question of solvability and uniqueness of solutions (effectively for both problems): Lemma 3.8.For every f ∈ V , P.3.6 has a unique solution u ∈ V .The mapping f → u is linear and there holds the a priori bound |u| a ≤ |f | a .
Proof.According to the Riesz-Fréchet Representation Theorem there exists a unique element 3.3.Properties of the LSE.In this section, we will show that the LSE from Section 2.3 is indeed regular.Furthermore, we prove the asserted representation u = N n=1 c n ϕ n + |α|<kmin d α π α of the solution u ∈ V of P.2.4 (or, equivalently, P.3.6).Definition 3.9.Denote by {ϕ 1 , . . ., ϕ N } ⊆ C 0 (R d ) and {π α | |α| < k min } ⊆ P the ansatz functions from D.2.12.We define the corresponding coordinate mappings Recalling from L.3.2 that P ⊆ V , it is clear that the operator Π always maps into the native space V , regardless of the input vector d ∈ R Nmin .As for the operator Φ, on the other hand, we remind the reader of A.2.9: The fundamental solution ϕ ∈ C 0 (R d ) is not necessarily an element of V , so we cannot guarantee Φc ∈ V for all inputs c ∈ R N .However, for the coefficient vectors c from the subset C ⊆ R N (cf.D.2.8), the conformity of Φc is provided by A.2.9.
Lemma 3.10.The operator Π : R Nmin −→ P is bijective.For all c ∈ C, there holds Φc ∈ V with |Φc| a h k−d/2 min c 2 .Finally, the operators Φ and E N (cf.D.2.3) are transposed in the following sense: Proof.The bijectivity of Π follows from the fact that {π α | |α| < k min } ⊆ P is a basis.Next, let us prove the transposition formula: For every c ∈ C, we know from A.2.9 that Φc = To extend this identity from C ∞ 0 (R d ) to all of V , we use a simple density argument: Let c ∈ C and v ∈ V .We know from L.3.3 that there exists a sequence Finally, let c ∈ C and denote by Λ : R N −→ V the operator from D.3.16 further below.Anticipating the results from L.3.17, we know that E N Λc = c and that |Λc| a h d/2−k min c 2 .Then, using the transposition formula from above and the Cauchy-Schwarz inequality from L.3.2, we obtain the bound We divide both sides by h d/2−k min c 2 and get the desired inequality.
In the case k min = 0, the transposition formula in L.3.10 reduces to a(ϕ n , v) = v(x n ) for all n ∈ {1, . . ., N } and all v ∈ V .This identity is reminiscent of L.2.5, where the identity a(ϕ, v) = v(0), v ∈ V , was used to define the basis function ϕ.Furthermore, let it be said that the transposition formula must be handled with care, if k min > 0. The left-hand side reads a(Φc, v) = a( N n=1 c n ϕ n , v), with the sum inside of a(•, •).This is a perfectly valid expression, because A.2.9 guarantees that N n=1 c n ϕ n lies in V , whenever c ∈ C.However, we cannot pull the sum out of a(•, •).In fact, the expression N n=1 c n a(ϕ n , v) is not well-defined, since ϕ n need not lie in V and may not be plugged into a(•, •) on its own.Next, let us have a look at the properties of the matrices A ∈ R N ×N , B ∈ R Nmin×N and ( A B T B 0 ) ∈ R (N +Nmin)×(N +Nmin) from D.2.13.Using the evaluation operator E N : C 0 (R d ) −→ R N from D.2.3 once again, we see that A can be written in the form A = (E N ϕ 1 | . . .|E N ϕ N ) and that B T can be written as B T = (E N π 1 | . . .|E N π Nmin ), if a linear ordering of the polynomial basis is assumed.The next lemma uses these representations.
(1) For all d ∈ R Nmin , there holds the identity B T d = E N Πd.In particular, B T is injective and B is surjective.
The kernel of B is given by ker(B) = C.
(2) For all c ∈ R N , there holds the identity Ac = E N Φc.Furthermore, Proof.For all d ∈ R Nmin , we have In particular, using the bijectivity of Π and L.3.5, we find that ker(B T ) = ker(E N • Π) = V 0 ∩ P = {0}.This proves that B T is injective and that B is surjective.Next, we have At this point we have all the ingredients to show that the solution u ∈ V of P.3.6 can be represented by means of the ansatz functions ϕ n and π α from D.2.12.Lemma 3.12.Let f ∈ V and set f := E N f ∈ R N .Denote by c ∈ R N and d ∈ R Nmin the unique solution of the following LSE: Then, the function coincides with the unique solution of P.3.6.
Proof.The second row of the LSE tells us that Bc = 0, so that c ∈ ker(B) = C, by L.3.11.Owing to L.3.2 and L. 3.10, the function u := Φc + Πd then lies in V .Using L. 3.11 again, the first row of the LSE yields . Furthermore, for all v ∈ V 0 , there holds the following orthogonality: a(u, v) = a(Φc + Πd, v) = c, 0 2 = 0. Therefore, the function u is a solution of P.3.6 with respect to the data f .By uniqueness, u is the solution.This concludes the proof.
3.4.The representation formula for the inverse matrix.Earlier, in L.3.11, we developed the fact that the interpolation matrix ( A B T B 0 ) from D.2.13 is invertible.Analogous to T.2.18, let us write its inverse in the form ) with matrices S 11 ∈ R N ×N , S 21 ∈ R Nmin×N , S 12 ∈ R N ×Nmin and S 22 ∈ R Nmin×Nmin .In this section, we develop a representation formula for the main block S 11 .Recall from L.3.8 that the mapping f → u of data f ∈ V to the solution u ∈ V of P.3.6 is a linear operator.Clearly, this abstract operator must play a prominent role in the sought representation formula.Definition 3.13.For every f ∈ V , denote by S N f ∈ V the unique solution of P.3.6, i.e., The linear operator S N : V −→ V is called solution operator.
The basic properties of the dual functions λ m are summarized in the next lemma.Due to its simplicity, the proof is omitted.We remind the reader of D.2.14, where the bubbles Ω m ⊆ R d were introduced.
Recall from L.3.2 that V is not necessarily a Hilbert space, so that Λ T cannot be the proper Hilbert space transpose of Λ.However, as discussed below, the defining equation for transposed operators is still satisfied.
In the subsequent lemma, we use supp(f ) := {n ∈ {1, . . ., N } | f n = 0} to denote the support of a vector f ∈ R N .Remember that Ω I = n∈I Ω n ⊆ R d denotes a union of bubbles (cf.D.2.14).Furthermore, we make use of the Lemma 3.17.The operators Λ and Λ T are transposed in the following sense: Both Λ and Λ T preserve locality: For all f ∈ R N , v ∈ V and I ⊆ {1, . . ., N }, there holds Finally, for all f ∈ R N , there holds E N Λf = f .In particular, we have v − ΛE N v ∈ V 0 for all v ∈ V .
Proof.Let v ∈ V and f ∈ R N .The transposition formula is straightforward: Next, in order to determine the support of Λf , we remember from L. 3.15 that supp(λ n ) ⊆ Ω n .Abbreviating I := supp(f ), we find that supp(Λf ) = supp In order to see the upper bound for |Λf | a , we argue that the disjointness of the bubbles Ω n implies the orthogonality a(λ n , λ m ) = l,α (σ l l!/α!) D α λ n , D α λ m L 2 (Ωn∩Ωm) = 0 for all m = n.Therefore, with L.3.15, we obtain Next, consider an arbitrary index set I ⊆ {1, . . ., N }.To show the upper bound for Λ T v l 2 (I) , we use a localized version of the Cauchy-Schwarz inequality from L.3.2 and exploit the disjoint bubbles once again: Finally, for all f ∈ R N , the Kronecker-δ-property from L.3.15 gives us the identity In L.3.12 we demonstrated how to derive the solution u ∈ V of P.3.6 from the solution (c, d) ∈ R N × R Nmin of the LSE in Section 2.3.In the next lemma, we go in the opposite direction, i.e., we construct the coefficient vectors c and d from a given solution u.Once we have this result, the representation formula for the inverse interpolation matrix is merely a byproduct.
Lemma 3.18.Let f ∈ V and denote by u ∈ V the unique solution of P.3.6.Set Proof.First, let us show that indeed c ∈ C: For every q ∈ P , there holds q − ΛE N q ∈ V 0 (cf.L.3.17) and thus c, E N q 2 = Λ T u, E N q 2 L.3.17  Due to L.3.2, this implies p ∈ P , so that As for the first row of the LSE, we compute Finally, using L.3.11 again, we have c ∈ C = ker(B).In other words, Bc = 0, which is precisely the second row of the LSE.
We close this section with the promised representation formula.The formula establishes a relationship between the interpolation matrix ( A B T B 0 ) from D.2.13, the coordinate mappings Φ : R N −→ C 0 (R d ) and Π : R Nmin −→ P from D.3.9, the solution operator S N : V −→ V from D.3.13 and the operators Λ : R N −→ V and Λ T : V −→ R N from D.3.16.
Corollary 3.19.For all f ∈ R N , there holds the identity In particular, the main block S 11 ∈ R N ×N from the representation Proof.Let f ∈ R N and set f := Λf ∈ V .According to D.3.13, the function u := S N f ∈ V is the unique solution of P.3.6.Now L.3.18 tells us that the coefficient vectors c := Λ T u = Λ T S N Λf ∈ C and d := Π −1 (id − ΦΛ T )u = Π −1 (id − ΦΛ T )S N Λf ∈ R Nmin solve the LSE from L.3.12,where the right-hand side is given by f := E N f ∈ R N .However, we know from L.3.17 that f = E N f = E N Λf = f .This proves the asserted identity.

3.5.
Reduction from matrix-level to function-level.In our main result, T.2.18, we claimed that the main diagonal block S 11 of the inverse interpolation matrix can be approximated well by H-matrices.The goal of the subsequent lemma is to reduce this "matrix-level" question to an analogous question on the "function-level".The majority of the work has already been done in achieving C.3.19, where the action of the matrix S 11 on any given vector f ∈ R N was characterized by the abstract solution operator S N : V −→ V from D.3.13.Furthermore, thanks to the asserted inequality (cf.D.2.16) it suffices to derive an error bound for each matrix block S 11 | I×J , where (I, J) is any given cluster tuple from the sparse hierarchical block partition P.
Lemma 3.20.Let S 11 ∈ R N ×N be the main diagonal block of the inverse interpolation matrix from D.2.13.Let I, J ⊆ {1, . . ., N } and W ⊆ V be a finite-dimensional subspace.Then, there exist an integer r ≤ dim W and matrices X ∈ R I×r and Y ∈ R J×r , such that there holds the following error bound: Proof.Follows from L.3.17 and C.3.19.We omit a detailed proof and refer the reader to [AFM21, Lemma 3.13], where a similar bound was derived in terms of the L 2 -norm.
L.3.20 can be interpreted as follows: Given boxes B, D ⊆ R d (cf.D.2.15), a free parameter L ∈ N and a function f ∈ V with supp(f ) ⊆ D, can we construct a subspace V B,D,L ⊆ V such that the dimension dim(V B,D,L ) and the best-approximation error inf w∈V B,D,L |S N f − w| a,B are both "small" at the same time?More precisely, is it possible to build V B,D,L in a way such that dim(V B,D,L ) L c1 and inf w∈V B,D,L |S N f − w| a,B 2 −c2L |f | a for some constants c 1 , c 2 > 0?
Over the course of the subsequent sections, we will show that the answer to this question is affirmative (up to a factor L k /h k min in the error bound), if the domains B, D satisfy the admissibility condition diam(B) ≤ σ adm dist(B, D) from D.2.16.
Let us quickly give an overview of the construction of the space V B,D,L : (1) Due to the admissibility condition, we can safely "inflate" the box B in L increments of identical size, before it touches the box D. This generates a family of L concentric boxes between B and D, i.e., where δ > 0 is a parameter proportional to L −1 and where σ sco > 0 is a certain constant.(3) Our construction of the approximant u L−1 will guarantee that u L−1 ∈ V harm (B L−1 ), and, since these subspaces are nested, we find that the error u L − u L−1 ∈ V harm (B L−1 ) as well.Again, the properties of this subspace allow us to construct a function From here, it is then not difficult to get an error estimate in the native space seminorm | • | a,B .
3.6.The cut-off operator.Let us now define precisely what we mean by an inflated box : ) with a i ≤ b i be a box as in D.2.15.For every δ ≥ 0, we introduce the inflated box Note that B δ is again a box.In particular, we can iterate (B δ ) δ = B 2δ , ((B δ ) δ ) δ = B 3δ , et cetera.
In order for our construction of the space V B,D,L to work, we need a way to "restrict" the support of a given function v ∈ V to a box B ⊆ R d , without destroying its global smoothness.This can be achieved by multiplying v with a smooth cut-off function: Lemma 3.22.Let B ⊆ R d be a box and δ > 0.Then, there exists a smooth cut-off function κ δ B with the following properties: It is convenient to introduce a name for the cut-off process described earlier: ) be the smooth cut-off function from L.3.22.We define the corresponding cut-off operator Lemma 3.27.Let B ⊆ R d be a box and δ > 0 with δ 1.Then, there holds the Caccioppoli inequality: ).From D.3.25 we know that u(x n ) = 0 for all n ∈ {1, . . ., N } with x n ∈ B δ .In particular, the product v := κ 2 u ∈ V satisfies v(x n ) = 0 for all n ∈ {1, . . ., N }.In other words, v ∈ V 0 .Furthermore, using L.3.24, we have supp(v) ⊆ supp(κ) ⊆ B δ .This proves that v is an admissible test function for D.3.25, i.e., a(u, v) = 0. Plugging in D.2.1, we get the identity We transfer the summands with β < α to the other side of the equality and obtain the following expression: For the summands in the first sum, we use Young's inequality (with variable ε > 0): Note that, by choosing ε sufficiently small, we can absorb the O(ε)-term in the left-hand side of the overall inequality.
For the summands in the second sum, we can pick an index i ∈ {1, . . ., d} with α i ≥ 1 (in the case α = 0, the sum is empty anyways).Then, we perform partial integration with respect to the i-th coordinate: Finally, we put everything together (exploiting κ ≡ 1 on B): This concludes the proof.
We end the discussion of the spaces V harm (B) with the following observation: If k min = 0, we remember from Section 2.1 that (V, a(•, •)) is a Hilbert space.One can then show that V harm (B) ⊆ V is a closed subspace with respect to the norm | • | a , so that the orthogonal projection from V to V harm (B) is well-defined.However, in the general case k min ≥ 0, this line of reasoning is not possible anymore and we need to find a different projection P B,H : V −→ V harm (B) that is in some sense stable.The idea here is to replace a(•, •) by a strictly positive definite inner product b(•, •) and then use the corresponding orthogonal projection.The new inner product is weighted with a free parameter H > 0 that will be important for the definition of the low-rank approximation operator Lemma 3.28.Let B ⊆ R d be a box with |B| > 0 and let H > 0. There exists a linear operator with the following properties: (1) Projection: For all v ∈ V harm (B), there holds (2) Stability: For all v ∈ V , there holds the stability bound Proof.We equip the native space V from D.2.1 with the following bilinear form: We remind the reader of L.3.2, where the inclusion V ⊆ H k loc (R d ) was derived.In particular, the local quantities |u| H l (B) are finite for all l ∈ {0, . . ., k}, so that b(•, •) is indeed well-defined.Note that the assumption |B| > 0 guarantees the strict positive definiteness of b(•, •) on all of V .The proof of completeness of (V, b(•, •)) is very similar to the one of L.3.5 and will therefore be omitted.Finally, using the continuous Sobolev embedding H k (B) ⊆ C 0 (B) and the Cauchy-Schwarz inequality for a(•, •) (cf.L.3.2), one can show that V harm (B) is a closed subspace of V with respect to b(•, •).Consequently, the b(•, •)-orthogonal projection P B,H : V −→ V harm (B) is well-defined.The asserted stability bound follows immediately from the fact that P B,H v b ≤ v b for all v ∈ V .This finishes the proof.
3.8.The low-rank approximation operator.We remind the reader of Section 3.5, where we argued that we need to construct a certain subspace V B,D,L ⊆ V of low dimension.More precisely, our goal was to achieve an algebraic dimension bound of the form dim(V B,D,L ) L c1 , where c 1 > 0 is some constant.For this purpose, we use an approximation operator of moderate rank with good local approximation properties.Our construction is a "partition of unity" method (see, e.g., [MB96] and [BM97]) on a perfect tensor-product grid with meshsize H > 0 that spans all of R d .A pleasing side effect of this method is the fact that the rank of this operator varies "smoothly" with respect to the parameter H. Once again, we make use of the axes-parallel boxes B ⊆ R d and their inflated relatives B δ ⊆ R d , δ > 0 for the proof (cf.D.2.15 and D.3.21).
Lemma 3.29.Let H > 0 be a free parameter.Then, there exists a linear operator with the following properties: (1) Local rank: For every box B ⊆ R d , there holds the dimension bound (2) Stability/error bound: For all v ∈ H k (R d ), there hold the following stability and error estimates: Now we have everything we need to define the asserted operator: Note that we implicitly restricted v • F m from R d to Ω and extended the polynomial Π(v Now, for every v ∈ H k (R d ) with supp(v) ⊆ B, the support properties of the bump functions g m guarantee that the sum in Π H v only ranges over ms(B), rather than all of Z d .Therefore, Now on to the stability and error estimates.The derivation is based on the following facts: (1) Partition of unity: (3) Bramble-Hilbert: (2) (2) To get the desired global error bound, we exploit the covering property R d ⊆ n∈Z d Ω n and sum up the local error contributions from above.Additionally, we use #ms(n) = 3 d and the fact that #{n Finally, the stability bound can be shown via a simple triangle inequality and the previously established error bound.This concludes the proof.
3.9.The single-and multi-step coarsening operators.This section contains the heart of our proof.Using the subspaces V harm (B) ⊆ V from D.3.25, let us quickly recapitulate the proof outline from Section 3.5: Given a function u ∈ V harm (B δ ), our goal is to construct a function ũ ∈ V harm (B) of low "dimension" that is a good approximation to u on the box B ⊆ R d .To this end, we concatenate the cut-off operator K Now, according to D.3.13, the function f − S N f vanishes on all interpolation points {x 1 , .
This concludes the proof.
Finally, we have everything we need for the proof of T.2.18.
Proof.Let S 11 ∈ R N ×N be the matrix from T.2.18 and r ∈ N a given block rank bound.We define the asserted H-matrix approximant M ∈ R N ×N in a block-wise fashion: First, consider an admissible block (I, J) ∈ P adm .From D. 2.16 we know that there exist boxes B I , B J ⊆ R d with Ω I ⊆ B I , Ω J ⊆ B J and diam(B I ) ≤ σ adm dist(B I , B J ).In particular, B I and B J both contain at least one interpolation point and there holds diam(B I ) ≥ diam(Ω I ) ≥ h min , by D. 2.14.This means that T. 3.32 is applicable to B I and B J .Now, denote by C > 0 the constant from the dimension bound in T. 3.32.We set σ exp := 1/(2C 1/(d+1) ) > 0 and L := (r/C) 1/(d+1) ∈ N.Then, T.3.32 provides a subspace V I,J,r ⊆ V .We apply L.3.20 to this subspace and get an integer r ≤ dim V I,J,r and matrices X I,J,r ∈ R I×r and Y I,J,r ∈ R J×r .We set M | I×J := X I,J,r (Y I,J,r ) T .
Second, for every small block (I, J) ∈ P small , we make the trivial choice By D.2.17, we have M ∈ H(P, r) with a block rank bound For the error we get 3.11.The fundamental solution (proofs).In this subsection we provide the proofs of L.2.10 and L.2.11.
and that ϕ was given as an integral expression: The trivial bound e − x 2 /(4t) ≤ 1, the substitution t = s/b 2 and the assumption k > d/2 show that the integral is indeed well-defined with a uniform upper bound |ϕ In particular, Lebesgue's Dominated Convergence Theorem tells us that ϕ ∈ C 0 (R d ) as well.
In order to show that ϕ is a fundamental solution of D 2k = (b 2 − ∆) k , we use standard Fourier techniques.First, using Fubini's Theorem, a straightforward computation reveals ϕ ∈ L 1 (R d ).To compute the Fourier transform φ of ϕ, recall that the Gaußkernel e − • 2 /2 is a fixpoint of the Fourier transform and that there holds the relation F(e − • 2 /(4t) )(y) = (2t) d/2 e −t y 2 for all t > 0 and y ∈ R d .Using the substitution t = s/(b 2 + y 2 ), we obtain the following expression: , where F −1 and • denote the inverse Fourier transform.We know that (b the usual Schwartz class of rapidly decreasing functions.Then, using the unit imaginary number i ∈ C, the usual Fourier computation rules and the duality formula Finally, let us prove the asserted conformity.Since k min = 0, we have )}, a well-known characterization of Sobolev spaces (e.g., [Eva10, Section 5.8.4.]).In fact, the function ϕ itself lies in V , because (1 ) is well-defined (and also real-valued).Furthermore, for all w ∈ C ∞ 0 (R d ), we have (−i •) α c w ∈ S 2k 0 .We use the standard Fourier computation rules to derive the following identity: This proves that the function v has an α-th derivative, given by D

Numerical examples
We finish this paper with a few numerical examples that were performed in MATLAB and H2Lib, [Bör21].
Figure 1.Interpolation of smooth data on a non-uniform point distribution.
In Figure 1, we used roughly N ≈ 30.000 interpolation points in d = 2 space dimensions.The domain of interest was the exterior of the TU Wien logo, i.e., the complement of the letters in the unit square [0, 1] × [0, 1] ⊆ R 2 .The interpolation points were uniform in the open and algebraically graded towards reentrant corners with a grading exponent σ card = 2 (cf.D.2.2).As for the basis function, we used the thin-plate spline ϕ(x) = x 2 ln x from L.2.11, i.e., k = k min = 2.The left image shows the positions x n of the interpolation points and the one in the middle depicts the pairs (x n , f (x n )), where the data function f ∈ V is a smooth indicator function of the letters.On the right-hand side, the solution u ∈ V of the interpolation problem, P.2.4, is rendered.
Figure 2. A "typical" hierarchical block partition and a "typical" error plot in 2D.
The right-hand image is empirical evidence that the error bound in our main theorem, T. 2.18, is correct.To generate this plot, the main block S 11 ∈ R N ×N of the inverse ( A B T B 0 ) −1 was computed exactly using MATLAB's built-in inversion routine inv(...).Next, for each admissible block S 11 | I×J , (I, J) ∈ P adm , we used svds(...) to compute the corresponding singular values σ r (S 11 | I×J ), r ∈ {1, 2, . . ., 60}.Truncating the blockwise SVDs at any given rank r ∈ N, we then assembled the H-matrix M r ∈ H(P, r) (cf.D.2.17), the best rank-r-approximation to S 11 .As discussed in our previous work [AFM21, Section 4], this approach leads to the computable error bound where depth(T N ) denotes the cluster tree depth previously mentioned in Section 2.4.The semi-logarithmic error plot depicts the computable error bound along with a dashed reference line.The apparent similarity suggests a relation of the form S 11 − M r 2 C(N ) exp(−σ exp r), which is even better than our theoretical prediction C(N ) exp(−σ exp r 1/3 ).
On a side note, we mention that the standard 16-digit precision arithmetic in MATLAB was not enough to generate a conclusive error plot.As is well-established in the literature (see, e.g., [Wen05,Chapter 12]), the condition number of the interpolation matrix ( A B T B 0 ) scales very poorly with respect to the separation distance h min introduced in D.2.2.To overcome this fundamental problem, we used MATLAB's variable-precision arithmetic vpa(...) with 32 digits.This brute-force approach allowed us to carry out the explicit matrix inversion with sufficient accuracy.).In accordance with T.2.18, the empirical decay rate seems to be independent of the problem size N .In Figure 4, we investigated the influence of the grading exponent σ card from D. 2.2 on the error decay rate in d = 3 space dimensions.We set k = 2 and k min = 0 and used ϕ(x) = e − x as the basis function.The error plot compares the cases σ card ∈ {1, 2, 3}, where σ card = 1 is a uniform grid and σ card = 3 is "strongly" graded towards the origin 0 ∈ R 3 .The problem size N ≈ 10.000 was held constant throughout all three runs.The plot suggests that the constant σ exp from the error bound exp(−σ exp r 1/4 ) in T.2.18 is independent of the grading parameter σ card ∈ {1, 2, 3}.
Finally, we perform experiments using the H-arithmetic to approximate the inverse system matrix using the library H2Lib.As mentioned in the introduction, previous numerical results have established that the H-matrix arithmetic is a viable tool for solving RBF interpolation problems, [LM17, LBW20, LBW19b].In the following, we compute a Cholesky-decomposition in the H-matrix format by approximating the system matrix by an H-matrix with good accuracy and then perform the Cholesky-algorithm using H-arithmetic as described, e.g., in [Beb07].Afterwards, we project the matrix to a prescribed rank r.Finally, an approximate inverse can be obtained by inverting the Cholesky factors.
However, a complication arises in the conditionally positive definite case of polyharmonic splines as the main block of the system matrix is not invertible so that the Cholesky-decomposition cannot be computed.In fact, the system matrix has saddle point structure.Like Gaussian elimination, H-matrix inversion of such matrices would require pivoting.To avoid pivoting it was advocated in [BLB12,LBW19b] to consider instead the augmented Lagrangian A + γB T B (γ > 0), which is SPD and therefore amenable to an H-matrix inversion.In Figure 5, we compute an approximate inverse using H-arithmetics as described above for the case of thin-plate splines in space dimension d = 2 (uniform point distribution in unit square, N = 10000 k = k min = 2, augmented Lagrangian approach with γ = 1) and the Bessel potential in space dimension d = 3 (uniform point distribution in unit cube, N = 4096 k = 2, k min = 0).In contrast to the previous examples performed in MATLAB, we do not compute the exact inverse matrix and perform an SVD to compute an upper bound of the absolute error, but rather use the error measure I − (L H U H ) −1 A 2 , which is an upper bound for the relative error.
Once again, we observe exponential convergence as predicted by our main result T.2.18.However, we mention that the error flattens out earlier than machine precision due to our method of computation, as the approximation of the initial stiffness matrix using interpolation determines the achievable accuracy.

N
n=1 c n ϕ n + |α|<kmin d α π α , where c n , d α ∈ R are certain coefficients and where {π α | |α| < k min } ⊆ P is a polynomial basis.The functions ϕ n ∈ C 0 (R d ), on the other hand, have the particularly simple structure of translates, i.e., ϕ n

N
n=1 c n ϕ n + |α|<kmin d α π α , where the coefficients c n , d α ∈ R satisfy a certain LSE.The corresponding interpolation matrix is the main object of interest of the present work.
n − → 0 and a Cauchy sequence argument to prove that v| Bi ∈ H k (B i ).One can then show that the global functions D α v : R d −→ R, |α| ≤ k, defined via (D α v)| Bi := D α (v| Bi ), lie in L 2 loc (R d ) and represent the α-th weak derivative of v.In other words, v must lie in H k loc (R d ).
2.2.Since P coincides with the kernel of | • | a (cf.L.3.2), the bilinear form a(•, •) is positive definite on V 0 .Finally, let us sketch the proof of completeness: Consider a Cauchy sequence (v n ) n∈N ⊆ V 0 with respect to | • | a and pick a sequence of bounded Lipschitz domains B 1 ⊆ B 2 ⊆ B 3 ⊆ • • • ⊆ R d , e.g., B i := Ball(0, i).For fixed i ∈ N, we use L.3.1 and the identity and piece them together to define global functions D α v : R d −→ R, |α| ≤ k, by setting (D α v)| Bi := D α (v Bi ).One can then prove that v lies in V 0 and that |v − v n | a n − → 0.3.2.An equivalent interpolation problem.The homogeneous native space V 0 allows us to rewrite the minimization property in P.2.4 in terms of an orthogonality relation for a(•, •).

Now, for allL
c ∈ R N , there holds Ac = N n=1 c n E N ϕ n = E N ( N n=1 c n ϕ n ) = E N Φc.In particular, for all c ∈ C, Ac, c 2 = E N Φc, c 2 coefficient vectors c ∈ R N and d ∈ R Nmin such that ( A B T B 0 )( c d ) = (00 ).The second row tells us that Bc = 0, so that c ∈ ker(B) = C.Then, we multiply the first row with c and get 0 = Ac + B T d, c = Ac, c + d, Bc = Ac, c h 2k−d min c 2 2 , which implies c = 0. Now the first row simplifies to B T d = 0, from which we obtain d = 0.

Lemma 3. 15 .
For all m, n ∈ {1, . . ., N }, there holdsλ m ∈ C ∞ 0 (R d ) ⊆ V , supp(λ m ) ⊆ Ω m and λ m (x n ) = δ mn .Furthermore,we have the stability bound |λ m | a h d/2−k min .Now that the dual functions λ m ∈ C ∞ 0 (R d ) are properly defined, let us introduce a name for the mapping f → N n=1 f n λ n from before.Definition 3.16.We define the operators

=
a(u, p − ΛE N p) 2) Starting at the outermost layer B L , we set u L := S N f ∈ V .Exploiting the facts supp(f ) ⊆ D and B L ∩ D = ∅, one can then prove that u L lies in a certain subspace V harm (B L ) ⊆ V associated with the box B L .The properties of this subspace allow us to construct a function u L−1 ∈ V with O(L d ) degrees of freedom that is a good approximation to u L on the box B L−1 .More precisely, we can show that k l=0 be a "mollifier" with supp(µ) ⊆ [−1/4, 1/4], µ ≥ 0 and R µ(x) dx = 1.Consider the mollified characteristic function of the interval [0, 1) ⊆ R, i.e., ĝ1 (x) := (µ * I [0,1) )(x) = R µ(y)I [0,1) (x − y) dy.There holds ĝ1 ∈ C ∞ 0 (R), supp(ĝ 1 ) ⊆ [−1/4, 5/4], ĝ1 ≥ 0 and m∈Z ĝ1 (x + m) = R µ(y) m∈Z I y−x+[0,1) (m) dy = R µ(y) dy = 1 for all x ∈ R. (Note that, for every given x, the sum contains at most two non-zero summands.)Set Ω := [−1/4, 5/4] d ⊆ R d ("reference patch") and ĝ(x) := d i=1 ĝ1 (x i ) for all x ∈ R d ("reference bump function").There holds ĝ ∈ C ∞ 0 (R d ), supp(ĝ) ⊆ Ω, ĝ ≥ 0 and m∈Z d ĝ(x + m) = 1 for all x ∈ R d (at most 2 d non-zero summands).Furthermore, denote by Π : H k ( Ω) −→ P k−1 ( Ω) the orthogonal projection with respect to •, • H k ( Ω) .Next, let H > 0 be a given parameter ("meshwidth").For every m ∈ Z d , consider the affine transformations F m , F −1 m : R d −→ R d , given by F m (x) := H(x + m) and F −1 m (x) := x/H − m.The m-th patch and m-th bump function are defined by Ω m := F m ( Ω) ⊆ R d and g m (x) := ĝ(F −1 m (x)) for all x ∈ R d .There holds g m ∈ C ∞ 0 (R d ), supp(g m ) ⊆ Ω m , g m ≥ 0 and m∈Z d g m (x) = 1 for all x ∈ R d (at most 2 d non-zero summands).Furthermore, for all l ∈ N 0 , we have the relation |g m follows from the stability bound below.As for the dimension bound, let B ⊆ R d be a box and denote by ms(B) := {m ∈ Z d | B ∩ Ω m = ∅} the set of indices of the adjacent patches.We will need an upper bound for the cardinality of ms(B) in terms of diam(B) and H.To this end, we use appropriate sub-and supersets of m∈ms(B) Ω m and exploit the monotonicity of the d-dimensional Lebesgue measure, denoted by | • |.On one hand, since every patch Ω m is itself a box of side length 3H/2, it is not surprising that m∈ms(B) Ω m ⊆ B 3H/2 , where B 3H/2 denotes the inflated box in the sense of D.3.21.On the other hand, for every m ∈ ms(B), consider the m-th subpatch ω m := F m ([0, 1] d ) ⊆ Ω m .Clearly, these subpatches are pairwise disjoint and fulfill m∈ms(B) Ω m ⊇ m∈ms(B) ω m .Combining both inclusions, we get the desired bound for #ms(B): #ms(B)H d = m∈ms(B) H d = m∈ms(B) |ω m | = m∈ms(B) ω m ≤ m∈ms(B) ) and n ∈ Z d .Again, we denote by ms(n) := {m ∈ Z d | m − n ∞ ≤ 1}the indices of the patches touching Ω n .Using (1), (2) and (3), we can establish a local error bound first: k

S 11 − M 2 D
J)∈P adm S 11 | I×J − X I,J,r (Y I,J,r ) which concludes the proof.Proof of L.2.11.The continuity of the thin-plate spline ϕ is obvious and the fact that ϕ is a fundamental solution of (−∆) k was shown in [Wen05, Theorem 10.36].It remains to show that the function v := N n=1 c n ϕ(•−x n ) lies in the native space V , whenever c ∈ C. To do so, we use Fourier techniques: According to [Wen05, Page 161], the function ϕ has a generalized Fourier transform, given by ϕ(y) := (2π) −d/2 y −2k .This means that R d ϕ w dx = R d ϕw dy for all homogeneous Schwartz functions w ∈ S 2k 0 , where S 2k 0 := {w ∈ S | ∀|β| < 2k : (D β w)(0) = 0}.Now, consider the auxiliary function c :=

Figure 3 .
Figure 3.A comparison of different problem sizes N for a uniform 3D grid.