Approximating inverse FEM matrices on non-uniform meshes with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal{H}}$$\end{document}H-matrices

We consider the approximation of the inverse of the finite element stiffness matrix in the data sparse \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal{H}}$$\end{document}H-matrix format. For a large class of shape regular but possibly non-uniform meshes including algebraically graded meshes, we prove that the inverse of the stiffness matrix can be approximated in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal{H}}$$\end{document}H-matrix format at an exponential rate in the block rank. Since the storage complexity of the hierarchical matrix is logarithmic-linear and only grows linearly in the block-rank, we obtain an efficient approximation that can be used, e.g., as an approximate direct solver or preconditioner for iterative solvers.


Introduction
Discretizations of elliptic partial differential equations on a domain Ω ⊆ R d using the classic finite element method (FEM) usually produce sparse linear systems of equations Ax = b with storage requirements linear in the number of unknowns and linear complexity for the matrix-vector multiplication.However, the direct solution of these systems is computationally more expensive.Therefore, iterative solution methods (e.g., Krylov space methods) are popular in applications, since they only need matrix-vector multiplications, which can be done in linear complexity.A drawback of these methods is that convergence can be slow for matrices with large condition numbers unless a suitable preconditioner is employed.These preconditioners have to be taylored to the problem at hand making black box preconditioners that are based on (approximate) direct solvers particularly interesting.Moreover, if one is interested in solving the same problem with (many) different right-hand sides, a direct solver may be computationally advantageous.
Hierarchical matrices (H-matrices), introduced in [Hac99] and extensively studied in the monograph [Hac15], provide a different solution approach to this problem that does not suffer from the drawbacks of classic direct and iterative methods.H-matrices are blockwise low-rank matrices.For suitable block structures and block ranks, storing an H-matrix is of logarithmic-linear complexity.Approximating a given matrix in the H-matrix format thus effects a compression.A main difference to other compression methods such as multipole expansions, [Rok85,GR97], or wavelet methods, [vPSS97,Sch98,TW03], is that the H-matrix format allows for an approximate arithmetic.It is possible to add and multiply as well as compute inverses and LU -decompositions efficiently in the format, [Gra01,GH03,Hac15].Therefore, using an H-matrix approximation to the inverse A −1 gives an approximate direct solution method of logarithmic linear complexity that can be applied efficiently to multiple right-hand sides.Moreover, an LU -decomposition in the H-matrix format can be used as a black-box preconditioner in iterative solvers, [Beb07,GHK08,GKLB08]. Nonetheless, we mention that the accuracy in terms of the maximal blockwise rank of the computed approximations to A −1 (or the LU -decomposition) using H-matrix arithmetic is not fully understood yet.
In order to explain the numerical success of these approximations, first observed in [Gra01], several works in the literature provide existence results of approximations to the inverse matrices in the H-matrix format.For the inverses of FEM matrices, e.g., see [BH03,Beb05,Bör10,FMP15] and for inverse BEM matrices, see [FMP16,FMP17].These analyses are restricted to the case of (quasi)uniform meshes, i.e., all mesh elements have comparable size.In a typical FEM scenario, however, locally refined meshes are employed with mesh elements varying greatly in size in order to account for effects such as locally reduced regularity of the solution.A classic example are graded meshes for the solution of elliptic problems in corner domains, [BKP79].
In this article, we generalize the results of [FMP15] for quasiuniform meshes to meshes of so called locally bounded cardinality (cf.D.2.4), which includes both uniform meshes and algebraically graded meshes.Our main result states that the inverses of FEM matrices for such meshes can be approximated by hierarchical matrices such that the error converges exponentially in the H-matrix block rank r.Given a clustering strategy suitable for non-uniform grids, cf.[GHLB04], the storage complexity of the H-matrix approximant is of logarithmic linear complexity O(rN ln N ).Moreover, we develop an abstract framework that allows for more general FEM basis functions that do not need to have local supports.In fact, locality is necessary only for a set of dual functions, which is a substantially weaker assumption.Finally, we streamline some of the arguments made in [FMP15].While not repeated in this article, we mention that the (mostly algebraic) techniques of [FMP15, Section 5] can be employed in exactly the same way to derive exponentially convergent approximate LU -decompositions in the H-matrix format.
The present paper is structured as follows: In Section 2 we introduce all necessary definitions and concepts and state our main result, T.2.13.Section 3 is dedicated to the proof of the main result.The main technical contribution is the discrete Caccioppoli-type estimate presented in L.3.28, which is of independent interest.For a certain class of functions, it allows us to bound the H 1 -seminorm on a given subdomain by the L 2 -norm on a slightly larger subdomain.Finally, Section 4 provides a numerical example that illustrates our main result.
Concerning notation: We write "a b" iff there exists a constant C > 0 such that "a ≤ Cb".The constant might depend on the space dimension d, the domain Ω, the coefficients of the PDE, the shape regularity constant of the mesh, and the polynomial degree of the discrete spline space, but it is independent of all critical parameters such as the mesh width.We write a b, if there hold both a b and a b.Matrices and vectors in linear systems of equations are expressed in boldface letters, e.g., A ∈ R N ×N and f ∈ R N .For all x ∈ R d and ε > 0, we write Ball 2 (x, r) := {y ∈ R d | y − x 2 < ε} for the Euclidean ball of radius r centered at x.The norm of the sequence spaces l 1 and l 2 is denoted by • 1 and • 2 .For k ≥ 0, q ∈ [1, ∞] and domains Ω ⊆ R d , we denote the Sobolev by W k,q (Ω).For a given mesh T , we denote by W k,q pw (T ) the broken Sobolev space consisting of elementwise functions from W k,q .For all v ∈ W k,q pw (T ) and B ⊆ T , we set . Similarly, C 0 pw (T ) denotes the space of piecewise continuous functions.Finally, it will facilitate notation on numerous occasions to define the (discrete) support of a function v ∈ L 2 (Ω) on a mesh T by supp T (v) := {T ∈ T | v| T ≡ 0}.In particular, we have supp T (v) ⊆ T and supp T (v) ⊆ R d , which slightly differs from the usual definition of a support, namely, supp(v

Main results
2.1.The model problem.We investigate the following model problem: Let d ≥ 1 and Ω ⊆ R d be a bounded polyhedral Lipschitz domain.Furthermore, let a 1 ∈ L ∞ (Ω, R d×d ), a 2 ∈ L ∞ (Ω, R d ) and a 3 ∈ L ∞ (Ω, R) be given coefficient functions and f ∈ L 2 (Ω) be a given right-hand side.We seek a weak solution u ∈ H 1 0 (Ω) to the following equations: In the present work, we restrict ourselves to homogeneous Dirichlet conditions.For the treatment of Neumann and Robin boundary conditions, the same arguments as in [FMP15] can be employed.
We assume that a 1 is coercive in the sense a 1 (x)y, y ≥ α 1 y 2 2 for all x ∈ Ω, y ∈ R d and some constant Here, σ Pcr > 0 denotes the constant in the Poincaré inequality Definition 2.1.We introduce the bilinear form: The weak formulation of the model problem reads as follows: The assumptions on the PDE coefficients imply that the bilinear form a(•, •) is continuous and coercive, cf.L.3.7.
In particular, the well-known Lax-Milgram Lemma yields the existence of a unique solution u ∈ H 1 0 (Ω).
2.2.The mesh.Throughout the text, we consider regular, affine meshes in the following sense: Definition 2.2 (Mesh).A finite set T ⊆ Pow(Ω) is a mesh if there exists an open simplex T ⊆ R d (the reference element) such that every element T ∈ T is of the form T = F T ( T ), where F T : R d −→ R d is an affine diffeomorphism.Furthermore, the elements must be pairwise disjoint, i.e., |T ∩ S| = 0 for all T = S ∈ T , and constitute a partition of Ω, i.e., T ∈T T = Ω.Finally, a mesh must be regular in the sense of [Cia78].
We call a collection of mesh elements B ⊆ T a cluster.In the literature on hierarchical matrices, the word cluster is typically reserved for collections of vector/matrix indices I ⊆ {1, . . ., N }.In the present work, however, we deal with collections of mesh elements B ⊆ T much more frequently.We also note that both concepts are intimately linked via D.2.8.
For every subset B ⊆ R d , we call the set of neighboring mesh elements Similarly, for every cluster B ⊆ T , we set T (B) := B∈B T (B) ⊆ T .
To measure the size of an element T ∈ T , we introduce the local mesh width h T := sup x,y∈T y − x 2 .The corresponding aggregate mesh widths for a cluster B ⊆ T read h B := h max,B := max T ∈B h T and h min,B := min T ∈B h T .
Finally, for every T ∈ T , we denote the center of the largest inscribable ball by x T ∈ T (the incenter ).We assume that T is part of a shape-regular family of meshes, i.e., there exists a constant σ shp ≥ 1 such that Definition 2.3.We define the mesh metric For all clusters A, B ⊆ T , we denote the corresponding diameters and distances by If A or B contains only one element, e.g., A = {T }, we drop the enclosing braces and simply write dist T (T, B) := dist T ({T }, B).Furthermore, diam T (T ) := diam T ({T }) = 0 by definition of the cluster diameter.
We refer to L.3.16 for some basic properties of the mesh metric.
Compared to [FMP15], we consider a more general class of meshes.Here, the crucial property is the so called locally bounded cardinality defined in the following D.2.4.Note that both uniform and graded meshes have this property, cf.Section 3.2.
Definition 2.4.A mesh T ⊆ Pow(Ω) has locally bounded cardinality, if there exists a constant σ card ≥ 1 such that 2.3.The basis-and dual functions.
Definition 2.5 (Spline spaces).Let k ≥ 0 and p ≥ 0. We introduce the finite-dimensional spline spaces where P p ( T ) := span { T x → x q | q 1 ≤ p} denotes the usual space of polynomials of (total) degree p on the reference element.
Remark 2.7.Note that we do not assume local basis functions ϕ n , i.e., supp T (ϕ n ) = T is allowed.On the other hand, the dual functions λ n should have local supports in order to guarantee competitive memory requirements for the H-matrices (cf.R.2.12).Furthermore, the specific exponent of h −d/2 min,T in the stability bound is not crucial, as it only affects the exponent of the prefactor N σ card +2 in T.2.13.
The fundamental idea of the present work is to derive properties of matrices from properties of function spaces.Naturally, one has to think about the connection between abstract matrix indices n ∈ {1, . . ., N } and corresponding physical subdomains of Ω, which is captured in the following definition.
Definition 2.8 (Index patches).We define the index patches ∀I ⊆ {1, . . ., N } : Recall from Section 2.2 that T (B) ⊆ T is the patch of a physical subdomain B ⊆ R d and that T (B) ⊆ T is the patch of a cluster B ⊆ T .Now, we also have patches T (I) ⊆ T for collections of matrix indices I ⊆ {1, . . ., N }.Since all three types of patches follow a common idea, we chose the similarity in notation on purpose.
2.4.The system matrix.Let T ⊆ Pow(Ω) be a mesh and p ≥ 1 a fixed polynomial degree.Let S p,1 0 (T ) ⊆ H 1 0 (Ω) be the corresponding spline space.We discretize the model problem from Section 2.1 by means of the spline space and get the following discrete model problem: Again, existence and uniqueness of a solution u ∈ S p,1 0 (T ) follow from L.3.7 and the Lax-Milgram Lemma.As usual, given a basis of the discrete space, the discrete model problem can be rephrased as an equivalent linear system of equations.The bilinear form a(•, •) from D.2.1 and the basis functions ϕ n ∈ S p,1 0 (T ) from D.2.6 compose the governing system matrix.Definition 2.9.We define the system matrix Note that the unique solvability of the discrete model problem already ensures that the matrix A is invertible.
Let σ adm , σ small > 0. A block partition P is called admissible, if it can be split into parts Typically, an admissible block partition P is constructed in two stages: First, the indices I root := {1, . . ., N } are split up into a (hierarchical) cluster tree T N := (T N with #I > σ small are split in the form I = I 1 ∪I 2 with I 1 = ∅ = I 2 via a predefined clustering strategy I → (I 1 , I 2 ).(See, e.g., [Hac15] for some examples of such clustering strategies.)The combined set of all such children defines the next layer, T (L+1) N .Clearly, this process stops after a finite number of layers denoted by depth(T N ).
Second, the matrix indices I root × I root are split up into a (hierarchical) block cluster tree T N ×N := (T Here, the first level is T (1) N ×N with diam T (T (I)) > σ adm dist T (T (I), T (J)) are split into the children (I 1 , J 1 ), (I 1 , J 2 ), (I 2 , J 1 ), (I 2 , J 2 ), where I = I 1 ∪I 2 and J = J 1 ∪J 2 as before.Again, all these children are collected in the layer T (L+1) N ×N .Finally, the block partition P is just the set of all leaves of T N ×N .
Definition 2.11.Let P be an admissible block partition and r ∈ N a given block rank bound.We define the set of H-matrices by Remark 2.12.By [Hac15, Lemma 6.13], the memory requirements to store an H-matrix B ∈ H(P, r) can be bounded by the quantity C sparse (T N ×N )(σ small + r)depth(T N )N , where C sparse (T N ×N ) > 0 denotes the so-called sparsity constant.
In [GHLB04], the authors present a geometrically balanced clustering strategy that guarantees the upper bounds C sparse (T N ×N ) 1 and depth(T N ) ln(h −1 min,T ).Using the relation h min,T h σ card T from D. 2.4 for meshes with locally bounded cardinality, we can conclude depth(T N ) ln(N ).In particular, we get an overall bound of O(rN ln N ) for the memory requirements to store the matrix B.
Note that this line of reasoning implicitly assumes that the dual functions λ n ∈ L 2 (Ω) from D.2.6 have local supports.More precisely, we need supp T (λ n ) ⊆ T (T n ) for some T n ∈ T and have to ensure that these characteristic elements T n do not coincide too frequently, i.e. #{n | T n = T } 1 for all elements T ∈ T .
2.6.The main result.The following theorem is the main result of the present work.It states that inverses of FEM matrices with meshes of locally bounded cardinality can be approximated at an exponential rate by hierarchical matrices.
Theorem 2.13.Let T ⊆ Pow(Ω) be a mesh of locally bounded cardinality for some σ card ≥ 1 in the sense of D.2.4 and {ϕ 1 , . . ., ϕ N } ⊆ S p,1 0 (T ) a basis that has a system of local dual functions (see D.2.6).Let a(•, •) be the elliptic bilinear form from D.2.1 and A ∈ R N ×N be the corresponding Galerkin stiffness matrix (D.2.9).Finally, let P be an admissible block partition as in D.2.10.Then there exists a constant σ exp > 0 such that, for every block rank bound r ∈ N, there exists an H-matrix B ∈ H(P, r) with Under additional assumptions on the block partition P, one can reduce the prefactor from N σ card +2 to ln(N )N σ card , see R.3.13.As shown in Section 3.2, uniform and algebraically graded meshes have locally bounded cardinality.In particular, we immediately get the following corollary.

Proof of main result
3.1.Overview.The techniques employed in the proof of our main result are similar to those developed in [FMP15] for uniform meshes.However, some modifications are necessary to deal with the present case of non-uniform meshes T and (possibly) global basis functions ϕ n ∈ S p,1 0 (T ).Additionally, we simplify several parts of the previous proof considerably.
1) Before we begin the proof, we give a motivation for the assumptions made in D.2.4 and D.2.6.In Section 3.2, we present two types of meshes with locally bounded cardinality, namely uniform and graded meshes.The fact that every uniform mesh has locally bounded cardinality will be used during our proof in T.3.32.The locally bounded cardinality of graded meshes shows that T.2.13 is applicable for graded meshes in the sense of D.3.4.Then, in Section 3.3, we present a practical choice for the dual functions λ n ∈ L 2 (Ω) from D.2.6 for a common choice of basis functions ϕ n ∈ S p,1 0 (T ).The results from this section guarantee that T.2.13 can be used for many different types of finite element bases, including the classic hat functions.
2) The starting point for our proof is an explicit representation formula for A −1 .Since A −1 represents the act of solving the discretized model problem, it is only natural that the corresponding discrete solution operator S T : L 2 (Ω) −→ S p,1 0 (T ) will be involved.Additionally, this endeavor requires the dual functions λ n ∈ L 2 (Ω) mentioned earlier.We present the explicit formula for A −1 at the end of Section 3.4.
3) In Section 3.5 we use this formula to go from the "matrix level" to the "function level": Initially, we reduce the problem of approximating A −1 as a whole to the problem of approximating A −1 | I×J for each admissible block (I, J) ∈ P adm .(The small blocks P small are irrelevant in this matter.)As it turns out, this boils down to the following question: Given admissible clusters B, D ⊆ T and a free parameter L ∈ N, how can we construct a low-dimensional subspace V B,D,L ⊆ L 2 (Ω) that contains a good approximant of (S T f )| B for every f ∈ L 2 (Ω) with supp T (f ) ⊆ D? More precisely, we want to achieve the bounds (for some fixed κ ≥ 1) The remaining sections will give an answer to this very question.Since the construction of V B,D,L is fairly technical and by no means straightforward, the proof is split into further parts: 4) As the notation "V B,D,L " already suggests, the notion of locality plays a prominent role in almost all parts of the proof.This is why we introduce so called inflated clusters, discrete cut-off functions, and the discrete cut-off operator in Section 3.6.5) In Section 3.7 we investigate an important class of functions for our analysis, the spaces of locally discrete harmonic functions S harm (B) ⊆ S p,1 0 (T ).These subspaces have three important properties: First, for certain f ∈ L 2 (Ω), they contain the image S T f .Second, they are invariant under the influence of their respective discrete cut-off operators.Third, they allow for the discrete Caccioppoli inequality, a key ingredient in deriving the asserted error bounds for V B,D,L .6) Finally, in Section 3.8 we construct the single-and multi-step coarsening operators.For any given u ∈ S harm (B δ ) on the inflated cluster B δ ⊇ B, the single-step coarsening operator Q δ B produces a "coarse" approximation Q δ B u ∈ S harm (B) with a small approximation error on B. This is by far the most intricate part of the proof and puts all the aforementioned concepts to use.Afterwards, the multi-step coarsening operator Q δ,L B is just a combination of L ∈ N single-step coarsening operators.7) In Section 3.9 we merely put all the pieces together and finish the proof of T.2.13.

Examples of meshes with locally bounded cardinality.
In this subsection, we present two representatives of meshes with locally bounded cardinality (cf.D.2.4): Uniform meshes and graded meshes.To verify the locally bounded cardinality property for a given mesh, the following lemma is helpful.
Lemma 3.1.Let T ⊆ Pow(Ω) be a shape-regular mesh as in D.2.2.Then, there hold the bounds Definition 3.4 (Mesh graded towards Γ).Let T ⊆ Pow(Ω) be a mesh and Γ ⊆ R d satisfy Γ ⊆ R d \T for all T ∈ T .Furthermore, let α ≥ 1 be a grading exponent and H > 0 a coarse mesh width.We say that T is graded towards Γ with parameters α, H, if there holds Here, x T denotes the incenter of the element T and dist 2 (x T , Γ) = inf γ∈Γ x T − γ 2 is the Euclidean distance between a point and a set.
The set Γ towards which the mesh is graded is usually determined by the given problem.For example, reentrant corners of the domain Ω or regions of non-smoothness of the data may entail a reduced regularity of the solution u to the model problem from Section 2.1.This usually leads to reduced order of convergence of the finite element approximation on quasiuniform meshes.Choosing the set Γ to contain all singularities of the solution as well as choosing the parameter α correctly, one can regain the optimal order of convergence.To a large extent, the shape of Γ is irrelevant for our analysis.We only require that the mesh resolve Γ, i.e., the mesh can only be graded towards points/lines that are part of the mesh skeleton.
Lemma 3.5.Let T ⊆ Pow(Ω) be a mesh graded towards Γ with parameters α, H.Then, there hold the bounds H α h min,T ≤ h T H. Furthermore, T has locally bounded cardinality with σ card = α.
Proof.We start with the bounds for h T and h min,T : For every T ∈ T , we know from D.2.2 that Ball 2 (x H and ultimately h min,T H α .On the other hand, we have the bound It remains to prove the locally bounded cardinality: Let B ⊆ T arbitrary.We fix some element B ∈ B with b := dist 2 (x B , Γ) = min T ∈B dist 2 (x T , Γ) and abbreviate ∆b := diam T (B).Note that there holds the bound h B (max In the case b ≤ ∆b we have the lower bound In the remaining case b > ∆b we get In particular, both cases lead to the estimate which concludes the proof.

3) Local distinctness:
The basis functions are locally distinct in the following sense: For all n = m ∈ {1, . . ., N } and all common T ∈ supp T (ϕ n ) ∩ supp T (ϕ m ), there holds l(n, T ) = l(m, T ).
For each basis function ϕ n we fix an element T n ∈ T as in 1).Note that a standard scaling argument T ↔ T readily provides the following relation: ∀n ∈ {1, . . ., N } : Tn .

Next, recall that |T | h d
T for every element T in a shape-regular mesh T .For all m ∈ {1, . . ., N }, we compute Tm .
Finally, for every T ∈ T , we consider the indices ms(T ) := {m | T m = T }.Due to the duality formula from above, the system {λ 1 , . . ., λ N } ⊆ S p,0 (T ) is linearly independent.In particular, there must hold #ms(T ) 1. Now, for every x ∈ R N and every T ∈ T , we obtain Summing over all elements T ∈ T then gives the asserted global stability bound.This concludes the proof.

3.4.
A representation formula for the inverse system matrix.In this subsection, we develop a representation formula for A −1 in terms of three linear operators: Recall that A −1 represents the action of solving the discrete model problem, so there must be a fundamental connection to the discrete solution operator S T : L 2 (Ω) −→ S p,1 0 (T ).Additionally, we need a way to turn coefficient vectors f ∈ R N into functions f ∈ L 2 (Ω) that can be plugged into S T .For this purpose, we can use the dual functions λ n ∈ L 2 (Ω) from D.2.6 and the corresponding coordinate mapping Λ : R N −→ L 2 (Ω).Finally, the image S T Λf ∈ S p,1 0 (T ) must be converted back to a vector in R N .A straightforward approach would be to use the inverse Φ −1 of the coordinate mapping Φ : R N −→ S p,1 0 (T ) associated with the basis functions ϕ n ∈ S p,1 0 (T ).But, as it turns out, it is advantageous to use the Hilbert space transpose First, let us recall the following classic result: Lemma 3.7.The bilinear form a from D.2.1 is coercive and continuous: The precise definitions of S T , Φ, and Λ are given in the following D.3.8.
Recall from Section 2.4 that existence and uniqueness of S T f are provided by the Lax-Milgram Lemma.Additionally, there holds the a priori bound Definition 3.9.Let {ϕ 1 , . . ., ϕ N } ⊆ S p,1 0 (T ) be a basis and {λ 1 , . . ., λ N } ⊆ L 2 (Ω) a dual system compliant with D.2.6.We denote the corresponding coordinate mappings by We summarize the most important properties of Φ and Λ in the following lemma.As usual, we use the notation supp(x) := {n ∈ {1, . . ., N } | x n = 0} for the support of a vector x ∈ R N .Furthermore, recall from D.2.8 the notation T (I) ⊆ T for all abstract matrix index sets I ⊆ {1, . . ., N }.
Lemma 3.10.The Hilbert space transpose of Λ is given by the operator .
The restriction of Λ T to the subspace S p,1 0 (T ) ⊆ L 2 (Ω) coincides with the inverse mapping Φ −1 .More precisely, for all x, y ∈ R N and all v ∈ S p,1 0 (T ), there hold the duality/inversion formulae Φx, Λy Both Λ and Λ T preserve locality: For all x ∈ R N , v ∈ L 2 (Ω) and I ⊆ {1, . . ., N }, we have Proof.The operator Λ T is indeed the Hilbert space transpose of Λ: For all v ∈ L 2 (Ω) and x ∈ R N , we compute The duality formula is a direct consequence of the duality property ϕ n , λ m L 2 (Ω) = δ nm from D. 2.6: For all x, y ∈ R N , we have x n y m ϕ n , λ m L 2 (Ω) = N n=1 x n y n = x, y 2 .
From this, we immediately get the inversion formula Λ T Φx = x as well.On the other hand, for every v ∈ S p,1 0 (T ), there holds Next, we turn our attention to the preservation of locality by Λ: Finally, let v ∈ L 2 (Ω) and I ⊆ {1, . . ., N }.Let κ ∈ L ∞ (Ω) be a (discontinuous) cut-off function with κ| T (I) ≡ 1 and κ| T \T (I) ≡ 0.Then, which finishes the proof.
Lemma 3.11.The system matrix A ∈ R N ×N from D.2.9, the discrete solution operator S T : L 2 (Ω) −→ S p,1 0 (T ) from D.3.8, and the coordinate mapping Λ : R N −→ L 2 (Ω) from D.3.9 are related via the representation formula Proof.First, we establish a relationship between A and a by means of the coordinate mapping Φ: a(ϕ n , ϕ m )x n y m D.3.9 = a(Φx, Φy).Now, using the duality and inversion formulae from L.3.10, we get This readily implies the stated representation formula.
3.5.Reduction from matrix level to function level.In this subsection, we rephrase the original matrix approximation problem as a function approximation problem.This will get rid of abstract matrix indices I ⊆ {1, . . ., N } in favor of element clusters B ⊆ T .The following lemma facilitates a reduction from the full matrix to the individual matrix blocks.
Remark 3.13.The constant O(N 2 ) in the upper bound is far from optimal.If one assumes a block partition P stemming from a hierarchical cluster tree T N , then it can be reduced to O(ln N ): In [Hac15, Lemma 6.5.8], the author showed the bound B 2 ≤ C sparse (T N ×N )depth(T N ) max (I,J)∈P B| I×J 2 with the sparsity constant C sparse (T N ×N ) and the depth of the cluster tree depth(T N ).Again, due to [GHLB04], one can achieve C sparse (T N ×N ) 1 and depth(T N ) ln(h −1 min,T ) ln(N ) with a geometrically balanced cluster tree on any mesh satisfying h min,T h σ card T .
The following lemma is the main step in shifting the original problem from matrices to function spaces.Note that the representation formula for A −1 from L.3.11 plays a crucial role in its proof.
Lemma 3.14.Let (I, J) ∈ P adm and V ⊆ L 2 (Ω) be a finite-dimensional subspace.Then, there exist matrices X ∈ R I×r and Y ∈ R J×r of size r ≤ dim V , such that there holds the error bound Proof.We use the transposed coordinate mapping Next, let the columns of the matrix X ∈ R I×r be an l 2 (I)-orthonormal basis of V .In particular, the product XX T ∈ R I×I represents the l 2 (I)-orthogonal projection from Now, for every f ∈ R N with supp(f ) ⊆ J, we get the bound We can divide both sides by f l 2 (J) , take suprema and substitute f := Λf ∈ L 2 (Ω).Finally, we use supp to get the desired result.
A thorough understanding of the preceding lemma is absolutely fundamental for the subsequent sections.Therefore, let us recall its interpretation from Section 3.1: for all source functions f ∈ L 2 (Ω) with supp T (f ) ⊆ D?
3.6.The discrete cut-off operator.The notion of cluster inflation provides a means of enlarging a given cluster by a predefined threshold with respect to the mesh metric dist T (•, •) from D.2.3.This is one of the core concepts in our proof and will be used extensively.We acknowledge this fact with tight notation: Definition 3.15.For every cluster B ⊆ T and every radius δ ≥ 0, we introduce the inflated cluster We summarize the most important facts about the mesh metric and inflated clusters in the subsequent lemma.We omit the elementary proofs, as they follow directly from the respective definitions.
Lemma 3.16.The mesh metric dist T (•, •) from D.2.3 defines a metric on T .There holds the triangle type inequality For every element T ∈ T and every neighbor S ∈ T (T ), the distance is bounded by dist T (T, S) ≤ σ shp h T .On the other hand, for every S ∈ T \{T }, we have the lower bound dist T (T, S) ≥ σ −1 shp (h T + h S ).Additionally, for every cluster B ⊆ T , there holds h B ≤ max{h min,B , σ shp diam T (B)}.
When dealing with a second mesh S ⊆ Pow(Ω), cluster diameters are essentially equivalent: Lemma 3.18.The linear operator J T has a local projection property: Given a cluster B ⊆ T and a function v ∈ L 2 (Ω) with v| T (B) ≡ const, there holds (J T v)| B = v| B .Furthermore, J T preserves discrete supports: For every q ≥ 0 and every v ∈ S q,0 (T ), there holds supp T (J T v) ⊆ T (supp(v)).Moreover, J T preserves ranges: For every v ∈ S 1,0 (T ) with 0 ≤ v ≤ 1 there also holds 0 ≤ J T v ≤ 1.Finally, we have the stability bound The discretized model problem a(u, v) = f, v L 2 (Ω) was phrased in terms of global functions u, v ∈ S p,1 0 (T ).But if we plug in a function v with local support, e.g., supp T (v) ⊆ B for some prescribed cluster B ⊆ T , we can extract local information about u on B. This motivates the usage of discrete cut-off functions.
Given a cluster B ⊆ T and a distance δ > 0, the discrete cut-off function κ δ B allows us to "restrict" a function v ∈ S p,1 (T ) to the subdomain B δ ⊆ Ω while preserving continuity.This can be achieved by simply multiplying v with κ δ B .Note that the product κ δ B v has polynomial degree p + 1, rather than p.To mitigate this drawback, we can simply re-interpolate the result with an operator of order p. Definition 3.20.Let p ≥ 1 and denote by Îp : C 0 ( T ) −→ P p ( T ) the (local) Lagrange interpolation operator on the reference element T .The (global) Lagrange interpolation operator I p T : C 0 pw (T ) −→ S p,0 (T ) is defined in a piecewise manner: For every v ∈ C 0 pw (T ) and every T ∈ T , we set In order to derive a useful stability estimate for I p T , we use a standard inverse inequality (see, e.g., [DFG + 01]).Lemma 3.21.Let k ≥ l ≥ 0, q ∈ [1, ∞] and p ≥ 0.Then, for all discrete functions v ∈ S p,0 (T ) and all elements T ∈ T , there holds the inverse inequality The properties of the Lagrange interpolation operator I p T are very similar to those of the Clément operator J T from D.3.17.For the sake of completeness, we include them in the following lemma.
T ).Moreover, I p T preserves discrete supports: For every q ≥ 0 and every v ∈ S q,0 (T ), we have supp T (I p T v) ⊆ supp T (v).Finally, for all q ≥ 0, v ∈ S q,0 (T ) and T ∈ T , there hold the following stability and error estimates (with constants depending on q): ∀m ∈ {0, . . ., p + 1} : Proof.We briefly sketch the proof of the stability and error bounds: The mapping defines a norm on the finite-dimensional space P q ( T ).Hence, by norm equivalence, v H p+1 ( T ) Îp v L 2 ( T ) + |v| H p+1 ( T ) for all v ∈ P q ( T ).Inserting v := w− Îp w for arbitrary w ∈ P q ( T ) results in the bound w− Îp w H p+1 ( T ) |w| H p+1 ( T ) .Finally, a standard scaling argument T ↔ T yields the desired error estimate on T .As for the stability bound, we perform a straightforward triangle inequality on T , reuse the already proven error bound and finish off with the inverse inequality from L.3.21.
Remark 3.23.The fact that I p T preserves global continuity and homogeneous boundary values hinges on an implicit assumption about the (local) interpolation points used by the local Lagrange interpolation operator Îp .Recall from D.2.2 that the reference element T ⊆ R d is a simplex and thus delimited by d + 1 hyperplanes.The interpolation points on each hyperplane Ê must be unisolvent for the space P p ( Ê).Then, in particular, every polynomial v ∈ P p ( T ) vanishing at the interpolation points in Ê must already vanish everywhere on Ê.This property readily implies that homogeneous boundary values are preserved by the global operator I p T .Finally, the distribution of interpolation points on each hyperplane Ê must be "symmetric".More precisely, if two elements T 1 , T 2 ∈ T share a common hyperplane, we require the corresponding interpolation points to align perfectly.In this case, using the same argument as before, the operator I p T preserves global continuity indeed.
Proof.The inclusion S harm (B + ) ⊆ S harm (B) follows directly from the definition of the spaces.As for the mapping properties of S T , let f ∈ L 2 (Ω) with supp T (f ) ⊆ D.Then, for every v ∈ S p,1 0 (T ) with supp T (v) ⊆ B, we have Finally, consider a function u ∈ S harm (B) and an arbitrary v ∈ S p,1 0 (T ) with supp T (v) ⊆ B. Then, This gives K δ B u ∈ S harm (B), which concludes the proof.
Next, we turn our attention to the discrete Caccioppoli inequality.In a nutshell, it will allow us to bound an H 1 -norm on a cluster B ⊆ T by an L 2 -norm on the slightly larger cluster B δ .Obviously, this can be true only for a certain subspace V ⊆ S p,1 (T ).In our setting, this is the space of locally discrete harmonic functions S harm (B δ ) from D.3.26.We can interpret the discrete Caccioppoli inequality as an improved version of the inverse inequality from L.3.21, which bounds an H 1 -seminorm by an L 2 -norm, too.This time, however, the prefactor h of the H 1 -seminorm can be increased to a (possibly much) bigger parameter δ h.
In Figure 1, we chose N ≈ 2.000 degrees of freedom.The elements are graded towards the reentrant corner with a grading exponent α = 5.The cluster tree T N is clearly deeper near the grading center.The block partition P uses sorted indices internally.Only a few admissible blocks are far away from the diagonal, lots of small blocks agglomerate along the diagonal.The sparsity pattern becomes more pronounced as N → ∞.In Figure 2, we chose N ≈ 72.000 degrees of freedom.The computable error bound from above (for r ∈ {1, . . ., 50}) is depicted on a linear abscissa and a logarithmic ordinate.The values are below a straight line with slope −0.37 indicating an exponential decay error(r) 10 −0.37r .This is even better than the asserted bound from T.2.13.The allocated memory in MBytes is plotted on a linear abscissa and a linear ordinate.The values are below a straight line with slope 103.57indicating a polynomial growth memory(r) r.Choosing a rank bound r = 37, for example, gives an approximation error ≈ 10 −14 and uses ≈ 4.2 GByte memory.The full system matrix takes ≈ 41.4 GByte memory.
Both estimates follow from the relation T ∈B h d T T ∈B |T | = | B| with appropriate B ⊆ T .Definition 3.2.A mesh T ⊆ Pow(Ω) is called uniform, if there exists a constant σ unif ≥ 1 such that h min,T ≤ h T ≤ σ unif h min,T .Using L.3.1 we immediately get the following result: Lemma 3.3.Every uniform mesh T ⊆ Pow(Ω) has locally bounded cardinality with σ card = 1.

Finally, consider
clusters B ⊆ C ⊆ T and inflation radii δ, ε ≥ 0.Then, B ⊆ B δ ⊆ (B δ ) ε ⊆ B δ+ε ⊆ C δ+ε .For the cluster patch T (B) we have the inclusion T (B) ⊆ B σ shp h B .We conclude this summary with the boundsdiam T (B δ ) ≤ diam T (B) + 2δ and h B δ ≤ max{h B , σ shp δ}.For the construction of the cut-off function κ δ B in L.3.19 we will use a variant of the classic Clément operator, [Clé75].Definition 3.17.Let N ⊆ Ω be the nodes of the mesh T and denote by {b N | N ∈ N } ⊆ S 1,1 (T ) the well-known hat-functions, i.e. b N (M ) = δ N M .We write v T := |T | −1 T v dx ∈ R for the mean value of a function v ∈ L 2 (Ω) on an element T ∈ T .Now, the Clément operator J T : L 2 (Ω) −→ S 1,1 (T ) is defined in a nodewise fashion: For every v ∈ L 2 (Ω), we set J T v := N ∈N β N b N , where the nodal value β N is given by

Figure 2 .
Figure 2. Approximation error and memory allocation for N ≈ 72.000 degrees of freedom.