1 Introduction

Adaptive spectral (AS) decompositions have been proposed as low-dimensional search spaces during the iterative solution of inverse medium problems [1,2,3,4,5]. For piecewise constant media, in particular, AS decompositions have proved remarkably efficient and accurate. So far, however, their remarkable approximation properties are only supported by numerical evidence. Here, starting from [5], we derive \(L^2\)-error estimates for AS approximations of piecewise constant functions.

In [1], De Buhan and Osses proposed to restrict the search space of an inverse medium problem to the span of a small basis of eigenfunctions of a judicious elliptic operator, repeatedly adapted during the nonlinear iteration. Their adaptive inversion approach relies on a decomposition

$$\begin{aligned} v=\sum _{k=1}^\infty \beta _k\varphi _k , \end{aligned}$$
(1.1)

for \(v\in W^{1,\infty }_0(\Omega )\), with \(\Omega \subset \mathbb {R}^d\). Here each \(\varphi _k\) is an eigenfunction of a v-dependent, linear, symmetric, and elliptic operator \(L_\varepsilon [v]\), i.e.,

$$\begin{aligned} L_\varepsilon [v]\varphi _k =\lambda _k \varphi _k \quad \text {in }\Omega , \qquad \varphi _k =0 \quad \text {on }\partial \Omega , \end{aligned}$$
(1.2)

for an eigenvalue \(\lambda _k\in \mathbb {R}\). In the sequel we shall in fact apply the AS decomposition to more general functions in \(W^{1,\infty }(\Omega )\) by extending their boundary data appropriately into the interior of \(\Omega \); here, for simplicity, we suppose \(v\in W^{1,\infty }_0(\Omega )\).

Clearly, the choice of \(L_\varepsilon [v]\) is crucial for obtaining an efficient approximation of v with as few basis functions as possible. Typically, we use

$$\begin{aligned} L_\varepsilon [v]w=-{\nabla \cdot }\left( \mu _{\varepsilon }[v]\nabla w\right) , \qquad \mu _{\varepsilon }[v](x)=\frac{1}{\sqrt{|\nabla v(x)|^2+\varepsilon ^2}} \, , \end{aligned}$$
(1.3)

where \(\varepsilon >0\) is a small parameter to avoid division by zero, but other forms have also been used in the past and are treated by our analysis.

Note that we cannot apply the above AS decomposition directly to piecewise constant u, because then \(\mu _\varepsilon [u]\) is a nonlinear function of the gradient \(\nabla u\), a singular measure at the jump set of u. Nevertheless, we may still decompose u at the cost of an additional step. We first approximate u by a more regular approximation, which we denote generically by \(u_\delta \), where \(\delta >0\) is a parameter that controls the error and is proportional to the width of the support of \(\nabla u_\delta \) near the jump discontinuities of u. Then we may expand u (or \(u_\delta \)) in the spectral basis of \(L_\varepsilon [u_\delta ]\) and obtain an approximate decomposition of u (or \(u_\delta \)) by truncating the expansion. Typically, \(u_\delta \) corresponds to the standard, continuous, piecewise polynomial FE interpolant of u on a regular triangulation with mesh size \(\delta =h\). Then, the eigenfunctions \(\varphi _k\) may correspond either to the (true continuous) eigenfunctions of \(L_\varepsilon [u_\delta ]\) or to their (discrete approximate) Galerkin FE counterparts, as our analysis encompasses both the continuous and the discrete setting.

Insight about the AS decomposition approach may be obtained from its connection to the total variation (TV) functional, which is commonly used for image denoising while preserving edges. In fact, \(L_\varepsilon [v]v\), with \(L_\varepsilon [v]\) given by (1.3), is the Fréchet derivative of the penalized TV functional—see [3, Remark 1]. The eigenvalue problem for \(L_\varepsilon [v]\) also bears a striking resemblance to nonlinear eigenvalue problems for the TV functional, which have been studied in the more general context of 1-homogeneous functionals for image processing—see [6,7,8] and the references therein.

The AS decomposition has been used as follows in various iterative Newton-like algorithms for the solution of inverse medium problems [2,3,4]: Given an approximation of the medium, \(u^{(m-1)}\), from the previous iteration, the approximation \(u^{(m)}\) at the current iteration is set as the minimizer of the misfit in the space \({\text {span}}(\varphi _k)_{k=1}^K\), where \(\varphi _k\), \(k=1,\ldots ,K\), satisfy (1.2) with \(v=u^{(m-1)}\). As the approximation \(u^{(m)}\) changes from one iteration to the next, so does the search space used for the subsequent minimization.

By combining the adaptive inversion process with the TRAC (time reversed absorbing condition) approach, de Buhan and Kray [2] developed an effective solution strategy for time-dependent inverse scattering problems. In [3], Grote, Kray and Nahum proposed the AEI (adaptive eigenspace inversion) algorithm for inverse scattering problems in the frequency domain. In [4], the AEI algorithm was extended to multi-parameter inverse medium problems. Recently, it was extended to electromagnetic inverse scattering problems at fixed frequency [9] and also to time-dependent inverse scattering problems when the illuminating source is unknown [10]. In [11], AS decompositions were used for solving 2-D and 3-D seismic inverse problems for the Helmholtz equation. First theoretical estimates for AS decompositions together with an approach for adapting the dimension of the search space were derived in [5].

When u consists of a sum of K characteristic functions \(\chi _{A^k}\) of sets \(A^k\), each compactly contained in \(\Omega \), the expansion (1.1) in the spectral basis of \(L_\varepsilon [u_\delta ]\) truncated after K terms has proved remarkably accurate, as it essentially recovers u and in fact decomposes u into the characteristic functions comprising it. In [5], it is shown that the gradients of the first K eigenfunctions of \(L_\varepsilon [u_\delta ]\) are small away from the discontinuities of u. Thus, in regions where u is constant, \(\varphi _1,\ldots ,\varphi _K\) are also nearly constant and we expect that in their span, \(\Phi _K^{\varepsilon ,\delta }={\text {span}}\{\varphi _k\}_{k=1}^K\), u together with each of the characteristic functions comprising it can be well approximated. Here, our goal is to rigorously prove this proposition, even in the more general situation where u is not necessarily constant near the boundary \(\partial \Omega \).

Starting from [5], we derive \(L^2\) error estimates for the projection of any \(v\in u+{\text {span}}\{\chi _{A^k}\}\) onto the appropriate affine subspace. In our main result, given by Theorem 3.6, we prove that the \(L^2\)-projection error of v (in particular of u itself) is bounded by \(\mathcal {O}(\sqrt{\varepsilon +\delta })\). Similarly, we show that any of the K characteristic functions \(\chi _{A^k}\) is approximated by its \(L^2\)-projection on \(\Phi _K^{\varepsilon ,\delta }\) up to \(\mathcal {O}(\sqrt{\varepsilon +\delta })\). In Corollary 3.8, we particularize our estimates for two standard methods for obtaining \(u_\delta \). Our analysis treats both continuous AS formulations and their discrete Galerkin approximations. In particular, our results apply when \(u_\delta \) is a continuous, piecewise polynomial interpolant of u in a FE space \(V_h\) with mesh size \(h=\delta \), and the eigenfunctions \(\varphi _k\) are computed numerically by a Galerkin FE approximation in the same subspace.

The remainder of the paper is organized as follows. In Sect. 2, we describe the class of piecewise constant functions considered, provide definitions and introduce notation. Section 3 contains the analysis and the main results of the paper. Finally, we present in Sect. 4 various numerical examples which illustrate the accuracy of the AS decomposition for functions that either do, or do not, satisfy the assumptions of our theory. There we also illustrate the usefulness of the AS decomposition for the solution of a standard linear inverse problem from image deconvolution.

2 Notation and Definitions

The adaptive spectral (AS) decomposition (1.1) of a function v is based on the spectral decomposition of the v-dependent elliptic operator \(L_\varepsilon [v]\) given by

$$\begin{aligned} L_\varepsilon [v]w=-{\nabla \cdot }\left( \mu _{\varepsilon }[v]\nabla w\right) . \end{aligned}$$
(2.1)

Typically, the weight function \(\mu _\varepsilon [v]\) has the form of either

$$\begin{aligned} \mu _\varepsilon [v](x) =\ \frac{1}{(|\nabla v(x)|^q+\varepsilon ^q)^{1/q}} \, , \end{aligned}$$
(2.2)

for some \(q\in [1,\infty )\), or

$$\begin{aligned} \mu _\varepsilon [v](x) =\ \frac{1}{\max \{|\nabla v(x)|,\, \varepsilon \}} \, . \end{aligned}$$
(2.3)

For the analysis below, however, we allow for more general \(\mu _\varepsilon [v]\).

Here our goal is to study the application of AS decompositions to regular approximations of piecewise constant functions. Indeed, for a piecewise constant function u, \(\mu _\varepsilon [u]\) is a nonlinear function of the gradient \(\nabla u\), a singular measure at the jump set of u. Nevertheless, we may still decompose u at the cost of an additional step. We first approximate u by a more regular approximation, which we denote generically by \(u_\delta \), where \(\delta \) is a parameter that controls the error and is proportional to the width of the support of \(\nabla u_\delta \) near the jump discontinuities of u. Then we may expand u (or \(u_\delta \)) in the spectral basis of \(L_\varepsilon [u_\delta ]\), be it finite- or infinite-dimensional, and obtain an approximation by truncating the expansion. One important example of a method for obtaining \(u_\delta \) is the standard, continuous, piecewise polynomial interpolant of u in an \(H^1\)-conforming finite element (FE) space with underlying mesh size \(\delta =h\).

To include FE approximations in the analysis, we formulate boundary value problems in closed subspaces \(\mathcal {V}^\delta \subset H^1(\Omega )\) and \(\mathcal {V}^\delta _0=\mathcal {V}^\delta \cap H^1_0(\Omega )\). Hence, in the continuous setting \(\mathcal {V}^\delta = H^1(\Omega )\), independently of \(\delta \), whereas in the discrete FE setting \(\mathcal {V}^\delta \subsetneq H^1(\Omega )\) corresponds to the finite-dimensional FE space with underlying mesh size \(\delta =h\). As a consequence, all our results below are valid both for the continuous and the discrete setting and, in particular, for \(H^1\)-conforming FE approximations. We let \(\left\langle \cdot ,\cdot \right\rangle \) and \(\Vert \cdot \Vert _{L^2(\Omega )}\) denote the standard inner product and norm of \(L^2(\Omega )\), and \(|\cdot |\) denote the \(\ell ^2\)-norm. We use C, \(C_1,C_2\), etc. to denote generic constants which may depend on u, but are independent of \(\delta \) and \(\varepsilon \); their values may also vary depending on the context. We sometimes use the term “medium” to refer to functions on the domain of interest \(\Omega \subset \mathbb {R}^d\).

In the remainder of this section, we introduce notation, assumptions and definitions needed for our approximation theory in Sect. 3. Section 2.1 precisely defines the class of piecewise constant functions u to be decomposed. In Sect. 2.2, we introduce admissible approximation methods for obtaining \(u_\delta \) and provide examples of two standard methods which are admissible. In Sect. 2.3, we state our assumptions on the medium-dependent weight function \(\mu _\varepsilon [\cdot ]\), and in Sect. 2.4 we state the boundary-value problems defining the spectral basis of \(L_\varepsilon [u_\delta ]\) and the \(L_\varepsilon [u_\delta ]\)-lifting, \(\varphi _0\), of the boundary data of \(u_\delta \) into \(\Omega \).

2.1 Piecewise Constant Medium

Consider \(u:\Omega \rightarrow \mathbb {R}\) piecewise constant, where \(\Omega \subset \mathbb {R}^d\), with \(d\ge 2\), is a bounded Lipschitz domain. We assume u has the form

$$\begin{aligned} u(x) = u^0(x)+\widetilde{u}(x) , \qquad x\in \Omega , \end{aligned}$$
(2.4)

where the background \(u^0\) and the interior inclusions \(\widetilde{u}\) are given by

$$\begin{aligned} u^0 = \sum _{m=1}^{M} \omega _{m}\chi _{\Omega ^{m}}\, , \quad \omega _{m}\in \mathbb {R}, \qquad \widetilde{u} = \sum _{k=1}^{K} \alpha _{k}\chi _{A^k}\, , \quad \alpha _{k}\in \mathbb {R}\setminus \{0\} , \end{aligned}$$
(2.5)

with \(\chi _{A}\) denoting the characteristic function of a set \(A\subset \mathbb {R}^d\). In the decomposition (2.4) we distinguish the sets \(\Omega ^m\) connected to the boundary \(\partial \Omega \) from those that are not. We suppose the sets \(\Omega ^{1},\ldots ,\Omega ^{M}\) characterizing the background \(u^0\) are disjoint Lipschitz domains covering \(\Omega \),

$$\begin{aligned} \overline{\Omega } = \bigcup _{m=1}^M \overline{\Omega ^{m}} , \end{aligned}$$

and for each m, \(\partial \Omega ^m\cap \partial \Omega \) is open in (the relative topology of) \(\partial \Omega \), i.e,

$$\begin{aligned} \Omega ^m = \Omega \cap \widetilde{\Omega }^m, \qquad \partial \Omega \cap \widetilde{\Omega }^m \ne \emptyset , \end{aligned}$$

for some bounded disjoint Lipschitz domain \(\widetilde{\Omega }^m\subset \mathbb {R}^d\). Moreover, we suppose \(A^{1},\ldots ,A^{K}\) are Lipschitz domains with mutually disjoint boundaries such that for each k, the boundary \(\partial A^{k}\) of \(A^{k}\) is connected, and \(A^{k}\subset \subset \Omega ^{m}\) for some \(m=1,\ldots ,M\). Hence \(\Omega \) is partitioned into finitely many subdomains \(\Omega ^{m}\) adjacent to its boundary \(\partial \Omega \), while each \(\Omega ^{m}\) may contain one or several inclusions \(A^k\) isolated from \(\partial \Omega \); Fig. 1 illustrates a possible configuration in two dimensions.

Fig. 1
figure 1

Typical configuration in two dimensions. In this example \(K=3\) and \(M=4\). The frame on the left shows the sets \(A^1\), \(A^2\) and \(A^3\), and the frame on the right shows \(B^1=A^1\), \(B^2=A^2\setminus A^3\), and \(B^3=A^3\)

Note that u given by (2.4) is defined only a.e. in \(\Omega \). This will be significant only in Section 2.2, where we discuss admissible approximations of u; in the rest of the paper this will not cause ambiguity since there we always consider u as an element of \(L^2(\Omega )\).

2.2 Admissible Approximation

To employ the estimates derived in [5], we assume \(u_\delta \) is obtained by an admissible method, i.e., by a method satisfying the following.

Definition 2.1

Consider a family of transformations \(\mathcal {I}_\delta :L^2(\Omega )\rightarrow \mathcal {V}^\delta \subset H^1(\Omega )\), with \(\delta >0\) in some set of indices. We say that \(\{\mathcal {I}_\delta \}_{\delta }\) is an admissible method, if for every Lipschitz domain \(A\subset \Omega \), the following conditions are satisfied:

  1. 1.
    $$\begin{aligned} \lim _{\delta \rightarrow 0} \Vert \mathcal {I}_\delta \chi _{A}-\chi _{A}\Vert _{L^2(\Omega )}=0 . \end{aligned}$$
    (2.6)
  2. 2.
    $$\begin{aligned} \nabla (\mathcal {I}_\delta \chi _{A})\in L^\infty (\Omega ), \quad {\text {supp}}\!\big (\nabla (\mathcal {I}_\delta \chi _{A})\big ) \subset \overline{U_\delta } , \end{aligned}$$
    (2.7)

    where

    $$\begin{aligned} U_\delta = \big \{x\in \Omega \, |\ {\text {dist}}(x, \partial \Omega ^m\cap \Omega )<\delta \big \} , \end{aligned}$$

    with \({\text {dist}}(x,W)\) denoting the distance of \(x\in \mathbb {R}^d\) to the set \(W\subset \mathbb {R}^d\).

  3. 3.

    There exists a constant C, such that for every \(\delta >0\) sufficiently small,

    $$\begin{aligned} \delta \Vert \nabla \mathcal {I}_\delta \chi _A \Vert _{L^\infty (\Omega )} \ \le \ C . \end{aligned}$$
    (2.8)
  4. 4.

    If \(\Gamma \subset \partial \Omega \setminus \partial U_\delta \) with positive \((d-1)\)-dimensional Hausdorff measure, \(\mathcal {H}^{d-1}(\Gamma )>0\), then the trace of \(\chi _A\) on \(\Gamma \) coincides with that of \(\mathcal {I}_\delta \chi _A\).

Hence, for convenience, we shall say that \(u_\delta \) obtained by an admissible method is an admissible approximation of u. By Definition 2.1, we have

$$\begin{aligned} u_\delta \ = \ u^0_\delta +\widetilde{u}_\delta , \qquad u^0_\delta = \mathcal {I}_\delta u^0 \in \mathcal {V}^\delta , \quad \widetilde{u}_\delta = \mathcal {I}_\delta \widetilde{u} \in \mathcal {V}^\delta _0 . \end{aligned}$$
(2.9)

In addition, by (2.7), we have \(\nabla u_\delta =0\) in the open complement

$$\begin{aligned} D_\delta = \Omega \setminus \overline{\mathcal {M}_\delta } , \end{aligned}$$
(2.10)

of the \(\delta \)-wide neighborhood \(\mathcal {M}_\delta \) of all interfaces,

$$\begin{aligned} \mathcal {M}_\delta = \bigcup _{k=1}^K \big \{x\in \Omega :\ {\text {dist}}(x,\partial A^{k})<\delta \big \} \cup \bigcup _{m=1}^M \big \{x\in \Omega :\ {\text {dist}}(x,\partial \Omega ^{m}\cap \Omega )<\delta \big \} . \end{aligned}$$
(2.11)

By (2.8), there exists a constant C (which depends on u), such that for every \(\delta >0\) sufficiently small, \(u_\delta \) satisfies

$$\begin{aligned} \delta \Vert \nabla u_\delta \Vert _{L^\infty (\Omega )} \le C . \end{aligned}$$
(2.12)

Next we provide two examples [5, Corollary 6] of standard methods which are admissible.

Proposition 2.2

Let u be extended to \(\overline{\Omega }\) either by assigning to u at any x on the interfaces in \(\overline{\Omega }\) one of its values in a neighboring domain \(A^k\) or \(\Omega ^m\), or by replacing \(A^k\) and \(\Omega ^m\) in (2.5) by \(\overline{A^k}\) and \(\overline{\Omega ^m}\). For \(h>0\), let \(V_h\) denote an \(H^1\)-conforming \(\mathcal {P}^r\)-FE space associated with a simplicial mesh \(\mathcal {T}_h\) with mesh size h. If the family of meshes \(\{\mathcal {T}_h\}_{h}\) is regular and quasi-uniform (see, e.g., [12]), then the FE-interpolant \(u_h\) of u in \(V_h\) is admissible.

Proof

To prove the proposition, we have to verify the conditions of Definition 2.1. Most of the conditions, i.e., linearity, and conditions 1,2 and 4 are clearly satisfied. The only condition which requires careful attention is 3, i.e., (2.8). The argument of the proof is similar to that of standard inverse inequalities.

Let \(u_h\) be the FE-interpolant of the characteristic function \(u=\chi _A\) of some set \(A\subset \Omega \). In every element \(\mathcal {K}\in \mathcal {T}_h\), \(u_h\) is the unique polynomial in \(\mathcal {P}^r\) which interpolates the values of u (0 or 1) at the nodes of \(\mathcal {K}\). By transforming \(\mathcal {K}\) to the (mesh independent) reference element \({\hat{\mathcal {K}}}\), we have

$$\begin{aligned} \nabla u_h(x) = J_\mathcal {K}^{-T}\nabla P_\mathcal {K}(F_\mathcal {K}^{-1}(x)) \qquad x\in \mathcal {K}, \end{aligned}$$

where \(F_\mathcal {K}^{-1}:\mathcal {K}\rightarrow {\hat{\mathcal {K}}}\) is the inverse of the affine mapping \(F_\mathcal {K}\) which transforms \({\hat{\mathcal {K}}}\) to \(\mathcal {K}\), \(J_\mathcal {K}\in \mathbb {R}^{d\times d}\) is the Jacobian matrix of \(F_\mathcal {K}\) and \(P_\mathcal {K}\in \mathcal {P}^r\) is a polynomial taking values of either 0 or 1 at the nodes of \({\hat{\mathcal {K}}}\). Since there is only a finite number of polynomials in \(\mathcal {P}^r\) whose image on the nodes of \({\hat{\mathcal {K}}}\) is a subset of \(\{0,1\}\), we can estimate all their gradients in \({\hat{\mathcal {K}}}\) by a single constant independently of the element \(\mathcal {K}\) and mesh size h. In addition, by [12, Lemma 4.3] and the assumption that the family of meshes \(\{\mathcal {T}_h\}\) is regular and quasi-uniform, we have \(\big |J_{\mathcal {K}}^{-1}\big | \le C h^{-1}\), where \(|\cdot |\) denotes the matrix norm induced on \(\mathbb {R}^{d\times d}\) by the \(\ell ^2\) norm of \(\mathbb {R}^d\). Note that while Lemma 4.3 in [12] is stated and proved in 2-D, its proof extends easily to any dimension. Hence we obtain (2.8) with \(\delta =h\) which yields that interpolation in \(V_h\) is an admissible approximation. \(\square \)

Fig. 2
figure 2

The continuous, piecewise linear \({P}^1\)-FE interpolant \(u_h\) of the characteristic function u for a disk

The main effort in the proof of Proposition 2.2 is to show (2.8) with \(u=\chi _A\) a characteristic function. Figure 2 illustrates this situation for the standard interpolant \(u_h\) in a \(\mathcal {P}^1\)-FE space of the characteristic function u for a disk A. The right frame shows a part of the mesh where the solid black line marks the discontinuity of u along \(\partial A\). Outside the dark gray elements, u is constant and therefore so is \(u_h\). In particular, \(\nabla u_h=0\) outside the neighborhood of width \(\delta =h\) (light gray) around \(\partial A\).

Proposition 2.3

If u is extended to a.e. \(x\in \mathbb {R}^d\) by

$$\begin{aligned} u =\sum _{m=1}^{M} \omega _{m}\chi _{\widetilde{\Omega }^{m}} + \sum _{k=1}^{K} \alpha _{k}\chi _{A^k} \end{aligned}$$
(2.13)

(compare with (2.4), (2.5)), and \(u_\delta \) is the convolution

$$\begin{aligned} u_\delta (x) = \zeta _\delta * u = \int _{\mathbb {R}^d} \zeta _\delta (x-y) u(y)\, d y , \qquad \zeta _\delta (x)=\delta ^{-d}\zeta (x/\delta ) \end{aligned}$$
(2.14)

with \(\zeta \) a standard mollifier (e.g., [13]), then \(u_\delta \) is admissible.

Proof

See Corollary 6 of [5]. \(\square \)

For the analysis below it is convenient to partition \(D_\delta \), given by (2.10), into its connected components. Hence, we let the sets \(A^1,\ldots ,A^K\) be indexed so that if \(i>k\), then either \(A^i \subset A^k\) or \(A^i \cap A^k=\emptyset \), and let \(B^k_\delta \) be the connected components of \(D_\delta \),

$$\begin{aligned} B^k_\delta = B^k \cap D_\delta , \qquad B^k = A^k\setminus \bigcup _{i> k} \overline{A^i} , \qquad k=1,\ldots ,K ; \end{aligned}$$
(2.15)

see Fig. 1. Similarly, we define outside the inclusions

$$\begin{aligned} E^m_\delta = E^m \cap D_\delta , \qquad E^m = \Omega ^m\setminus \bigcup _{k=1}^K \overline{A^k} , \qquad m=1,\ldots ,M . \end{aligned}$$
(2.16)

Here, we assume \(\delta >0\) sufficiently small so that \(B^k_\delta \) and \(E^m_\delta \) are indeed connected and that the \((d-1)\)-dimensional Hausdorff measure of \(\partial E^m_\delta \cap \partial \Omega \) is positive. Thus, for each k and \(\delta >0\) small, \(B^k\) and \(B^k_\delta \) are open and connected, and \(D_\delta \) is given by the disjoint union

$$\begin{aligned} D_\delta = E_\delta \cup \bigcup _{k=1}^K B^k_\delta , \end{aligned}$$

where \(E_\delta \) denotes the “\(\delta \)-exterior”,

$$\begin{aligned} E_\delta = \bigcup _{m=1}^M E^m_\delta . \end{aligned}$$
(2.17)

Now we may deduce from condition 4 of Definition 2.1 that

$$\begin{aligned} u=u_\delta =u^0_\delta , \qquad \widetilde{u}=\widetilde{u}_\delta =0 \qquad \text {a.e. in} E_\delta . \end{aligned}$$
(2.18)

Since we have a finite number of Lipschitz domains, \(B^k\) (\(k=1,\ldots ,K\)) and \(E^m\) (\(m=1,\ldots ,M\)), we may find a single constant \(\Lambda >0\) sufficiently large so that each of the sets, near its boundary, locally coincides with the epigraph of a \(\Lambda \)-Lipschitz function. While the optimal Lipschitz constant for a domain may depend on the scale of the open sets used for covering its boundary, when reducing the scale, the optimal constant cannot increase. Therefore, if for some scale the Lipschitz constant \(\Lambda \) is suitable for a domain, for simplicity, we shall say that it is a \(\Lambda \)-Lipschitz domain.

By Theorem A.1, for every sufficiently small \(\delta \), each \(B^k_\delta \) is also a \(\Lambda \)-Lipschitz domain. Note, however, that since a portion of the boundary of \(E^m_\delta \) coincides with the boundary of \(E^m\) for every \(\delta \), it does not have the form assumed in Theorem A.1. As a result, we cannot rely on the same theorem to deduce that \(E^m_\delta \) is a \(\Lambda \)-Lipschitz domain. Nevertheless, outside a neighborhood of \(\partial \Omega \cap \partial E^m\), the boundary of \(E^m_\delta \) is a \(\Lambda \)-Lipschitz surface with \(E^m_\delta \) lying to one of its sides, by Theorem A.3. It is therefore possible to modify the definition of \(\mathcal {M}_\delta \) so that for every \(\delta \) sufficiently small, \(E^m_\delta \) given by (2.16), is a \(\widetilde{\Lambda }\)-Lipschitz domain, for some \(\widetilde{\Lambda }\) independent of \(\delta \). Here, for simplicity, we assume the latter to be true and denote the uniform constant \(\max (\Lambda ,\widetilde{\Lambda })\) again by \(\Lambda \).

2.3 Medium Dependent Weight Function

For \(\varepsilon >0\) and \(v\in H^1(\Omega )\), with \(\nabla v\in L^{\infty }(\Omega )\), we assume the v-dependent weight function \(\mu _\varepsilon [v]\) has the form

$$\begin{aligned} \mu _\varepsilon [v](x)={\hat{\mu }}_\varepsilon (|\nabla v(x)|) , \qquad x\in \Omega , \end{aligned}$$
(2.19)

where \({\hat{\mu _\varepsilon }}:[0,\infty )\rightarrow \mathbb {R}\) is a non-increasing function that satisfies

$$\begin{aligned} {\hat{\mu _\varepsilon }}(0)=\varepsilon ^{-1} , \qquad 0<{\hat{\mu _\varepsilon }}(t) , \quad t {\hat{\mu _\varepsilon }}(t)\le 1 , \quad t\ge 0 , \end{aligned}$$
(2.20)

and

$$\begin{aligned} \exists C>0\text {, s.t. for every sufficiently large }t, C \le t{\hat{\mu _\varepsilon }}(t). \end{aligned}$$
(2.21)

In particular, for \({\hat{\mu _\varepsilon }}(t)=1/(t^q+\varepsilon ^q)^{1/q}\) and \({\hat{\mu _\varepsilon }}(t)=1/\max (t,\varepsilon )\), as in (2.2) and (2.3), respectively, (2.20)–(2.21) hold for any \(C<1\). From (2.20), we immediately conclude that

$$\begin{aligned} \mu _\varepsilon [v](x)|\nabla v(x)| \le 1 , \qquad \text {a.e. }x\in \Omega , \end{aligned}$$
(2.22)

and

$$\begin{aligned} 0 < {\hat{\mu }}_\varepsilon (\Vert \nabla v\Vert _{L^\infty (\Omega )}) \le \mu _\varepsilon [v](x) \qquad \text {a.e. }x\in \Omega . \end{aligned}$$
(2.23)

2.4 Boundary Value Problems

Let \(\mathcal {V}^\delta \) be a closed subspace of \(H^1(\Omega )\), possibly equal to \(H^1(\Omega )\), and \(\mathcal {V}^\delta _0= \mathcal {V}^\delta \cap H^1_0(\Omega )\). For sufficiently small and fixed \(\delta ,\varepsilon >0\), the operator \(L_\varepsilon [u_\delta ]\) in (2.1) is uniformly elliptic in \(\Omega \) [5]. Thus, it admits in \(\mathcal {V}^\delta _0\) a (possibly finite) non-decreasing sequence \(\{\lambda _k\}_{k\ge 1}\) of positive eigenvalues with each repeated according to its multiplicity with corresponding eigenfunctions \(\{\varphi _k\}_{k\ge 1}\) which form an \(L^2\)-orthonormal basis of \(\mathcal {V}^\delta _0\). In addition, we denote by \(\varphi _0\in \mathcal {V}^\delta \) the \(L_\varepsilon [u_\delta ]\)-lifting of the boundary data of \(u_\delta \) into \(\Omega \). More precisely, we let \(\varphi _0\in \mathcal {V}^\delta \) satisfy

$$\begin{aligned} L_\varepsilon [u_\delta ]\varphi _0 =0 \quad \text {in }\Omega , \qquad \varphi _0 = u_\delta \quad \text {on }\partial \Omega \end{aligned}$$
(2.24)

in \(\mathcal {V}^\delta _0\), and for \(k\ge 1\) we let \(\varphi _k\in \mathcal {V}^\delta _0\), \(\varphi _k\ne 0\) satisfy

$$\begin{aligned} L_\varepsilon [u_\delta ]\varphi _k =\lambda _k \varphi _k \quad \text {in }\Omega , \qquad \varphi _k =0 \quad \text {on }\partial \Omega , \end{aligned}$$
(2.25)

in \(\mathcal {V}^\delta _0\). Clearly both (2.24) and (2.25) should be understood in a weak sense with respect to the bilinear form

$$\begin{aligned} B_{\varepsilon ,\delta }[w,v] = \left\langle \mu _{\varepsilon }[u_\delta ] \nabla w, \nabla v \right\rangle . \end{aligned}$$
(2.26)

For instance, if \(\mathcal {V}^\delta \) is a (finite-dimensional, \(H^1\)-conforming) FE space, the eigenvalue problem (2.25) is understood as the Galerkin FE formulation: find \(\varphi _k\in \mathcal {V}_0^{\delta }\) and \(\lambda _k\in \mathbb {R}\) such that

$$\begin{aligned} B_{\varepsilon ,\delta }[\varphi _k, \varphi ] = \lambda _k\left\langle \varphi _k,\varphi \right\rangle \qquad \forall \, \varphi \in \mathcal {V}_0^{\delta } . \end{aligned}$$
(2.27)

Thus, the framework above treats both continuous and discrete formulations.

Remark 2.4

Note that \(\varphi _k\) (\(k\ge 0\)) and \(\lambda _k\) (\(k\ge 1\)) always depend on \(\varepsilon \) and \(u_\delta \), and thus on u and \(\delta \), regardless of any particular finite- or infinite-dimensional choice for \(\mathcal {V}^\delta \). For simplicity of notation, we do not indicate this dependency explicitly.

3 Error Estimates

Given a piecewise constant u, we shall now derive our estimates for the AS decomposition of \(u_\delta \) based on the assumptions and definitions introduced in Sect. 2. Since \(u_\delta \) is an admissible approximation of u, as defined in Sect. 2.2, for every \(\varepsilon >0\) and every sufficiently small \(\delta >0\), we have [5]

$$\begin{aligned} B_{\varepsilon ,\delta }[v,v] \le C , \qquad v\in \left\{ u_\delta ,\, u^0_\delta ,\, \widetilde{u}_\delta ,\, \varphi _0,\ldots ,\, \varphi _K\right\} . \end{aligned}$$
(3.1)

Here, and in the rest of the paper, the constants \(C,C_1,C_2,\ldots \) may depend on u (i.e., on its values and on the sets \(B^k\) and \(\Omega ^m\)), but not on \(\varepsilon ,\delta \). As a consequence of (3.1), the gradients of \(\varphi _k\), with \(k=0,\ldots ,K\), are small in \(D_\delta \) [5, Theorem 5]. Heuristically, this implies that each \(\varphi _k\) is almost constant in regions where u is constant and thus we expect that u be well approximated in \(\varphi _0+\Phi _K^{\varepsilon ,\delta }\), where

$$\begin{aligned} \Phi _K^{\varepsilon ,\delta } = {\text {span}}\{\varphi _k\}_{k=1}^K . \end{aligned}$$
(3.2)

Here, our goal is to rigorously prove this proposition.

More precisely, let \(\Pi _K^{\varepsilon }[u_\delta ]\) denote the standard orthogonal projection on \(\Phi _K^{\varepsilon ,\delta }\):

$$\begin{aligned} \Pi _K^{\varepsilon }[u_\delta ]:L^2(\Omega )\rightarrow \Phi _K^{\varepsilon ,\delta } , \qquad \left\langle v-\Pi _K^{\varepsilon }[u_\delta ]v,\varphi \right\rangle =0 , \quad \forall \, \varphi \in \Phi _K^{\varepsilon ,\delta } , \end{aligned}$$
(3.3)

and let \(X_K\) be given by

$$\begin{aligned} X_K ={\text {span}}\{\chi _{A^k}\}_{k=1}^K = {\text {span}}\{\chi _{B^k}\}_{k=1}^K . \end{aligned}$$
(3.4)

We shall show that every function \(v\in u+X_K\) is well approximated in \(\varphi _0+\Phi _K^{\varepsilon ,\delta }\) by its \(L^2\)-orthogonal projection

$$\begin{aligned} Q_K^{\varepsilon }[u_\delta ](v)= \varphi _0+\Pi _K^{\varepsilon }[u_\delta ](v-\varphi _0) . \end{aligned}$$
(3.5)

Similarly, we shall show that every \(v\in X_K\) is well approximated by its orthogonal projection \(\Pi _K^{\varepsilon }[u_\delta ]v\) on \(\Phi _K^{\varepsilon ,\delta }\). The main result, given by Theorem 3.6, provides estimates of the \(L^2\) errors in terms of \(\varepsilon \) and \(\delta \).

3.1 Preliminary Results

From (2.23) with \(v=u_\delta \), the monotonicity of \({\hat{\mu }}\), (2.12) and (2.21) we get

$$\begin{aligned} 0<C\delta \le \mu _\varepsilon [u_\delta ](x) \qquad \text {a.e. }x\in \Omega \end{aligned}$$
(3.6)

for every sufficiently small \(\delta \), where the constant C may depend on u, but is independent of \(\delta \) and \(\varepsilon \). Since \(\nabla u_\delta \) vanishes in \(D_\delta \) by (2.7), assumptions (2.19) and (2.20) on \({\hat{\mu _\varepsilon }}\) yield

$$\begin{aligned} \mu _\varepsilon [u_\delta ](x) = \varepsilon ^{-1} \qquad \text {a.e. }x\in D_\delta . \end{aligned}$$
(3.7)

Together with the definition of \(B_{\varepsilon ,\delta }[\cdot ,\cdot ]\) in (2.26), and (3.6) we obtain

$$\begin{aligned} \varepsilon ^{-1}\Vert \nabla v\Vert _{L^2(D_\delta )}^2 +C_1\delta \Vert \nabla v\Vert _{L^2(\mathcal {M}_\delta )}^2 \le B_{\varepsilon ,\delta }[v,v] \end{aligned}$$
(3.8)

for every \(\delta >0\) sufficiently small and every \(v\in H^1(\Omega )\). By substituting \(v=\varphi _k\) in the above and using (3.1) we get

$$\begin{aligned} \varepsilon ^{-1}\Vert \nabla \varphi _k\Vert _{L^2(D_\delta )}^2 +C_1\delta \Vert \nabla \varphi _k\Vert _{L^2(\mathcal {M}_\delta )}^2 \le B_{\varepsilon ,\delta }[\varphi _k,\varphi _k] \le C . \end{aligned}$$
(3.9)

Next we employ (3.9) and Poincaré-type inequalities to obtain \(L^2\) estimates for \(\varphi _k\) in \(D_\delta \). To do that we require inequalities with constants independent of \(\delta \) for the connected components of \(D_\delta \). We use Theorems 1 and 2 of [14] which yield the following: Let \(p\ge 1\) and \(\Lambda >0\). There exists a constant \(C>0\) such that for every \(\Lambda \)-Lipschitz domain \(W\subset \Omega \) and \(v\in W^{1,p}(W)\),

$$\begin{aligned} \Vert v-\langle v \rangle _{W}\Vert _{L^p(W)} \le C\Vert \nabla v \Vert _{L^p(W)} , \qquad \forall v\in W^{1,p}(W) , \end{aligned}$$
(3.10)

where \(\langle f \rangle _{W}\) denotes the average of f over W,

$$\begin{aligned} \langle f \rangle _{W} = \frac{1}{\mathcal {L}(W)} \int _W f(x) dx , \end{aligned}$$
(3.11)

with \(\mathcal {L}(W)\) the Lebesgue measure of W. Moreover, if \(\Gamma \subset \overline{\Omega }\) has positive \((d-1)\)-dimensional Hausdorff measure, then for every \(\Lambda \)-Lipschitz domain \(W\subset \Omega \), with \(\Gamma \subset \partial W\), and \(v\in W^{1,p}(W)\) satisfying \(v=0\) on \(\Gamma \),

$$\begin{aligned} \Vert v \Vert _{L^p(W)} \le C\Vert \nabla v \Vert _{L^p(W)} . \end{aligned}$$
(3.12)

Corollary 3.1

There exists a constant \(C>0\) such that for every \(\varepsilon >0\), \(\delta >0\) sufficiently small and \(1\le j\le K\),

$$\begin{aligned} \Vert \varphi _0-u^0\Vert _{L^2(E_\delta )}^2 \le C\varepsilon , \qquad \Vert \varphi _0-\langle \varphi _0 \rangle _{B^j_\delta }\Vert _{L^2(B^j_\delta )}^2 \le C\varepsilon \end{aligned}$$
(3.13)

and

$$\begin{aligned} \Vert \varphi _k\Vert _{L^2(E_\delta )}^2 \le C\varepsilon , \qquad \Vert \varphi _k-\langle \varphi _k \rangle _{B^j_\delta }\Vert _{L^2(B^j_\delta )}^2 \le C\varepsilon , \qquad k=1,\ldots ,K . \end{aligned}$$
(3.14)

Proof

We show (3.13); the proof of (3.14) is similar. Fix \(1\le m\le M\). Then, for every sufficiently small \(\delta \), we have \(\eta =\varphi _0-u^0 \in H^1(E^m_\delta )\) with \(\eta =0\) on

$$\begin{aligned} \Gamma ^m = \partial \Omega \cap \partial E^m_\delta , \end{aligned}$$

by (2.18). As \(\Gamma ^m\) contains an open set in the topology of \(\partial \Omega \), its \((d-1)\)-dimensional Hausdorff measure is positive. Since \(E^m_\delta \) is \(\Lambda \)-Lipschitz, with \(\Lambda \) independent of \(\delta \), by Poincaré (3.12), there exists \(C_1>0\) such that

$$\begin{aligned} \Vert \eta \Vert _{L^2(E^m_\delta )} \le C_1 \Vert \nabla \eta \Vert _{L^2(E^m_\delta )} . \end{aligned}$$
(3.15)

Now, we use the above combined with (3.9) and \(\nabla u^0=0\) in \(E^m_\delta \), to obtain

$$\begin{aligned} \Vert \varphi _0-u^0\Vert _{L^2(E^m_\delta )} = \Vert \eta \Vert _{L^2(E^m_\delta )} \le C_1 \Vert \nabla \varphi _0\Vert _{L^2(E^m_\delta )} \le C_2 \sqrt{\varepsilon } , \end{aligned}$$
(3.16)

which proves the first estimate in (3.13), since \(E_\delta \) is the disjoint (finite) union of \(E^m_\delta \). The proof of the second estimate in (3.13) is similar, but relies on (3.10) instead of (3.12); therefore, it is omitted here. \(\square \)

While Corollary 3.1 provides \(L^2\) estimates for \(\varphi _k\) in the connected components of \(D_\delta \), the following lemma provides \(L^2\) estimates in the neighborhood \(\mathcal {M}_\delta \) of the discontinuities of u. Especially, it yields that the contribution over \(\mathcal {M}_\delta \) to the norm of \(\varphi _k\) is small. Note that to deduce this conclusion it is not enough to observe that the volume of \(\mathcal {M}_\delta \) is small, since the functions \(\varphi _k\) themselves depend on \(\delta \).

Lemma 3.2

There exists a positive constant C such that for every sufficiently small \(\varepsilon ,\delta >0\),

$$\begin{aligned} \Vert \varphi _0-u^0\Vert _{L^2(\mathcal {M}_\delta )}^2&\le C\delta \end{aligned}$$
(3.17)
$$\begin{aligned} \Vert \varphi _k\Vert _{L^2(\mathcal {M}_\delta )}^2&\le C\delta \qquad k=1,\ldots ,K. \end{aligned}$$
(3.18)

Proof

Here we show only (3.18). We include estimate (3.17) here only for brevity; its proof is similar, though it requires Lemma 3.5. Thus, the correct order of our argument is (3.18), Lemma 3.3, Theorem 3.4, Lemma 3.5, and then (3.17). Note that by (3.17) we have that \(\varphi _0\) also satisfies (3.18).

Fix \(1\le k\le K\), let \(W=B^j\) for some \(j=1,\ldots ,K\) or \(W=\Omega ^{m}\) for some \(m=1,\ldots ,M\), and let

$$\begin{aligned} U_\delta =\left\{ x\in W :\ {\text {dist}}(x,\partial W)<\delta \right\} . \end{aligned}$$
(3.19)

By Theorem B.1 we have

$$\begin{aligned} \Vert \varphi _k\Vert _{L^2(U_\delta )}^2 \le C\left( \delta ^{2}\Vert \nabla \varphi _k\Vert _{L^2(U_\delta )}^2 +\delta \Vert \varphi _k\Vert _{H^1(D_\delta )}^2\right) . \end{aligned}$$
(3.20)

By using \(\Vert \varphi _k\Vert _{L^2(\Omega )}=1\) and (3.9), we estimate the right hand side of (3.20) and thus for \(\delta ,\varepsilon \) sufficiently small obtain

$$\begin{aligned} \Vert \varphi _k\Vert _{L^2(U_\delta )}^2 \le C_1\delta \left( 1+\varepsilon \right) \le C\delta . \end{aligned}$$
(3.21)

Since \(\mathcal {M}_\delta \) is a subset of the finite union of all \(\overline{U_\delta }\), we obtain (3.18) which completes the proof.

Following Corollary 3.1 and Lemma 3.2 we know that \(\varphi _1,\ldots ,\varphi _K\) are approximately piecewise constant, and that the contributions over \(\mathcal {M}_\delta \) to their \(L^2\)-norms are small. This implies that each \(\varphi _k\) is close to some function in \(X_K\). As we shall see in Theorem 3.6, the converse is also true; i.e., every function in \(X_K\) can be well approximated in \(\Phi _K^{\varepsilon ,\delta }\). Here – because in each \(B^k_\delta \), \(\varphi _1,\ldots ,\varphi _K\) are close to their averages – this proposition reduces to the invertibility of the matrix of the averages \(\langle \varphi _j \rangle _{B^k_\delta }\).

Lemma 3.3

Let the matrix \(\Sigma \in \mathbb {R}^{K\times K}\) be given by

$$\begin{aligned} \Sigma =(\sigma _{kj}) , \qquad \sigma _{kj} =\langle \varphi _j \rangle _{B^k_\delta } , \quad k,j=1,\ldots ,K . \end{aligned}$$
(3.22)

There exist constants \(0<C_1\le C_2\) such that for every sufficiently small \(\delta \) and \(\varepsilon \),

$$\begin{aligned} C_1|\beta | \le |\Sigma \beta | \le C_2|\beta | , \qquad \beta \in \mathbb {R}^K . \end{aligned}$$
(3.23)

Proof

Since the upper estimate in (3.23) is simple, here we only show the lower estimate \(C_1|\beta |\le |\Sigma \beta |\), for some positive constant \(C_1\) independent of \(\beta \), \(\varepsilon \), and \(\delta \). Let \(\beta \in \mathbb {R}^K\) with \(|\beta |=1\) and \(\varphi \in \Phi _K^{\varepsilon ,\delta }\) be given by

$$\begin{aligned} \varphi =\sum _{j=1}^K \beta _j\varphi _j . \end{aligned}$$
(3.24)

Then, we have

$$\begin{aligned} (\Sigma \beta )_k = \langle \varphi \rangle _{B^k_\delta } , \qquad k=1,\ldots ,K , \end{aligned}$$
(3.25)

where \((\Sigma \beta )_k\) denotes the k-th entry of \(\Sigma \beta \). Since \(\varphi _1,\ldots ,\varphi _K\) are orthonormal and \(|\beta |=1\), we get

$$\begin{aligned} 1=\Vert \varphi \Vert _{L^2(\Omega )}^2 =\Vert \varphi \Vert _{L^2(E_\delta )}^2+\Vert \varphi \Vert _{L^2(\mathcal {M}_\delta )}^2 +\sum _{k=1}^K \Vert \varphi \Vert _{L^2(B^k_\delta )}^2 . \end{aligned}$$
(3.26)

Due to (3.25), the function \(\varphi -(\Sigma \beta )_k\) has zero average over \(B^k_\delta \) and is, therefore, orthogonal to the constant in \(L^2(B^k_\delta )\). Thus,

$$\begin{aligned} \Vert \varphi \Vert _{L^2(B^k_\delta )}^2 = \Vert \varphi -(\Sigma \beta )_k\Vert _{L^2(B^k_\delta )}^2 +\mathcal {L}(B^k_\delta )(\Sigma \beta )_k^2 , \qquad k=1,\ldots , K . \end{aligned}$$
(3.27)

By Poincaré’s inequality (3.10),

$$\begin{aligned} \Vert \varphi -(\Sigma \beta )_k\Vert _{L^2(B^k_\delta )}^2 \le C\Vert \nabla \varphi \Vert _{L^2(B^k_\delta )}^2 \end{aligned}$$
(3.28)

and by the triangle inequality and (3.9), we have

$$\begin{aligned} \Vert \nabla \varphi \Vert _{L^2(B^k_\delta )} \le \sum _{j=1}^K |\beta _j|\Vert \nabla \varphi _j\Vert _{L^2(B^k_\delta )} \le C\sqrt{\varepsilon } . \end{aligned}$$
(3.29)

We further estimate \(\Vert \varphi \Vert _{L^2(E_\delta )}\) and \(\Vert \varphi \Vert _{L^2(\mathcal {M}_\delta )}\) in (3.26) using (3.14) and (3.18), and use that \(B^k_\delta \subset B^k\), to obtain

$$\begin{aligned} 1 \le C(\varepsilon +\delta ) +\sum _{k=1}^K \mathcal {L}(B^k_\delta )(\Sigma \beta )_k^2 \le C(\varepsilon +\delta ) +\max _{k} \mathcal {L}(B^k)\, |\Sigma \beta |^2 . \end{aligned}$$
(3.30)

Thus, for every \(\delta \) and \(\varepsilon \) sufficiently small,

$$\begin{aligned} \widetilde{C} \le \max _{k} \mathcal {L}(B^k) \, |\Sigma \beta |^2 \end{aligned}$$
(3.31)

which completes the proof. \(\square \)

3.2 Main Results

Next we show that if \(\varepsilon ,\delta >0\) are sufficiently small, then the first eigenvalue of \(L_\varepsilon [u_\delta ]\) is bounded from below by a constant independent of \(\varepsilon ,\delta \).

Theorem 3.4

There exists a positive constant C such that for every \(\varepsilon ,\delta >0\) sufficiently small and for every \(v\in H^1_0(\Omega )\),

$$\begin{aligned} C\Vert \nabla v\Vert _{L^1(\Omega )} \le \sqrt{B_{\varepsilon ,\delta }[v,v]} , \qquad C\Vert v\Vert _{L^2(\Omega )}^2 \le B_{\varepsilon ,\delta }[v,v] . \end{aligned}$$
(3.32)

In particular, the second estimate yields \(\lambda _1\ge C>0\).

Proof

We begin by showing the first estimate of (3.32). Let \(v\in H^1_0(\Omega )\). Hölder’s inequality and Lemma 4 of [5] yield

$$\begin{aligned} \delta \Vert \nabla v\Vert _{L^2(\mathcal {M}_\delta )}^2 \ge \frac{\delta }{\mathcal {L}(\mathcal {M}_\delta )} \Vert \nabla v\Vert _{L^1(\mathcal {M}_\delta )}^2 \ge C \Vert \nabla v\Vert _{L^1(\mathcal {M}_\delta )}^2. \end{aligned}$$
(3.33)

Similarly we use Hölder’s inequality to estimate \(\Vert \nabla v\Vert _{L^2(D_\delta )}^2\) from below by \(C\Vert \nabla v\Vert _{L^1(D_\delta )}^2\) and thus, by (3.8), for \(\varepsilon >0\) sufficiently small we get

$$\begin{aligned} C_1 \Vert \nabla v\Vert _{L^1(\Omega )}^2 \le B_{\varepsilon ,\delta }[v,v] , \end{aligned}$$
(3.34)

which is equivalent to the first estimate of (3.32).

To verify the second estimate of (3.32), we only need to show that the smallest eigenvalue \(\lambda _1\) of \(L_{\varepsilon }[u_\delta ]\) in \(H^1_0(\Omega )\) is bounded from below by a constant C independent of \(\varepsilon \) and \(\delta \). Thus, for the proof we may set \(\mathcal {V}^\delta _0=H^1_0(\Omega )\) and show that for every \(\varepsilon ,\delta >0\) sufficiently small,

$$\begin{aligned} \lambda _1=B_{\varepsilon ,\delta }[\varphi _1,\varphi _1]\ge C . \end{aligned}$$

Substituting \(v=\varphi _1\) into (3.34) yields

$$\begin{aligned} \lambda _1 = B_{\varepsilon ,\delta }[\varphi _1,\varphi _1] \ge C_1 \Vert \nabla \varphi _1\Vert _{L^1(\Omega )}^2 . \end{aligned}$$
(3.35)

Thus for \(\varepsilon ,\delta >0\) sufficiently small, by Poincaré’s inequality (3.12) we get

$$\begin{aligned} \lambda _1\ge C_1 \Vert \nabla \varphi _1\Vert _{L^1(\Omega )}^2 \ge C_2 \Vert \varphi _1\Vert _{L^1(\Omega )}^2. \end{aligned}$$
(3.36)

Therefore,

$$\begin{aligned} \sqrt{\lambda _1} \ge \sqrt{C_2} \, \sum _{k=1}^K \mathcal {L}(B^k_\delta )|\langle \varphi _1 \rangle _{B^k_\delta }| . \end{aligned}$$
(3.37)

As a consequence, for every \(0<\delta \le \delta _0\), with \(\delta _0\) sufficiently small, we have

$$\begin{aligned} \sqrt{\lambda _1} \ge C_3\min _{k}\mathcal {L}(B^k_{\delta _0})\sum _{k=1}^K |\langle \varphi _1 \rangle _{B^k_\delta }| , \end{aligned}$$
(3.38)

where we have used that \(B^k_{\delta }\supset B^k_{\delta _0}\). Finally, Lemma 3.3 yields

$$\begin{aligned} \sum _{k=1}^K |\langle \varphi _1 \rangle _{B^k_\delta }| = |\Sigma e_1|_{\ell ^1} \ge |\Sigma e_1| \ge C>0 , \end{aligned}$$
(3.39)

where \(\Sigma \) is given by (3.22) and \(e_1=(1,0,\ldots ,0)^T\in \mathbb {R}^K\), and thus \(\lambda _1\ge C>0\). Since \(\lambda _1\) is the minimum of the Rayleigh quotient in \(H^1_0(\Omega )\setminus \{0\}\), the last two estimates yield the second inequality in (3.32), which completes the proof. \(\square \)

Recall that we have not yet derived estimate (3.17) of Lemma 3.2. To do so, we will need to estimate the norm of \(\varphi _0-u^0\) (or \(\varphi _0\)) in \(\Omega \).

Lemma 3.5

There exists a positive constant C, such that for each positive \(\delta ,\varepsilon >0\) sufficiently small

$$\begin{aligned} \Vert \varphi _0-u^0\Vert _{L^2(\Omega )} \le C , \qquad k=1,\ldots ,K . \end{aligned}$$
(3.40)

Proof

Let \(\eta \in \mathcal {V}^\delta _0\) be given by \(\eta =\varphi _0-u^0_\delta \), where \(u^0_\delta \) is the admissible approximation of \(u^0\). Then, (2.24) implies

$$\begin{aligned} B_{\varepsilon ,\delta }[\varphi _0,\eta ]=0 \end{aligned}$$
(3.41)

and thus using (3.1) we obtain

$$\begin{aligned} B_{\varepsilon ,\delta }[\eta ,\eta ] \le B_{\varepsilon ,\delta }[\eta ,\eta ] +B_{\varepsilon ,\delta }[\varphi _0,\varphi _0] = B_{\varepsilon ,\delta }[u^0_\delta ,u^0_\delta ] \le C . \end{aligned}$$
(3.42)

Since \(\eta =0\) on \(\partial \Omega \), we also have

$$\begin{aligned} \lambda _1\Vert \eta \Vert _{L^2(\Omega )}^2 \le B_{\varepsilon ,\delta }[\eta ,\eta ] \le C . \end{aligned}$$
(3.43)

By Theorem 3.4, \(\lambda _1\) is bounded from below by a positive constant independent of \(\delta \) and \(\varepsilon \), therefore,

$$\begin{aligned} \Vert \varphi _0-u^0_\delta \Vert _{L^2(\Omega )} =\Vert \eta \Vert _{L^2(\Omega )} \le C , \end{aligned}$$
(3.44)

which yields (3.40) by the triangle inequality, (2.9) and (2.6) and thus completes the proof. \(\square \)

We can now prove our main results. Here, as above, we suppose u, given by (2.4), is approximated by admissible \(u_\delta \) as defined in Sect. 2.2, and let \(X_K\) be given by (3.4). For \(\varepsilon ,\delta \) positive, \(L_\varepsilon [\cdot ]\) is given by (2.1) with \(\mu _\varepsilon [\cdot ]\) given by (2.19). Finally, we let \(\varphi _0\) satisfy (2.24) and \(\varphi _1,\ldots ,\varphi _K\) satisfy (2.25) weakly in \(\mathcal {V}^\delta _0\).

Theorem 3.6

  1. 1.

    Let \(\Pi _K^{\varepsilon }[u_\delta ]\) be the orthogonal projection on \(\Phi _K^{\varepsilon ,\delta }\), given by (3.3). There exists a positive constant C such that for every \(v\in X_K\) and every \(\varepsilon ,\delta \) sufficiently small,

    $$\begin{aligned} \left\| v-\Pi _K^{\varepsilon }[u_\delta ] v \right\| _{L^2(\Omega )} \le C\sqrt{\varepsilon +\delta }\, \Vert v\Vert _{L^2(\Omega )}. \end{aligned}$$
    (3.45)

    In particular, \(v=\widetilde{u}\) and \(v=\chi _{A^k}\) (\(k=1,\ldots ,K\)) satisfy (3.45).

  2. 2.

    Let \(Q_K^{\varepsilon }[u_\delta ]\) be the \(L^2\)-orthogonal projection on \(\varphi _0+\Phi _K^{\varepsilon ,\delta }\), given by (3.5). There exists a positive constant C such that for every \(v\in u^0+X_K\) and every \(\varepsilon ,\delta \) sufficiently small,

    $$\begin{aligned} \big \Vert v-Q_K^{\varepsilon }[u_\delta ](v)\big \Vert _{L^2(\Omega )} \le C\sqrt{\varepsilon +\delta }\, \big (\Vert v-u^0\Vert _{L^2(\Omega )} + 1\big ) . \end{aligned}$$
    (3.46)

    In particular, \(v=u\) and \(v=u^0\) satisfy (3.46).

Remark 3.7

Theorem 3.6 estimates the projection error for piecewise constant functions in \(X_K\) and \(u^0+X_K\). From these we immediately obtain error estimates for \(L^2(\Omega )\) functions, e.g, admissible approximations of elements of \(X_K\) or \(u^0+X_K\). By the triangle inequality and Theorem 3.6, for every \(v\in X_K\) and \(w\in L^2(\Omega )\), we have

$$\begin{aligned} \big \Vert w-\Pi _K^{\varepsilon }[u_\delta ] w \big \Vert _{L^2(\Omega )} \le C\sqrt{\varepsilon +\delta } \, \Vert w\Vert _{L^2(\Omega )} + \Vert w-v\Vert _{L^2(\Omega )}\, , \end{aligned}$$
(3.47)

and, similarly, if \(v\in u^0 +X_K\) and \(w \in L^2(\Omega )\), then

$$\begin{aligned} \big \Vert w-Q_K^{\varepsilon }[u_\delta ](w)\big \Vert _{L^2(\Omega )} \le C\sqrt{\varepsilon +\delta } \, \big (\Vert w-u^0\Vert _{L^2(\Omega )} + 1\big ) + \Vert w-v\Vert _{L^2(\Omega )}\, . \end{aligned}$$
(3.48)

In particular, (3.47) is satisfied for \(v=\widetilde{u}\) and \(w=\widetilde{u}_\delta \), and (3.48) is satisfied for \(v=u\) and \(w=u_\delta \) and for \(v=u^0\) and \(w=u^0_\delta \).

Similarly to Corollary 6 of [5], we have the following:

Corollary 3.8

  1. 1.

    If \(u_h\) is the interpolation of u in a FE space \(V_h\) as in Proposition 2.2, and either \(\mathcal {V}^h=V_h\) or \(\mathcal {V}^h=H^1(\Omega )\), then for every \(\varepsilon ,h>0\) sufficiently small, estimates (3.45) and (3.46) with \(\delta =h\) hold true.

  2. 2.

    If \(u_\delta \) is the mollification of u as in Proposition 2.3 and \(\mathcal {V}^\delta =H^1(\Omega )\), then for every \(\varepsilon ,\delta >0\) sufficiently small, estimates (3.45) and (3.46) hold true.

Proof

This corollary is a direct result of Theorem 3.6 and Propositions 2.2 and 2.3. \(\square \)

Assertion 1 of Corollary 3.8 is particularly important for applications, as it implies that our main estimates are valid not only for the continuous setting, but also for Galerkin FE discretizations as follows: Let \(u_h\) be the interpolant of u in a FE space \(V_h\) with mesh size h, and let \(\varphi _0,\varphi _1,\ldots ,\varphi _K\) be the Galerkin FE solutions of (2.24), (2.25) with \(u_\delta =u_h\) in \(V_h\). By Assertion 1 of Corollary 3.8, the projections \(\Pi _K^{\varepsilon }[u_h], Q_K^{\varepsilon }[u_h]\) defined by (3.3) and (3.5) – using the computed FE solutions \(\varphi _0,\varphi _1,\ldots ,\varphi _K\) – satisfy (3.45) and (3.46) with \(\delta =h\).

Proof of Theorem 3.6

Here, we only show (3.46); the proof of (3.45) is similar. We have

$$\begin{aligned} \big \Vert v-Q_K^{\varepsilon }[u_\delta ](v) \big \Vert _{L^2(\Omega )} = \min _{\beta \in \mathbb {R}^K} \bigg \Vert (v -\varphi _0) -\sum _{k=1}^K \beta _k \varphi _k \bigg \Vert _{L^2(\Omega )} . \end{aligned}$$
(3.49)

By Lemma 3.3 there exists a unique vector \(\beta =(\beta _k)\in \mathbb {R}^K\) such that

$$\begin{aligned} \sum _{j=1}^K \beta _j \langle \varphi _j \rangle _{B^k_\delta } = \langle v-\varphi _0 \rangle _{B^k_\delta } , \qquad k=1,\ldots ,K, \end{aligned}$$
(3.50)

and, moreover,

$$\begin{aligned} |\beta |^2 \le C_1 \sum _{k=1}^K \langle v-\varphi _0 \rangle _{B^k_\delta }^2 \end{aligned}$$
(3.51)

for \(C_1>0\) independent of \(\varepsilon \), \(\delta \) and v. Thus, we get

$$\begin{aligned} |\beta | \le \sqrt{C_1} \left[ \sum _{k=1}^K \Vert v-\varphi _0\Vert _{L^2(B^k_\delta )}^2\right] ^{\frac{1}{2}}. \end{aligned}$$
(3.52)

By Lemma 3.5, we have

$$\begin{aligned} \Vert v-\varphi _0\Vert _{L^2(B^k_\delta )} \le \Vert v-u^0\Vert _{L^2(B^k_\delta )} + \Vert u^0-\varphi _0\Vert _{L^2(B^k_\delta )} \le \Vert v-u^0\Vert _{L^2(B^k_\delta )} + C \end{aligned}$$
(3.53)

which yields

$$\begin{aligned} |\beta | \le C \big (\Vert v-u^0\Vert _{L^2(\Omega )}+1\big ) , \end{aligned}$$
(3.54)

with \(C>0\) independent of \(\varepsilon \), \(\delta \) and v. Define

$$\begin{aligned} \varphi =\varphi _0 +\widetilde{\varphi }, \qquad \widetilde{\varphi }= \sum _{k=1}^K \beta _k \varphi _k \in \Phi _K^{\varepsilon ,\delta }. \end{aligned}$$
(3.55)

Thus, we have

$$\begin{aligned} \big \Vert v-Q_K^{\varepsilon }[u_\delta ](v) \big \Vert _{L^2(\Omega )} \le \Vert v-\varphi \Vert _{L^2(\Omega )} . \end{aligned}$$
(3.56)

By the triangle inequality, we get

$$\begin{aligned} \begin{aligned} \Vert v-\varphi \Vert _{L^2(\Omega )}&\le \Vert v-\varphi \Vert _{L^2(E_\delta )} +\Vert v-u^0\Vert _{L^2(\mathcal {M}_\delta )} +\Vert u^0- \varphi \Vert _{L^2(\mathcal {M}_\delta )} \\&\quad +\sum _{k=1}^K \Vert v-\varphi \Vert _{L^2(B^k_\delta )} . \end{aligned} \end{aligned}$$
(3.57)

Next we estimate each of the terms on the right hand side. Since \(v=u^0\) in \(E_\delta \), we can estimate the first term as follows:

$$\begin{aligned} \begin{aligned} \Vert v-\varphi \Vert _{L^2(E_\delta )}&\le \Vert u^0-\varphi _0\Vert _{L^2(E_\delta )} +\Vert \widetilde{\varphi }\Vert _{L^2(E_\delta )} \\&\le \Vert u^0-\varphi _0\Vert _{L^2(E_\delta )} +\sum _{k=1}^K |\beta _k| \Vert \varphi _k\Vert _{L^2(E_\delta )} \\&\le C\sqrt{\varepsilon } \, \big (\Vert v-u^0\Vert _{L^2(\Omega )}+1\big ) \end{aligned} \end{aligned}$$
(3.58)

because of (3.13), (3.14) and (3.54). The second term on the right hand side of (3.57) is the \(L^2\) norm of the piecewise constant function \(w=v-u^0\) in \(\mathcal {M}_\delta \). Since \(w=0\) a.e. in \(\Omega \setminus \overline{\bigcup _{k=1}^K B^k}\), we have

$$\begin{aligned} \Vert v-u^0\Vert _{L^2(\mathcal {M}_\delta )}^2 = \int _{\mathcal {M}_\delta } w^2 =\sum _{k=1}^K \int _{\mathcal {M}_\delta \cap B^k} w^2 . \end{aligned}$$
(3.59)

We now use that \(w^2\) is constant in each \(B^k\) and that \(\mathcal {L}(\mathcal {M}_\delta \cap B^k) = \mathcal {O}(\delta )\) [5, Lemma 4] to get

$$\begin{aligned} \int _{\mathcal {M}_\delta \cap B^k} w^2 = \mathcal {L}(\mathcal {M}_\delta \cap B^k)\, w^2|_{B^k} \le C \delta \, \mathcal {L}(B^k)\, w^2|_{B^k} = C\delta \int _{B^k} w^2 \end{aligned}$$
(3.60)

which yields

$$\begin{aligned} \Vert v-u^0\Vert _{L^2(\mathcal {M}_\delta )} \le C\sqrt{\delta } \, \Vert v-u^0\Vert _{L^2(\Omega )} . \end{aligned}$$
(3.61)

To estimate the third term we use Lemma 3.2 and (3.54) to obtain

$$\begin{aligned} \begin{aligned} \Vert u^0- \varphi \Vert _{L^2(\mathcal {M}_\delta )}&\le \Vert u^0-\varphi _0\Vert _{L^2(\mathcal {M}_\delta )} +\sum _{k=1}^K |\beta _k|\Vert \varphi _k\Vert _{L^2(\mathcal {M}_\delta )} \\&\le C\sqrt{\delta } \, \big (\Vert v-u^0\Vert _{L^2(\Omega )}+1\big ) . \end{aligned} \end{aligned}$$
(3.62)

For each \(k=1,\ldots ,K\), we estimate \(\Vert v-\varphi \Vert _{L^2(B^k_\delta )}\) as follows: Since \(\beta \) solves (3.50), we have \(\langle v-\varphi \rangle _{B^k_\delta }=0\), which by the Poincaré inequality (3.10) yields

$$\begin{aligned} \Vert v-\varphi \Vert _{L^2(B^k_\delta )} \le C\Vert \nabla (v-\varphi )\Vert _{L^2(B^k_\delta )} . \end{aligned}$$
(3.63)

Since \(\nabla v=0\) in \(B^k_\delta \), estimates (3.9) and (3.54) yield

$$\begin{aligned} \begin{aligned} \Vert v-\varphi \Vert _{L^2(B^k_\delta )}&\le C_1 \Vert \nabla \varphi \Vert _{L^2(B^k_\delta )} \le C_1 \left( \Vert \nabla \varphi _0\Vert _{L^2(B^k_\delta )} + \sum _{j=1}^K |\beta _j| \Vert \nabla \varphi _j\Vert _{L^2(B^k_\delta )}\right) \\&\le C_2\sqrt{\varepsilon } \big (\Vert v-u^0\Vert _{L^2(\Omega )}+1\big ) . \end{aligned} \end{aligned}$$
(3.64)

Finally, by combining the above, we obtain

$$\begin{aligned} \big \Vert v-Q_K^{\varepsilon }[u_\delta ](v) \big \Vert _{L^2(\Omega )} \le \Vert v-\varphi \Vert _{L^2(\Omega )} \le C\sqrt{\varepsilon +\delta } \, \big (\Vert v-u^0\Vert _{L^2(\Omega )}+1\big ) \end{aligned}$$
(3.65)

which completes the proof. \(\square \)

4 Numerical Examples

Here we present numerical examples which illustrate the main results of our analysis and, in particular, the remarkable accuracy of AS decompositions for piecewise constant mediaFootnote 1. First, we consider media comprised of a constant background \(u^0\) and a single characteristic function. Secondly we consider a medium which consists of an inhomogeneous background comprised of five sets \(\Omega ^{m}\), \(m=1,\ldots ,5\), and four interior inclusions \(A^{k}\), \(k=1,\ldots ,4\) (see Sect. 2.1). In the third example, we consider a medium which consists of four adjacent squares in a constant background. Since the boundaries of the squares are not mutually disjoint, this example is not covered by our theory. Next we apply the AS decomposition to two more complex examples that are not covered by our theory: a polygonal approximation of the map of Switzerland with its 26 cantons and the well-known Marmousi model from seismic imaging. Finally, we devise a simple iterative inversion algorithm based on AS decompositions to solve a standard deconvolution inverse problem from optical imaging [15].

In all examples the domain \(\Omega \subset \mathbb {R}^{2}\) is rectangular and we use a regular, uniform triangular mesh \(\mathcal {T}_{h}\) whose vertices lie on an equidistant Cartesian grid of size \(h>0\). We let \(\mathcal {V}^{\delta } \subset H^{1}(\Omega )\), with \(\delta = h\), be the standard \(\mathcal {P}^{1}\) FE space of continuous piecewise linear functions and set \(\mathcal {V}^{\delta }_{0} = \mathcal {V}^{\delta } \cap H^{1}_{0}\). For piecewise constant u, we let \(u_\delta \) denote the \(H^{1}\)-conforming (continuous) interpolation of u in the FE space \(\mathcal {V}^\delta \).

We consider decompositions associated with \(L_{\varepsilon }[u_{\delta }]\) given by (2.1) with \(\mu _\varepsilon [\cdot ]\) of the form (2.2) with \(q = 2\). We compute the approximation of the background \(\varphi _{0}\) and the first few eigenfunctions \(\varphi _{k}\) of \(L_{\varepsilon }[u_{\delta }]\) by numerically solving (2.24) and (2.25) using the Galerkin FE method. The discretization of (2.25) leads to a generalized eigenvalue problem

$$\begin{aligned} \textrm{A}\varphi _{k} = \lambda _{k}\textrm{M}\varphi _{k} \quad \text {for}\quad k = 1, \ldots , K, \end{aligned}$$
(4.1)

where the stiffness matrix \(\textrm{A}\) corresponds to the discretization of \(L_{\varepsilon }[u_{\delta }]\) and \(\text {M}\) is the mass matrix. We solve (4.1) numerically using the MATLAB function eigs.

Once we have obtained \(\varphi _{0} \in \mathcal {V}^{\delta }\) and \(\varphi _{k} \in \mathcal {V}^{\delta }_{0}\) for \(k=1,\ldots ,K\), we can compute the projections \(\Pi _K^{\varepsilon }[u_\delta ]\) and \(Q_K^{\varepsilon }[u_\delta ]\) given by (3.3) and (3.5). Since \(\{\varphi _{k}\}_{k=1}^K\) are computed numerically, they satisfy \(\langle \varphi _k,\varphi _j \rangle =\delta _{kj}\) only up to a small error. This slight loss of orthonormality causes small errors when computing the projection \(\Pi _K^{\varepsilon }[u_\delta ]\) directly from the Fourier expansion

$$\begin{aligned} \Pi _{K}^{\varepsilon }[u_{\delta }]v = \sum _{k=1}^{K} \langle \varphi _{k},v \rangle \, \varphi _{k} . \end{aligned}$$

To avoid these errors, we instead compute \(\Pi _K^{\varepsilon }[u_\delta ] v\) by solving the K-dimensional least squares problem

$$\begin{aligned} \Pi _{K}^{\varepsilon }[u_{\delta }]v = \mathop {\textrm{argmin}}\limits _{w \in \Phi _{K}^{\varepsilon ,\delta }}\Vert v-w\Vert _{L^2(\Omega )}, \qquad \Phi _{K}^{\varepsilon ,\delta } = {\text {span}}\{\varphi _{k}\}_{k=1}^{K} . \end{aligned}$$

When validating the conclusion of Theorem 3.6 and its corollary in Remark 3.7, we shall focus on two types of errors

$$\begin{aligned} \Vert u - Q_{K}^{\varepsilon }[u_{\delta }](u) \Vert _{L^2(\Omega )} \qquad \text {and} \qquad \Vert u_{\delta } - Q_{K}^{\varepsilon }[u_{\delta }](u_{\delta }) \Vert _{L^2(\Omega )} ; \end{aligned}$$
(4.2)

the first measures the misfit to the true medium u whereas the second measures the misfit to the continuous interpolant \(u_{\delta }\). Note that in both cases the same AS basis is used. Computing these expressions requires the evaluation of \(L^2\) inner products. As the functions participating in the expression on the right lie in the FE space \(\mathcal {V}^\delta \), we can evaluate the needed integrals exactly. In contrast, the expression on the left includes inner products involving a piecewise constant function whose discontinuities are, in general, not aligned with the mesh. Thus, to evaluate the integrals for the error on the left in (4.2), we use a numerical quadrature rule from ACM TOMS algorithm \(\#584\) [16] with degree of precision of 8 and 19 quadrature points.

In principle, \(\varepsilon > 0\) should be as small as possible, while sufficiently large so that the matrix \(\textrm{A}\) is well-conditioned. Unless specified otherwise, we always use \({\varepsilon = 10^{-8}}\).

Fig. 3
figure 3

Four simple shapes. The exact medium u (or \(u_{\delta }\)) consists of a single characteristic function \(\chi _{A^{1}}\) and vanishing \(u^{0}\)

4.1 Four Simple Shapes

We consider the four 2-dimensional piecewise constant media \({u:\Omega \rightarrow \mathbb {R}}\), in \(\Omega = (0,1)^{2}\), shown in Fig. 3. All four vanish on the boundary \(\partial \Omega \) and correspond to the characteristic function

$$\begin{aligned} u(x) = {\widetilde{u}}(x) = \chi _{A^{1}}(x), \quad x \in \Omega \end{aligned}$$
(4.3)

of a Lipschitz domain and are therefore covered by our analysis. The sets are chosen purposely with different geometric properties: the disc is convex with a smooth boundary; the square is convex, but its boundary is only piecewise smooth; the Pac-Man and the star are both non-convex with piecewise smooth boundaries.

Fig. 4
figure 4

Four simple shapes. The error \(\Vert u - \Pi _{1}^{\varepsilon }[u_\delta ](u)\Vert _{L^2(\Omega )}\). Left: the error as a function of \(\delta \) for fixed \(\varepsilon = 10^{-8}\). Right: the error as a function of \(\varepsilon \) for fixed mesh-size \(\delta = 0.05/2^{6}\)

Fig. 5
figure 5

Four simple shapes. The error \(\Vert u_\delta - \Pi _{1}^{\varepsilon }[u_\delta ](u_\delta )\Vert _{L^2(\Omega )}\). Left: the error as a function of \(\delta \) for fixed \(\varepsilon = 10^{-8}\). Right: the error as a function of \(\varepsilon \) for fixed mesh-size \(\delta = 0.05/2^{6}\)

In Fig. 4, we show the error \({\Vert u - \Pi _{1}^{\varepsilon }[u_\delta ](u)\Vert _{L^2(\Omega )}}\). The left frame shows the error for varying mesh-size \(\delta \) but fixed \(\varepsilon = 10^{-8}\). For all four shapes, the error decays as \(\mathcal {O}(\sqrt{\delta })\), as proved in Theorem 3.6. The right frame of Fig. 4 shows the error \(\Vert u - \Pi _{1}^{\varepsilon }[u_\delta ](u)\Vert _{L^2(\Omega )}\) for varying \(\varepsilon \) on the fixed finest mesh, i.e., with smallest \(\delta \). The error initially decreases with \(\varepsilon \) but then levels off at about \(10^{-2}\), at which point it can only be improved by further refining the mesh.

To eliminate the interpolation error and thereby illustrate the estimates of Remark 3.7, we show in Fig. 5 the projection error \(\Vert u_{\delta } - \Pi _{1}^{\varepsilon }[u_\delta ](u_{\delta })\Vert _{L^2(\Omega )}\). On the left, we show the approximation error for varying \(\delta \), with \(\varepsilon = 10^{-8}\) fixed: The projections of the disc, the square, and the Pac-Man in the AS basis are remarkably good, with errors at about \(10^{-9}\). For these cases, the projection of each \(u_\delta \) (hence the first eigenfunction \(\varphi _1\) of \(L_\varepsilon [u_\delta ]\)) essentially coincides with \(u_\delta \) itself. In contrast, the error for the star is larger, though it decays at a rate of \(\mathcal {O}(\delta )\), still faster than the upper estimate of \(\mathcal {O}(\sqrt{\delta })\) in Remark 3.7. In all cases, the errors here are significantly smaller than those in the left frame of Fig. 4, indicating that the errors in Fig. 4 are mainly due to interpolating u in \(\mathcal {V}^\delta \).

The error \(\Vert u_{\delta } - \Pi _{1}^{\varepsilon }[u_\delta ](u_{\delta })\Vert _{L^2(\Omega )}\) for varying \(\varepsilon \) and fixed \(\delta \) is shown in the right frame of Fig. 5. Here we observe a decay rate of \(\mathcal {O}(\varepsilon )\), which is also faster than the upper estimate in Remark 3.7. Here, for all shapes but the star, the error decreases with \(\varepsilon \) down to about \(10^{-9}\). In contrast, the error for the star levels off at about \(10^{-3}\).

The significant difference in the behavior of the error for the star compared to the other shapes, shown in Fig. 5, is due to the geometry of the discontinuities in the media and the mesh. Indeed, if we repeat the experiment for the star but with a locally adapted mesh aligned with the star’s geometry, as shown in Fig. 6, the error \(\Vert u_{\delta } - \Pi _{1}^{\varepsilon }[u_\delta ](u_{\delta })\Vert _{L^2(\Omega )}\) also drops below \(10^{-8}\). Note that while \(\delta \) is smaller in this test than it is in the tests shown in Fig. 5, this reduction by itself is not sufficient to explain the difference in the errors between Figs. 5 and 6, which is of about 6 orders of magnitude.

Fig. 6
figure 6

Four simple shapes. Left: The aligned mesh for the star-shaped medium with \(\delta = 0.05/2^{2}\). Right: the error \(\Vert u_{\delta } - \Pi _{1}^{\varepsilon }[u_\delta ](u_{\delta })\Vert _{L^2(\Omega )}\) for mesh-sizes \(\delta = 0.05/2^{m}\), \(m = 1, \ldots , 6\), and fixed \(\varepsilon = 10^{-8}\)

Fig. 7
figure 7

Nonuniform background. The exact medium u with its background \(\varphi _{0}\) and first four eigenpairs \((\lambda _{i},\varphi _{i})\), \(i=1,\ldots ,4\)

4.2 Nonuniform Background

Next we consider a medium u with non-constant background \(u^{0}\). We let \({u:\Omega \rightarrow \mathbb {R}}\) be the medium shown in frame (a) of Fig. 7, and \(\Omega = (0,1)^{2}\). Here u admits a decomposition (2.4), (2.5) with \({M = 5}\) and \({K = 4}\). Figure 7 also shows the approximation \(\varphi _{0}\) of the background and the first four eigenfunctions \(\varphi _1,\ldots ,\varphi _4\) of \(L_\varepsilon [u_\delta ]\).

Figure 8 (left) shows the error \(\Vert u - Q_{K}^{\varepsilon }[u_\delta ](u)\Vert _{L^2(\Omega )}\) with \(K=4\), for six different meshes with \(\delta = 0.05/2^{m}\), \(m=1,\ldots ,6\). Here we observe an error decay of \(\mathcal {O}(\sqrt{\delta })\), consistent with our theoretical estimates. The right frame of Fig. 8 shows the error \(\Vert u_{\delta } - Q_{K}^{\varepsilon }[u_\delta ](u_{\delta })\Vert _{L^2(\Omega )}\) with \(K=4\), as a function of \(\varepsilon \) with fixed \(\delta = 0.05/2^{6}\). Again, we observe a convergence rate of \(\mathcal {O}(\varepsilon )\), faster than the \(\mathcal {O}(\sqrt{\varepsilon })\) rate proved in Remark 3.7.

Fig. 8
figure 8

Nonuniform background. Left: the error \(\Vert u - Q_{4}^{\varepsilon }[u_\delta ](u)\Vert _{L^2(\Omega )}\) for mesh-sizes \(\delta = 0.05/2^{m}\), \(m = 1, \ldots , 6\), and fixed \(\varepsilon = 10^{-8}\). Right: the error \(\Vert u_{\delta } - Q_{4}^{\varepsilon }[u_\delta ](u_{\delta })\Vert _{L^2(\Omega )}\) for \(\varepsilon = 10^{-m}\), \(m = 0,\ldots ,8\), and fixed mesh-size \(\delta = 0.05/2^6\)

Fig. 9
figure 9

Adjacent squares. The medium u and the first four eigenfunctions \(\varphi _{k}\), \(k=1,\ldots ,4\), of the operator \(L_{\varepsilon }[u_{\delta }]\), together with its AS decomposition \(\Pi _{4}^{\varepsilon }[u_\delta ](u_{\delta })\) computed on a mesh with \(\delta = 0.05/2^{6}\)

4.3 Four Adjacent Squares

Let \(\Omega \) be the unit square \(\Omega = (0,1)^{2}\) and

$$\begin{aligned} u(x) = \sum _{k=1}^{4}\alpha _{k}\chi _{A^{k}}(x), \quad x \in \Omega , \end{aligned}$$
(4.4)

with \(\alpha _{k} = k\), for \(k = 1, \ldots , 4\), the piecewise constant medium shown in Fig. 9. Since the boundaries \(\partial A^{k}\) of the squares \(A^{k}\) are not mutually disjoint, this example is not covered by our analysis. However, we may still compute the AS approximation and measure the approximation error.

In Fig. 10 we still observe errors of \(\mathcal {O}(\sqrt{\delta })\), consistent with our theoretical estimates. Again, the error with respect to \(\varepsilon \) decays with a rate of \(\mathcal {O}(\varepsilon )\), as seen in Fig. 10.

Fig. 10
figure 10

Adjacent squares. Left: the error \(\Vert u - \Pi _{4}^{\varepsilon }[u_\delta ](u)\Vert _{L^2(\Omega )}\) for mesh-sizes \(\delta = 0.05/2^{m}\), \(m = 1, \ldots , 6\), and fixed \(\varepsilon = 10^{-8}\). Right: the error \(\Vert u_{\delta } - \Pi _{4}^{\varepsilon }[u_\delta ](u_{\delta })\Vert _{L^2(\Omega )}\) for \(\varepsilon = 10^{-m}\), \(m = 0,\ldots ,8\), and fixed mesh-size \(\delta = 0.05/2^6\);

4.4 Map of Switzerland

Here we consider the polygonal approximation of the map of Switzerland with its \(K = 26\) cantons, shown in frame (a) of Fig. 11, where each canton admits a constant value. The data of the map are given on a discrete rectangular pixel based \(1563\,\textrm{px} \times 1002\,\textrm{px}\) grid with grid-size \(\delta = 1 \,\textrm{px}\). We interpolate the data to obtain \(u_{\delta } \in \mathcal {V}^{\delta }_{0}\), and compute the first \(K=26\) eigenfunctions, \(\varphi _1,\ldots ,\varphi _K\) of \(L_\varepsilon [u_\delta ]\); frames (c), (d) and (e) of Fig. 11 show three of the eigenfunctions.

Although a single eigenfunction does not necessarily correspond to any particular canton, we may still represent each canton in \(\Phi _{26}^{\varepsilon ,\delta }={\text {span}}\{\varphi _k\}_{k=1}^{26}\). If \(u^{\textrm{c}}\) is the characteristic function for a canton shown in the map in Fig. 11, and \(u^{\textrm{c}}_\delta \) is its continuous (piecewise linear) interpolant in \(\mathcal {V}^{\delta }\), we can use the AS basis \(\{\varphi _{k}\}_{k=1}^{K}\) to approximate \(u_{\delta }^{\textrm{c}}\) as

$$\begin{aligned} u_{\delta }^{\textrm{c}} \approx \Pi _{K}^{\varepsilon }[u_\delta ]\, u_{\delta }^{\textrm{c}} =\sum _{k=1}^{K} \beta _{k}\varphi _{k} , \end{aligned}$$

with \(K=26\). In Fig. 12 we show the approximations for the cantons of Bern, Grisons, and St. Gallen in \(\Phi _{26}^{\varepsilon ,\delta }={\text {span}}\{\varphi _k\}_{k=1}^K\). These reconstructions approximate very well the exact cantons in Fig. 11.

Fig. 11
figure 11

Polygonal approximation of the map of Switzerland \(u_{\delta }\) and its 26 cantons (top left), together with three eigenfunctions \(\varphi _{k}\), \(k=2,5,15\), of the operator \(L_{\varepsilon }[u_{\delta }]\)

Fig. 12
figure 12

Map of Switzerland. Three cantons approximated in the truncated AS basis \(\{\varphi _{k}\}_{k=1}^{K}\) with \(K=26\)

Fig. 13
figure 13

The original Marmousi model with its background \(\varphi _{0}\) and AS decomposition with 100 eigenfunctions

4.5 The Marmousi Model

As a last example we consider the subsurface model of the P-wave velocity of the AGL elastic Marmousi model shown in Fig. 13, see [17, 18]. The data of the model is given as nodal values on a discrete rectangular mesh representing a \(17\, \textrm{km} \times 3.5\, \textrm{km}\) area. We interpolate the data in \(\mathcal {V}^{\delta }\) with \(\delta = 2.5 \,\textrm{m}\) to obtain \(u_{\delta }\). Next, we compute the background \(\varphi _{0} \in \mathcal {V}^{\delta }\) as well as the first 100 eigenfunctions of the operator \(L_{\varepsilon }[u_{\delta }]\).

Remarkably, the background \(\varphi _{0}\) already yields a good approximation of the model with a relative error of

$$\begin{aligned} \frac{\Vert u_{\delta }-Q_{0}^{\varepsilon }[u_{\delta }](u_{\delta })\Vert _{L^2(\Omega )}}{\Vert u_{\delta }\Vert _{L^2(\Omega )}} = \frac{\Vert u_{\delta }-\varphi _{0}\Vert _{L^2(\Omega )}}{\Vert u_{\delta }\Vert _{L^2(\Omega )}} \approx 12.8 \%, \end{aligned}$$

probably because many of the internal layers in the model reach the boundary and thus can be recovered by \(\varphi _{0}\). In contrast, the eigenfunctions \(\varphi _{k}\) (\(k\ge 1\)) account for variations of the medium in the interior of the domain. Here, the additional contribution of the first \(K=100\) eigenfunctions to the approximation further reduces the relative error to \(\Vert u_{\delta }-Q_{K}^{\varepsilon }[u_{\delta }](u_{\delta })\Vert _{L^2(\Omega )}/\Vert u_{\delta }\Vert _{L^2(\Omega )} \approx 3.8\%\).

4.6 Inverse Problem

Here we devise an iterative inversion algorithm based on AS decompositions to solve a standard linear deconvolution inverse problem which occurs in optical imaging [15]. Hence, we consider the Fredholm integral equation of the first kind

$$\begin{aligned} Fu =y \end{aligned}$$
(4.5)

where \(F: L^2(\Omega ) \rightarrow L^2(\Omega )\) is the convolution operator

$$\begin{aligned} Fu(x) = \int _\Omega g(x - x^\prime ) u(x^\prime ) \; dx^\prime , \end{aligned}$$
(4.6)

with \(\Omega = (0,1)^2\) and g the Gaussian kernel

$$\begin{aligned} g(x) = \frac{1}{2 \pi \gamma ^2} \, e^{-\frac{|x|^2}{2 \gamma ^2}}, \qquad \gamma = \frac{1}{32}. \end{aligned}$$
(4.7)

Given the noisy observation \(y^\eta \) of \(y^\dagger =Fu^\dagger \) where \(\Vert y^\dagger - y^\eta \Vert _{L^2(\Omega )} \le \eta \), we wish to reconstruct the true medium/image \(u^\dagger \). In doing so, we assume the FE interpolant \(u^\dagger _h\) of \(u^\dagger \) is known on the boundary \(\partial \Omega \).

First, we formulate the problem as the minimization of

$$\begin{aligned} \mathcal {J}(u) = \frac{1}{2}\Vert Fu - y^\eta \Vert _{L^2(\Omega )}^2 \end{aligned}$$
(4.8)

in some appropriate space. Then, we proceed iteratively as follows: In the m-th iteration, given the previous estimate \(u^{(m-1)}\) of \(u^\dagger \), we compute \(\varphi _k^{(m)}\) \((k = 0, \ldots , K)\) by solving

$$\begin{aligned} \begin{aligned} L_{\varepsilon }[u^{(m-1)}] \varphi _0^{(m)}&= 0 \;{} & {} \text {in}\;\Omega , \qquad \varphi _0^{(m)} = u^\dagger _h \;{} & {} \text {on}\;\partial \Omega , \\ L_{\varepsilon }[u^{(m-1)}] \varphi _k^{(m)}&= \lambda _k \varphi _k^{(m)} \;{} & {} \text {in}\; \Omega , \qquad \varphi _k^{(m)} = 0 \;{} & {} \text {on}\;\partial \Omega . \end{aligned} \end{aligned}$$
(4.9)

Next, we compute the current estimate, \(u^{(m)}\), by solving the least squares (LS) problem:

$$\begin{aligned} u^{(m)} = \arg \min \Big \{\mathcal {J}(u):\ u \in \varphi _0^{(m)} + \Phi ^{(m)}_K\Big \} , \qquad \Phi _K^{(m)} = {\text {span}}\Big \{\varphi _k^{(m)}\Big \}_{k=1}^K. \end{aligned}$$
(4.10)

Since the dimension K of this LS problem is small, we may solve it directly. The iteration stops when the discrepancy principle,

$$\begin{aligned} \Vert F u^{(m)} - y^\eta \Vert _{L^2(\Omega )} \le \tau \eta , \end{aligned}$$
(4.11)

is satisfied for some fixed \(\tau \ge 1\); then \(u^\textrm{ASI}\) denotes the estimate \(u^{(m)}\) at the final iteration.

In practice, we solve the problem numerically with the FE method. As in the previous numerical examples, we use standard \(\mathcal {P}^1\)-FE on a uniform triangular mesh with mesh size \(h = \delta = 0.00625\) to discretize the deconvolution problem (4.5) and the AS problems (4.9), (4.10). Hence, the diameter \(2\gamma = 1/16\) (in the Euclidean norm) of the unit ball in the Mahalanobis distance of the Gaussian convolution kernel in (4.7) corresponds to approximately 10 triangular elements (“pixels”).

The discretization of (4.5) yields a linear system of equations

$$\begin{aligned} F_h \vec u_h = \vec y_h , \end{aligned}$$
(4.12)

which is ill-conditioned as the smallest singular value of \(F_h\) is \(\sigma _\textrm{min} \approx 10^{-17}\). For the test below we let the exact medium/image \(u^\dagger \) be given by Fig. 7a, the noise \(\eta \approx 4\%\) and set \(u^{(0)} = y^\eta \) and \(K = 100\). Thus the dimension \(K = 100\) of the LS problem in (4.10) is indeed small compared to the dimension \(N\approx 26'000\) of the FE space and we can solve it directly.

For comparison, we also solve the deconvolution problem with two other standard approaches. In the first, we solve (4.12) directly using the LU-decomposition to obtain the solution \(u^\textrm{LU}\). In the second, we apply the truncated singular value decomposition (TSVD), i.e., we regularize (4.12) by replacing all singular values of \(F_h\) smaller than \(\sqrt{\eta }\) by zeros; see [15, Chapter 8] or [19, Chapter 1] for more details; that solution is denoted by \(u^\textrm{TSVD}\).

Table 1 provides the relative \(L^2\) error

$$\begin{aligned} e_\textrm{r} = \frac{\Vert u - u^\dagger _h\Vert _{L^2(\Omega )}}{\Vert u^\dagger _h\Vert _{L^2(\Omega )}} \qquad \text {and the ratio} \qquad \tau = \frac{1}{\eta } \left\| F u - y^\eta \right\| _{L^2(\Omega )} \end{aligned}$$
(4.13)

for the discrepancy principle (4.11), for the three reconstructions \(u^\textrm{ASI}\), \(u^\textrm{TSVD}\), and \(u^\textrm{LU}\) shown in Fig. 14. As expected, using the LU-decomposition for solving the inverse problem produces the solution with the largest relative \(L^2\) error, despite a rather small misfit. The TSVD solution \(u^\textrm{TSVD}\) yields an acceptable reconstruction with a relative error less than \(20 \;\%\) and \(\tau \approx 1\). Still, as shown in Fig. 14b, the discontinuities are not well represented. In contrast, the ASI solution in Fig. 14a has the smallest relative \(L^2\) error while discontinuities in the medium are better detected. Clearly, there are many available image reconstruction techniques more sophisticated than TSVD [15, 19, 20], which is only used here for the purpose of illustration.

Table 1 Inverse problem. The relative error \(e_\textrm{r}\) and the relative misfit \(\tau \) given by (4.13) for the three reconstructions \(u^\textrm{ASI}\), \(u^\textrm{TSVD}\) and \(u^\textrm{LU}\) for the inverse problem (4.12) with \(4\;\%\) added noise
Fig. 14
figure 14

Inverse problem. Solutions obtained by the three different methods to solve the inverse problem (4.12) with \(4\;\%\) added noise